Amazon Hosting TIGER Is Nice, OpenStreetMap Would Be Interesting
I’m sure most of you have seen the news of Amazon hosting TIGER shapefiles in S3 and now in EBS. Sure I like TIGER being available for EC2 instances, but the real amazing stuff happens when you can work with OpenStreetMap XML data. That mounted up to either FME Server or some great open source tools running on EC2 really would whole open worlds up. TIGER is the low hanging fruit here, but OSM would be the icing. My mouth waters thinking about what people could do with EC2 instances chomping on OSM data. One could do the lifting yourself, but Amazon’s rates are lower than what it would cost to host it yourself and since you are already on AWS, the benefit would be huge.

The GeoBunny just wants to consume OSM data on AWS
Update: A couple people have asked, yes you need to have an EC2 instance to leverage the EBS TIGER data.


This has already been discussed with the powers that be at Amazon and should be no problem, just needs to be implemented. The question is this … since the planet.osm files are already hosted in a suitable manner, so what sort of hosting would be beneficial for OSM data? a pgdata directory? a pg_dump file? Ideas welcome. I’m happy to set it up and get it sorted with the Public Data team. Just FYI, a pg_dump of the current planet dataset is > 5GB which is the max size on S3.
Cool
Would it be possible to provide an EBS volume with data already preloaded to PostGIS?
Does 5GB mark the high end of a snapshot requiring modularized splitting of OSM data? Doesn’t sound like S3 will be much help in the long run if OSM continues expansion.
EBS – volumes from 1 GB to 1 TB. Do you need the snapshot (only 5GB) to start a new EBS?
Got to have more room for OpenAerial and someday OpenTerrain(LiDAR)!
[...] Â James Fee looks at AWS data and here is the Tiger .shp snapshot James mentions: Amazon TIGER snapshot More details here: Tom MacWright [...]
Many moons ago (mid 2008?) you mentioned that WeoGeo are putting FME Server up on AWS. I’ve seen nothing public since???
@Pete,
the moons do fly, don’t they. We haven’t made any public announcements of FME Server on AWS. However, James gave a nice presentation with Denice Ross at Where 2.0 on a project that is demonstrating the nascent integration of FME and WeoGeo (http://http://bit.ly/18BhYs).
We will be making a presentation at the FME UC on some more features that will be publicly available soon. I think that Safe will make those presentations available after the conference.
If you have some specific questions about roadmap and functionality, please contact us at support [at] weogeo [dot] com.
Just wanted to sort out some of the confusion on EC2/EBS here.
We currently have about 4TB of aerial image data up on EBS and S3 in different states of processing. As Randy mentions above, you can create EBS volumes from 1GB to 1TB. Beyond that you need to do LVM. You can also snapshot those upto 1TB volumes to S3, and later simultaneously recreate any number of EBS volumes from that one snapshot. In practice that number is limited by your max allotment of 20 EBS volumes.
The EBS snapshot is not limited to 5GB. 5GB is the max size for an S3 file you upload by other means. The snapshot process works independently of normal access to your S3 bucket system. You can’t actually see the EBS snapshots under your S3 account.
As long at the OSM dataset is under 1TB, it would be easy to implement as a ready-to-go EBS snapshot. If only in the 5GB range in size, you would hardly notice the EBS bill, its the EC2 instance that would cost. In fact even the standard EC2 instance comes with over 100GB disk space, so the more practical approach for a read only system would be to just copy it to the EC2 instance.