Exploring openstreetmap

I want to make a map of the continental united states (CONUS) that contains the US border, state borders and major highways using data from OpenStreetMap. It can’t be that difficult, right?

Getting OpenStreetMap Data

The first order of business is getting the data. A quick google search led me to the Downloading data page on the OpenStreetMap wiki. A quick look through extracts that others provide did not seem to have what I wanted so I opted to download the entire planet.osm file. Being a good internet citizen bittorrent seemed to be the way to go. A third party service provides the torrent download and can be found here. The file I downloaded was the compressed osm data from the February 26, 2014 and came in at 33GB. Uncompressing the file and another 464GB of drive space is used up. Did I mention that an external drive is useful?

Put the US in a box

Getting the data from just the US out of the full planet.osm file would decrease the size of the file I need to work with significantly. I looked for how to just get data based on the US border looked to be a difficult undertaking. For me having a bit of Canada and Mexico mixed up in the data seemed fine so I went the bounding box route. There are several tools that the OpenStreetMap wiki suggests for getting data based on a bounding box. I arbitrarily chose Osmosis. After installing osmosis and figuring out the latitude and longitude that I wanted to bound the data by I used the following command:

$ osmosis-latest/bin/osmosis \
        --read-xml file=osm_d/planet-140226.osm \
        --bounding-box \
        top=49.384444 \
        left=-124.733056 \
        bottom=24.520833 \
        right=-66.949722 \
        --tag-filter reject-relations \
        --write-xml file=osm_d/us-140226.osm

That left me with a 104GB file. That is still way to big to deal with. Since I only want country and state borders and there must be a way to filter out the other stuff.

Filtering OpenStreetMap data

Information about map features in OpenStreetMap can be found here. As I said before I want US and state borders, highways and lets throw in cities as well. After digging around I came up with this:

  • Administrative Boundaries - Admin level 2 for national borders and 4 for state borders. Can be elements node, way and relation
  • Highway - motorway, trunk, primary and secondary. Element type way
  • Cities - city and town. maybe filter on population as well. I think I only want element type node though it can be area and relation elements as well.

Going back to osmosis I figure the following might work to get the data I need:

#!/usr/bin/env bash
../osmosis-latest/bin/osmosis \
        --read-xml file=osm_files/us-140226.osm \
        --tag-filter accept-ways highway=motorway,trunk,primary,seconday \
        --tag-filter accept-ways admin_level=2,4 \
        --tag-filter reject-ways boundary=maritime \
        --tag-filter accept-nodes place=city,town \
        --used-node \
        --write-xml file=osm_files/filtered-140226.osm

Time to take a look in JSOM.

jsom_cities_only

That isn’t right. It looks like just a bunch of cities. The blanks spots in what looks to be around Oklahoma and Kansas as well as the Carolinas is kind of interesting though. I suspect what is going on here is that as it gets to each filter more and more is removed. The last filter I have is, –tag-filter accept-nodes place=city,town so maybe that is the problem. Time to create a filter for each thing I want to export. After several iterations of seeing strange things in the map like parks and what not I added extra filters.

../osmosis-latest/bin/osmosis \
        --read-xml file=osm_files/us-140226.osm \
        --tag-filter accept-ways admin_level=2 \
        --tag-filter reject-ways boundary=maritime \
        --tag-filter reject-ways maritime=yes \
        --tag-filter reject-ways boundary=national_park \
        --tag-filter reject-ways name="Pawnee Nation" \
        --tag-filter reject-ways designation="Neighborhood Group" \
        --tag-filter reject-ways designation="incorporated private school" \
        --tag-filter reject-ways name="Heer Park" \
        --used-node \
        --write-xml file=osm_files/admin2-140226.osm
../osmosis-latest/bin/osmosis \
        --read-xml file=osm_files/us-140226.osm \
        --tag-filter accept-ways admin_level=4 \
        --tag-filter reject-ways boundary=maritime \
        --tag-filter reject-ways boundary=national_park \
        --tag-filter reject-ways leisure=park \
        --used-node \
        --write-xml file=osm_files/admin4-140226.osm
../osmosis-latest/bin/osmosis \
        --read-xml file=osm_files/us-140226.osm \
        --tag-filter accept-ways highway=motorway,trunk,primary,secondary \
        --tag-filter reject-nodes highway=motorway_junction \
        --used-node \
        --write-xml file=osm_files/highways-140226.osm

Using mapnik which I will not go into here I made the a few maps. The first one which shows just the admin level 2 features.

admin level 2 admin level 2

Well I was not expecting that.  Looks like only the land borders and no coastlines.  How about the state borders.

admin level 4 admin level 4

That area around the great lakes as well as texas is weird.  Maybe river borders are not coming through.  How about level 2 and 4 combined.

admin level 2 and 4 admin level 2 and 4

That looks a little better although the great lakes area, the area around Ohio and Kentucky and a few other areas look weird.  Now for fun lets look at highways.

highways motorway, trunk, primary and secondary highways motorway, trunk, primary and secondary

That is pretty busy, motorways are black, trunk is red, primary is green and secondary is blue.  Lastly how about admin level 2, admin level 4, motorways and trunks.

admin level 2, admin level 4, motorways, trunks admin level 2, admin level 4, motorways, trunks

Well it is starting to look like a map but I think I will need to spend a lot more time filtering data to get this map to look right.

I have the scripts I used to generate the maps here at github.