Open Street Map, TIGER data, catalog.data.gov, state government and regional (MPO) data are all very useful to me. I pondered data as topic as I looked at the following error on the Census website.
The problem of inaccessible data.
It’s happened before and it will happen again. In this case, I was looking for the useful web interface for Census TIGER data shown on the right. Only, I got a MySQL error.
While I waited a couple of days to send an email about this (on a Sunday morning), the issue was resolved within an hour after I did send an email. I was very grateful. Should I download copious amounts of data to my computer so that I’m not at the mercy of such an issue? No – not if the reason is that I fear failure of data sources.
Getting data via FTP
The Census data was available via FTP. I used
I discovered voting district files (vtd) in the Time Series Partnership Files. While I never knew how to find these on the Census site, these are in the TIGER2012 directory, listed by state number. Here’s what the data looks like when looked at in QGIS.
Download TIGER as needed
The key to accessing TIGER DATA is understanding the state and county codes. Montgomery County is identified 42091, a combination of the state code and the county code. It makes sense to know the codes or at least to post a reference on the wall.
TIGER data is both accessible via online here, and downloading data for lots of counties at time this way is a bit tedious. Say you are working with 10 counties. You would need to download water, streams, roads, rails, county subdivisions, and and maybe some other data. You’d need to find parks and other natural data elsewhere.
Using OSM to save time
With the ease of downloading OSM via geofabrik for larger areas, the ease of downloading specific data via Quick OSM in QGIS, and the ubiquitous availability of QGIS, it’s tempting to use OSM for larger projects. So… Why not just use OSM?
Because… I think it’s important to question data if just to pay closer attention to the details.
If using OSM – Should I use the planet.osm.pbf file?
I downloaded the planet.osm.pbf file, and then my computer spent several hours just converting this to a geopackage. Ultimately, I couldn’t get the geopackage to run. I don’t want to set up a local database on my computer because I am constantly switching between computers. But I still may try this approach.
I don’t need a local version of the planet.osm file. QuickOSM extracts combined with OSM shapefile extracts from download.geofabrik.de work fine.
However OSM data is used it needs to be saved to the specific project gpkg.
Process for managing data
If I open a project I haven’t used for some time, I want to minimize source files. Currently, this is best done in a geopackage (gpkg), a file database that I can also save styles to. There would be one geopackage for the project and one geopackage with commonly used data.
Using a project Geopackage (gpkg).
I created a geopackage with layers and styles for a project. As I set up the map, I saved imported data to the geopackage, whether or not this was data specific to the map or base data. I also saved style styles to the geopackage and set the styles to turn on by default.
The resulting geopackage was just over 1 gigabyte in size that zipped down to about 640 mb. It contained the following national data sets from Census, to which I added a couple of fields (color and population). ZIP, County, State. To this, I added the project location, original tables, qml styles.
I needed to share this file, but sharing a gigabyte file with someone just seemed too much. I created some isolines, some point data, and revised some of the original data. I saved the new layers as geojson files but then save them into my original geopackage.
So, the layers were essentially these:
- County subdivisions (for two states)
- US States
- US Counties
- Target market area counties
- US ZIP with the origin, ACS, and computation fields
- American Community Survey (ACS) Population by ZIP
- Project site location
- 5 years of origin data from Excel
- World Countries (area)
- World Countries (centroids)
- 10 Styles saved into the database
Pros & Cons of a big geopackage
The benefit is that I have two files and a folder of original data received that make up the entire project. If I do a similar project, I can start with this one and modify it. The cons are that the project has much more data than is needed.
Where I’m heading is to have a set of QGIS templates that use geopackages specific to project type. The geopackages will contain lots of styling by default. I expect to use the geopackages more than layer definition files (data and styles) just because I want to minimize the number of files I have.
When I am finished with a project, I can either Zip it or just store it in the cloud and off of my hard drive.