As part of our [Data Observatory rollout] CARTO is excited to announce that we are maintaining shoreline clipped versions of the USA boundaries provided by U.S. Census TIGER (Topologically Integrated Geographic Encoding and Referencing).
TIGER publishes shapefiles of some of the most important boundaries in the U.S. including U.S. states counties congressional districts census tracts and zip code tabulation areas amongst others. As TIGER's boundaries are a work of the United States government they are not under copyright. They can be freely reproduced reused and altered. This makes them an especially valuable resource for the democratization of mapmaking: anyone can use them however they want to!
TIGER does a great job producing and maintaining files at an excellent resolution in fact since 2010 they have not published high-resolution files where boundaries follow shorelines. This means that a fledgling mapmaker's first experience with TIGER boundaries is frustrating:
Boundaries extend deep into the oceans and Great Lakes; counties and states reach across major rivers and bodies of water to touch their neighbors. While there are many statistical reasons one would want shapes to do this they don't work well for maps.
GIS StackExchange abounds with questions about where to get TIGER boundaries that don't ignore the shore. The shoreline-hugging files which the census does make available are low resolution and thus unsuitable for maps that are zoomed in.
Using the same PostGIS that's available to every CARTO user we've processed every TIGER boundary we make available to be "shoreline clipped". What we've done is taken water shapes and "clipped" them out of the original leaving boundaries that clearly follow the shoreline. Take a look and you'll understand why this is a big deal:
As an added bonus the water areas we use to clip to shoreline are TIGER's own AREAWATER shapes. This means we can put the resulting dataset under the same extremely liberal license as regular TIGER data sources.
The reason why shoreline clipped boundaries aren't easy to come by is that they're surprisingly difficult to put together. Fortunately PostGIS is a powerful tool and all the transformations can be achieved in SQL:
Split the "positive" land shapes like states or counties into smaller simpler components by using ST_Subdivide. Some of these geometries start out quite large -- hundreds of thousands of vertices -- and this makes them easier to work with. Geometries with a land area of
0in TIGER are also eliminated here. Each component is assigned a unique ID.
Create a water area layer by importing several thousand
AREAWATERshapefiles from TIGER and bringing them into one table excluding minor water features (lakes and ponds of less than 1/3 km square). This will be our "negative" layer which we subtract from the positive.
Join the water and land areas using ST_Intersects and union the resulting positive and negative geometries by the ID from step (1). This leaves us with manageably-sized positive and negative shapes to subtract from each other. Note that this will exclude any land shapes that don't touch water areas.
Subtract the water shapes from the land shapes leaving clipped versions with the unique IDs from step (1).
Add back to these all the land shapes that didn't touch water which were excluded at step (3).
Re-union the shapes according to their Census geoid (a unique identifier for every census boundary). This stitches the shapes back together as they've been broken apart since step (1). For example a state could have been broken into hundreds of shapes in step (1) -- this would stitch those components which since had water subtracted from them back into one geometry.
Eliminate any water features that have created holes in the original geom using ST_ExteriorRing. Since this breaks geoms apart into their components this is also where we eliminate pesky edge artifacts.
Happy data mapping!