and

CARTO Contributor

and

MVT generation: Mapnik vs PostGIS

Core Tech

mins read

With the last release of PostGIS a new set of functions were added to generate Mapbox vector tiles (MVT). The most important of these functions is ST_AsMVT which allows users to create a full tile from a SQL query. Björn Harrtell completed the majority of the project with partial sponsorship from CARTO.

You might be wondering "What's an MVT"? A map is usually created with a set of small images called tiles that are rendered together to create a mosaic. MVT's instead of using small images are created using the data that is stored in those tiles. That allows for rendering of maps client-side with WebGL (BTW we are hiring someone who loves rendering stuff on the client or any other render technology.

At CARTO we use Mapnik to render those tiles using a combination of Postgres & PostGIS. We wondered if rendering those tiles straight from Postgres/PostGIS would be faster.

The benchmark

First of all I created a benchmark took it as a baseline and then compared PostGIS'sST_AsMVT against Mapnik's MVT renderer.

The benchmark was to generate a tile with a lot ofpoints and then check how long it takes to render.

So let's get our hands dirty and create a tile with 100K points:

CREATE TABLE test_100k_points AS
  SELECT (ST_Dump(points)).geom AS the_geom FROM ST_GeneratePoints(cartodb.CDB_XYZ_Extent(0  0  16)  100000) as points;

SELECT CDB_CartodbfyTable('test_100k_points');

Mapnik

Let's now request the tile with 100K points using our Maps API:

$ curl -v -o /dev/null \
  "http://development.localhost.lan:8181/api/v1/map/${layergroup_id}/1/16/0/0.mvt"
#...
< X-Tiler-Profiler: {"setDBAuth":2 "res":221 "getTileOrGrid":221 "render-mvt":221 "total":223}
#...

Out of 223 ms what portion of that time is spent in DBquerying features versus encoding them into the MVT?

Let's instrument mapnik in order to help answer that question.

First we have to locate the interesting pieces of code:

here's the datasource that queries the database
and here's the processor which encodes vector tiles

Using systemtap we can hook into those functions to getan idea of how much time is spent retrieving features versus encodingthem into an MVT. Here's a simple script to do that:

# File mapnik.stp
# based on https://sourceware.org/systemtap/examples/profiling/fntimes.stp

function time() { return gettimeofday_us() }

function print_time(t)
{
    printf("function %s took %d us\n"  ppfunc()  t)
}

probe process("node_modules/mapnik/lib/binding/node-v48-linux-x64/mapnik/input/postgis.input").function("*get_resultset*").return
{
    print_time(time() - @entry(time()));
}

probe process("node_modules/mapnik/lib/binding/node-v48-linux-x64/mapnik.node").function("*update_tile*").return
{
    print_time(time() - @entry(time()));
}

then we can request the same tile and gather execution times:

$ sudo stap mapnik.stp
function _ZNK18postgis_datasource13get_resultsetERSt10shared_ptrI10ConnectionERKSsRKS0_IN6mapnik4PoolIS1_17ConnectionCreatorEEES0_INS6_17IProcessorContextEE took 93423 us
function _ZN6mapnik16vector_tile_impl9processor11update_tileERNS0_4tileEdii took 216795 us

This tells us that the update_tile function takes about 217 ms ofwhich 93 ms are spent retrieving features from the database.

If we execute the query used to retrieve features directly from the database weget the following numbers:

EXPLAIN ANALYZE
  SELECT
    ST_AsBinary(ST_Simplify(ST_SnapToGrid("the_geom_webmercator" 0.0597164) 0.119433))
      AS geom "cartodb_id"
    FROM (SELECT * FROM test_100k_points) as cdbq
    WHERE "the_geom_webmercator" &&
      ST_SetSRID('BOX3D(-20037508.3
      20036743.97250642 -20036743.97250639 20037508.3)'::box3d  3857);
                                                                                                                                                             QUERY PLAN                                                                                                                                                             
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Seq Scan on test_100k_points  (cost=0.00..5482.66 rows=99922 width=40) (actual time=0.015..63.357 rows=100000 loops=1)
   Filter: (the_geom_webmercator && '01030000A0110F00000100000005000000CDCCCC44F81B73C1E4628F7FC81B73410000000000000000CDCCCC44F81B73C1CDCCCC44F81B73410000000000000000DC628F7FC81B73C1CDCCCC44F81B73410000000000000000DC628F7FC81B73C1E4628F7FC81B73410000000000000000CDCCCC44F81B73C1E4628F7FC81B73410000000000000000'::geometry)
 Planning time: 0.735 ms
 Execution time: 66.432 ms
(4 rows)

that is significantly faster than the execution times reported inthe database log:

2017-08-09 18:18:05.420 CEST [7662] LOG:  duration: 98.150 ms  execute <unnamed>: SELECT ST_AsBinary(ST_Simplify(ST_SnapToGrid("the_geom_webmercator" 0.0597164)  0.119433)) AS geom "cartodb_id" FROM (SELECT * FROM test_100k_points) as cdbq WHERE "the_geom_webmercator" && ST_SetSRID('BOX3D(-20037508.3 20036896.84656299 -20036896.84656296 20037508.3)'::box3d  3857)

This is due to the mapnik postgis input plugin iteratingon the result set from postgres WKB format and transforming it to themapnik internal format.

Don't freak out at those magic numbers in the ST_Simplify call they are calculated using the tile zoom to simplify the geometry to a resolution enough to be rendered correctly but without sending too many vertices to the rendering process.

So roughly speaking out of those 223 ms:

93 ms are spent retrieving data and transforming it
124 ms are spent on the actual encoding of the tile
the rest (around 6 ms) is overhead because of authorization datatransfer etc.

ST_AsMVT

What if we made a similar query to spit the MVT straight from the database?

EXPLAIN ANALYZE
SELECT ST_AsMVT('test'  4096  'geom'  q)
  FROM (SELECT ST_Simplify(ST_SnapToGrid("the_geom_webmercator" 0.0597164)  0.119433) AS geom "cartodb_id" FROM (SELECT * FROM test_100k_points) as cdbq WHERE "the_geom_webmercator" && ST_SetSRID('BOX3D(-20037508.3 20036743.97250642 -20036743.97250639 20037508.3)'::box3d  3857)) as q;
                                                                                                                                                                QUERY PLAN

------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
------------
 Aggregate  (cost=3234.42..3234.43 rows=1 width=32) (actual time=185.818..185.818 rows=1 loops=1)
   ->  Seq Scan on test_100k_points  (cost=0.00..2485.00 rows=99922 width=40) (actual time=0.039..26.918 rows=100000 loops=1)
         Filter: (the_geom_webmercator && '01030000A0110F00000100000005000000CDCCCC44F81B73C1E4628F7FC81B73410000000000000000CDCCCC44F81B73C1CDCCCC44F81B73410000000000000000DC628F7FC81B73C1CDCCCC44F81B73410000000000000000DC628F7FC81B73C1E4628F7FC81B73410000000000000000CDCCCC44F81B73C1E4628F7FC81B73410000000000000000'
::geometry)
 Planning time: 0.865 ms
 Execution time: 187.031 ms
(5 rows)

Note we removed the ST_AsBinary. It is not required by ST_AsMVT as it works with PostGIS native geometry types (saving some time as well).

We may also consider removing the ST_Simplify and ST_SnapToGrid as they aim to facilitate raster rendering and it's done by the MVT encoding mechanism. On the other hand the equivalent query also requires a transformation from webmercator to the coordinate space of an MVT which is achieved through ST_AsMVTGeom. The final query looks like this:

EXPLAIN ANALYZE
  SELECT ST_AsMVT('dbe9fc29-ef99-4a4f-afc2-0f652dc9985b'  4096 
  'geom'  q)
  FROM (SELECT cartodb_id  ST_AsMVTGeom(qbounds.the_geom_webmercator 
    ST_MakeBox2D(ST_Point(-20037508.5  20037508.5) 
    ST_Point(-20036897.00376892  20036897.00376892))  4096  0 false)
    AS geom
        FROM (SELECT * FROM test_100k_points WHERE
            "the_geom_webmercator" && ST_MakeEnvelope(-20037508.5 
            20037508.5  -20036897.00376892  20036897.00376892  3857))
    as qbounds
  ) AS q;
                                                                                                                        QUERY PLAN                                                                                                                        
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Aggregate  (cost=2983.48..2983.49 rows=1 width=32) (actual time=328.254..328.255 rows=1 loops=1)
   ->  Seq Scan on test_100k_points  (cost=0.00..2485.00 rows=99696 width=40) (actual time=0.010..29.480 rows=100000 loops=1)
         Filter: (the_geom_webmercator && '0103000020110F0000010000000500000000000048F81B73C100000048F81B734100000048F81B73C100700F10D21B734100700F10D21B73C100700F10D21B734100700F10D21B73C100000048F81B734100000048F81B73C100000048F81B7341'::geometry)
 Planning time: 0.825 ms
 Execution time: 329.123 ms
(5 rows)

The execution time is significantly worse than in the previous case. Why is that?

In order to answer that question we need to look into how ST_AsMVT and ST_AsMVTGeom are implemented:

ST_AsMVT is actually anaggregate.Here isan interesting discussion about that design choice for theinterface. As mentioned in that thread there are some advantages tohaving the data first before encoding. However nothingprevents you from concatenating MVT's (e.g: from different layers)into a single tile.
ST_AsMVTGeom is meant to transform an arbitrary geometry into thecoordinate space of an MVT. It is kind of a requirement ofST_AsMVT to have the input geometries in such coordinatespace. Basically it performs an affine transformation from onecoordinate space to another plus some additional checks.

There's an excellent tutorial on how to profile postgreshere.

Using those tools we can profile our query and get an overviewof where the time is spent. Take a look at this kcachegrind screenshot:

By looking at that graph and the code you can figure out that a lot ofthe time is spent copying checking and passing geometriesaround. Here's a PR thatoptimizes some of those paths for the points case.

Here are the results after applying those optimizations:

EXPLAIN ANALYZE
  SELECT ST_AsMVT('dbe9fc29-ef99-4a4f-afc2-0f652dc9985b'  4096 
  'geom'  q)
  FROM (SELECT cartodb_id  ST_AsMVTGeom(qbounds.the_geom_webmercator 
    ST_MakeBox2D(ST_Point(-20037508.5  20037508.5) 
    ST_Point(-20036897.00376892  20036897.00376892))  4096  0 false)
    AS geom
        FROM (SELECT * FROM test_100k_points WHERE
            "the_geom_webmercator" && ST_MakeEnvelope(-20037508.5 
            20037508.5  -20036897.00376892  20036897.00376892  3857))
    as qbounds
  ) AS q;
                                                                                                                        QUERY PLAN                                                                                                                        
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Aggregate  (cost=2983.48..2983.49 rows=1 width=32) (actual time=197.396..197.396 rows=1 loops=1)
   ->  Seq Scan on test_100k_points  (cost=0.00..2485.00 rows=99696 width=40) (actual time=0.011..29.509 rows=100000 loops=1)
         Filter: (the_geom_webmercator && '0103000020110F0000010000000500000000000048F81B73C100000048F81B734100000048F81B73C100700F10D21B734100700F10D21B73C100700F10D21B734100700F10D21B73C100000048F81B734100000048F81B73C100000048F81B7341'::geometry)
 Planning time: 0.887 ms
 Execution time: 198.373 ms
(5 rows)

To sum up: around 200 ms for data retrieval and encoding of anMVT. That is about 10% faster than mapnik.

Putting all the pieces together

You can find an implementation of a Windshaft renderer (the main piece of ourMaps API) based on ST_AsMVT in the 1000x branch.

Let's run our little benchmark again using this new renderer:

$ curl -v -o /dev/null "http://development.localhost.lan:8181/api/v1/map/${layergroup_id}/1/16/0/0.mvt"
#...
< X-Tiler-Profiler: {"setDBAuth":2 "req2params":1 "res":272 "getTileOrGrid":272 "render-mvt":271 "query":271 "total":275}
#...

Ouch! The bulk of the job is meant to be done in the database (takingabout 200 ms). I expected it to have some overhead on the node sideof things but not that much. How come?

We can alsoprofile the node serverto get some insights about where the time is being spent:

What you can see is some initial setup time then the querybeing executed in the database and outside node (in grey taking about210 ms in this very instance) and then node copying parsing andprocessing the results of the query.

It is worth noting that despite the node server is able to performnon-blocking IO it has to wait for the query results and cannotbegin streaming the encoded tile until the query returns.

Actually if we just request a tile with 10k features instead the time goes down to about 25 ms on my machine which is on par withmapnik and that after all makes perfect sense.

Conclusions

Most of the above has been centered around executionspeed. This was my main focus at the time of writing the code andexploring the possibilities.

It is true that generating the MVT's straight in the database ispotentially faster as it can avoid data transfer conversion betweeninternal representations of the features to encode and soon. It also opens up the possibility of offloading the encoding operation(which is CPU-intensive) from the tiler to the database. Depending onyour set up this may be (or may be not) a good idea.

We also found out that MVT's are good for rendering client side butnot so good when you want to render tiles with a lot of features.

However it is not all about speed. More importantly having MVT's inthe database as a building block lets you build the SQL on top which gives you a lot of flexibility.

I hope you enjoyed reading this post as much as I enjoyed writingit. If that is the case you might also be interested injoining us. Thanks!

Don’t forget to share this post on Twitter, Facebook and Linkedin!

About the author

More Posts from

About the author

Provided by our community, industry experts, or the CARTO Team, these blog posts cover the entire spectrum of spatial analysis. From location intelligence to GIS, spatial data science, industry trends, and much more, we’ve crafted relevant content to accompany you at every stage of your journey, whether you have a technical or business background. With our Blog, you are one step closer to taking spatial analysis to the next level.

About the author

About the author

About the author

About the author

About the author