CARTO's Use of Foreign Data Wrappers
PostgreSQL is packed full of under-appreciated features (pg_trgm full-text search logical replication ranges to name a few) but I'd like to highlight one feature that we use under the covers in CARTO more than you might expect: foreign data wrappers.
Foreign data wrappers (FDW for short) allow developers to expose external sources of data as tables inside PostgreSQL. For data sources that are already tabular FDW just looks like a mirror of a remote table (Oracle FDW MySQL FDW etc) where local queries are magically fulfilled by the remote server. For other data sources (Twitter FDW passwd FDW random number FDW etc) the FDW developer maps the external source into a tabular shape where queries can be evaluated against the source and still returned as a PostgreSQL tuple set.
CARTO's use of FDW is pretty conventional:
- We use postgres_fdw to link data in remote databases to users Carto databases and
- We use odbc_fdw to expose data in customer's operational databases to their CARTO instances.
In both cases the FDW connection allows us to provide data access without incurring the cost and complication of replicating that actual data from one database to another.
Push Queries to Remote Servers Faster with postgres_fdw
The native postgres_fdw ships with PostgreSQL and is naturally the most full-featured FDW implementation -- there is an almost perfect mapping of features from one instance of PostgreSQL to another so the FDW can be very clever about pushing queries to the remote server.
During development when we joined local spatial data with remote data we found that PostGIS types and functions weren't being sent to the remote server. That rendered remote data access impossibly slow.
To achieve our ends we had to extend postgres_fdw to support the PostGIS geometry types and functions so that spatial queries could be sent to the remote server.
In order to get our enhancements accepted into the PostgreSQL community we then generalized them to support any PostgreSQL extended type.
When creating a foreign server definition with spatial data just add an extensions option to let FDW know that it is safe to ship PostGIS objects and functions to the remote server.
With the "extension shipping" it's possible to make use of any custom type or function over a postgres_fdw connection as long as it is a part of an extension that is installed in both the local and remote databases.
Schedule Live Data Syncs with odbc_fdw
CARTO Database Connectors allow users to connect their CARTO instances directly to remote databases with queries that can be refreshed on a schedule. For folks with live data they want to feed into analysis and display the database connectors can be very useful!
Under the covers the connectors are setting up and managing odbc_fdw connections between the CARTO instance and the remote databases. By using the generic ODBC driver as the base for our connectors we can support connections to
- MySQL
- PostgreSQL
- SQL Server and
- Apache Hive.
In order to keep up with new PostgreSQL releases as we upgrade our infrastructure to the latest version we have taken on the maintenance of the odbc_fdw.