Democratizing Spatial Analysis with Raster Data on the Cloud

Summary

Discover CARTO's vision for democratizing spatial analysis by making raster data accessible on the cloud & learn about upcoming initiatives.

This post may describe functionality for an old version of CARTO. Find out about the latest and cloud-native version here.
Democratizing Spatial Analysis with Raster Data on the Cloud

At CARTO, we believe that Spatial Analysis should be democratized and accessible to all. Our strategy to make this happen is to set geospatial data free, breaking the silo it has been living in. We know the power of geospatial data can be truly unlocked when it is integrated into the broader modern analytics stack, allowing it to benefit from a rich ecosystem of services, tools, and programming interfaces.

We've seen a move towards Spatial Analysis becoming more widely accessible and integrated due to developments across open-source packages, open standard data formats, cloud-native connectivity, and intuitive low-code interfaces. However, as there is no single platform that answers all questions and solves all problems - interoperability remains one of the most crucial hurdles in building a truly open geospatial industry.

We have already taken some steps to move towards our vision, such as driving forward the GeoParquet initiative, which aims to democratize vector data. Now, we are extending this approach to raster data.

A screenshot of a map showing gridded data clipped to roofs
Using raster to visualise the annual amount of solar energy on roofs, via the Google Maps Platform Solar API

Why is Raster Data so Difficult to Work with?

Size

Imagine working with large datasets like climate data, weather data, flood regions, soil data, or harvest data. Depending on the region you want to cover, you can have millions of data pixels. Raster data is notoriously challenging due to its size and the need for specialized software to analyze and visualize it.

Skills Paywall

Unlike commonly-used vector data, raster data sits behind a  “skills paywall”. Professionals in the data science & analytics industry often lack training in handling raster formats, and obtaining GIS certifications or mastering advanced frameworks like GDAL can be prohibitively time-consuming. This skills gap creates a barrier to accessing valuable insights hidden in raster data.

Common misconception

The data science industry is divided into two communities: those aware of raster data and those who are "raster-blind." But why? The common misconception is that vector data alone is enough to perform successful spatial analysis. However, ignoring raster data can lead to incomplete analyses. This data format can provide extra spatial context for environmental monitoring, urban planning, and catastrophe modeling.

A map showing buildings overlaid with a gridded flood risk layer
Analysing building-level flood risk with raster data

CARTO is breaking this GIS data silo by advocating for better integration of vector and raster data. We aim to make raster data as accessible and easy to use as vector data by adopting open standards like Parquet for raster data. Stay tuned, as our efforts in this direction are soon to be launched.

CARTO's Vision for Raster Data

Our vision is not to create a new format but to scale the potential of raster data through open standards and cloud-native solutions. By making raster data as accessible as vector data, we aim to democratize and simplify spatial analysis.

Principles Guiding Our Vision

Raster data should:

  • Be based on open standards
  • Be accessible in a way that is cloud native
  • Work seamlessly across all major data warehouses and cloud providers
  • Be efficient and cost-effective to use
  • Be compatible with distributed computing

The Role of the Cloud

We bet on the cloud to integrate geospatial data (including raster data) within the modern tech stack. But why the cloud? Over the past decade, cloud computing has become the standard for building and delivering data systems and platforms. While on-premise has not fully disappeared from the scene, cloud computing is a clear winner of the race as it allows users to embrace the open data ethos by either integrating or building on open-source components.

In fact, CARTO’s cloud-native approach is enabling hundreds of companies to benefit from cloud capabilities (scalability, speed, and security) with our platform.  This approach is critical for democratizing the geospatial domain and unlocking the power of location for various data use cases, from analytics to AI.

The Importance of Open Data Standards

Openness is the key to growth and progress. For that reason, we are actively involved in the design and promotion of GeoParquet, a cloud standard for vector-based geospatial data, built on the Parquet format.

Why is Parquet so interesting? It has been around for more than a decade and its extensibility has made it a foundation for many open-source frameworks and commercially available SaaS and PaaS solutions, such as Delta Lake and Apache Iceberg formats.

For raster data, our research has shown improvements in cloud-native formats like Cloud Optimized GeoTIFFs (COGs) and SpatioTemporal Asset Catalogs (STAC). We also see the potential for non-geo-specific array-based storage technologies like Zarr and Xarray. Zarr (or GeoZarr, for geospatial data) provides an efficient chunked storage format, and Xarray offers a robust foundation for storing array data.

To integrate raster data seamlessly into the broader data ecosystem, we propose the use of Parquet to store GeoZarr data. This concept could evolve into a form of "GeoParquet" or "E-GeoParquet" (extended to support raster data), and ensure that data downstreamed from warehouses into data lakes remains accessible and usable for everyone.

What’s next? A Call for Collaboration

Our strategy is designed to instill confidence in organizations as they structure their raster data. We will continue to develop and implement this vision within our raster module, collaborating with others to achieve the democratization of spatial analysis.

Join our webinar "Master Raster Data Analytics with CARTO & Snowflake" happening on June 26th to learn more.

By embracing open standards and cloud-native solutions, we can make the geospatial capabilities of raster data accessible to all. Join us to shape the future of raster data. Share your feedback, collaborate with us, and explore the possibilities that CARTO offers.