Jun 5, 2025

Miguel Álvarez

and

CARTO Contributor

Jun 5, 2025

Miguel Álvarez

and

Jun 5, 2025

Foundation Models: Transforming the Future of Spatial Analytics

Spatial Data Science

mins read

Foundation Models: Transforming the Future of Spatial Analytics

The rapid growth of foundational models in machine learning has been transforming the way we approach natural language, images, and even scientific research. Now, these models are making their way into the spatial domain. One of the most promising examples is Google's recently released Population Dynamics Foundation Model (PDFM), which aims to represent how people interact with places over time by learning from vast, multimodal datasets.

Traditional models are typically built for specific tasks using a limited set of data - for example, predicting foot traffic at a store using only census data or recent transactions. In contrast, foundation models like PDFM are trained on massive, diverse datasets - satellite imagery, mobile data, POIs, and more - to learn a deep, general-purpose understanding of how people interact with places. The result is a set of rich embeddings that can be reused across many different applications, from disaster response to infrastructure planning. This shift is like moving from a highly specialized, one-use-case tool to a universal toolkit for spatial analysis.

Why these models matter

The promise of population dynamics models lies in their ability to capture complex patterns of human movement and behavior using data from satellite imagery, mobile devices, POIs (Points of Interest), and more. While traditional approaches have relied on static data or manual aggregation, foundation models enable continuous, automated, and adaptable forecasts at unprecedented scales. Here are just a few ways these models can make an impact:

Disaster Response: Quickly estimate population impacts in areas affected by earthquakes, floods, or wildfires, even in the absence of up-to-date census data.
Urban and Regional Planning: Anticipate where infrastructure, housing, or services will be needed based on predicted population shifts.
Telecommunications and Retail: Optimize network expansion or store locations by anticipating changes where customers live, work, and travel.
Epidemiology: Model the spread of diseases by understanding fine-grained human movement and aggregation.
Insurance & Risk Modeling: Assess exposure to risks by linking population density shifts to climate or economic scenarios.

Validating Google’s Population Dynamics Foundation Model with CARTO

At CARTO, we’re really excited about what this new class of models could mean for our industry and our users. We are actively exploring their applications, running experiments, comparing them to our own datasets, and iterating on how they might enhance spatial decision-making. But the first step for us was to see how they stack up against the traditional ways we’ve been doing geospatial analytics for years.

To put Google’s Population Dynamics Foundation Model to the test, we ran an internal experiment using CARTO Workflows - our low-code tool for automating spatial analysis pipelines. Our goal? To predict one of CARTO’s own flagship spatial features: the Human Activity Index, a proprietary composite index that quantifies human activity on Earth by combining data on population, POIs, nighttime lights, and telecom cell towers.

This index is commonly used by our customers to support decisions in retail site selection, infrastructure planning, and mobility analysis, as it provides a high-resolution, globally consistent view of where and when people are active.

We did this by constructing the below workflow:

A screenshot of CARTO Workflows — The full workflow

This analysis consists of the following steps:

Data preparation: Each ZIP Code Tabulation Area (ZCTA) centroid was assigned to its corresponding H3 resolution 8 cell to enable a join with the Human Activity Index from CARTO Spatial Features.
Feature selection: We used the PDFM embeddings (a total of 330 features) as predictive variables.
Train-test split: The dataset was randomly split into training (70%) and testing (30%) subsets to evaluate model generalizability.
Model training & evaluation: Using the CARTO Workflows extension for BigQuery ML, we trained several regressors (see below). A random forest regressor delivered the strongest performance.

The outcome? An R² of 0.922 on the training set and 0.882 on the test set. This means that Google’s PDFM can explain nearly 90% of the variation in human activity as measured by CARTO’s index, even when using data it hasn't seen before. This level of accuracy is a strong signal that the model has captured real-world dynamics in a meaningful way. The following map illustrates the predicted values from Google’s PDFM embeddings compared with CARTO’s Human Activity Index.

These results show that Google’s foundation model embeddings and CARTO’s Human Activity Index capture some of the same underlying patterns, despite being built from very different data sources and methodologies.

The PDFM was trained on massive, diverse, and often unstructured global datasets, while CARTO’s index is a curated, domain-driven product built from carefully selected spatial indicators. The fact that their outputs strongly correlate speaks to the flexibility and power of Google’s foundation model: it has learned to generalize patterns of human activity in a way that aligns with expert-engineered indicators.

This provides strong cross-validation: for CARTO, in confirming the strength of our spatial indicators, and for Google, in showcasing the real-world relevance of PDFM embeddings across geographies and domains.

The way forward: geospatial foundation models will change the way we do spatial analytics

Current models like Google's PDFM already demonstrate impressive capabilities in capturing patterns of human activity across time and space. At CARTO, we see these developments as a major step forward for the geospatial industry, and we’re focused on helping push them even further. We are actively working on advancing these models in key areas that we believe will make them even more powerful and useful. For example:

Moving Beyond Administrative Boundaries: Google’s PDFM embeddings are aggregated by ZCTA or county which vary in size across geographies and may not reflect “natural” patterns. Using regular Spatial Indexes like H3 or Quadkeys allow for more consistent global modeling.
Smart Data Selection: Foundation models are data-hungry, but more data isn’t always better. Careful curation of signal variables, mobility, infrastructure, land use, and real-world events, can make models both more interpretable and more robust.
Fine-Tuning for Local Use Cases: While foundation models are great for global consistency, fine-tuning for specific regions (urban vs rural, developing vs developed) or use cases (disaster response, logistics, or urban planning) will drive even more accurate, actionable insights.
Transparent & Explainable AI: Making sure that the results from these complex models can actually be used to support real decisions is still a key challenge.

Bringing next-gen population modeling to everyone - with CARTO Workflows

At CARTO, we are committed to bringing these powerful new capabilities to our users as quickly and as seamlessly as possible. That's why we're making population dynamics foundation models available directly within CARTO Workflows and through SQL-based components. Our goal is to make it easy for analysts, planners, and data scientists to start using these models in their everyday projects without needing to be AI experts.

We see even greater opportunities when coupling them with CARTO AI Agents—an approach we recently showcased at the SDSC25 Keynote on the The future interface of Geography. Combining foundation models with agentic workflows opens up new possibilities that were previously unimaginable.

If you're excited about building the next generation of geospatial foundation models, we’d love to hear from you. CARTO is currently hiring for a senior data scientist position focused on this work. We’re collaborating with some of the top research organizations in the field and have access to the computing power needed to experiment and scale. Join us to help shape the future of spatial AI!

Hear from our experts!

Request a demo

Don’t forget to share this post on Twitter, Facebook and Linkedin!

About the author

Miguel is Lead Data Scientist at CARTO, an amazing team working to provide spatial analytics capabilities to our users as well as to develop custom models for our customers using spatial statistics and machine learning.

More Posts from

Miguel Álvarez

About the author

Provided by our community, industry experts, or the CARTO Team, these blog posts cover the entire spectrum of spatial analysis. From location intelligence to GIS, spatial data science, industry trends, and much more, we’ve crafted relevant content to accompany you at every stage of your journey, whether you have a technical or business background. With our Blog, you are one step closer to taking spatial analysis to the next level.

About the author

About the author

About the author

About the author