How to Use Snowflake Cortex AI for Geospatial Use Cases

Solutions Architect

June 18, 2025

Large Language Models (LLMs) are redefining how we interact with data, especially in domains traditionally siloed behind specialized knowledge and technical complexity, like geospatial analysis. By translating natural language into structured spatial queries, LLMs dramatically reduce the technical barrier to extracting location-based insights.

With platforms like Snowflake Cortex, this capability is now embedded directly into the data warehouse, enabling faster, more accessible geospatial experimentation than ever before. These kinds of capabilities are at the heart of what’s now called GeoAI: combining geospatial intelligence with artificial intelligence.

We can now ask specific questions and the model generates a spatial SQL query to retrieve answers, like:

“What is the distance between each row in this table and a specific location?”
“Which customers are within 5 km of a service outage zone?”
“Can you clean up this list of POIs and tell me if the coordinates match the actual place and city?”
“Can you standardize these addresses in Portugal and include the district, city, and country?”

But while LLMs make it easier to explore and prototype with spatial data, they aren’t a replacement for geospatial logic, governance, or production-grade performance. Without careful integration of geospatial functions—distance calculations, spatial indexing, projection handling—LLMs can produce flawed or oversimplified results.

In this article, we walk you through three practical use cases that showcase the power of LLMs for geospatial analytics inside Snowflake:

Insurance Risk Assessment based on proximity to emergency services
Address Standardization to improve insurance data quality and geocoding precision
POI Normalization to clean and contextualize amenities in real estate listings

Use Case #1: Insurance Risk Assessment– Emergency Service Proximity

Business Context

In property insurance, proximity to emergency services like fire stations is a critical factor in underwriting and pricing. The closer a property is to first responders, the lower the perceived risk—and potentially, the premium. Traditionally, calculating these spatial relationships requires GIS tools and spatial SQL expertise, creating friction between business users and actionable insights.

What the LLM Does

With Snowflake Cortex and a small, embedded list of fire station coordinates, we asked an LLM to compute the nearest fire station for each property. Here’s an example:

Prompt: “Calculate the distance between each insured property and the nearest fire station in Québec City. Only return the distance in kilometers, no other text.”

Here’s the SQL query used along with the results returned by the model:

Benefits

This approach allows insurers to quickly assess risk based on emergency service proximity—without needing a full GIS stack, and without extracting data outside the warehouse. All analysis happens where the data already lives—ensuring better performance, governance, and scalability.

Use Case #2: Address Standardization in Insurance

Business Context

In the insurance industry, accurate and consistent address data is essential for underwriting, risk assessment, and claims management. Customer-submitted addresses often come in varied formats—with missing postal codes, inconsistent abbreviations, or formatting errors—that can compromise geocoding precision and downstream risk calculations.

By using Snowflake Cortex, we provided a raw list of customer addresses and prompted the LLM to clean and standardize the records:

Prompt:
“Clean and standardize the following list of customer-provided addresses. Ensure proper casing, consistent formatting, and include missing province or postal code information where possible. Use Canadian address formatting conventions. Only return cleaned and standardized address.”

Here’s the SQL query used along with the results returned by the model:

Benefits

The LLM interpreted the intent, corrected typos, filled in missing location hierarchy levels, and returned a clean, structured output—all within the Snowflake environment. This ensured all addresses conformed to a consistent format, enabling precise geocoding and risk zone mapping.

Use Case #3: POI Normalization in Real Estate

In real estate analytics, points of interest (POIs)—like parks, grocery stores, cafés, and schools—are key to scoring neighborhoods and evaluating property value. But user-submitted or scraped listing data often contains misspelled, vague, or colloquial POI names that can’t be reliably geocoded or aggregated. Those free-text POI mentions are inconsistent and hard to geocode reliably.

By applying LLM-powered POI normalization, platforms can:

Standardize POI names across listings (e.g., “Starbux” → “Starbucks”).
Resolve vague or colloquial references (e.g. Whole Foods Market, the park near downtown, etc) using city and coordinate context.
Improve map-based search, amenity scoring, and neighborhood analytics.

Using Snowflake Cortex and an input list of incorrectly spelled addresses, we applied a prompt-driven LLM function to automatically clean, standardize, and enrich the data with consistent formatting and missing geographic details.

Prompt:
“Given a POI name, a city and coordinates, return de corrected POI name and the corrected city (if needed). Also, indicate if the coordinates likely match the POI. Only return a JSON with “corrected_poi”, “corrected_city”, location_match”. ”

Here’s the SQL query used along with the results returned by the model:

Benefits

The LLM generated enriched, consistent names, corrected typos, and disambiguated local references based on spatial hints. This approach accelerates address standardization by eliminating manual processing or the need for external tools. By interpreting the user’s intent and transforming messy input into consistent, geocodable formats, the LLM dramatically improves the quality of address data used in downstream analytics.

Limitations of LLMs for Geospatial Uses

While Large Language Models (LLMs) offer remarkable capabilities for prototyping spatial queries and accelerating early-stage analysis, they are not yet dependable for production-grade geospatial workflows without proper GIS integration. Several critical limitations persist:

Inaccurate or Oversimplified Results

LLMs can misinterpret unit conversions, mislabel outputs, or overlook coordinate reference systems. For example, a request to convert distances from miles to kilometers might return the same numeric values with only the unit label changed—failing to apply the correct conversion factor. In spatial calculations, even small numeric errors can lead to large-scale business consequences, especially in fields like insurance and logistics.
Lack of Spatial Context and Semantic Understanding

Unlike GIS systems that are designed to respect geographic hierarchies (e.g., neighborhoods within cities, cities within provinces), LLMs operate without embedded geographic ontologies. They don’t inherently understand the difference between administrative boundaries, projections (e.g., WGS84 vs. UTM), or topological relationships like adjacency or containment. This makes them prone to generate syntactically correct but semantically flawed spatial queries.
Limited Performance Awareness

The SQL code generated by LLMs is often not optimized for scale. It may neglect key spatial performance practices such as spatial indexing (ST_CLUSTERKMEANS, ST_CLUSTERDBSCAN, or GEOGRAPHY_INDEX in Snowflake), filtering early in joins, or minimizing computation over large bounding boxes. This can result in slow-running queries that don’t scale across millions of rows, hindering production use in high-throughput environments.
Inability to Validate or Debug Results

LLMs lack the ability to reason about the validity of the output they generate. They don’t know if their spatial join returned the correct results, or if a buffer radius was applied correctly. It is therefore up to human to manually examine and test queries, hence the importance of geospatial experts like Korem.
Cost Efficiency and Query Optimization

LLMs can generate working spatial queries, but they don’t always optimize for cost. For example, asking for distances between a table of properties and a list of fire stations might result in an unnecessarily complex or inefficient query. In Snowflake, a simple function like ST_DISTANCE would be more efficient and far less costly. Even well-written prompts can lead to excessive credit usage—query validation and optimization remain essential.

Key Takeaways

LLMs can accelerate experimentation, reduce time-to-first-insight, and expand geospatial access to non-experts, but they are not a replacement of robust GIS systems, domain knowledge, or validated spatial logic. When integrated carefully within a governed environment like Snowflake, they become a powerful co-pilot (but not an autopilot) for location-based analytics.

For production-grade solutions, especially those involving large datasets or requiring high spatial accuracy, the role of geospatial engines remains essential.

The most effective path forward is a hybrid one: integrating LLMs where they can add the most value (like natural language interfaces, prototyping, and automation) while continuing to rely on GIS platforms and geospatial experts to ensure precision, scalability, and reliability.

As a Snowflake Select Tier Partner, Korem can help organizations unlock the power of this platform by combining our geospatial expertise with Snowflake AI capabilities. Whether you’re looking for advanced geo-analytics, complex modeling based on large-scale data, or seamless integration of location intelligence into your data workflows, we can help you move faster—with confidence.

Contact us today to discuss your project.

talk to an expert »

Use Case #1: Insurance Risk Assessment– Emergency Service Proximity

Business Context

What the LLM Does

Benefits

Use Case #2: Address Standardization in Insurance

Business Context

Benefits

Use Case #3: POI Normalization in Real Estate

Benefits

Limitations of LLMs for Geospatial Uses

Inaccurate or Oversimplified Results

Lack of Spatial Context and Semantic Understanding

Limited Performance Awareness

Inability to Validate or Debug Results

Cost Efficiency and Query Optimization

Key Takeaways

EV Charging Stations Will Challenge Retailer’s Brand Identities

The Biggest Insurance Industry Challenges Today

Build or Buy Geospatial Data: A Rational Choice Is Needed