Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

Introduction

Early Release

Welcome to Building Serverless Geospatial Apps on AWS. Geospatial systems are not a natural fit for serverless architectures. Their underlying data models — large rasters, complex geometries, and spatial indexes — clash with the event-driven, stateless patterns that make serverless systems effective. As a result, building scalable geospatial applications has often meant falling back on traditional infrastructure, missing the advantages of serverless approaches.

This guide is an experimental resource developed to address that gap. It collects patterns observed and tested in production, showing how serverless technologies can be applied to geospatial problems in practice. The focus is on reusable, adaptable solutions built with AWS services and open-source tools, offering a foundation for developers who want to combine modern cloud architectures with geospatial workloads.

A central idea is that the choice of data model is as important as the choice of technology. Geographic data can be restructured to align with serverless patterns: large spatial files stored in an object store, tiled or indexed features mapped to key-value stores, or normalised geometries persisted in relational databases. Each model makes different trade-offs, but restructuring the data is what enables scalability. By adapting the data model — rather than the platform — cloud-native patterns can be applied effectively to geospatial problems.

Cloud Native

The guide presents practical architecture patterns for developers building geospatial solutions on AWS. Each pattern includes:

  • A short description
  • When to use it (and trade-offs)
  • An architecture diagram
  • Links to examples and further resources

The patterns presented here are grounded in production experience and refined through developing geospatial architectures at scale. They are also inspired by ideas shared within the FOSS4G community, with references to relevant talks and repositories included throughout. This guide offers a set of practical building blocks for developers to adapt to their own needs.

About

Tomas Holderness is a Geographer and Technologist, and CTO at Addresscloud where he leads product, engineering and data teams building geospatial solutions for insurance.

© Tomas Holderness 2025. All rights reserved.

Serve Tiles

When you need to deliver map tiles for a web or mobile client, there are several serverless approaches on AWS. These patterns trade off simplicity, performance, cost, and security.

Patterns in this section

How to choose

  • Use S3 only for small-scale, public datasets or testing.
  • Add CloudFront for production-scale, public tiles.
  • Use API Gateway when controlling who can access tiles and/or data residency is critical.

Serve Tiles with S3

TL;DR
Serve static z/x/y map tiles directly from an S3 bucket over HTTP. Best for simple, public, low-traffic use cases.

Tiles with S3 architecture

Description

A web map requests tiles using the standard /z/x/y path structure. S3 responds with the corresponding file, e.g. 10/512/384.png.

This pattern serves static web map tiles (vector or raster in the z/x/y format) directly from an Amazon S3 bucket. S3 provides HTTP access, allowing you to use it as a basic tile server with minimal setup.

When to Use

  • Prototyping or internal demos
  • Public datasets with light to moderate traffic
  • Regional or low-resolution base maps
  • Situations where simplicity is more important than performance

Trade-offs

  • No caching — every tile request hits S3
  • Higher latency compared to CDN-backed solutions
  • S3 request and data transfer costs add up quickly
  • Not suitable for authenticated access

Alternatives

Resources

Serve Tiles with CloudFront

TL;DR
Add CloudFront in front of your S3 bucket to cache tiles and reduce latency and cost at scale. Good for public tiles with moderate to high traffic.

Tiles with CloudFront architecture

Description

User requests a tile → CloudFront checks its edge cache → if not found, it fetches from S3 → response is cached at the edge for future use.

This pattern extends Public Tiles with S3 by placing a CloudFront distribution in front of the S3 bucket. CloudFront caches tile requests at edge locations, reducing latency and minimizing S3 request charges. This setup is especially useful for production deployments of public tile sets.

When to Use

  • Serving public tile sets at scale
  • Reducing S3 request and data transfer costs
  • Lowering latency for end users globally
  • Light or moderate access control needs (e.g. cookie-based auth)

Trade-offs

  • Signed requests are secure but too slow for tile serving
  • Cookie-based auth is possible but awkward to configure
  • Cache invalidation can be tricky for frequently updated tiles
  • Excellent performance for read-heavy, public datasets

Alternatives

Resources

Serve Tiles with API Gateway

TL;DR
Use API Gateway in front of S3 (or other stores) to securely serve tiles using a Lambda authorizer, and with caching out-the-box as a bonus. Good for use cases where tiles need to be secure, with moderate to high traffic volumes.

Tiles with API Gateway architecture

Description

User requests a tile → API Gateway authorizes request → API Gateway checks its edge cache → if not found, it fetches from S3 → response is cached at the edge for future use.

This pattern extends Public Tiles with S3 by placing an API Gateway instance in front of the S3 bucket. API Gateway can secure requests using a Lambda authorizer, and also provides out-the-box caching at either edge locations or on a per-region basis, reducing latency and S3 request charges. This setup is especially useful for production deployments of private tile sets.

When to Use

  • Serving private tile sets at scale
  • Reducing S3 request and data transfer costs
  • Strong access controls
  • Run tile services alongside other API endpoints
  • Use regional deployments if data residency is a consideration

Trade-offs

  • Higher complexity than other solutios
  • Authorization typically requires custom logic
  • Likely more expensive than CloudFront

Alternatives

Resources

Query COGs with Lambda

TL;DR
Use Lambda with rasterio (and/or rio-tiler) to read Cloud Optimized GeoTIFFs (COGs) in S3 via HTTP range requests and return clipped/warped subsets or tiles on demand.

Query COGs with Lambda

Flow: Client → API Gateway → Lambda → S3 (COG) → Lambda returns PNG/WEBP/JSON.
Cache headers (or API Gateway/Lambda@Edge/CloudFront in front) can reduce repeated reads.

Overview

This pattern serves on-demand reads from large rasters stored as COGs in S3. A client requests a window/tile/stat; API Gateway invokes Lambda, which uses rasterio (or rio-tiler) to read only the needed byte ranges from the COG and returns the result (PNG/WEBP/COG tile, or JSON stats).

When to Use

  • You need dynamic subsets (windows, tiles, reprojected extracts) without pre-generating tiles.
  • You want to access pixel data, without tiling for visualisation
  • Pay-per-use access to large rasters; low/irregular traffic.
  • You want to keep data in S3 and avoid maintaining tile servers.

Trade-offs

  • Cold starts and Lambda timeout/memory limits for heavy reads.
  • Throughput/concurrency can be constrained for very hot workloads.
  • Per-request CPU is limited; heavy reprojection/warps can be slow.
  • Requires creation of a Docker image Lambda for rasterio and GDAL support.

References

  • Blog post: https://www.addresscloud.com/blog/serverless-flood
  • Example code (archived): https://github.com/addresscloud/aws-lambda-docker-rasterio

For Tiling from COGs check out TiTiler (note: this should probably be it's own pattern).

Query Attributes by H3 with DynamoDB

TL;DR
Use H3 cell IDs as partition keys in DynamoDB for fast attribute lookups. Resolve H3 on the server (Lambda) or on the client, then do a direct key-value read.

H3 + DynamoDB

Description

Discrete Global Grid Systems allow...

A client requests attributes for a location. Client computes h3 → calls API Gateway with h3 → (optionally) direct DynamoDB HTTP API pass-through or a thin Lambda.

When to Use

  • Low-latency point-in-cell lookups at web scale.
  • You have denormalized attributes per H3 cell/resolution.
  • Cost-predictable key-value access (no scans/joins).

Trade-offs

  • H3 resolution choice impacts precision vs. storage.
  • Updates require re-materializing affected cells.
  • Hotspots possible (popular cells / zoomed-in areas) → consider caching or replication.
  • Client-side H3 reduces server cost but exposes resolution logic to clients.

References

TODO.