Is COG Scaleable?
Jeff Albrecht
Tile Server
Blob Store Characteristics
s3://cog-layers-glad/7/28/49/0231102/tile.tif
Scaling Blob Stores is Easy!
COG Internal Structure
8192
8192
Tile(x=28, y=49, z=7)
Web Optimized COG
This is, in theory, the most optimized COG layout for this use case.
Caching
Most applications have some sort of local cache.
Requests are load balanced across replicas (round robin).
Requests are load balanced across replicas (round robin).
Requires 3 requests for all replicas to cache the header.
Round Robin
Number Images | Number Replicas | Scenario | Number of Requests |
100 | 100 | Perfect Caching | 100 |
100 | 100 | Local Caching | 10000 |
Round Robin
n_replicas = 100
n_images = 100
Round Robin
n_replicas = 100
n_images = 100
Cache (In)efficiency
Distributed Caching Patterns
Cache aside
Peer-to-peer cache
Forward-proxy
MetaTiling
MetaTiling
(8192, 8192) image with (256x256) chunk size contains 1024 chunks.
(8192, 8192) image with (4096x4096) contains 4 meta-chunks.
Sending Requests Quickly
Send Request Quickly
In Practice
pyasyncio-benchmark
https://github.com/geospatial-jeff/pyasyncio-benchmark
pyasyncio-benchmark
Header Requests (m5.8xlarge)
MetaTiling (m5.8xlarge)
Dump all native resolution tiles (1024).
MetaTiling (m5.8xlarge)
ignore httpx :(
Is COG Scaleable?
Yes, it is.
Yes, it is. But our software is not written in a way that encourages or achieves scalable system design in the cloud.
Yes, it is. But our software is not written in a way that encourages or achieves scalable system design in the cloud.
Software, hardware, and data are in constant co-evolution. Code doesn’t age, but context does change.
Yes, it is. But our software is not written in a way that encourages or achieves scalable system design in the cloud.
Software, hardware, and data are in constant co-evolution. Code doesn’t age, but context does change.
As the CNG community standardizes on individual data formats (ex. Zarr / COG) it makes sense to invest more in specialized tooling for those formats.
Questions