Geospatial¶
SecantusDB ships full geo support — operators, the $geoNear
aggregation stage, and both 2d and 2dsphere index acceleration.
This page is the operator-by-operator and index-by-index reference;
the Indexes page covers the rest of the index
machinery.
Operators¶
Operator |
Doc-side data |
Notes |
|---|---|---|
|
GeoJSON, legacy |
Containment test. Accepts |
|
GeoJSON |
|
|
GeoJSON or legacy pair |
Containment + sort by distance. |
|
GeoJSON or legacy pair |
Same as |
All four are reachable through pymongo, mongo-go-driver,
mongo-node-driver, mongo-java-driver (Filters.geoWithin /
Filters.geoIntersects / Filters.near / Filters.nearSphere), and
mongo-ruby-driver. The mongo-java-driver gauge’s
GeoJsonFiltersFunctionalSpecification and
GeoFiltersFunctionalSpecification exercise the full driver-side
Filters builder path against SecantusDB and pass 10/10.
$geoNear aggregation stage¶
pipeline = [
{
"$geoNear": {
"near": {"type": "Point", "coordinates": [0.0, 0.0]},
"distanceField": "distance",
"key": "loc",
"maxDistance": 500, # meters (GeoJSON)
"query": {"category": "A"}, # pre-filter
"includeLocs": "matchedLoc", # echo raw doc geometry under this field
}
}
]
$geoNear auto-picks a 2d or 2dsphere index on the named field
when one exists; falls back to a full-scan distance computation
otherwise. Output is sorted ascending by distance and distanceField
carries the value. includeLocs echoes the raw doc geometry under a
named field so the client can plot the matched points without a
second round-trip.
Index types¶
2dsphere — modern spherical¶
coll.create_index([("loc", "2dsphere")])
Best for GeoJSON data and any computation that should be
geodesically correct on a sphere. Doc-side geometries must be valid
GeoJSON (or legacy pairs interpreted as [lng, lat]).
Implementation:
Each indexed geometry’s S2 cell covering is computed via
s2sphere.RegionCoverer(min level 4, max level 16, max 64 cells), and every covering cell plus every ancestor back to level 0 is written as an index entry. The ancestor expansion is what lets a query at any level — a coarse covering, a leaf point cell, anything in between — find the doc.Query-side coverings expand the same way. The storage layer does exact point-lookups against the entries table; Shapely (planar) and haversine (spherical) verify candidates.
Cell IDs are encoded as fixed-width 8-byte big-endian uint64 so the WT B-tree’s lex byte ordering aligns with S2 cell-ID ordering.
2d — legacy planar¶
coll.create_index([("loc", "2d")])
For legacy [x, y] coordinate pairs (lng/lat by default). Useful
when working with non-geographic 2D data (game-world positions, plot
coordinates, etc.) where the spherical assumption is wrong.
Implementation:
Each indexed point gets one bit-interleaved geohash entry at the configured precision (
bits, default 26;min/max, default -180 / 180).Query-side: the bbox is decomposed into a list of tight Z-order ranges via a quadtree (each 2^k × 2^k power-of-2-aligned cell that lands fully inside the bbox yields one contiguous Z-range). Falls back to a single coarse range over
max_ranges=32for very tortuous bboxes. The Shapely / haversine verifier filters false positives.2d indexes are point-only on the doc side — mongod itself doesn’t index arbitrary shapes against a 2d index.
Compound geo + scalar¶
coll.create_index([("loc", "2dsphere"), ("category", 1)])
The geo column drives the cell-covering scan; the trailing scalar column gets filtered at the verifier step. Useful when most queries combine a geo predicate with a category / status filter — the combined index cuts down the scan from “all geo matches” to “geo matches in category X.”
Custom 2d range¶
coll.create_index([("pos", "2d")], min=0, max=1000, bits=20)
Override the default lng / lat range when storing non-geographic
coords. bits sets the geohash precision per axis (1–32; default
26). The grid is 2^bits × 2^bits buckets.
Distance units — the gotcha¶
Three different conventions are in play depending on the spec shape and index type. SecantusDB matches mongod’s rules:
Spec shape |
Operator |
|
|---|---|---|
GeoJSON |
|
Meters (great-circle on Earth) |
Legacy |
|
Input units (planar Pythagoras) |
Legacy |
|
Radians on the unit sphere |
The legacy + spherical case is the most surprising — the bound is in radians, not meters. To convert: meters / 6_378_100 ≈ radians.
Internally, SecantusDB’s distance(spherical=True) returns meters
(Earth-radius scaled), so the matcher converts legacy+spherical
bounds via * EARTH_RADIUS_METERS. The 2d-index picker for the same
shape converts via * 180 / π to get degrees (matching mongod’s
behaviour against a 2d index).
This isn’t usually a problem if you use the GeoJSON form everywhere
(unambiguously meters). The legacy forms only come up when a driver
builder API like Java’s Filters.nearSphere(field, x, y, max, min)
serializes to the legacy shape on the wire.
Doc-side geometry shapes accepted¶
Shape |
Example |
Notes |
|---|---|---|
GeoJSON |
|
The canonical form for both index types |
GeoJSON |
|
2dsphere only (2d indexes don’t index polygons) |
GeoJSON |
|
2dsphere only |
Legacy |
|
Treated as |
Legacy |
|
Treated as |
Legacy |
|
Explicit aliases |
Malformed geometries reject at insert / update / upsert /
createIndex time with mongod’s documented code 16572
("Can't extract geo keys"). Stored bad geometry is tolerated by
the operators (treated as “no match”) without raising — mirrors
mongod.
Worked example¶
from pymongo import MongoClient
from secantus import SecantusDBServer
with SecantusDBServer(port=0) as srv:
client = MongoClient(srv.uri)
coll = client["app"]["places"]
# Insert some restaurants in central London.
coll.insert_many([
{"_id": 1, "name": "The Fox", "loc": {"type": "Point", "coordinates": [-0.1276, 51.5072]}},
{"_id": 2, "name": "Borough", "loc": {"type": "Point", "coordinates": [-0.0900, 51.5050]}},
{"_id": 3, "name": "Camden", "loc": {"type": "Point", "coordinates": [-0.1426, 51.5395]}},
{"_id": 4, "name": "Greenwich","loc": {"type": "Point", "coordinates": [ 0.0098, 51.4769]}},
])
coll.create_index([("loc", "2dsphere")])
# All restaurants within 5 km of Trafalgar Square.
cursor = coll.find({
"loc": {
"$near": {
"$geometry": {"type": "Point", "coordinates": [-0.1280, 51.5080]},
"$maxDistance": 5000, # meters
}
}
})
for doc in cursor:
print(doc["name"])
# Same query as aggregation with distance attached.
pipeline = [{
"$geoNear": {
"near": {"type": "Point", "coordinates": [-0.1280, 51.5080]},
"distanceField": "metres",
"key": "loc",
"maxDistance": 5000,
}
}]
for doc in coll.aggregate(pipeline):
print(f"{doc['name']}: {doc['metres']:.0f} m")
# $geoWithin a polygon.
westminster = {
"type": "Polygon",
"coordinates": [[
[-0.14, 51.49], [-0.12, 51.49],
[-0.12, 51.52], [-0.14, 51.52],
[-0.14, 51.49],
]],
}
inside = list(coll.find({"loc": {"$geoWithin": {"$geometry": westminster}}}))
Validation coverage¶
Surface |
Tests |
|---|---|
Unit tests (parser, distance, containment) |
|
Operator integration via pymongo |
|
Index acceleration + explain |
|
Cross-driver smoke (mongosh / node / go) |
|
pymongo conformance gauge |
|
mongo-java-driver gauge |
|
Out of scope¶
Exact mongod error-string matching — we surface mongod’s error codes (
16572for bad geo extraction,2for bad operator args) but not the exacterrmsgwording. Driver tests that pin specific English strings fall here; tests that key on the code pass.Geo-haystack indexes — deprecated in MongoDB 5.0; no point.
The
geoSearchcommand — superseded by$geoNear.
Where the code lives¶
src/secantus/geo.py— geometry primitives, GeoJSON parsing, Shapely / haversine distance + containment. Pure module, no storage import.src/secantus/geo_index.py— S2 cell coverings (2dsphere), bit-interleaved geohash + quadtree Z-order range decomposition (2d), cell-ID encoding for the WT entries table.src/secantus/storage.py_pick_geo_index_for_filter/_try_geo_index_id_keys/_geo_query_cells— picker integration.src/secantus/query.py_op_geo_within/_op_geo_intersects/_op_geo_near/_parse_near_spec— operator matcher (also handles the legacy mongod sibling-form$maxDistance/$minDistance).src/secantus/aggregate.py_stage_geoNear— the aggregation stage.