# Backup & point-in-time recovery
SecantusDB supports two recovery models:
1. **Snapshot backup / restore** — a consistent copy of the whole database at the
moment the backup was taken.
2. **Point-in-time recovery (PITR)** — rebuild the database as it was at *any*
target time, by replaying the oplog forward.
Both are **offline restores**: they produce a fresh data directory that you then
point a *new* server at (`secantusdb --storage-path
` /
`SecantusDBServer(storage_path=)`). Hot in-place restore over a live
WiredTiger connection isn't supported — real `mongod` restores work the same way
(stop, swap the data directory, start).
## One interface, two servers
SecantusDB ships as [two separate servers](servers.md) — the pure-Python
`SecantusDBServer` and the standalone Rust `secantusdb` binary — and **both
implement the full PITR surface** with the same command names:
- the `secantusdb restore` command (the Python console script *and* the Rust
binary are both named `secantusdb`);
- the `secantusAdmin.backupArchive`, `secantusAdmin.restoreToTimestamp`, and
`secantusAdmin.archiveBaseSnapshot` wire commands;
- the `--oplog-archive-dir` server flag.
Because both servers store data in **the same WiredTiger schema** and write **the
same mongod-shaped oplog**, a backup or data directory produced by one server is
restorable by the other. The Python tooling restores a Rust server's data and the
Rust binary restores a Python server's data, byte-for-byte — there is one PITR
format, not two. (This identity is pinned by the cross-server tests
`tests/test_rust_pitr_cross_server.py` and `tests/test_rust_binary_pitr.py`.)
Everything below applies to both servers unless a heading says otherwise.
## How it works
PITR is **snapshot + oplog replay**. The pieces:
### The oplog
Every write a server accepts is recorded in a mongod-shaped operations log
(surfaced as `local.oplog.rs`), stored in the same WiredTiger connection as the
data. Each entry mirrors mongod's shape — `ts` (a `Timestamp(secs, ord)`), `op`,
`ns`, `ui` (collection UUID), `o`, `o2`, `wall` — and uses the same op codes:
| `op` | Meaning | `o` payload |
|------|---------|-------------|
| `i` | insert | the inserted document |
| `u` | update | `{$v: 2, diff}` for an operator update (a dotted-path `updateDescription`), or the whole replacement document |
| `d` | delete | the deleted `_id` (in `o2`) |
| `c` | command (DDL) | `create` (with collection options as siblings), `createIndexes`, `dropIndexes`, `collMod`, `drop`, `dropDatabase`, `renameCollection` |
| `n` | no-op | heartbeats / replica-set init — skipped on replay |
Because the oplog lives beside the data, a WiredTiger checkpoint captures both
consistently, and a backup archive is **self-contained**: it carries the oplog up
to the checkpoint.
### The applier
Recovery opens a stopped source (a backup archive or a stopped server's data
directory — a *live* directory can't be opened, WiredTiger holds a single-writer
lock), then replays the oplog forward into a fresh target, **stopping before the
first entry past the target time**. Each entry is applied through the server's
**ordinary write paths** — `i` inserts, `d` deletes by `_id`, `c` re-runs the DDL
— so the documents, indexes, collection options, and natural (insertion) order
come out exactly as they were produced live.
- An **operator update** (`{$v: 2, diff}`) is rolled forward by re-applying the
`updateDescription` to the document's current state (`updatedFields` are set,
`removedFields` unset, `truncatedArrays` shortened) — the inverse of how the
oplog diff was computed. A **replacement update** simply restores the whole `o`.
- During replay the target's own oplog emission is **suppressed**, because the
oplog is the *input*, not something to regenerate. (See
[resume continuity](#change-stream-resume-continuity) for the opt-in exception.)
- Collection options (`capped` / `size` / `max` / `validator` / `viewOn` / …) and
index / `collMod` (incl. TTL `expireAfterSeconds` retunes) / rename DDL are all
reconstructed.
### The manifest
Every backup archive embeds a small `pitr-manifest.json` describing the oplog
range it can recover to — floor / head seq, the floor / head timestamps and
wall-clock times, and whether the oplog still reaches genesis (an un-pruned
front). It's advisory (restore reads the oplog directly) but lets tooling report a
backup's recoverable range without opening WiredTiger.
## Snapshot backup
`Storage.create_archive` forces a WiredTiger checkpoint and tars the consistent
file set into a single `.tar.gz`. Over the wire it's the
`secantusAdmin.backupArchive` command — taken **against the live server** (a
consistent snapshot off WiredTiger's `backup:` cursor, no downtime):
```python
from pymongo import MongoClient
admin = MongoClient("mongodb://127.0.0.1:27017")["admin"]
admin.command({"secantusAdmin.backupArchive": 1, "outputPath": "/backups/db.tar.gz"})
```
Restore the snapshot by extracting it into a fresh directory — with the
`secantusdb-restore-archive` tool (Python) or the `secantusAdmin.restoreArchive`
command — then start a new server on it. A plain snapshot restore lands you at the
backup's checkpoint; for an arbitrary target time, use PITR below.
## Point-in-time recovery
Recovery replays the oplog into a fresh store, stopping at a target timestamp or
wall-clock time. With neither, the whole oplog is replayed ("latest").
### CLI
```bash
# Recover to a wall-clock time:
secantusdb restore --source /backups/db.tar.gz \
--target-dir /restore/at-1430 \
--to-time 2026-06-17T14:30:00Z
# Or to a precise cluster timestamp (seconds[,ordinal]):
secantusdb restore --source /path/to/stopped-data-dir \
--target-dir /restore/exact \
--to-timestamp 1781716542,7
# With neither --to-time nor --to-timestamp, the whole oplog is replayed
# ("latest"). Then start a server on the result:
secantusdb --storage-path /restore/at-1430
```
`--source` is a backup `.tar.gz`, a stopped server's data directory, **or** a PITR
archive directory (see [Arbitrary window](pitr-arbitrary-window)
below — auto-detected). `--target-dir` must be a fresh path.
```{note}
`secantusdb restore` is provided by **both** the Python console script and the
Rust binary, with identical flags. The Rust binary additionally exposes
`--to-timestamp` and `--preserve-oplog`; the Python CLI adds `--to-time`.
```
### Wire command
`secantusAdmin.restoreToTimestamp` exposes the same operation for admin tooling
(both servers):
```python
admin.command({
"secantusAdmin.restoreToTimestamp": 1,
"source": "/backups/db.tar.gz", # archive, stopped data dir, or archive dir
"targetDir": "/restore/at-1430",
"toTimestamp": Timestamp(1781716542, 7), # or "toTime": ; omit for latest
"preserveOplog": False, # see "resume continuity" below
})
```
### Python API
The Python server's machinery is also importable directly:
```python
from secantus import oplog_replay
# A backup archive or a stopped data directory:
oplog_replay.restore_to_timestamp(source_dir, target_dir, to_ts=ts) # data dir
oplog_replay.restore_archive_to_timestamp(archive, target_dir, to_wall=t) # .tar.gz
# A PITR v2 archive directory:
from secantus import pitr_archive
pitr_archive.restore_from_archive_dir(archive_dir, target_dir, to_ts=ts)
```
### Transactions
Every statement in a multi-document transaction shares one commit timestamp, so
the timestamp cut is always **all-or-nothing** for a transaction — a recovery
point never lands in the middle of one.
## The recovery window
### Live oplog (the simple case)
The simplest restore replays onto an **empty** base, which is exact whenever the
source oplog still reaches genesis — i.e. it hasn't been pruned from the front.
The recovery window is then the **oplog retention window**. Tune it for the
horizon you need:
```bash
secantusdb --oplog-retention-seconds 604800 --oplog-max-entries 5000000 # ~1 week
```
(or the `[oplog]` section of `secantusdb.toml`). The rule of thumb: *keep enough
oplog and you can rewind to any point in it.* If the oplog has been pruned past
genesis and no archive is configured, this restore **fails loudly** rather than
silently rebuilding a partial database.
(pitr-arbitrary-window)=
### Arbitrary window: oplog archiving + base snapshots
To recover to a time *before* the live oplog floor — without keeping the entire
oplog live — turn on **oplog archiving** and take periodic **base snapshots** into
the same directory:
```bash
secantusdb --storage-path /data --oplog-archive-dir /pitr-archive
```
With `--oplog-archive-dir` set, the rows `prune_oplog` is about to drop are first
written to durable segment files (`oplog--.seg`) in that directory.
Take base snapshots on demand (there is no background scheduler — same explicit
model as `prune_ttl` / `prune_oplog`):
```python
admin.command({"secantusAdmin.archiveBaseSnapshot": 1, "archiveDir": "/pitr-archive"})
```
Each writes a `base-.tar.gz` into the directory. To recover, point
`restore` at the **archive directory** (the CLI and wire command auto-detect it):
```bash
secantusdb restore --source /pitr-archive --target-dir /restore/at-T \
--to-time 2026-06-10T09:00:00Z
```
Restore picks the newest base snapshot at or before the target time, extracts it,
and stitches the archived oplog forward onto it up to the target — so any moment
in the archived history is reachable. If the base snapshots plus segments don't
cover the requested time (a gap), it fails loudly rather than returning a
truncated database.
## Change-stream resume continuity
By default the restored data directory starts a **fresh oplog timeline** — the
replayed history isn't carried into the target, so a change stream on the restored
server resumes only from the restore point forward (this matches `mongorestore`).
Pass `--preserve-oplog` (`secantusdb restore`) or `preserveOplog: true`
(`secantusAdmin.restoreToTimestamp`) to carry the replayed oplog onto the restored
directory **verbatim** — same seq, timestamp, and pre-images. A change stream on
the restored server can then resume from a [resume token](change-streams.md)
minted *before* the restore point, because the rows that token references are
present.
## Quick reference
| Task | CLI | Wire command | Python API |
|------|-----|--------------|------------|
| Snapshot backup | — | `secantusAdmin.backupArchive` | `Storage.create_archive` |
| Extract a snapshot | `secantusdb-restore-archive` | `secantusAdmin.restoreArchive` | `extract_backup_archive` |
| Restore to a time | `secantusdb restore --to-time/--to-timestamp` | `secantusAdmin.restoreToTimestamp` | `oplog_replay.restore_to_timestamp` |
| Restore "latest" | `secantusdb restore` (no bound) | `restoreToTimestamp` (no bound) | `restore_to_timestamp()` |
| Carry oplog for resume | `--preserve-oplog` | `preserveOplog: true` | `carry_oplog=True` |
| Take a base snapshot | — | `secantusAdmin.archiveBaseSnapshot` | `Storage.archive_base_snapshot` |
| Restore from an archive dir | `secantusdb restore --source ` | `restoreToTimestamp` (dir source) | `pitr_archive.restore_from_archive_dir` |
| Enable oplog archiving | `--oplog-archive-dir DIR` | — | `Storage(oplog_archive_dir=…)` |
## Notes & limitations
- Restore is **offline**: it writes a fresh data directory you then start a new
server on. There is no in-place / hot restore (neither does `mongod`).
- The source must be a **stopped** server's data directory or a backup archive —
WiredTiger's single-writer lock forbids opening a live one. Take a
`backupArchive` from the live server instead, then restore from that.
- Base snapshots and oplog archiving have **no background scheduler** by design —
the operator drives `archiveBaseSnapshot` / pruning explicitly.
- See [Change streams](change-streams.md) for the oplog model, [Running in
production](production.md) for a deployment shape, and
[Compatibility](compatibility.md) for the broader divergence list.