Snapshot system
How Backstop captures, stores, verifies, and restores table snapshots using Parquet files in S3-compatible object storage.
Backstop's snapshot system is the recovery foundation. Snapshots are point-in-time table captures stored as Parquet files in your own S3-compatible storage. They are what make a CRITICAL operation reversible.
How snapshots are taken
The sync sidecar takes snapshots in two modes:
Continuous background snapshots — The sidecar runs on a configurable interval. By default it captures every discovered table on startup, then captures new, changed, or retry-needed tables on later polls. --snapshot-every-poll=true is available when you want a full table snapshot every poll. For each captured table, it:
- Reads all rows from PostgreSQL
- Converts row values to Apache Parquet
- Uploads to S3/MinIO under
s3://bucket/snapshots/{table}/{timestamp}.parquet - Writes a manifest file with row count, schema DDL, captured indexes/constraints, and checksums
- Reports liveness via the heartbeat record in SQLite
On-demand snapshots — SDK/local guarded flows can capture before-images immediately before destructive table operations. The gateway recovery gate verifies the latest sidecar snapshot before approved CRITICAL execution.
Storage format
Snapshots are stored in Apache Parquet:
- Columnar storage — efficient reads for large tables
- Portable row data — Row values are stored in a form the restore path can load back through the captured table schema
- Compression — Snappy compression by default
- Manifest — every snapshot has a
.manifest.jsonalongside it:
{
"snapshot_id": "snap_a3f9e2c1",
"table": "users",
"schema": "public",
"row_count": 1842933,
"columns": ["id", "email", "name", "created_at"],
"parquet_path": "snapshots/users/2026-05-06T10:30:00Z.parquet",
"checksum": "sha256:9f86d081884c7d659a2feaa0c55ad015",
"created_at": "2026-05-06T10:30:00Z",
"source": "sidecar",
"db_size_bytes": 218000000
}Snapshot storage URL format
Backstop uses a URL convention to specify S3-compatible storage:
s3://bucket-name@http://endpoint-url
s3://bucket-name # AWS S3 (uses AWS SDK credentials)
s3://bucket-name@http://localhost:9000 # MinIO local
s3://bucket-name@https://storage.example.com # any S3-compatible APIRecovery readiness gate
Before any CRITICAL operation can be approved, the gateway checks:
| Check | Default threshold |
|---|---|
| Snapshot exists for target table | Required |
| Snapshot age | Max 300 seconds (5 min) |
| Sidecar heartbeat age | Max 120 seconds (2 min) |
| Manifest checksum valid | Required |
All checks must pass. If the sidecar has been down for more than 2 minutes, CRITICAL operations are blocked even with operator approval.
Restoring from a snapshot
Guided table recovery
For operator-driven recovery, start with the guided command:
backstop recover \
--db postgresql://postgres:postgres@localhost:5432/mydb \
--storage s3://backstop-snapshots@http://localhost:9000 \
--table usersThe wizard lists only valid checksummed recovery points, refuses to restore over
the original table by default, restores into users_recovered, validates the
restored table, and prints copyback SQL only after validation passes.
Fast table restore
The lower-level backstop restore command reads a Parquet snapshot and writes it back to the target table. It is intended for automation and scripted incident procedures. Runtime depends on table size, storage throughput, and whether you restore into a recovered table first for validation.
# Dry run — shows what would be restored
backstop restore \
--db postgresql://postgres:postgres@localhost:5432/mydb \
--storage s3://backstop-snapshots@http://localhost:9000 \
--snapshot-id snap_a3f9e2c1 \
--table users \
--dry-run
# Execute restore
backstop restore \
--db postgresql://postgres:postgres@localhost:5432/mydb \
--storage s3://backstop-snapshots@http://localhost:9000 \
--snapshot-id snap_a3f9e2c1 \
--table usersThe restore:
- Verifies the manifest checksum
- Reads the Parquet file from S3
- Creates the target schema/table from the captured DDL
- Inserts all rows in batches
- Reapplies captured indexes and constraints on a best-effort basis
- Verifies row count, target existence, indexes, constraints, and sample equality where possible
Listing available snapshots
backstop snapshots list \
--db postgresql://postgres:postgres@localhost:5432/mydb \
--storage s3://backstop-snapshots@http://localhost:9000 \
--table usersOr via the API:
curl http://localhost:8080/metadata/snapshots?table=users | jq '.snapshots'Limitations
- Snapshots are per-table. They can capture table DDL, indexes, and constraints, but they do not guarantee cross-table transactional consistency. For multi-table domains, configure recovery groups and validate before copyback.
- Snapshots are not a full PostgreSQL backup. Functions, triggers, grants, extensions, custom types, schemas outside the captured target, and cluster-level state require logical backups or PITR.
- Snapshot data reflects the state at snapshot time. Data written after the snapshot was taken is not recoverable from the snapshot alone — you need PITR for that window.
For schema recovery and post-snapshot data recovery, see the Operations runbooks.