A periodic snapshot of the volatile state under programfiles/ — every
authored document plus the user store — written as a zip archive on a
configurable interval, with a configurable retention cap. The shape
mirrors the sitemap writer: one tokio task per process, kicked off from
main after the runtime is up.
backup::spawn_writer() is called from main.rs right after the runtime
comes online. It’s a no-op when either backup_interval_secs or
backup_retain is missing, zero, or negative — config_u64 only returns
Some for strictly positive integers, and both gates short-circuit to
return before any thread is spawned. A deployment that hasn’t set those
keys stays completely quiet; nothing is written, no background task
exists.
INCLUDE_PATHS in src/backup.rs lists exactly two directories:
programfiles/content/ — the doc tree (every resource’s ctx.json,
body files, perms).programfiles/local_auth/ — the user store.programfiles/op/ — checked-in static config (config.json,
navbar.json, support_lang.json, robots.txt). Losing it is
lossless; it comes back with git pull.backup-<unix_ts>.zip, where <unix_ts> is unix_now() at the start of
the cycle. The timestamp embeds into the filename so lexical sort matches
chronological order — list_archives and prune rely on that. The
on-disk mtime additionally preserves the human-readable timestamp for
ls -l.
run_cycle writes to <archive>.tmp and then renames over the final
path. Same atomicity trick routes/sitemap.rs::persist uses. If
archive_paths_to_file fails partway through, the .tmp file is
unlinked and the cycle returns without renaming, so a half-written archive
never appears under a canonical backup-<ts>.zip name. If the rename
itself fails, the .tmp is also unlinked — list_archives filters by
the backup-*.zip pattern so a stray .tmp wouldn’t be counted anyway,
but the cleanup keeps the destination tidy.
prune(dir, retain) lists archives via list_archives (which sorts
lexically, i.e. oldest first thanks to the timestamped names), then
deletes everything except the newest retain entries. Individual
remove_file errors are logged and swallowed — a single undeletable file
shouldn’t block the pruning of the rest, and certainly shouldn’t block
the next cycle’s write.
spawn_writer runs one refresh().await before entering the
tokio::time::sleep loop, so a fresh restart always has a recent
snapshot even on hosts that get restarted more often than the configured
interval.
Read fresh from programfiles/op/config.json each cycle (the writer
doesn’t cache config beyond a single function call), so editing the file
takes effect on the next tick — no restart required.
backup_interval_secs — seconds between snapshots. Goes through
config_u64, which means zero / negative / missing all mean
“disabled”. When disabled, spawn_writer returns immediately without
spawning the task.backup_retain — keep at most this many archives. Same config_u64
semantics; same “disabled = no task” behavior. Required because without
a cap, archives grow unbounded.backup_dir — destination. Optional string. Empty / missing falls back
to DEFAULT_BACKUP_DIR (programfiles/backup). Relative paths resolve
from the working directory, so an external mount works without code
changes — just point it at /mnt/backups/fds or similar.unzip backup-<ts>.zip at the repo root. Archive entry paths preserve
each include’s leading folder (content/foo.md for the
programfiles/content include; see src/zip.rs::archive_paths_to_file
for how strip_root is set to the include’s parent), so a top-level
unzip recreates the programfiles/content/... and
programfiles/local_auth/... layout.No metadata is stripped; perms files, ctx.jsons, and body files all
round-trip byte-for-byte. The test
run_cycle_writes_archive_with_payload in src/backup.rs verifies the
round-trip on a synthetic workspace.
The #[cfg(test)] mod tests block in src/backup.rs shows a pattern
worth reusing for any module that touches the filesystem. Each test owns
a fresh target/test_backup_<name>/ directory built by a workspace
helper that remove_dir_alls and re-creates the path, so cargo’s default
parallel test execution doesn’t create cross-test interference. target/
is already gitignored, so the test workspace can’t leak into a commit
even if cleanup fails. Three tests cover the contract:
run_cycle_writes_archive_with_payload (single cycle, round-trip
payload), prune_keeps_n_newest (retention behaviour with 10 fake
archives pruned to 3), and list_archives_ignores_unrelated_files
(filter rejects .tmp and unrelated user files). The fourth,
run_cycle_no_op_when_no_paths_present, pins the empty-includes case to
“do nothing” rather than “write an empty archive”.
See SEO and sitemap for the parallel background-writer + atomic-rename pattern.