Deploy¶
The whole stack is a single Compose file, dataland-infrastructure/compose.yml. Deploy is one shell script, deploy.sh, that fast-forwards every repo, derives an immutable image tag, runs a pre-deploy boot guard, then rolls the stack with docker compose up -d --build.
Where this runs
Production lives on the Spark DGX VDS (ege@100.124.170.43, repos under /home/cobanov/DATALAND/). The public surface is fronted by Cloudflare tunnels; operators on the Tailscale tailnet reach the stack directly via the *_PUBLIC_BIND host bindings. Service-to-service traffic always uses the dataland-network bridge and is unaffected by host port binding. See Services for per-service detail.
Recent changes
- DAT-291 added the pre-deploy boot-guard fail-fast (run the real agent boot guard against the production
.envbefore any rebuild), wired the tailnet*_PUBLIC_BINDpublishing across stateful services, added thedocs.dataland.chatservice, and folded prod secret rotation into the deploy flow. - DAT-269 standardized every service on
gemini-3.5-flash(chat + Gemini captioning); RAG vectors usegemini-embedding. A deploy after a model bump just rebuilds the agent + rag images — the model id is read from.envat boot. - DAT-82 introduced the
host-metricsCompose profile (cAdvisor + node-exporter) thatdeploy.shlayers automatically on Linux deploys.
First time on a new host¶
cd /home/cobanov/DATALAND
cp dataland-infrastructure/.env.example .env
mkdir -p secrets && chmod 700 secrets # (1)!
# fill in .env, drop secrets/gcp-key.json, then:
chmod 600 .env secrets/gcp-key.json # (2)!
700onsecrets/so only the operator can traverse the directory. The.envandgcp-key.jsonyou drop inside still need their own600— the directory mode alone does not protect the files.- DAT-88 — the default umask on a fresh
secrets/would leave new files world-readable. The Compose mount is:rofrom the container, but host-side file permissions are the operator's job, so lock both the.envand the GCP key down to600(owner read/write only).
chmod 600 matters: the default umask on a fresh secrets/ would leave new files world-readable. The compose mount is :ro from the container, but host permissions are the operator's job (DAT-88).
Fill the :?-required secrets before the first deploy
Several services hard-fail at startup if their secret is missing or still a placeholder. Compose enforces a subset via ${VAR:?...}:
| Variable | Service | Enforced by |
|---|---|---|
REDIS_PASSWORD |
redis + every consumer (DAT-76) | Compose :? + healthcheck |
MUSEUM_PASSWORD, MUSEUM_SESSION_SECRET |
museum-api dashboard gate | Compose :? |
RDC_REDIS_URL |
museum-api (external RDC redis) | Compose :? |
INFORMATION_WEBUI_PASSWORD, INFORMATION_WEBUI_SESSION_SECRET |
Catalog Studio | Compose :? |
DOCS_PASSWORD |
docs.dataland.chat basic auth | Compose :? + entrypoint |
GRAFANA_ADMIN_PASSWORD |
Grafana (DAT-266) | Compose :? |
Anything else flagged by the boot guard (DAT-291) is caught on the next deploy — see below.
Routine deploy¶
deploy.sh (cds to the parent of dataland-infrastructure first, so paths are relative to /home/cobanov/DATALAND) runs, in order:
flowchart TD
A["git checkout main<br/>git pull --ff-only<br/>(all 6 repos)"] --> B["Derive IMAGE_TAG<br/>= UTC timestamp + infra short SHA"]
B --> C{"dataland/agent:latest<br/>image exists?"}
C -- "yes" --> D["DAT-291 boot guard:<br/>docker run --rm --env-file .env<br/>agent:latest → assert_boot_required_env()"]
C -- "no (first deploy)" --> F
D -- "exit 1" --> E["ABORT before rebuild<br/>(no crash-loop)"]
D -- "exit 0" --> F["docker compose up -d --build<br/>--profile host-metrics"]
F --> G["Retag each built image<br/>:IMAGE_TAG → :latest"]
G --> H["docker compose ps<br/>echo Active deploy: IMAGE_TAG"]
1. Fast-forward every repo¶
The script checks out main and git pull --ff-only for all six repos:
repos=(
dataland-infrastructure # (1)!
dataland-agent
dataland-rag-v2
dataland-museum
dataland-notification
dataland-atlas
)
- All six repos are checked out to
mainand pulled withgit pull --ff-only.--ff-onlyis deliberate: a deploy must never silently create a merge commit on the box. A repo that has diverged fromorigin/mainfails the pull loudly and stops the deploy before anything is rebuilt.
--ff-only is deliberate: a deploy must never silently create a merge commit on the box. If a repo has diverged from origin/main, the pull fails loudly and the deploy stops before anything is rebuilt.
2. Derive IMAGE_TAG¶
infra_sha=$(git -C dataland-infrastructure rev-parse --short HEAD) # (1)!
export IMAGE_TAG="${IMAGE_TAG:-$(date -u +%Y%m%d-%H%M%S)-${infra_sha}}" # (2)!
- Short SHA of the
dataland-infrastructurerepo at deploy time — the deploy is keyed to the infra repo HEAD, not the per-service repos, sincecompose.ymllives here. - DAT-77 —
${IMAGE_TAG:-…}only mints a fresh<UTC-timestamp>-<infra-sha>tag (e.g.20260604-093015-eb4857a) whenIMAGE_TAGis not already set, which is what lets you pin a deploy to an existing tag. The timestamp + SHA makes every deploy's image set immutable and rollback-addressable.
IMAGE_TAG is a UTC timestamp plus the infrastructure repo short SHA, e.g. 20260604-093015-eb4857a (DAT-77). Every internally-built image in compose.yml is tagged dataland/<svc>:${IMAGE_TAG:-latest}, so each deploy produces an immutable, timestamped set of images you can roll back to.
Pinning a deploy
Set IMAGE_TAG in the environment (or in .env) to reuse an existing tag instead of minting a new one. Leaving it unset in .env means a plain docker compose up uses whatever :latest the local Docker last built. deploy.sh always exports a fresh value unless you pre-set one.
3. Pre-deploy boot guard (DAT-291)¶
Before rebuilding anything, deploy.sh runs the real agent boot guard against the production .env using the current dataland/agent:latest image:
if docker image inspect dataland/agent:latest > /dev/null 2>&1; then # (1)!
if ! docker run --rm --env-file .env dataland/agent:latest /app/.venv/bin/python -c "
import sys
from app.runtime import assert_boot_required_env
try:
assert_boot_required_env()
except RuntimeError as exc:
print(exc, file=sys.stderr)
sys.exit(1)
"; then
echo "ABORT: .env failed the agent boot guard (see above)." >&2 # (2)!
exit 1 # (3)!
fi
fi
- Guards the whole block on
dataland/agent:latestexisting. On the very first deploy there is no:latestimage to run the guard from, so the check is skipped and the deploy trusts the operator's.env(DAT-291). - Runs the real
assert_boot_required_env()— the same function the container runs at boot — against the production.env, so the pre-flight can never drift from the boot-time contract.--rmdiscards the throwaway container;--env-file .envfeeds it the exact env a real boot would see. - Exits non-zero before
docker compose ... --buildis ever reached. Nothing is rebuilt or rolled — fix the flagged secrets in.envand re-rundeploy.sh. This is the check that prevents the crash-loop outage.
Why this exists: the agent's boot guard (assert_boot_required_env) crash-loops the container if the production .env still holds placeholder/default secrets. Before DAT-291, that meant a freshly-built agent would build cleanly, start, fail the guard, restart, fail again — taking chat offline. The exact outage this check prevents.
Properties of the guard
- It runs the same function the container runs at boot, so the pre-flight check can never drift from the boot-time contract.
- It is a no-op outside
APP_ENV=production, so dev deploys are unaffected. - It is skipped on the very first deploy when no
dataland/agent:latestimage exists yet (nothing to run the guard from). The first deploy therefore trusts the operator's.env; subsequent deploys are guarded. - On failure the script exits
1beforedocker compose ... --buildis invoked. Nothing is rebuilt, nothing is rolled. Fix the flagged secrets in.env, re-rundeploy.sh.
4. Build + roll the stack (with the host-metrics profile)¶
docker compose -f dataland-infrastructure/compose.yml --profile host-metrics --env-file .env up -d --build # (1)!
- DAT-82 —
--profile host-metricslayers incadvisor+node-exporter, which mount the host/procand/sysand only work on Linux.deploy.shalways passes it because production is the Spark Linux box; on a macOS dev box you drop the flag so those two containers stay off.--buildrebuilds only images whose Dockerfile/source changed, then-drecreates each service detached.
This rebuilds any image whose Dockerfile or source changed, then recreates each service with the new image. restart: unless-stopped (from the x-common-env anchor) does the rest.
The host-metrics profile (DAT-82)
The profile enables cadvisor + node-exporter, which mount the host /proc and /sys and only work on Linux. deploy.sh always passes --profile host-metrics because production is the Spark Linux box. On a macOS dev box you would run docker compose up without the profile so those two containers stay off. The rest of the monitoring stack (prometheus, grafana, alertmanager, postgres-exporter, redis-exporter) has no profile and starts everywhere.
The simulator profile is separate and is not enabled by deploy.sh; the simulator only runs when you explicitly opt in — see Simulator.
5. Retag :IMAGE_TAG → :latest¶
for image in "${internal_images[@]}"; do
if docker image inspect "${image}:${IMAGE_TAG}" > /dev/null 2>&1; then # (1)!
docker tag "${image}:${IMAGE_TAG}" "${image}:latest" # (2)!
fi
done
- The
inspectguard skips any image that was not built this run. Only images that actually got the fresh:IMAGE_TAGare retagged. - Points
:latestat the just-built:IMAGE_TAG. This is what makes the next deploy's boot guard (step 3) and any plaindocker compose upresolve:latestto the most recent deploy.
The six internal images (dataland/auth-server, dataland/rag, dataland/museum-api, dataland/agent, dataland/notification, dataland/information-webui) get a :latest alias pointing at the just-built :IMAGE_TAG. The next boot guard run (step 3 of the following deploy) and any plain docker compose up then resolve :latest to the most recent deploy. The conditional skips images that weren't built this run.
6. Report¶
docker compose -f dataland-infrastructure/compose.yml --env-file .env ps
echo "Active deploy: ${IMAGE_TAG}" # (1)!
echo "Roll back with: IMAGE_TAG=<previous-tag> docker compose -f dataland-infrastructure/compose.yml --env-file .env up -d" # (2)!
- Prints the immutable tag this deploy just stamped, so you have a record of exactly which image set is live.
- Emits the ready-to-paste rollback one-liner with the current tag interpolated — copy it before the next deploy and you have a no-rebuild path back to this exact state.
The final lines print the active tag and the exact rollback one-liner.
What deploy.sh does NOT do
The docs service is not in internal_images, so it never gets a :IMAGE_TAG → :latest retag in step 5 (it builds fine; it just isn't aliased). The script also does not run the smoke suite, run migrations explicitly (agent + auth run them on startup), or touch volumes.
Surgical single-service redeploy¶
A full deploy.sh rebuilds whatever changed across six repos. When you've changed exactly one service and want to roll only that container — without re-evaluating dependencies or disturbing healthy services — use a targeted up:
cd /home/cobanov/DATALAND
# Rebuild + recreate ONE service, leave its dependencies untouched:
docker compose -f dataland-infrastructure/compose.yml --env-file .env \
up -d --no-deps --build agent # (1)!
-
--no-depsstops Compose from also recreatingpostgres,redis,rag,museum-apijust becauseagentdepends_onthem — those stay up and serving.--buildrebuilds only the target image from its repo context. The final arg is the compose service name, not the container name (valid names:agent,rag,museum-api,notification-worker,notification-api,information-webui,auth,docs). -
--no-depsstops Compose from also recreatingpostgres,redis,rag,museum-apijust becauseagentdepends_onthem. Those stay up and serving. --buildrebuilds the target image from its repo context.- Pass the compose service name, not the container name:
agent,rag,museum-api,notification-worker,notification-api,information-webui,auth,docs.
Pin the surgical redeploy to the active tag
A bare surgical up builds a :latest image (no IMAGE_TAG exported). To keep the deployed set consistent and immutable, mint a tag the same way deploy.sh does:
export IMAGE_TAG="$(date -u +%Y%m%d-%H%M%S)-$(git -C dataland-infrastructure rev-parse --short HEAD)" # (1)!
docker compose -f dataland-infrastructure/compose.yml --env-file .env \
up -d --no-deps --build agent
docker tag dataland/agent:${IMAGE_TAG} dataland/agent:latest # (2)!
- Mints the tag exactly the way
deploy.shdoes (UTC timestamp + infra short SHA) so a surgical redeploy stays immutable instead of building an anonymous:latest. deploy.sh's retag loop (step 5) does not run on a surgicalup, so you alias:latestto the freshly-built tag by hand — otherwise the next full deploy's boot guard would still resolve:latestto the old image.
An .env change needs up, not restart
docker compose restart <svc> reuses the old environment. To pick up an edited .env, recreate the container: docker compose ... up -d --no-deps <svc>. (See the runbook — ".env change didn't take effect".)
The consumer-name guard restart window (DAT-109)¶
The notification consumers are the one place where a surgical redeploy has a timing constraint. notification-worker and notification-api both join the museum:telemetry Redis Streams consumer group (CONSUMER_GROUP, default notification-service) and each must own a unique consumer name in that group, or two replicas race over the same PEL slot and acks silently collide.
To defend against this, the consumer refuses to start if a consumer with its name is already registered in the group with an idle time short enough to mean it's still alive (app/consumer.py::_reject_duplicate_consumer, DAT-109):
RuntimeError: consumer_name '<name>' is already registered in group
'notification-service' with idle=<n>ms (< pel_reap_idle_ms=60000ms).
Another worker is using this name. Set CONSUMER_NAME to a unique value
or wait for the prior process to age out.
How this interacts with deploys:
- Default
CONSUMER_NAMEis unique per process —hostname-<8 hex>(_default_consumer_name). So a normal recreate (old container stops, new one starts with a fresh uuid suffix) does not collide. The guard is invisible in the happy path. - If you pin
CONSUMER_NAMEto a fixed value, the new container can collide with the outgoing one during the brief overlap of a rolling recreate, or with a still-registered prior process. The guard then aborts the new container.
The restart window
The "still alive" threshold is pel_reap_idle_ms, default 60s (PEL_REAP_IDLE_MS=60000). A consumer whose last activity is older than that is treated as gone — the PEL reaper will reclaim its orphaned entries and re-registering the same name is safe (logged as consumer_name_reclaimed_from_idle_predecessor).
Practically, if you run the notification service with a fixed CONSUMER_NAME, leave ~60s between stopping the old container and starting the new one, or just rely on the unique default. The cleanest surgical redeploy of a fixed-name worker is:
docker compose -f dataland-infrastructure/compose.yml --env-file .env stop notification-worker
# wait > pel_reap_idle_ms (60s) for the old name to age out, then:
docker compose -f dataland-infrastructure/compose.yml --env-file .env \
up -d --no-deps --build notification-worker # (1)!
- DAT-109 — only needed when
CONSUMER_NAMEis pinned to a fixed value. The new replica's_reject_duplicate_consumerguard aborts startup if a consumer with the same name is still registered with idle <pel_reap_idle_ms(default 60s). Stopping first and waiting out that window lets the PEL reaper reclaim the old name before the new container claims it. With the unique default name (hostname-<8 hex>) you can skip the stop-and-wait entirely.
Or, set a fresh CONSUMER_NAME for the new replica and skip the wait entirely.
Rebuilding the docs container¶
docs.dataland.chat is its own Compose service (docs) built from dataland-infrastructure/docs/Dockerfile — a two-stage build that runs mkdocs build --strict (MkDocs 1.6.1 + Material 9.5.44 + pymdown-extensions 10.11.2) and serves the static site/ from nginx:1.27-alpine.
Edits to any page under docs/src/ (this site) or to docs/mkdocs.yml only land after the docs image is rebuilt:
cd /home/cobanov/DATALAND
docker compose -f dataland-infrastructure/compose.yml --env-file .env \
up -d --no-deps --build docs # (1)!
- Edits to any page under
docs/src/or todocs/mkdocs.ymlare baked into the staticsite/at image-build time, so they only land after a--build.--no-depskeeps the rest of the stack untouched. The two-stage build runsmkdocs build --strict, so a broken link fails the build here rather than serving a broken page.
--strict will fail the build on a broken link
mkdocs build --strict (in the Dockerfile) turns broken internal links, missing nav entries, and broken anchors into build errors. A docs redeploy that fails to come up is almost always a --strict failure — check docker logs dataland-docs (build output) and fix the link/anchor before re-running. This is why cross-links between pages must be valid relative paths, e.g. [RAG](../services/rag.md).
How the docs container is served and secured (DAT-291 added the service, DAT-73 the bind policy):
- Basic auth is generated at container start by
docker-entrypoint.sh:htpasswd -bcBfromDOCS_USERNAME/DOCS_PASSWORD, so the password lives only in env + memory, never baked into the image. The entrypoint hard-fails ifDOCS_PASSWORDis unset. - TLS terminates at Cloudflare; the credential never travels plaintext as long as docs is reached through the cloudflared tunnel.
- Health is
GET /healthz(auth-exempt innginx.conf), checked withwget --spider. - Bind policy (DAT-73): published on two host IPs —
127.0.0.1:${DOCS_PUBLIC_PORT:-4148}for the hostcloudflaredsystemd unit (→ publicdocs.dataland.chat) and${DOCS_PUBLIC_BIND:-100.124.170.43}:4148for direct tailnet browser access (spark:4148). Never setDOCS_PUBLIC_BINDto0.0.0.0— basic auth is the only protection.
Adding docs to the public tunnel
If docs.dataland.chat is a new ingress, add the route once in the Cloudflare Zero Trust dashboard (Tunnels → minotaur tunnel → Public Hostname → Add), pointing at http://127.0.0.1:4148. After that, every docs change is just a --no-deps --build docs.
Rollback¶
Deploy strategy is :latest in place — there is no blue-green. Rollback is a redeploy of a previous image. Because every deploy stamps an immutable IMAGE_TAG, the fastest rollback re-points the stack at a prior tag without rebuilding:
cd /home/cobanov/DATALAND
IMAGE_TAG=<previous-tag> \
docker compose -f dataland-infrastructure/compose.yml --env-file .env up -d # (1)!
- No
--build— this re-points the running stack at an already-built prior tag, so rollback is near-instant. The strategy is:latestin place with no blue-green, which is why every deploy stamps an immutableIMAGE_TAGyou can address here. List candidates withdocker images 'dataland/*'.
(deploy.sh prints exactly this line, with the active tag, at the end of every run.) List available tags with docker images 'dataland/*'.
If the bad state is in source rather than an image you still have locally, roll the affected repo back and rebuild:
cd /home/cobanov/DATALAND/dataland-<repo>git log --oneline -10→ identify the last known-good commit.git checkout <sha>(orgit reset --hard <sha>if bad code was already pushed).cd /home/cobanov/DATALAND && ./dataland-infrastructure/deploy.sh(or a surgicalup --no-deps --build <svc>for a single repo).- Confirm health:
docker compose -f dataland-infrastructure/compose.yml ps.
For dependency-upgrade and migration rollbacks (uv.lock revert, alembic hand-revert), see the on-call runbook — the migration path is forward-only, so a bad migration is reverted by hand or restored from a Postgres dump.
Smoke test¶
After a deploy:
Read-only suite per service (agent, museum, rag-v2, notification, information-webui), green/red rollup, logs to dataland-infrastructure/.smoke-logs/.
| Exit code | Meaning |
|---|---|
0 |
all green |
1 |
one or more suites failed |
2 |
one or more skipped (tunnel down, suite missing) |
For full coverage including write paths, open the internal-port tunnel and flip writes on:
ssh -L 4143:127.0.0.1:4143 -L 8080:127.0.0.1:8080 -L 4145:127.0.0.1:4145 \
ege@100.124.170.43 # (1)!
SMOKE_ALLOW_WRITES=1 \
SMOKE_NOTIFICATION_REDIS_CLEANUP=tunneled \
bash dataland-infrastructure/scripts/smoke.sh # (2)!
- Forwards the loopback-only internal ports (
4143,8080,4145) over SSH so the write-path suites can reach services that are bound to127.0.0.1on the box and are not exposed publicly. SMOKE_ALLOW_WRITES=1flips the suite from read-only to exercising write paths — only safe behind the tunnel.SMOKE_NOTIFICATION_REDIS_CLEANUP=tunneledtells the suite to clean up its notification Redis test keys through the forwarded port rather than assuming a local redis.
There is also an end-to-end visit-flow harness, scripts/smoke-visit-flow.sh, which exercises the ticket → telemetry → notification path and tails dataland-notification-worker to confirm rules fire.
Logs¶
docker compose -f dataland-infrastructure/compose.yml --env-file .env logs -f agent
docker compose -f dataland-infrastructure/compose.yml --env-file .env logs -f museum-api
# ...etc
Each service also writes JSONL to ${DATALAND_LOG_DIR:-/home/cobanov/DATALAND/logs}/<service>/. The Docker json-file driver is capped at 10 MiB × 5 rotations (50 MiB max per container, DAT-81) so a chatty service can't fill host disk; the bind-mounted JSONL needs a separate logrotate config — see reports/runbook.md.
Simulator¶
For local end-to-end testing without an Empatica band, layer compose.sim.yml:
SIM_TICKET_ID=<a-real-ticket> \
SIM_REDIS_PASSWORD=$(openssl rand -hex 24) \
SIM_FORCE_HR_SPIKE=1 SIM_FORCE_EDA_SPIKE=1 \
bash dataland-infrastructure/scripts/start-simulator.sh # (1)!
docker logs -f dataland-notification-worker # (2)!
SIM_TICKET_IDmust be a real ticket so the synthetic telemetry maps to a routable visitor.SIM_REDIS_PASSWORDis minted fresh for the isolated sidecar redis, not reused from prod.SIM_FORCE_HR_SPIKE/SIM_FORCE_EDA_SPIKEinject heart-rate and EDA spikes so the notification rules actually fire instead of waiting on organic signal.- Tail the worker to watch the rules trigger end-to-end. Because the overlay points
notification-workerat the sim redis, these are sim events only — the livedataland-redisis untouched and real visitors are undisturbed.
The overlay does three things:
- Spins up a sidecar
dataland-redis-sim(separate volume, loopback port4149). - Spins up
dataland-telemetry-sim, a Python publisher thatXADDs synthetic events tomuseum:telemetryon the sim redis. - Recreates
notification-worker+notification-apiwithREDIS_HOSTpointed at the sim redis. The maindataland-rediskeeps serving every other service, so real visitors aren't disturbed.
Stop with bash dataland-infrastructure/scripts/stop-simulator.sh.
Never replay simulation telemetry into live redis
The playback/simulator path must stay isolated on its own redis container. Replaying museum-simulation Avro/telemetry into the production dataland-redis would corrupt the live museum:telemetry stream that the notification consumers drain.
Destructive reset¶
- Wraps
docker compose down -v --remove-orphansthen a freshup -d --build. The-vdeletes every named volume (postgres-data,qdrant-data,redis-data,auth-data,prometheus-data,alertmanager-data,grafana-data, plus the webui SQLite bind). This is not reversible from inside the stack — readreports/backup-restore.mdfirst, and note a wipedauth-datameans re-provisioning the JWKS mirror (DAT-286,data/extra_jwks.json).
This is docker compose down -v --remove-orphans followed by a fresh up -d --build. -v deletes every named volume — postgres-data, qdrant-data, redis-data, auth-data, prometheus-data, alertmanager-data, grafana-data, plus the webui SQLite bind. Read reports/backup-restore.md first — this is not reversible from inside the stack, and a wiped auth-data means re-provisioning the JWKS mirror (DAT-286, data/extra_jwks.json).
Inspect¶
docker compose -f dataland-infrastructure/compose.yml --env-file .env ps
docker compose -f dataland-infrastructure/compose.yml --env-file .env config # (1)!
docker images 'dataland/*' # (2)!
- Renders the fully resolved Compose config with
.envinterpolated and anchors expanded — the canonical way to confirm what a service will actually run with (env, binds, profiles) before or after a deploy. - Lists every built
dataland/*image tag. Each:IMAGE_TAGhere is a rollback candidate you can feed straight into theIMAGE_TAG=<previous-tag> … up -done-liner.
Adding a new service¶
- Add the service block to
compose.yml, attached todataland-networkand inheriting<<: *common-env. - Pin
mem_limit/cpusto a realistic budget — defaults are unbounded (DAT-80). - Bind host port to
127.0.0.1:(and optionally a*_PUBLIC_BINDtailnet IP, DAT-73) unless it's genuinely a public ingress. Never0.0.0.0. - Add a healthcheck.
- If it builds an internal image, add
dataland/<svc>to theinternal_imagesarray indeploy.shso it gets the:IMAGE_TAG → :latestretag. - Re-deploy. If it's public-facing, add the Cloudflare ingress route (Zero Trust → Tunnels → minotaur tunnel → Public Hostname → Add).
- Document it: copy a service page in
dataland-infrastructure/docs/src/services/, add it todocs/mkdocs.ymlnav(so the--strictbuild passes), and rebuild the docs container.