Flagsmith Flag Rollback Runbook¶
Owner: Platform Ops
Last updated: 2026-05-22
Related ADRs: ADR-0018 §F (multi-tenancy + feature flag governance)
Related auxiliary memory: reference_flagsmith_v2_env_patch.md, reference_flagsmith_v2_value_patch.md, feedback_flagsmith_dual_env.md
Overview¶
Curaway uses Flagsmith for runtime feature flagging with V2 environment versioning enabled. Every flag exists in two environments — Production and Development — and rollback always flips both, never one. This runbook covers:
- The general procedure for flipping any V2-versioned flag back to OFF
- Per-flag rollback semantics for flags with known asymmetric side effects, starting with
auto_invoke_matcher_on_intake_complete
If a flag is not listed in the per-flag section below, follow the general procedure and assume symmetric semantics (flipping OFF reverts behavior cleanly for all subsequent traffic).
Curaway Flagsmith environment reference¶
| Environment | env_id (int) | env api_key |
|---|---|---|
| Production | 85219 |
X4CdBvak98wpn6Ljq7eUSs |
| Development | 85220 |
SCE375zGzViFpGZhWoiK7D |
- Project ID:
36214 - Admin token: Railway env var
FLAGSMITH_ADMIN_TOKEN(Production project). Local: pull from Railway withrailway variables --service backend | grep FLAGSMITH. - API base:
https://api.flagsmith.com
Both envs are V2-versioned (EnvironmentFeatureVersion rows present on every featurestate). The legacy unscoped PATCH endpoint /api/v1/features/featurestates/{id}/ returns HTTP 400 with the message "This environment uses v2 feature versioning. Use the environment feature version endpoint instead." — do not use it.
General procedure — flip a V2-versioned flag back to OFF¶
Step 1 — Confirm with SD before flipping¶
Flag flips are shared-state changes. Always confirm with SD before running the POST calls, especially on Production. The dual-env-flip is the default shape of the operation — not a license to skip the confirmation.
Step 2 — Look up the featurestate ID in each environment¶
The featurestate ID differs between Prod and Dev. Look up by feature name in each env first:
# Production
curl -s -H "Authorization: Token $FLAGSMITH_ADMIN_TOKEN" \
"https://api.flagsmith.com/api/v1/environments/X4CdBvak98wpn6Ljq7eUSs/featurestates/?feature_name=<FLAG_NAME>" \
| jq '.results[0] | {id, enabled, environment_feature_version}'
# Development
curl -s -H "Authorization: Token $FLAGSMITH_ADMIN_TOKEN" \
"https://api.flagsmith.com/api/v1/environments/SCE375zGzViFpGZhWoiK7D/featurestates/?feature_name=<FLAG_NAME>" \
| jq '.results[0] | {id, enabled, environment_feature_version}'
The environment_feature_version UUID confirms V2 is enabled. Record both id values — call them <PROD_FS_ID> and <DEV_FS_ID>.
Step 3 — Look up the feature ID¶
Feature ID is shared across envs (it's per-project). Look it up once:
curl -s -H "Authorization: Token $FLAGSMITH_ADMIN_TOKEN" \
"https://api.flagsmith.com/api/v1/projects/36214/features/?search=<FLAG_NAME>" \
| jq '.results[0] | {id, name}'
Record as <FEATURE_ID>.
Step 4 — POST a new environment-feature-version with enabled: false¶
For boolean flips on V2 envs, the body must include feature_state_value even when null. Use {"type": "unicode", "string_value": null} to signal "no value" for boolean flags.
Production:
curl -s -X POST -H "Authorization: Token $FLAGSMITH_ADMIN_TOKEN" \
-H "Content-Type: application/json" \
"https://api.flagsmith.com/api/v1/environments/85219/features/<FEATURE_ID>/versions/" \
-d '{
"feature_states_to_update": [{
"id": <PROD_FS_ID>,
"enabled": false,
"feature_state_value": {"type": "unicode", "string_value": null}
}],
"feature_states_to_create": [],
"segment_ids_to_delete_overrides": [],
"publish_immediately": true
}'
Development:
curl -s -X POST -H "Authorization: Token $FLAGSMITH_ADMIN_TOKEN" \
-H "Content-Type: application/json" \
"https://api.flagsmith.com/api/v1/environments/85220/features/<FEATURE_ID>/versions/" \
-d '{
"feature_states_to_update": [{
"id": <DEV_FS_ID>,
"enabled": false,
"feature_state_value": {"type": "unicode", "string_value": null}
}],
"feature_states_to_create": [],
"segment_ids_to_delete_overrides": [],
"publish_immediately": true
}'
Expected response: HTTP 201 with the new EnvironmentFeatureVersion UUID in the response body. publish_immediately: true activates the new version atomically — no second "publish" step needed.
Step 5 — Verify the flip landed¶
Re-run the GET from Step 2 against both envs and confirm enabled: false:
for KEY in X4CdBvak98wpn6Ljq7eUSs SCE375zGzViFpGZhWoiK7D; do
curl -s -H "Authorization: Token $FLAGSMITH_ADMIN_TOKEN" \
"https://api.flagsmith.com/api/v1/environments/$KEY/featurestates/?feature_name=<FLAG_NAME>" \
| jq '.results[0] | {env: "'$KEY'", enabled, environment_feature_version}'
done
Both rows should show enabled: false and a new environment_feature_version UUID.
Step 6 — Verify backend picked up the change¶
Flagsmith SDKs cache for ~60 seconds by default. After ~1 minute, hit an endpoint that reads the flag and confirm the OFF path runs. For flags that show up in logs (e.g., flag-gated branches with a log line), tail Railway logs:
Step 7 — Document the rollback¶
- Comment on the originating PR with the rollback timestamp + reason
- Post in
#opsSlack with the new version UUIDs - If the flag has known asymmetric semantics (see per-flag section below), capture any data-cleanup work that remains
CONFIG flag rollback (value flip, not boolean)¶
For CONFIG flags where you need to change the value (not just enabled), the body shape is the same but feature_state_value.string_value carries the JSON-encoded payload:
{
"feature_states_to_update": [{
"id": <FS_ID>,
"enabled": true,
"feature_state_value": {
"type": "unicode",
"string_value": "[\"decided_on_destination\"]"
}
}],
"feature_states_to_create": [],
"segment_ids_to_delete_overrides": [],
"publish_immediately": true
}
Gotcha: the GET returns feature_state_value as a flat string, but the POST body requires the dict shape. The string_value should be the JSON-encoded payload (escaped quotes for list flags). See reference_flagsmith_v2_value_patch.md for the full pattern.
Per-flag rollback semantics¶
auto_invoke_matcher_on_intake_complete¶
- Flipped ON: Production + Development, 2026-05-22 (per
config/feature_flags.yamldefault remainsfalse; runtime override was applied per-identity for SD dogfooding before tenant-wide flip on 2026-05-21). - Default:
false(inconfig/feature_flags.yaml). - Code path:
app/agents/auto_invoke_matcher.py(maybe_auto_invoke_matcher); called fromapp/agents/orchestrator_phases/intake_triage.py:~403. - Predicates checked when ON: intake gates pass + decision_stage ∈ {
comparing_options,ready_to_commit} + records_requested +workflow_state.matching_complete is False(idempotency). - What it does when ON: Inside the intake_triage phase, when the gates pass, the orchestrator auto-invokes the matcher and emits a
match_resultsrich-content card to the patient instead of falling through to the Triage Agent's stock "matches in 24-48 hours" deferral.
Asymmetric semantics — read before rolling back¶
Rollback (flag → false) is safe-but-asymmetric:
- Safe for the patient: Cases that already fired the matcher card keep their card. The patient sees no regression — the match_results card persists in the conversation transcript and the patient's view of the case is identical before and after the rollback.
- Asymmetric on case state: Those cases carry
workflow_state.matching_complete = Truepermanently (set inorchestrator_phases/matching.py:201,278). They never re-enter the auto-invoke path because of the idempotency guard atauto_invoke_matcher.py:89(if ws.get("matching_complete"): return). - Inconsistency that remains: Cases that fired the matcher card via the flag have
matching_complete=Truebut may carryehr_constructed=Falsebecause the flag-on path does not synthesize the EHR before invoking the matcher in some branches. The inconsistency is invisible to the patient but visible in admin views and to any downstream service that filters onehr_constructed. - Safe-default direction: Default is OFF. Rolling back is the safe direction. The asymmetry is in the direction new cases stop using the flag-on path, NOT in the direction old flag-on cases get reverted (they don't — there is no migration to undo their
matching_complete=True).
What rollback does NOT do¶
- Does not delete already-rendered
match_resultscards from patient transcripts. - Does not unset
workflow_state.matching_complete=Trueon already-routed cases. - Does not roll back any
match_resultsrow in Postgres (those are durable matching outputs, not flag-derived state).
Cleanup steps if a data-fix is required after rollback¶
If the architecture review's flagged concern (the matching_complete=True + ehr_constructed=False cohort) becomes load-bearing — e.g., a downstream filter starts producing wrong counts — the cleanup is:
- Query the affected cohort:
SELECT id, tenant_id, patient_id
FROM cases
WHERE (workflow_state->>'matching_complete')::bool IS TRUE
AND (workflow_state->>'ehr_constructed')::bool IS NOT TRUE
AND created_at >= '<flag_on_timestamp>';
- Either backfill the EHR via the retro-rebuild path (
scripts/retro_ehr_rebuild.pyper PR #1075/#1080) or resetmatching_completeon the cohort and let the next intake turn re-invoke the matcher (only safe if the flag is being flipped back ON).
When to flip this flag back ON¶
Re-enable only after: - The asymmetric semantics above are addressed via either a synchronous EHR-before-matcher write or a documented acceptance that the inconsistency does not affect any downstream consumer - A regression test exercises both the flag-ON and flag-OFF paths (per the PR #1088 / #1092 / #1094 Maria-replay pattern) - Architecture review re-signs-off
Reference scripts¶
scripts/flip_mso_flags.py— legacy boolean-only flipper; uses the env-scoped PATCH endpoint. Does NOT work on V2 envs (returns HTTP 400). Extend to use the version endpoint before next boolean flip.scripts/sync_flagsmith.py— YAML ↔ Flagsmith sync; run with--dry-runfirst to verify drift.scripts/create_v6_flags.py— pattern for creating new flags via the V2 endpoint.
Related runbooks¶
runbook/triage-tuning.md— for triage threshold flags (uses per-identity overrides, not env-level)runbook/deployment.md— for the full deploy sequence including post-deploy flag verification
Open items¶
flip_mso_flags.pyneeds an update to use the V2 environment-feature-versions endpoint. Tracked in the work queue.- The dual-env rollback flow should be wrapped in a single script (
scripts/rollback_flag.py FLAG_NAME) to remove the per-step copy-paste risk during incident response.