Remediation API Reference
Remediation endpoints drive approval workflows, execution audit, and operator control-surface signals.
Base URL: http://localhost:3003 (local) or your deployed core-platform.
Auth + Permissions
All endpoints require dashboard JWT auth.
| Endpoint class | Permission |
|---|---|
| Read actions/stats/history/priority signal | remediation:view |
| Read reporting settings | remediation:view |
| Update reporting settings | policy:update |
| Approve action | remediation:approve |
| Reject action | remediation:reject |
Endpoints
GET /remediation/reporting/settings
Get organization-scoped compliance settings for remediation data retention and scheduled report preferences.
Response — 200 OK
{
"organizationId": "org_123",
"retentionDays": 30,
"scheduledReportEnabled": false,
"reportCadence": "weekly",
"reportRecipients": [],
"lastReportAt": null,
"createdAt": "2026-04-21T00:00:00.000Z",
"updatedAt": "2026-04-21T00:00:00.000Z"
}
GET /remediation/reporting/runs?status=&from=&to=&limit=
List recent scheduled remediation report executions (newest first).
Query params (optional):
status:running | success | failedfrom: ISO timestamp lower bound forstartedAtto: ISO timestamp upper bound forstartedAtlimit: max rows
Response — 200 OK
[
{
"id": "94000000-0000-4000-8000-000000000001",
"organizationId": "org_123",
"cadence": "weekly",
"recipients": ["sre@company.com"],
"deliveredCount": 1,
"failedDeliveryCount": 0,
"deliveryFailures": [],
"deletedCount": 2,
"totalActions": 14,
"failedActions": 3,
"pendingApprovals": 1,
"queuedExecutions": 2,
"status": "success",
"error": null,
"startedAt": "2026-04-20T06:00:00.000Z",
"completedAt": "2026-04-20T06:00:05.000Z"
}
]
GET /remediation/reporting/runs/export/csv?status=&from=&to=&limit=
Export report-run history as CSV.
Response — 200 OK
Content-Type: text/csvContent-Disposition: attachment; filename="remediation-report-runs.csv"
POST /remediation/reporting/runs/:runId/retry
Retry delivery for a specific prior report run (creates a new run record).
Response — 200 OK
Returns new run row.
PUT /remediation/reporting/settings
Update retention and scheduled report preferences.
Request Body
{
"retentionDays": 90,
"scheduledReportEnabled": true,
"reportCadence": "weekly",
"reportRecipients": ["sre@company.com", "compliance@company.com"]
}
Rules:
retentionDays: integer7..3650reportCadence:daily | weeklyreportRecipients: max 20 emails- scheduled delivery uses server-side email adapter (
SMTP_*env config)
POST /remediation/reporting/enforce-retention
Run retention cleanup immediately for current organization using configured retentionDays.
Response — 200 OK
{
"organizationId": "org_123",
"retentionDays": 30,
"cutoff": "2026-03-23T00:00:00.000Z",
"deletedCount": 18
}
GET /remediation/actions?status=&environment=&limit=
List remediation actions from audit trail with optional filtering.
Query Parameters
| Parameter | Type | Default | Description |
|---|---|---|---|
status | string | — | Filter by action status (PENDING_APPROVAL, QUEUED, EXECUTING, FAILED, COMPLETED, etc.) |
environment | string | — | Filter by environment slug |
limit | number | 50 | Max rows returned |
Response — 200 OK
Array of remediation audit rows (latest first).
GET /remediation/stats?environment=
Get aggregate remediation stats used by control center summaries.
Query Parameters
| Parameter | Type | Default | Description |
|---|---|---|---|
environment | string | — | Optional environment scope |
Response — 200 OK
{
"total": 42,
"pending": 3,
"executing": 2,
"completed": 31,
"failed": 6,
"successRate": 73.8
}
GET /remediation/priority-signal?environment=
Get operator-priority signal for Remediation Control Center hero.
Derived from approval pressure, failure pressure, and active execution load.
Query Parameters
| Parameter | Type | Default | Description |
|---|---|---|---|
environment | string | — | Optional environment scope |
Response — 200 OK
{
"level": "warning",
"label": "Warning",
"reason": "Failures rising and approval queue growing.",
"pendingApprovals": 4,
"activeExecutions": 2,
"failedInLast24h": 5
}
POST /remediation/request
Request remediation action for an incident.
This endpoint is what the incident investigation UI calls after an operator confirms Execute on a suggested fix. The frontend maps the selected diagnosis suggestion into the request payload, sends it here, then refetches GET /remediation/history/:incidentId and highlights the newest history row.
Request Body
{
"incidentId": "a0000000-0000-4000-8000-000000000001",
"actionType": "restart-service",
"parameters": { "service": "api-gateway" },
"serviceName": "api-gateway",
"environment": "production",
"severity": "HIGH"
}
Response — 201 Created
Returns the created remediation audit/action row. Status depends on policy evaluation:
| Policy outcome | Expected status | Operator UX |
|---|---|---|
| Manual approval required | PENDING_APPROVAL | Show in pending remediation surfaces. |
| Auto-approved / allowed | QUEUED or execution status | Show in timeline/history; worker may advance to EXECUTING, COMPLETED, or FAILED. |
| Blocked / validation error | Error response | Keep execute dialog open and show error toast. |
Frontend Contract
- Do not optimistically mark execution successful. Treat this call as "request created", not "fix completed".
- On success, refetch incident remediation history and pending-action queries before trusting local state.
- If history refetch returns rows, highlight the newest
createdAtrow for operator feedback. - If suggestion mapping fails client-side, block the request and show
Unable to map suggested action to executable remediation.
POST /remediation/:id/approve
Approve pending remediation action.
Request Body
{
"reason": "Safe change window active"
}
Response — 200 OK
{
"status": "QUEUED"
}
Other successful response shapes:
{
"status": "POLICY_VIOLATION",
"violations": ["Environment production is blocked from automated remediation."],
"detail": "Environment production is blocked from automated remediation. Policy match: global blockedEnvironments includes production."
}
{
"status": "COMPLETED"
}
Semantics
- Only
PENDING_APPROVALactions are queued by approval. - If action is already no longer pending, API returns current
statusinstead of queueing duplicate work. - If safety guardrail blocks approval and
reasonis omitted, response isPOLICY_VIOLATION; no queue job is created. - If safety guardrail blocks approval and
reasonis provided, request is treated as an operator override; action moves toQUEUEDand auditdetailsinclude override reason plus policy violations. 404 Not Foundmeans action ID is not visible in current organization.- A queued approval enqueues BullMQ remediation job; worker later advances status through execution lifecycle.
Operator UX Contract
- Incident detail opens approve dialog first so operator can provide optional reason or override reason.
- Remediation Control Center approves inline with default reason
Approved from remediation control center. - UI may optimistically hide pending card/table row immediately after click.
- If response is
POLICY_VIOLATION, restore hidden row and show policy-violation error/override prompt. - If request rejects or throws, restore hidden row and show backend message when available.
- On success, refetch remediation history/actions and close action sheet/dialog.
POST /remediation/:id/reject
Reject pending remediation action.
Request Body
{
"reason": "Requires manual rollback instead"
}
Response — 200 OK
{
"status": "REJECTED"
}
Semantics
- Only
PENDING_APPROVALactions can be rejected. - Rejecting a non-pending action returns
400 Bad Requestwith message likeCannot reject action in "QUEUED" status — only PENDING_APPROVAL actions can be rejected. 404 Not Foundmeans action ID is not visible in current organization.- Rejection never enqueues execution.
- Audit
detailsinclude actor, optional rejection reason, and prior policy context.
Operator UX Contract
- Incident detail rejects inline with reason
Rejected by operator from incident investigation view. - Remediation Control Center rejects inline with reason
Rejected from remediation control center. - UI may optimistically hide pending card/table row immediately after click.
- On success, show rejected toast, refetch history/actions, and close any open action sheet.
- On error, restore hidden row and show backend message when available.
GET /remediation/history/:incidentId
List remediation action history for one incident.
Response — 200 OK
Array of audit rows scoped to incident and organization.
Frontend Contract
- Treat history as source of truth after request/approve/reject mutations.
- New rows should be sorted by
createdAtwhen choosing newest action to highlight. - Timeline and pending-approval UI should derive final status from refetched history/actions, not from local optimistic state.
GET /remediation/export/csv?status=&environment=&limit=
Export remediation audit rows as CSV for internal compliance/reporting workflows.
Query Parameters
| Parameter | Type | Default | Description |
|---|---|---|---|
status | string | — | Optional status filter |
environment | string | — | Optional environment scope |
limit | number | 500 | Maximum exported rows |
Response — 200 OK
Content-Type: text/csvContent-Disposition: attachment; filename="remediation-audit-export.csv"- CSV columns:
idincidentIdactionTypestatusperformedByserviceNameenvironmentcreatedAtupdatedAtdetailserror