Incident response
Last reviewed 2026-05-14
Severity levels
| Severity | Definition | Examples | Response target |
|---|---|---|---|
| Sev 1: Critical | Confirmed breach, exfiltration, widespread outage, or regulator-reportable privacy event | Verified data exfil; auth bypass in production; ransomware | IC engaged < 15 min; customer notice < 4 h; status page < 30 min |
| Sev 2: High | Suspected breach, major outage affecting all users, payment-system failure | Stripe webhook broken for hours; XSS with user impact; major data corruption | IC engaged < 30 min; status page < 30 min |
| Sev 3: Medium | Partial outage, elevated error rate, limited-scope vulnerability found internally | Single feature broken; isolated data issue on one tenant | Triage < 2 h business-day; fix within 7 d |
| Sev 4: Low | Minor bug, no user impact | Cosmetic issue, log noise | Backlog |
Roles
| Role | Description |
|---|---|
| Incident Commander (IC) | Security Officer. Owns the response and documents decisions, timestamps, and escalations. |
| Technical Lead | Diagnoses, implements mitigation. |
| Communications Lead | Drafts customer/regulator comms with IC sign-off. |
| Scribe | Maintains the incident timeline in a running doc. |
For smaller response teams, roles may be combined; every decision and timestamp is logged.
Phases
1. Detection
Sources:
- Sentry alerts (Fatal; elevated error-rate)
- Uptime monitors (status page integration)
- Stripe webhook failures
- Hourly audit-log anomaly monitor for MFA failures, MFA disablement, high admin mutation volume, all-session revocation bursts, and emergency-change creation
- Customer report (via
[email protected]or in-app support) - Responsible disclosure via security.txt
Alerts at or above configured thresholds are reviewed by the Security Officer; an incident record is opened when the trigger criteria above are met.
2. Triage
- Assign severity (see table).
- Open an incident channel / document (private).
- Begin timeline log.
3. Containment
- Sev 1/2: immediate actions may include: rotate secrets, revoke all sessions, disable affected feature via flag, firewall-block attacker IP, freeze payments.
- Do NOT delete logs / evidence.
- Preserve forensic state where feasible (snapshot affected Firestore collections before remediation).
4. Eradication
- Identify root cause.
- Remove malicious artifacts or vulnerability.
- Patch and deploy; verify fix.
5. Recovery
- Restore service; monitor for re-occurrence for 24 h.
- Communicate resolution to customers.
6. Post-incident (within 14 days)
- Write a post-mortem (template below) regardless of severity for Sev 1/2.
- Store a private post-mortem in the restricted incident evidence set and publish a public-safe summary to the trust center where appropriate for Sev 1/2.
- Action items tracked as GitHub issues.
Post-mortem template
# Post-mortem: <title>
Date: <YYYY-MM-DD>
Severity: <1|2|3|4>
Detection: <how + when>
Duration: <detection → resolution>
Customer impact: <scope; # affected; data involved; regulatory?>
Timeline: <chronological events>
Root cause: <5-whys>
Resolution: <what was done>
What went well: <>
What went wrong: <>
Action items: <owner, due date>
Evidence preservation
- Sentry retention: 90 d.
- Audit log retention: 2 y.
- Vercel logs: vendor-retained platform/runtime logs plus the enabled production/preview Vercel log drain to Sentry for selected sources.
- Incident documents: retained indefinitely.
Do not delete logs during an active incident.
Customer and regulatory notification
- Per breach notification: GDPR 72 h, state laws per jurisdiction, contractual obligations per customer DPA.
- Template communications are maintained in the restricted incident-response evidence set.
Drills
Tabletop exercises quarterly: pick a scenario (SQL/NoSQL exfil, compromised admin, vendor outage), walk through response in 20 minutes, and retain the record in the restricted incident-drill evidence set.