Incidents View

Track and manage production incidents across your Meteor application.

Overview

The Incidents view gives you a central place to manage production incidents -- whether they come from alert triggers or manual reports from your team. You can find it in the sidebar under Incidents.

Each incident tracks:

What went wrong
Current investigation status
Who's involved
Timeline of updates and status changes

Severity Levels

Incidents use four severity levels:

Severity	Description	Typical Response
SEV1	Critical	All hands on deck, customer-facing impact
SEV2	Major	Significant degradation, needs immediate attention
SEV3	Minor	Limited impact, can be addressed during business hours
SEV4	Low	Minimal impact, fix when convenient

Choose the severity that matches the actual user impact. You can always change it later as you learn more about the issue.

Incident Lifecycle

Every incident moves through a defined lifecycle:

Investigating → Identified → Monitoring → Resolved → Postmortem

Status Descriptions

Investigating - You know something is wrong but haven't pinpointed the cause yet
Identified - Root cause found, working on a fix
Monitoring - Fix deployed, watching to confirm it holds
Resolved - Issue confirmed fixed, normal operation restored
Postmortem - Post-resolution analysis phase

Status transitions are tracked in the incident timeline, so you always have a record of when each phase started.

Creating Incidents

From Alerts

When an alert fires, SkySignal can automatically create an incident. The alert's severity, metric data, and context are carried over to the incident so you don't have to re-enter anything.

Manually

Navigate to Incidents in the sidebar
Click Create Incident
Fill in the details:
- Title - Short description of the issue
- Severity - SEV1 through SEV4
- Description - What you know so far

Manual creation is useful for issues reported by users or caught during manual testing that haven't triggered any alerts.

Incident Timeline

Each incident has a timeline that records:

Status changes (with who made the change and when)
Updates and notes added by team members
Related alert triggers
Resolution details

The timeline is the source of truth for what happened during an incident. It's especially valuable during postmortem analysis when you need to reconstruct the sequence of events.

Postmortem Analysis

Once an incident is resolved, move it to Postmortem status to document what happened:

Root cause - What actually went wrong
Impact - How many users were affected, for how long
Timeline review - Key events and decisions during the incident
Action items - What to do to prevent recurrence

Postmortems are optional but recommended for SEV1 and SEV2 incidents. They help the team learn from incidents and improve reliability over time.

Next Steps

Alerts - Configure alerts that create incidents automatically
Alerting Guide - Set up alert rules and notification channels
Errors View - Investigate errors related to incidents

Overview​

Severity Levels​

Incident Lifecycle​

Status Descriptions​

Creating Incidents​

From Alerts​

Manually​

Incident Timeline​

Postmortem Analysis​

Next Steps​