The page goes off at 2:14 AM. By 2:18 you're on the bridge with two engineers, the on-call from the database team is pulling down logs, and the customer success lead has just joined to ask what to tell the three accounts who already wrote in. By 4 AM the issue is mitigated, by 6 AM it's resolved, and by the time you sit down at 10 to write the postmortem, the only timeline you have is what's in your chat scrollback and what people half-remember.

Most operations teams run incidents with a rough mix of: a bridge call, a chat thread, a few people taking notes on the side, and a runbook nobody reads during the actual incident. The artifacts that survive are usually the chat scrollback and a postmortem written from imperfect memory. The decisions made during the incident — *we ruled out the upstream provider at 2:51 because Sarah pulled the latency graph and said it was clean* — usually don't survive at all.

This post is about what it looks like to run an incident with a workspace that holds the timeline, the decisions, and the conversational record — so the postmortem writes itself from what actually happened.

## What an incident notes system actually needs

The shape is different from a runbook or a postmortem template. Incident notes need to:

- **Capture in real time** without anyone having to format anything during a high-stress moment
- **Hold the timeline** with the exact wall-clock times of every meaningful action and decision
- **Capture decisions and the reasoning** — not just *we restarted the service* but *we restarted the service because Sarah's hypothesis at 3:02 was that the connection pool was exhausted*
- **Hold the comms trail** — what we told customers, when we told them, who approved the wording
- **Survive the handoff** — when the relief on-call comes in at 4 AM, they need full context fast
- **Become the postmortem source material** without anyone having to reconstruct it from chat history

The common failure mode isn't that the notes aren't taken. It's that the notes are taken across five places — chat, a doc, someone's terminal scrollback, the bridge call, the CS team's customer-facing comms — and the synthesis happens days later when half the participants have moved on. Adjacent shapes — daily team coordination and the post-incident audit trail — are in [Run a Daily Standup From Your Notes App](/guides/field-service-ops/daily-standup-from-notes/) and [AI Notes for Compliance and Audit Preparation](/guides/field-service-ops/compliance-audit-preparation-notes/).

## A page per incident, opened the moment the page fires

In Docapybara, every incident gets a markdown page. The convention: title is *INC-YYYYMMDD-HHMM-short-description*, started the moment the on-call confirms there's an incident. A common shape inside:

```
# INC-20260426-0214 — checkout 500s

## Summary
[one-paragraph status, updated as it evolves]

## Current Theory
[what we think is wrong, updated continuously]

## Timeline
[wall-clock entries, append-only]

## Decisions
[material decisions and the reasoning, append-only]

## Comms
[what we told customers and when]

## Followups
[things to chase after resolution]
```

Page nesting holds the rest. Sub-pages for relevant data dumps — graphs pulled, log excerpts saved, screenshots of dashboards. The agent can read across all of them.

Plain markdown means the page is searchable, copyable, and exportable later. When the postmortem is published or shared with a customer, you can hand over the relevant sections as text without needing to rebuild anything.

## A live database for the running timeline

Embed a `:::database:::` directive on the incident page for the timeline. Six column types — text, number, date, select, checkbox, link — give you what you need. A timeline database with columns for time, event, actor, type, link looks like:

| Time | Event | Actor | Type | Link |
|---|---|---|---|---|
| 02:14 | Page received — checkout 500 spike | system | Detection | datadog dashboard |
| 02:18 | Bridge opened, IC: J. Patel | J. Patel | Coordination | |
| 02:23 | Hypothesis: connection pool exhaustion | S. Chen | Hypothesis | |
| 02:27 | Pool metrics pulled, no exhaustion | S. Chen | Investigation | grafana panel |
| 02:31 | New hypothesis: upstream payment provider | M. Liu | Hypothesis | |
| 02:34 | Provider status page green | M. Liu | Investigation | status.example.com |
| 02:51 | Latency graph clean — provider ruled out | S. Chen | Decision | grafana panel |

Sort by time, you have the chronological story. Filter by type, you see only the decisions, or only the hypotheses, or only the investigations.

The incident commander (or anyone on the bridge) just adds rows as things happen. No formatting, no templates, just *time, what just happened, who did it*. The cognitive cost during the incident is small. The value at 10 AM the next morning is large.

## Capy reads the page back when you need it

The assistant inside the workspace can read the running incident page during the incident itself. The kinds of questions that come up at 3 AM:

- *"Summarize the last forty-five minutes for the CTO who just joined the bridge."* — the agent reads the page, returns a one-paragraph summary
- *"What hypotheses have we ruled out and what's still on the table?"* — pulls from the timeline by type
- *"Draft an updated customer comms message based on the current status."* — uses the summary, the running theory, and the comms section history

The relief on-call coming in at 4 AM gets a full handoff in two minutes by asking *"give me a complete handoff: timeline, current theory, what's been ruled out, what we're trying right now, what comms have gone out, what's pending."*

That handoff is the difference between losing momentum at the shift change and continuing the same investigation with the same context. The agent-acts-on-docs idea behind that is described in [Claude Code for Documents](/blog/claude-code-for-documents/).

## Recording the bridge call when it matters

Most incident bridge calls don't need to be recorded. Some do. The one where a major decision is being made and the reasoning needs to survive. The one where multiple teams are negotiating responsibility and the exact wording matters. The one where a customer-facing executive is making a call about disclosure and you need a record of what was said.

Audio with speaker labels keeps each speaker attributable on replay. Drop the recording on the incident page. The transcript drops in alongside. When you write the postmortem the next morning and you need the exact phrase the database team's lead used to describe the failure mode, the answer is searchable text.

The recording question is a workflow choice — what's appropriate to record, who needs to consent, what gets retained. Once your team has settled the policy, the workspace just holds whatever you decide to capture.

## The comms section that protects the customer-facing story

During an incident, the customer-facing message lives in a status page, a few support tickets, and possibly a tweet. After the incident, when a customer asks *"why didn't you tell us until 4 AM that the issue was a database failover?"*, the team needs to know what the team actually said.

The *Comms* section on the incident page is one paragraph per customer-facing update, with the exact wording, the channel, and the time it went out. When the agent helps draft each update, the wording goes into both the channel and the comms section automatically: *"Draft an update for the status page based on the current state. Save the final wording to the comms section."*

The result is that the postmortem has the customer-facing record built in. *We waited 47 minutes before posting the database hypothesis publicly because we wanted to confirm it before stating it as a cause. The first public mention was at 02:58 in the form: 'We are investigating an issue affecting checkout. Customers may see intermittent errors. Updates here every 15 minutes.'*

That's defensible. That's also what most teams can't actually reconstruct after the fact.

## The postmortem that writes itself from the timeline

The morning after the incident, opening the postmortem is a question the workspace can answer. *"Read the incident page for INC-20260426-0214. Draft a postmortem in the standard format: one-paragraph summary, impact, timeline, contributing causes, what went well, what didn't, action items. Use the timeline database for the chronology and the decisions section for the reasoning."*

The draft comes back. It's not perfect. It's also vastly closer than starting from a blank doc.

Where the agent's draft falls short is in the *contributing causes* and *action items* — these need human judgment, the kind of pattern recognition that comes from having lived the incident and the prior similar ones. The workspace can help there too: *"Read the postmortems from the last six months. Are there patterns where this incident shares contributing causes with prior incidents?"*

The agent reads across, returns the patterns, and you decide which to call out in the new postmortem.

## Old postmortems and runbooks count too

Most operations teams have years of postmortem PDFs and runbook docs sitting in a knowledge base nobody opens during an actual incident. They're useful when findable and a problem when not.

Drop them into the workspace. Each one converts to markdown automatically, which means the agent can read across them. During the next incident that looks vaguely familiar, ask *"have we seen this signature before? Pull any past postmortems where the symptom matched."* The agent finds the relevant ones, you read the parts that apply, and you skip a half-hour of investigation.

For an established operations team, this is the layer that turns *we keep solving the same incident every six months* into *the workspace remembers what we learned the last time.* The postmortems compound instead of evaporating. The shape that supports this — runbooks that don't rot — is covered in [Standard Operating Procedures, Without the Wiki Maintenance Tax](/guides/field-service-ops/ai-notes-standard-operating-procedures/).

## Try Docapybara free

You probably can't simulate an incident on demand, but you can prepare the page template before the next one. Open Docapybara, create an `Incidents` section, draft a template page with the headings above, and add a starter timeline database. The next time the page goes off, you open a new page from the template and the structure is already there. [Try Docapybara free](/accounts/signup/), drop in the postmortems from the last twelve months, and see what patterns the workspace surfaces before the next 2 AM page.