← Blog

What 'partial outage' actually means at Anthropic — a Statuspage taxonomy

· 8 min read
statuspagetaxonomyincidentsreading

Anthropic’s public Statuspage encodes incidents using four impact levels: none, minor, major, and critical. These are not arbitrary. The Atlassian Statuspage product has internal definitions for each, providers populate them under their own incident-management discipline, and downstream consumers — like this dashboard — render them into colors and badges that drive user reactions.

Most users misread them. Specifically: most users overreact to minor and underreact to the difference between major and critical. This post is a working guide to what each level actually means at Anthropic, what the underlying user experience tends to look like, and which ones should change your behavior.

The four levels, defined

The Atlassian Statuspage product documents these as the following gradient:

LevelDefinitionColor in our UI
noneNo active issue. The component is operating within its normal envelope.green
minorDegraded performance. The service is up, but slower or less reliable.amber
majorPartial outage. A meaningful subset of users cannot use the surface.red
criticalMajor outage. Broad, severe impact across the surface.red (with banner)

These map cleanly to the Statuspage component states operational, degraded_performance, partial_outage, and major_outage, in the same order.

Two facts about how Anthropic uses them are worth knowing.

They are set by humans, often during the incident. The level is not auto-detected from telemetry. An on-call engineer or incident commander chooses it when posting the incident. They will adjust it as the picture clarifies. This means the level can lag the user experience by minutes — sometimes by half an hour.

The level is per-component, not per-account or per-region. A single component flipping to major does not mean every API user is broken; it means a meaningful subset is.

What each level actually looks like

none — green

Everything you would expect. The component grid on this dashboard shows green dots and full uptime bars. Latency in your region is normal. Community report counts in the sidebar are at background levels (typically zero to two reports per hour).

This is the state Claude is in roughly 95–98% of any given week. It is not interesting and that is the point.

minor — amber

The most-misread state. “Minor” sounds like “small,” and users often interpret amber as “you might experience small slowness.” That undersells it.

In practice, minor at Anthropic means the component is up but a measurable subset of requests is failing or much slower than usual. Claude API in minor usually means:

For an interactive user — someone chatting on claude.ai — minor is often invisible. The web app retries gracefully and the user sees a slightly longer wait on a few responses.

For a programmatic user — someone hitting the API in a tight loop — minor is usually visible as a small uptick in errors that exponential backoff handles. If your retry policy is correct (see our 529 playbook), minor events will mostly look like a temporary 1–3% bump in your error metric.

If you are running a high-throughput batch job, minor is the right moment to slow down voluntarily — not because Anthropic told you to, but because your retries during a minor incident contribute to the load that is keeping it minor.

major — red

This is where users tend to underreact. “Major” looks just as red as “critical” in most renderings, including ours, and that creates a perception that all-red is all-the-same. It is not.

major at Anthropic means a meaningful subset of users cannot use the surface at all. Specifically:

If you are in the affected subset, your experience is indistinguishable from a critical outage. If you are not in the affected subset, you may not notice anything.

The right reaction to major is investigate the subset. Read the Statuspage incident text — Anthropic almost always names the affected scope in the first update. Common phrasings:

If your traffic does not match the named scope, your retry policy is fine. If it does match, switch to your fallback path immediately and stop retrying.

critical — red, banner

Rare. Roughly one to three times per quarter on a typical month. critical means broad, severe impact — the kind of incident where the official Twitter account posts an acknowledgement and the support team’s queue measurably spikes.

Practically, critical means:

The right reaction is to stop sending requests, communicate the outage to your own users, and switch to whatever degraded path your application has. Retries during critical events are not just useless, they extend the recovery time.

Why the same color for major and critical, then?

Because the difference between “lots of users broken” and “all users broken” matters less than the difference between “some users broken” (red) and “everyone slow” (amber). For someone glancing at the dashboard, the actionable distinction is amber-vs-red, not red-vs-red. We use a banner for critical to flag the worst case, but we deliberately do not invent a fifth color.

How to read the level alongside everything else

The impact level is one signal. Two others are visible on the same dashboard, and using them in combination gives you a more accurate picture than the level alone.

The component grid below the banner. Tells you which surface is affected. If the banner is red because Claude for Government is in major but you are an API user, your code path is fine.

The community reports in the sidebar. Tells you whether what users are reporting matches what Anthropic has classified. A minor incident with a sudden spike of 30 community reports per hour is probably under-reported by Anthropic — happens occasionally during the time it takes to escalate from minor to major. A major incident with five community reports per hour is probably narrower than it looks.

The latency widget. Tells you whether the surface is reachable at all. Useful during ambiguous incidents.

The level is a starting point, not an answer. Read it together with the other three.

Common misreads

A few patterns we see frequently:

“It says minor so it’s nothing.” No — minor means measurable degradation. If your application is sensitive to error rates (a real-time user-facing feature), even minor should change behavior, even if the change is just “back off and retry less aggressively.”

“It says all systems operational so my problem must be on my side.” Maybe. But the indicator can lag the user experience by minutes during the start of an incident. If the latency widget is also red and community reports are spiking, your problem is probably not on your side, even if Statuspage has not caught up yet.

“Major and critical are the same.” They look the same in most renderings, but they imply different actions. major rewards investigating which subset is affected. critical rewards stopping work and communicating.

“Resolved means fixed for everyone.” It means Anthropic believes the underlying cause is addressed. Some users on long-lived connections, cached errors, or stale routing may continue to see the symptoms for several minutes after resolved. Your error metric returning to baseline is the real signal.

When to use the official Statuspage versus this dashboard

Both are useful, for different reasons.

The official Anthropic Statuspage is the source of truth for impact levels, official incident text, and the post-incident timeline. Anything we publish about an incident is downstream of what Anthropic publishes there. If you are writing a postmortem for a customer or a regulator, cite Anthropic’s page, not ours.

This dashboard adds three things that the official page does not: a community-report sidebar that surfaces user experience faster than incidents are classified, a 17-country latency widget that is independent of Anthropic’s own telemetry, and a single page that combines them — which is the page you can leave open in a tab during an outage and trust to update on its own.

Together, they tell you both what Anthropic believes is happening and what users are reporting. Most of the time those line up. When they do not, you have learned something useful about the lag between user experience and official acknowledgement, and that lag is the most actionable thing a status dashboard can teach you.

Share this post