← Blog

How we calculate 30-day uptime — and why our number is lower than the marketing figure

· 8 min read
methodologyuptimesre

If you compare the 30-day uptime number on this site with the figures Anthropic might quote in a sales deck, ours will almost always be lower. That is not a bug, and it is not us being mean — it is a deliberate choice about what “uptime” should mean for a user of a hosted AI service.

This post walks through the exact math, runs it against a real recent incident, and shows why two honest people can both report different uptime numbers for the same service in the same month.

The textbook formula, and why nobody actually uses it

The textbook definition is simple:

uptime = (total_seconds − downtime_seconds) / total_seconds

In a 30-day window, total_seconds = 2,592,000. If a service was hard-down for 60 minutes during the month, that is 3,600 seconds of downtime, and the figure is (2,592,000 − 3,600) / 2,592,000 = 99.86%.

That formula is fine for a single binary “up or down” service. It falls apart the moment any of these things is true:

  1. The service has multiple components. Claude has five (claude.ai, platform.claude.com, the API, Claude Code, Claude for Government). Any of them being down counts against “Claude is up” for the user who relies on that surface, but does not count for users on the other surfaces.
  2. There are multiple impact levels. Statuspage encodes incidents as minor, major, or critical. A “minor” event is degraded performance — the service is still answering, just slowly or with elevated error rates. Should that count as downtime?
  3. Incidents overlap. During cascading failures, providers commonly post a parent incident plus child incidents covering the same window. Naive summing double-counts.
  4. The window includes “now.” If you are calculating uptime mid-day, the denominator should be elapsed seconds, not the full day, otherwise your number is misleadingly high in the early morning UTC.

Every status dashboard makes a choice on each of these. Most providers’ marketing numbers make the most generous choice every time. We make the strictest defensible choice every time.

Our exact math, in four rules

For each of the last 30 days, for each component, we compute a daily uptime number using these rules:

Rule 1: Per-component, per-day, in UTC

The window is 00:00 UTC to 24:00 UTC for that day. For “today,” the upper bound is now. We do not align to the maintainer’s timezone — the canonical timestamps on Statuspage are UTC, so the math is UTC.

Rule 2: Critical and major impact count as outage; minor counts as degraded

Both reduce uptime. The day’s color in the strip is determined by whether outage seconds > 0 (red), or only degraded seconds > 0 (amber), or neither (green).

Rule 3: Overlap each incident’s window with the day, then merge

For incident i with start s_i and end e_i (where e_i = now if the incident is unresolved), the contribution to day d is the overlap interval [max(s_i, d_start), min(e_i, d_end)]. If max(s_i, d_start) >= min(e_i, d_end), the contribution is zero — the incident did not touch this day.

After collecting overlap intervals for all incidents that affect this component on this day, we merge overlapping intervals before summing duration. This is the step almost every naive implementation gets wrong. Two simultaneous incidents from 14:00–15:00 UTC contribute one hour of downtime, not two.

The merge is a standard sweep:

sort intervals by start
for each interval in order:
  if it starts before the last merged interval ends:
    extend the last merged interval to cover both
  else:
    push it as a new merged interval
sum (end − start) over the merged set

Rule 4: For “today,” the denominator is elapsed seconds

If you call our uptime endpoint at 00:01 UTC, the denominator for “today” is 60 seconds, not 86,400. This avoids a class of bug that plagues homemade status pages: the page reports 99.999% at 00:01 because almost no time has passed and no downtime has happened, then drops to 99.5% at noon because of a 2-hour incident the previous evening.

The 30-day percentage in the page header is the simple arithmetic mean of those daily uptimes across all monitored components. We chose the arithmetic mean rather than weighting by seconds because users tell us they want each day to count once — three full-day component outages should drop the number more than one short critical incident, even if the critical incident ate more total seconds.

Worked example

Suppose a hypothetical day on Statuspage looks like this for the API component:

IncidentImpactCreatedResolved
Incident Amajor12:00 UTC13:30 UTC
Incident Bmajor13:00 UTC14:00 UTC
Incident Cminor20:00 UTC21:00 UTC

A naive sum gives 90 + 60 + 60 = 210 minutes of downtime. But A and B overlap from 13:00 to 13:30. After merging A and B you get one major-impact interval from 12:00 to 14:00 = 120 minutes, plus C’s 60 minutes of degraded performance separately.

The day’s uptime is:

outage_seconds   = 120 * 60 = 7,200
degraded_seconds = 60 * 60  = 3,600
total_down       = 10,800
day_seconds      = 86,400

uptime = 1 − (10,800 / 86,400) = 87.5%

A “marketing” calculation might:

All three of those calculations are defensible if you read the contract carefully. None of them is wrong. They simply answer a different question than “what fraction of the day was the service usable as advertised.”

Why we picked the strict version

Three reasons.

It matches what users feel. When a user reaches for Claude and the API is timing out, they do not file a contract claim, and they do not want a marketing number. They want a status page that confirms: yes, this is real, you are not imagining it, here is the data. The strict version is closer to that reality.

It is reproducible. Anyone can pull the same Statuspage feed we use (the endpoints are listed on our Methodology page) and arrive at the same number. The marketing-style calculations require knowing which incidents the provider chose to exclude, which is not always public.

It degrades gracefully. During a quiet month, the strict version is barely lower than the lenient version. During a bad month, the difference is large and informative. A status page whose number always says “99.9%” is not communicating anything; ours is.

How this compares to other providers

We do not claim ours is the only honest math, just that it is the math we picked and the math we explain. For comparison:

If you want the rolled-up number to be different, we would rather you understand exactly how ours is built and disagree with it on the merits than be surprised by it.

Where the math lives

The implementation is small and unglamorous. The interval-merge function is about 12 lines of TypeScript. The per-day loop is another 30. There is no machine learning, no proprietary smoothing, and no data the public API does not give you. If you want to fork it, the Methodology page lists the upstream endpoints; everything else is bookkeeping.

What matters is that the math is the same every day, written down here, and visible alongside the number it produces. That is the whole brand promise of this site: the number you see has a definition, and the definition is on the page.

Share this post