← Blog

When claude.ai loads but the chat doesn't — debugging the gap between front door and inference

· 6 min read
debuggingclaude-aichatinfrastructure

A specific failure mode shows up in our community report sidebar maybe twice a week: the user reports that claude.ai “loads but won’t reply.” Our latency widget for their region shows green. The Statuspage component grid is all green. The user is, by the dashboard’s lights, fine — and yet the user is not getting an answer when they type a prompt.

This is the gap between the front door and the inference path. They are different surfaces, served by different infrastructure, and they fail in different ways. This post is about why the dashboard cannot fully cover the gap, and what users can read off it anyway.

The two surfaces

When you load claude.ai, several things happen, in roughly this order:

  1. DNS resolves the hostname. Cheap, cached, almost always fine.
  2. Edge fetch of the marketing-and-app page. This is the surface our latency widget probes — TTFB to the HTML/JS bundle. If this is slow, you cannot even get the chat UI to render.
  3. JavaScript boot in the browser. Loads the React-or-similar bundle, initializes the chat client, connects to Anthropic’s auth and session services.
  4. Chat session opens — typically a long-lived connection (WebSocket or streamed HTTP) that the client uses to send prompts and receive responses.
  5. Inference call is made through that session for each user prompt; the model produces tokens, the session streams them back.

The latency widget on this dashboard probes step 2. Steps 3, 4, and 5 are downstream and not measured by the widget at all.

When users report “loads but won’t reply,” they have completed steps 1 and 2 successfully. The page rendered. The interesting failure is somewhere in 3, 4, or 5.

Why each step can fail independently

Step 3 (JavaScript boot) is mostly client-side. It can fail because of browser issues (extension conflicts, cache corruption, unusual security configurations) that have nothing to do with Anthropic. These failures are local to the user; no broad signal would surface them.

Step 4 (chat session) depends on auth services and session-establishment infrastructure that is distinct from the static-asset edge. A degradation here typically produces “I can load the page but I cannot start a chat.” This often shows up on the Statuspage as claude.ai going amber or red, but only after the on-call engineer notices and posts.

Step 5 (inference call) depends on the actual model-serving capacity. A degradation here looks like “I started a chat but my prompt is hanging” or “I get partial tokens then nothing.” This is the most common failure mode for the “loads but won’t reply” pattern. The Statuspage reflects it on the API component (which serves model inference for both the web app and direct API users), and our community report sidebar tends to spike with it.

The three steps fail differently and recover differently. Reading the dashboard during an event helps you tell them apart.

What the dashboard can show you, and what it cannot

What it can show:

What it cannot show:

If the dashboard is mostly green and your chat is broken, the most likely explanations in rough probability order:

  1. Local browser issue — extension conflict, cached bad state, network interception. Try an incognito window first.
  2. Session-scoped issue — your specific session got into a bad state. Refresh the page.
  3. Account-scoped issue — something on Anthropic’s side specific to your account or organization. Rare but possible.
  4. Early-incident — Anthropic is in the lead window before classification. Check the community sidebar for spikes.

If the dashboard is amber or red on the API component while latency is still green, you are seeing the canonical signature of “loads but won’t reply” — front door open, inference path degraded.

The lead window

The most informative thing the dashboard does for this failure mode is provide a lead signal during the window between user impact and official classification.

A typical timeline for a step-5 (inference) degradation:

The 5–25 minute window between community-report spike and official classification is where this dashboard provides the most distinctive value. During that window, a user looking at our page sees: latency green, Statuspage green, but community reports loud. That combination is diagnostic. It almost always means an inference-path degradation in the lead window.

Reading the gap as a triage decision

A practical triage flow when “loads but won’t reply” hits you:

  1. Check the latency widget. If your country is red, you are looking at a front-door issue, not an inference issue. Different problem, different fix.
  2. If latency is green, check the API component on the grid. If amber/red, the inference path is officially degraded. Wait it out or fall back.
  3. If both are green, check the community sidebar. If the count is elevated and the trend is rising, you are likely in the lead window of an inference incident. Behave as if step 2 already happened — i.e., treat it as a real incident.
  4. If everything looks calm, refresh the page and try in an incognito window. Most likely a local issue.
  5. If incognito also fails, sign out and sign back in. Session-scoped issues recover this way.
  6. If still broken, file a ticket with Anthropic support. Account-scoped issues need provider intervention.

This flow takes about 90 seconds. It eliminates most of the false-positive “Claude is broken” reactions and isolates the cases where the issue is genuinely on the inference path.

What direct API users see

The same gap shows up differently for users hitting api.anthropic.com directly. There is no front door for them — every request is an inference call. So step 2 is not a thing they ever see; their entire experience is step 5.

For API users, “loads but won’t reply” translates to: requests return 200 status codes but responses arrive slowly, or responses are incomplete (truncated streams, malformed JSON, missing tool-use blocks). The dashboard’s inference-path indicators (API component status, community reports) are the relevant signals; the latency widget is not, because their problem is not at the marketing edge.

The general lesson

A status dashboard that probes only the front door will routinely show “all green” during incidents that affect only the inference path. The marketing edge being healthy is a necessary but not sufficient condition for the service being usable.

Our latency widget, by itself, is not the answer to “is Claude usable right now.” It is the answer to “is the front door open.” Combining it with the Statuspage component grid (which Anthropic uses to classify inference-path incidents) and the community report sidebar (which catches the lead window) produces a more complete picture, but even the combination has gaps — incidents narrow enough to affect only a small subset of users will not produce a community spike, and the official Statuspage will only catch them once the on-call detects internally.

The honest framing is: the dashboard tells you most of what is going on most of the time, with a known set of gaps. “Loads but won’t reply” is the most common failure mode that lives in those gaps. Reading the three signals together, with awareness of what each can and cannot show, gets you most of the way there. The rest is patience and refresh.

Share this post