Skip to content

Codex Web Response Latency Has Worsened in Measurable Public Signals Since May 2026

For / Key Points

For: Developers and team leads who rely on Codex Web, the Codex app, or CLI/IDE extensions and need to triage recent latency issues.

Key Points: - Since May 7, 2026, OpenAI Status has shown a cluster of Codex-related degradation records - GitHub Issues and OpenAI Developer Community reports show parallel complaints about slowness, reconnect loops, and stream disconnects - User-side triage should start with surface switching, reasoning settings, and context-size reduction

Codex Web feeling "recently slow" became visible in public records in early May 2026. From May 7 to May 11, OpenAI Status listed multiple incidents involving Codex or adjacent model infrastructure. During the same window, GitHub Issues and the OpenAI Developer Community showed reports of slow responses, reconnect loops, and stream disconnects123.

This article answers one question: is the Codex slowdown a local environment problem, or should it be treated as a service-side degradation signal?

The practical answer is that early May 2026 includes enough public service-side signals to take seriously. That does not mean every slow turn has the same root cause. Official incidents, user reports, and local configuration need to be separated.

The strongest evidence is the public OpenAI Status timeline.

Between the evening of May 7 and May 11, multiple records touched Codex directly or the model layer Codex can depend on. The main public records are:

DateOfficial recordCodex relevance
May 7 17:15 - May 8 08:31Elevated transcription failures affecting ChatGPT & CodexCodex was listed as an affected component12
May 8 12:31 - 14:14Increased error rate for gpt-5.5 model in the APIAn update also mentioned higher-than-normal latency8
May 8 15:07 - 16:17Degraded Performance with Codex Cloud TasksFollow-up prompts were failing at a higher-than-expected rate7
May 11 01:23 - 05:35Elevated errors in ChatGPT uploads and Codex Cloud task creationCodex Cloud task creation was affected1
May 11 16:11 - 17:59Elevated error rates with GPT 5.5Not Codex-specific, but adjacent to the model layer9

This table does not prove that every Codex Web user experienced identical latency. OpenAI Status also notes that availability metrics are aggregated across tiers, models, and error types, so individual customer experience may vary.

Still, the service-side signal is too strong to dismiss as a single user's Wi-Fi problem. There were multiple Codex-adjacent public degradation records in the same short window.

GitHub Issues Separate "Slow" From "Disconnecting"

Outside the official status page, user symptoms are visible in GitHub Issues.

Issue #21527, opened on May 7, reports that both the VS Code extension and Codex app were extremely slow for a Pro user on Codex App 26.429.61741. The reported model was gpt-5.5 fast, which matters because a faster model choice did not remove the perceived delay2.

Issue #15334, opened on March 20, is useful context because it shows that slow Codex responses existed before May. The reporter said even simple questions took ten minutes or more, while ChatGPT Web worked smoothly on the same machine and network4. That makes a purely local-network explanation weak.

Issue #18960, opened on April 22, describes a different symptom: Codex App streaming failed because the WebSocket was closed server-side before response.completed. The user saw reconnect attempts such as Reconnecting... 2/5 through 5/5, followed by a failed turn5.

Slowness and disconnects are technically different. For users, they collapse into the same outcome: waiting, retrying, and losing development flow.

The "2x Capacity Until May 30" Rollout Is a Candidate Trigger, Not a Confirmed Root Cause

The May 12 Developer Community post captures the user-side suspicion well.

The poster described a rollout they called "2x capacity until May 30" and said Codex became dramatically less stable afterward. They reported reconnect loops, stream disconnects, frozen thinking states, and interrupted generations after one to two minutes, even while only around 50% of usage had been consumed3. OpenAI's Help Center separately confirms that, for a limited time, Codex is included with Free and Go plans and that other plans receive 2x rate limits6.

The important distinction is correlation versus causation. The 2x rate-limits messaging exists, and there are user reports close to it in time. But OpenAI has not publicly said that the 2x rollout changed routing, capacity, or WebSocket behavior in a way that caused disconnects.

The safe reading is this: the 2x rate-limits rollout is a plausible timeline marker, not a confirmed root cause.

That caution matters. If the cause is decided too early, users may skip settings and context checks that are still under their control.

The Likely Degradation Factors Sit in Three Layers

The public record points to three layers rather than one single failure.

The first layer is service incidents. On May 8, OpenAI described Codex Cloud Tasks degradation where follow-up prompts failed at a higher-than-expected rate7. On May 11, Codex Cloud task creation was again listed as affected1.

The second layer is model instability. GPT-5.5 incidents were recorded on May 8 and May 11, and the May 8 incident update explicitly mentioned higher-than-normal latency89. When Codex surfaces depend on that model layer, model-side latency can show up as Codex UI latency.

The third layer is user-side context and configuration. OpenAI's Help Center says Codex usage depends on task size, complexity, and execution surface, and that larger codebases, long-running tasks, and extended sessions can require more context6. An OpenAI Codex repository maintainer also recommended medium reasoning for day-to-day coding and noted that xhigh increases token usage and turn times10.

Separating these layers gives users a practical path. They cannot fix OpenAI's incident, but they can reduce factors that amplify the delay.

User-Side Triage Steps

Start with OpenAI Status and your own settings.

If Codex alone is slow during a known incident window, check the service status before debugging your network. Then try the same lightweight task across Codex Web, the Codex app, and the CLI or IDE extension. Whether only Web is slow, Codex Cloud task creation is failing, or local CLI sessions are also slow changes the next diagnostic step.

Next, inspect reasoning and model settings. If routine work is running with xhigh, switch back to medium and retest. If possible, compare GPT-5.5 with another available model on the same small prompt to separate model-layer delay from surface delay.

Finally, reduce context. In large repositories, use .codexignore to exclude dependencies, build outputs, large data files, and generated artifacts. InventiveHQ's troubleshooting guide also suggests measuring time_connect and time_starttransfer, then comparing a tiny prompt with a context-heavy prompt to isolate context overhead11.

A useful operational order is:

  1. Check OpenAI Status
  2. Run the same small task across multiple Codex surfaces
  3. Reset reasoning to medium
  4. Reduce context with .codexignore
  5. Look for concurrent GitHub Issue or Community reports

This will not fix a service incident. It does prevent local settings and service degradation from being mixed into one vague "Codex is slow" diagnosis.

Single-Vendor Dependence Is a Development-Stoppage Risk

This Codex latency episode shows what happens when AI coding agents become production development infrastructure.

The more teams delegate implementation, review, and PR work to an agent, the more service degradation turns into direct development downtime. Model benchmark scores are not enough. Teams also need to evaluate incident frequency, recovery time, limit transparency, and pricing clarity.

In the short term, the realistic hedge is keeping multiple agent harnesses usable. GitHub Discussion #9588 includes a user saying they moved toward Claude Code because of Codex slowness and behavior regressions10. That is not merely tool preference. It is redundancy for development flow.

Summary

Codex Web response latency is backed by public records in early May 2026. OpenAI Status shows Codex Cloud and GPT-5.5-related degradation, while GitHub Issues and the Developer Community show reports of slowness, disconnects, and reconnect loops.

At the same time, it is too early to reduce the cause to "2x capacity until May 30." What is confirmed is the existence of 2x rate-limits messaging and nearby user reports of instability.

The practical response is narrower: check Status, switch surfaces, review reasoning settings, reduce context, and keep another harness ready. If AI agents are now part of the main development path, reliability has to become a procurement criterion alongside raw capability.