The First Real Agentic Supply Chain Incident

What Mitiga’s MCP Token Hijack Tells Us About the Operational Substrate of Modern Developer Tooling

May 14, 2026

In brief
Mitiga documented a way to steal Claude Code’s OAuth tokens: a malicious npm package rewrites the config file so the assistant’s MCP traffic runs through an attacker’s proxy, and the tokens are lifted in transit.
What makes it new is persistence. Rotating the stolen token feeds the next one to the attacker, and editing the config back gets silently rewritten. The standard incident-response actions become the attacker’s update channel.
Anthropic ruled it out of scope, because the attack needs prior code execution on the machine. That is defensible on its own terms and still operationally inadequate, because the agentic design changes what code execution is worth.
This is the first clean case of an agentic supply chain attack: the compromise sits in the channel that mediates access, not in a shipped artifact, so it spreads through use rather than updates. The blast radius is every SaaS integration the developer connected, and the traffic looks identical to legitimate use on the provider side.
The root is OAuth token concentration: long-lived, broadly scoped tokens held in a user-writable file are what make the assistant useful and what make the attack work. The defender’s threat model is now the tool plus everything the tool can reach.

The disclosure landed on 7 May 2026 and was, on its face, a routine credential theft research note. Mitiga Labs reported that an attacker who could land a malicious npm package on a developer’s machine could redirect Claude Code’s MCP traffic through attacker-controlled infrastructure, intercept the OAuth tokens used to authenticate to connected SaaS providers, and maintain persistent access even as the user rotated those tokens. After Mitiga notified Anthropic on 10 April, Anthropic responded on 12 April that the issue was out of scope, on the grounds that prior code execution on the user’s endpoint is a precondition the model is not designed to defend against. The exchange was civil, both parties were defensible, and the disclosure became another entry in the steadily growing catalogue of agentic AI security findings.

If that were the whole story, it would not be worth a long article. The vulnerability is technically interesting but not surprising. Post-install hooks in npm packages have been a credential exfiltration vector since the technique was first popularised in 2018. Local configuration files containing OAuth tokens in plaintext have been a familiar exposure since at least the introduction of the AWS credentials file. Researchers and attackers alike have known for years that any tool which persists tokens in a user-writable location is one bad install away from compromise.

What makes the Mitiga finding worth taking seriously is the architecture the technique exploits, rather than the technique itself, and the way that architecture changes the meaning of a routine supply chain incident. I will argue in this piece that the MCP token hijack is the first cleanly documented case of an agentic supply chain attack, in a specific and uncomfortable sense. The compromise of one developer’s environment becomes durable, self-healing, attacker-controlled access to every SaaS platform that developer has connected to their AI assistant, with traffic that is indistinguishable from legitimate use on the provider side and invisible on the endpoint UI. The cost of the attack does not scale with the value of the target; the value of the access scales with how many integrations the developer has thoughtfully wired up.

This is a category change, and it deserves to be named.

The mechanics

Claude Code stores its configuration, including the URLs of registered MCP servers and the OAuth tokens issued by those servers, in a single file at ~/.claude.json. The tokens sit in plaintext, the file is writable by the user, and the MCP server URLs are not pinned to any cryptographic identity. Whatever URL is in the configuration is the URL that Claude Code connects to when initiating or refreshing an MCP session.

Mitiga’s attack chain begins with a malicious npm package. The package registers a lifecycle hook that runs as part of the install. The hook locates common code repository clone locations, populates them with a pre-configured trust dialog state set to true (so that no prompt fires when the directory is later opened in Claude Code), opens ~/.claude.json, and rewrites the mcpServers entries to point at an attacker-controlled proxy address. The proxy runs mitmproxy with the appropriate configuration to act as a transparent man-in-the-middle for the MCP protocol. Every subsequent request from Claude Code to that MCP server passes through the attacker’s infrastructure, the OAuth bearer tokens in the Authorization headers are captured, and the traffic is then forwarded to the legitimate provider so the user experience continues uninterrupted.

The persistence properties are the part that elevates this above a one-shot credential theft. If the user notices the compromise and rotates the affected token, the hook is still resident. On the next load it captures the new token. If the user notices that the MCP URL in the configuration is wrong and edits it back to the legitimate value, the hook rewrites it again. The standard incident response actions (rotate the credentials, restore the configuration) feed the chain rather than breaking it. Mitiga’s description of the outcome is precise. A durable redirection of the victim’s SaaS credentials into attacker-controlled infrastructure, with automatic recovery from token rotation, invisible to the victim’s endpoint UI, and indistinguishable from legitimate traffic on the provider’s side.

The point about provider-side indistinguishability is worth dwelling on. Mitiga published an example Atlassian audit log entry from a compromised session. The user, the session, and the IP address resolving to Anthropic’s egress range are all genuine, and the action looks like exactly the kind of action the user performs every week (a JQL query pulling tickets that mention credentials). Nothing in the row is wrong; nothing in the row would trigger any reasonable detection rule based on volume, geography, time-of-day, or action type.

This is the model of detection failure that the SaaS security industry has spent the last decade trying to escape. We built CASBs, we built ITDR platforms, we built behavioural analytics, all in service of being able to distinguish legitimate user activity from compromise. Mitiga’s research describes a compromise that defeats all of that, not by being more sophisticated than the detection logic, but by routing through a legitimate intermediary that the detection logic has already learned to trust.

Why Anthropic’s response is correct on its own terms

The most provocative element of the disclosure is Anthropic’s determination that the issue is out of scope. I want to take this seriously rather than dismiss it, because the logic is not unreasonable.

The argument goes like this. The attack requires the adversary to achieve code execution as the user on the developer’s machine, by getting the developer to install a malicious npm package. Once an adversary has code execution as the user, many things are possible: reading the user’s SSH keys, harvesting browser cookies, installing a keylogger, replacing the user’s git binary with a compromised version. Treating one specific consequence of endpoint compromise (in this case the rewriting of a configuration file) as a vendor-specific vulnerability is, in this view, a category error. If we are going to fix that, we have to fix everything that depends on the user being able to write to their own home directory, and at that point we have left the threat model of a developer tool and entered the threat model of a hardened operating system.

Mitiga acknowledges this directly. The blog post says, more or less plainly, that the determination is defensible on those terms: the user has code execution, many things become possible, and this is the world we live in.

The reason this defence is correct on its own terms and yet operationally inadequate is that the agentic architecture changes what “many things are possible” actually means. Let me unpack this carefully.

When a traditional developer tool stores credentials in a user-writable location, the consequence of endpoint compromise is that the attacker gets those credentials, can use them until they are rotated, and then has to find another way back in. The chain has a natural termination point. Token rotation works because the credential is a static artefact, and stealing it once gives the attacker a finite amount of value.

When an agentic tool stores OAuth tokens for a federated set of SaaS providers in a single configuration file, alongside the URLs of the MCP servers that issued those tokens, the consequence of endpoint compromise is different in kind. Rather than stealing a credential, the attacker redirects the channel by which credentials are continuously refreshed. The agentic tool itself becomes the renewal mechanism, and token rotation, the canonical defensive response, becomes the attacker’s update channel.

This is not a metaphor but a direct description of what the Mitiga proof of concept demonstrates. The malicious post-install hook does not exfiltrate the tokens and leave: it installs a persistent rewrite of the configuration that turns Claude Code into a credential-feeding pipeline for the attacker. Every refresh, every new connection, every session restart feeds the chain.

The architectural property that enables this is not specific to Claude Code but is the same property that makes MCP useful in the first place. The protocol exists to give an AI assistant durable, broad, OAuth-mediated access to a heterogeneous collection of SaaS resources. That is the design intention. The token concentration that makes the attack so effective is what makes the assistant work at all, which is why you cannot fix this with a configuration change: the configuration is the feature.

What “agentic supply chain attack” actually means

The phrase supply chain attack has been overused to the point of being almost analytically useless, applied to everything from SolarWinds to dependency confusion in npm. I want to be careful about why I am calling the Mitiga finding the first agentic supply chain incident, because the claim is doing more work than the phrase usually does.

A traditional software supply chain attack compromises a component that is incorporated into a downstream product. Updates to that component flow to the users of the downstream product, and the compromise spreads with the updates. The classic example is a build pipeline that ships a malicious binary. The compromise is in the artefact.

An agentic supply chain attack, in the sense I am using it, compromises a component that mediates access to downstream resources. The compromise spreads through use rather than through updates. Each interaction the user has with the agentic tool flows credentials, intent, and context through the compromised mediator. The compromise is in the channel.

This is a different shape of incident. The Mitiga research demonstrates it cleanly because the attack target is the MCP routing layer itself, rather than any specific tool or credential. Once that layer is compromised, every connected resource is reachable, and the reachability is determined by what the user has authorised the assistant to access rather than by what the attacker has explicitly targeted.

The implication for defenders is that the relevant blast radius of a compromise is the set of SaaS integrations a developer has connected, multiplied by the duration of the compromise, multiplied by the difficulty of distinguishing legitimate traffic from attacker-mediated traffic on the provider side. All three quantities run large: developers connect their assistants to everything they touch daily, token rotation does not break the chain, and the traffic flows through legitimate session contexts.

This is also why Anthropic’s out-of-scope determination, while technically defensible, is operationally inadequate. The vendor’s threat model is the agentic tool. The defender’s threat model is the agentic tool plus everything the agentic tool can reach. Those two threat models have diverged sufficiently that they are no longer the same conversation.

The OAuth token concentration problem

A useful way to think about what is happening here is to look at how OAuth tokens have historically been protected and what changed.

In a well-designed enterprise SaaS environment, OAuth tokens are short-lived, narrowly scoped, and held by services that have strong identity, well-audited code paths, and clear ownership. When a token is compromised, the response is well understood. Rotate the token, audit the actions taken with it, review the scope grants, tighten if needed.

The agentic developer tooling pattern violates each of those properties. Tokens stay long-lived because the user does not want to re-authenticate every hour; they are broadly scoped because the assistant is meant to be useful across many tasks, with the consent screen presented once at setup rather than per action; and they are held by a desktop tool running as the user, with code paths that are difficult to audit because they include both the deterministic tool and the non-deterministic model. Ownership is ambiguous. Is the token held by the user, by the developer tool vendor, or by the SaaS provider that issued it? Legally, by the user; operationally, by whatever process can read the configuration file.

Each individual decision in that chain has a defensible rationale: long-lived tokens reduce friction, broad scopes enable the assistant to be useful across heterogeneous tasks, and user-held configuration is consistent with the principle that the developer owns their own machine. None of these are obviously wrong choices. The interaction of these choices produces the concentration that the Mitiga attack exploits.

I want to be explicit about something here. This pattern extends well beyond Anthropic. The same concentration exists for GitHub Copilot CLI, for Cursor (whose token handling has been flagged by LayerX as well, with the report that Cursor does not store API keys in protected storage), for any IDE plugin that uses OAuth-mediated MCP, and for the broader family of agentic developer tools that are now standard in software engineering workflows. The Mitiga research happens to use Claude Code as the example because Claude Code’s configuration file is a particularly clean target. The pattern generalises.

What changes in the threat model

For organisations running developer teams that use agentic tooling, the practical implication of the Mitiga research is that the threat model for endpoint compromise has changed shape, and several standard responses no longer work as designed.

The first change is in the value of an individual compromised developer endpoint. Historically the value of compromising a developer’s laptop was bounded by what that developer could access. The blast radius was the union of the developer’s permissions. With agentic tooling, the blast radius is the union of the developer’s permissions plus the set of SaaS integrations the assistant has been given OAuth grants to, with durability that survives credential rotation. The developer is now a higher-value target than they were before, often without realising it.

The second change is in the detection model. Endpoint compromise has historically been detected through endpoint telemetry, network anomalies, or SaaS audit logs. The Mitiga attack defeats all three: endpoint telemetry sees normal Claude Code operation, network monitoring sees traffic to claude.ai (the expected destination), and SaaS audit logs see actions taken in a real user session from the expected IP range. Detection has to move into the configuration drift of the agentic tool itself, looking for changes to MCP server URLs, OAuth refresh patterns that do not match the user’s known schedule, and unexpected traffic through MCP integrations. Most organisations have no monitoring for any of these signals because they are properties of a tool category that has only existed at scale for about eighteen months.

The third change is in the response model. The standard playbook for a suspected developer endpoint compromise (reimage the machine, rotate all credentials, audit recent actions in connected services) needs an additional step: the configuration files of any agentic tooling on the machine must be examined for redirection, and the OAuth grants in those tools must be revoked at the provider rather than merely refreshed. If the malicious hook is still resident, refreshing only feeds the chain.

The fourth change is in the procurement and policy model. Developer-friendly tools have historically been adopted bottom-up: engineers install them, find them useful, and adoption spreads through teams before procurement notices. With agentic tooling this pattern produces a particular kind of risk because the security properties of the integration are not visible to the developer at the moment of adoption, while the cost is paid by the organisation at the moment of incident.

The wider pattern

This is not the first finding of its kind, and it will not be the last. In the same week as the Mitiga disclosure, LayerX published its ClaudeBleed research showing that any Chrome extension could pilot the Claude Chrome assistant through a similar trust composition failure, with similar persistence properties and similar invisibility to standard monitoring. Earlier this year LayerX’s research on Claude Desktop Extensions showed how content arriving through a low-risk connector (a calendar event) could trigger code execution through a high-risk connector on the same machine, with the vulnerability earning a CVSS 10 out of 10 and Anthropic declining to fix it on the same consent-based grounds.

The pattern is consistent. Agentic tooling concentrates trust across previously independent layers, and compromise of any one layer cascades through all of them. Vendor responses correctly point out that each individual compromise requires preconditions that are outside the vendor’s threat model, while defenders correctly observe that the operational reality is that the compromises happen anyway, and the standard responses do not work.

I have called this an operational substrate problem in other work, and the framing applies cleanly here. The substrate sits beneath the governance and policy layer; it is what the technology depends on to actually function. When that substrate is the concentration of long-lived OAuth tokens in a user-writable configuration file managed by an agentic tool that can be reconfigured by a post-install hook, the governance layer’s claims about responsible AI use cannot reach down far enough to control the actual risk.

What this looks like for the practitioners who have to act on it

For a CISO or security lead trying to figure out what to do about this Monday morning, the immediate steps are not difficult to enumerate. Identify the developers in the organisation using agentic tooling, which usually means Claude Code, GitHub Copilot CLI, Cursor, or one of the smaller players. Inventory the MCP integrations connected to those tools, which usually means at least source control, project management, and chat. For each integration, identify what OAuth scopes have been granted and what those scopes allow. Establish monitoring for changes to the configuration files of the agentic tools, which is straightforward at the endpoint level if you have any endpoint detection platform. Review provider-side audit logs not for anomalous user behaviour, which will not show up, but for traffic from unexpected MCP server URLs or unexpected refresh patterns.

The harder work sits upstream of all this. Procurement of agentic developer tooling needs to start being treated with the same care that procurement of any other privileged access tool would receive, and the threat models published by the vendors are not sufficient for organisations that have to deal with the consequences of trust composition failures the vendors have explicitly declared out of scope. “Endpoint compromise” is no longer a single state with a single value; the value of an endpoint now depends on what agentic tools live on it and what those tools have been authorised to reach.

The Mitiga research is a marker, not a panic event. The marker says that the comfortable separation between developer tool security, identity and access management, and SaaS posture management has become operationally fictional. Anyone running security for a developer-heavy organisation is going to have to reckon with this, and the sooner the reckoning starts, the cheaper it will be.

Sources and references

Kevin Townsend. “Claude Code OAuth Tokens Can Be Stolen Through Stealthy MCP Hijacking.” SecurityWeek, 7 May 2026. https://www.securityweek.com/claude-code-oauth-tokens-can-be-stolen-through-stealthy-mcp-hijacking/
Mitiga Labs. “MCP Token Theft in Claude Code: A Man-in-the-Middle Attack Chain via ~/.claude.json.” Mitiga blog, May 2026. https://www.mitiga.io/blog/claude-code-mcp-token-theft-mitm
CXO Digital Pulse. “Researchers Warn Claude Code OAuth Tokens Can Be Stolen Through Stealthy MCP Hijacking.” May 2026. https://www.cxodigitalpulse.com/researchers-warn-claude-code-oauth-tokens-can-be-stolen-through-stealthy-mcp-hijacking/
Aviad Gispan. “ClaudeBleed: How Any Chrome Extension Can Hijack Claude.” LayerX Security blog, 5 May 2026. https://layerxsecurity.com/
LayerX Security. “Claude Desktop Extensions Exposes Over 10,000 Users to Remote Code Execution Vulnerability.” 12 February 2026. https://layerxsecurity.com/blog/claude-desktop-extensions-rce/
RedCaller. “MCP Client OAuth Refresh-Token Support Matrix.” 2026. https://www.redcaller.com/docs/references/mcp-client-oauth-refresh-token-support
TrueFoundry. “MCP Authentication in Claude Code 2026 Guide.” 2 April 2026. https://www.truefoundry.com/blog/mcp-authentication-in-claude-code
Getlarge Blog. “Securing MCP Servers with OAuth2: Ory Hydra + Claude Code + ChatGPT.” 30 January 2026. https://getlarge.eu/blog/securing-mcp-servers-with-oauth2-ory-hydra-claude-code-chatgpt/
Anthropic. Claude Code documentation on MCP server configuration and OAuth authentication. https://docs.claude.com
Anthropic Release Notes. May 2026 updates on OAuth and credential reliability in Claude Code. https://releasebot.io/updates/anthropic
Marco Brondani. “OSRA: Operational Substrate Risk Audit.” Methodology and worked examples. https://marcobrondani.com/osra

Marco Brondani

Discussion about this post

Ready for more?