There is no free()
Claude Code's extended thinking is not a use-after-free. The real exposure is indirect prompt injection into a tool-holding agent - OWASP LLM01, ATLAS T0051.
A use-after-free is a memory-corruption primitive. A pointer continues to reference a heap chunk after that chunk has been returned to the allocator. CWE-416. Reclaim the chunk with attacker-controlled data, dereference the stale pointer, and the program operates on a structure it no longer owns. The claim that Claude Code’s extended thinking exhibits use-after-free behaviour fails at the first layer. There is no heap chunk. There is no free(). There is no dangling pointer in a stream of generated tokens. The premise maps a memory-safety concept onto a process with no memory-safety boundary to violate.
The mechanism the term actually describes is worth stating, because the gap is the whole point. A UAF lives in a process heap. On glibc, freed chunks land in the tcache or the fastbins. On Windows, the Low Fragmentation Heap services the same size classes. The exploit primitive is reallocation control. Free an object, allocate a different object of the same size class into the same chunk, and the original pointer now reads or writes a type it was never meant to touch. Type confusion follows. A controlled write into a function pointer, a vtable, or an allocator metadata field follows that. The end state is control of execution. Every step depends on a real allocator managing real virtual memory.
Extended thinking is autoregressive token generation. The model emits intermediate reasoning tokens before it emits the final answer. Each token is sampled from a probability distribution over the vocabulary, conditioned on every prior token. The only state that resembles memory is the KV cache - the cached attention keys and values for tokens already processed. That cache is not a heap object an attacker frees and reclaims with a spray. It is not indexed by an offset the attacker supplies. There is no free list, no chunk reuse, no stale pointer dereference that lands execution on attacker bytes. The structure nearest to an allocation is read-only with respect to anything an external party controls.
Push the analogy as far as it goes and it still breaks. Suppose an attacker influences the context - prior turns, retrieved documents, cached prefixes in a shared serving stack. The worst achievable outcome is altered output, because the cache holds values the model attends to, not pointers the model dereferences. There is no metadata field to corrupt, no allocator invariant to break, no path from a poisoned cache entry to control of the host process running inference. Cache poisoning in a serving stack is a real research area. It changes what the model says, not where the instruction pointer goes. The distance between “the model produced different tokens” and “the attacker controls execution” is the entire exploit, and nothing in the described condition crosses it.
The phrase “bounds checking during extended prompt execution” compounds the error. Bounds checking validates an index against the length of an array before a read or write. The failure mode it prevents is an out-of-bounds access into adjacent memory. Reasoning tokens are not an array indexed by an attacker-supplied offset. A long prompt does not walk a pointer past the end of a buffer. The model can produce wrong, inconsistent, or manipulable output. That is a property of statistical inference, not of an unchecked array index. A logit distribution cannot be heap-sprayed. A fakeobj primitive cannot be landed in a softmax.
The described vulnerability does not exist as stated. No CVE has been assigned, no CVSS vector applies, and no advisory will issue one, because the condition is not a memory-safety defect. That is the accurate finding. It is also the less interesting one, because a real exposure sits underneath, and the fabricated framing buries it.
The genuine concern with an LLM coding agent is the input it trusts and the tools it holds. Claude Code reads files, fetches content, edits source, and executes shell commands under the operator’s identity. The reachable input includes any file in the working tree, any dependency it inspects, and any web content it retrieves. That input is not validated as hostile. The relevant bug class is indirect prompt injection. OWASP catalogues it as LLM01. MITRE ATLAS tracks it as AML.T0051, prompt injection, with indirect variants where the instruction is staged in content the model ingests rather than typed by the user.
The mechanism is a trust-boundary violation, not a memory one. Instructions and data share one channel - the context window. A malicious instruction placed in a README, a code comment, a dependency’s documentation, or a fetched web page enters the context as ordinary text. The model holds no hard separation between content to reason about and commands to obey. When the injected text says to run a command, write a file, or read an environment variable, and the agent holds the tool permission to do it, the instruction reaches the host. This is the RCE-equivalent. It routes through T1059, command interpreter execution, on the host, under whatever identity launched the agent. No memory corruption is involved. The agent was asked, and it complied.
The chain has the shape of a supply-chain compromise, which is why it deserves the attention the UAF framing wastes. Initial access is a poisoned artifact - a dependency whose README carries injected instructions, a pull request with a crafted comment, a documentation page the agent fetches mid-task. ATLAS catalogues the staging under AML.T0051, indirect prompt injection. The agent ingests the content as part of normal operation. If the injection succeeds, it reaches the tool layer on the same turn, with no separate escalation step, because the agent already holds the permissions. Persistence is possible where the injection writes instructions back into project files the agent reads on a later run. None of this requires a single byte of memory corruption.
The inversion is worth naming directly. An LLM does not contain a use-after-free in its reasoning. It can write one into the code it generates. A model emitting C that frees a buffer and uses it on a later path produces CWE-416 in the output artifact. That is a real and measurable failure - insecure code suggestion - and it is the opposite of the claim. The defect lands in the generated source, where a reviewer and a sanitizer catch it, not in the token stream that produced it.
Credential and context exposure is the second real path. The agent’s context window can hold secrets - an .env file it read, an API key in a config, a token pasted into the session. An injection that steers the agent to embed that material in a tool call, a commit, or an outbound request exfiltrates it. Where the context holds regulated data, this is a reportable exposure under the Privacy Act, and under SOCI obligations where the host operates critical infrastructure. The control is scoping what the agent can read and treating its context as a credential store, not a scratchpad.
A red-team exercise against this kind of agent does not fuzz the model for memory bugs. It stages injected instructions in every input the agent trusts - repository files, dependency metadata, retrieved web content, tool output - and measures whether any of them redirect the agent’s tool calls. The objective is an unauthorised command, an unintended file write, or egress of context data. The finding is a permissions-and-trust result, expressed as which inputs reached which tools, not a crash with a controlled instruction pointer. That is the assessment that maps to the real exposure.
Telemetry is where the fabricated framing collapses, because it points defenders at the wrong layer. There is no Sysmon event for a reasoning flaw. The model’s inference is opaque to host telemetry. No EDR sensor reads logits, and no SIEM rule correlates attention weights. What is observable is the tool boundary. When the agent spawns a shell, Sysmon Event ID 1 records the process creation, with the parent being the Node or Python runtime hosting the agent. Windows Security Event 4688 captures the command line where command-line auditing is enabled. Sysmon Event ID 3 records outbound connections from that process tree. File writes surface as Event ID 11. The signal is entirely at the point where the model crosses into the operating system, never inside the model.
That is the detection-engineering reality. The monitored events are agent-initiated process creation, file modification, and network egress - not the content of the reasoning. A useful correlation is a read of untrusted external content followed closely by an anomalous tool invocation, which is the observable shadow of an indirect injection succeeding. The blind spot is the decision itself. The host sees that the agent ran a command, never why it chose to. Detection has to assume the reasoning is unauditable and instrument the consequences.
The same boundary holds in CI/CD. The agent runs under a pipeline identity, and the events to watch are process creation and egress from the build job, with the pipeline’s credentials as the asset at risk. Over-provisioned tokens in that job become attacker capability the moment an injection lands, which is the same failure pattern that turns a scanner into a payload.
The residual reality, post-framing. There is no patch boundary, because there is no vulnerability of the described class. The exposure that does exist is structural - an agent that trusts its input context and holds tools that reach the host and the network. The controls are capability scoping, isolation of the execution environment, treating all ingested file and web content as untrusted instruction input, and logging every tool call as a security event. Suspected compromise of an agent session - anomalous commands, unexpected egress, secret access outside the task - escalates to the team that owns the host identity, the same as any endpoint executing untrusted instructions.
The label was wrong. A use-after-free is a dangling pointer into freed heap memory. Extended thinking is generated text. Conflating the two produces a CVE that will never exist and aims an incident response team at a layer with no telemetry. The actual exploitation vector is prompt injection into a tool-holding agent, mapped to OWASP LLM01 and ATLAS AML.T0051. It is detectable at the boundary where the model touches the operating system, and nowhere before it.
Contains a referral link.
Keep Reading
use-after-freeCypherpunk frees the key schedule twice
UAF in the Cypherpunk Library's context teardown - CWE-416, heap reuse, sandbox-free RCE path, and why EDR misses the corruption stage.
prompt-injectionThe chatbot answered the door for attackers
Meta's Instagram chatbot abuse case is a prompt injection and confused deputy failure. Technical breakdown of the vector, telemetry gap, and residual exposure.
honeypotBinding 65535 ports is the easy part
Architecture and evasion realities of an LLM honeypot binding all 65535 ports - TPROXY, latency tiers, fingerprint defence, and detection traps.
Stay in the loop
New writing delivered when it's ready. No schedule, no spam.