Read the mark hidden in your bot's requests

A claim is circulating: Claude Code embeds steganographic markers in the requests it generates. Set the accusation aside. The framing is what matters. This is not the claim that a model mass-watermarks every token for provenance. It is narrower. Specific requests carry a hidden mark, placed deliberately, readable after the fact. That is not brute force. That is targeted marking. Targeted marking is a reconnaissance primitive, and reconnaissance primitives get modeled differently from noise.

Separate what is established from what is asserted. Text steganography in machine-generated output is trivial, documented for years, and nearly invisible to standard logging. Established. That any specific vendor build ships a covert marking channel by default is not established. Treat the vendor-specific claim as a hypothesis to verify against captured output, not as a finding. The mechanism holds regardless of who runs it. The same primitive works when an attacker marks a target’s own automated traffic - which is the version detection engineers should plan for. The distinction is not academic. A defender who plans only for the vendor case builds the wrong detection and misses the attacker case entirely.

Three techniques get flattened into one word, and the flattening is where analysis fails. Steganography hides a payload inside a cover channel. The mark is a message. Watermarking embeds a robust statistical signal across many tokens. The mark is a probability distribution. Fingerprinting identifies a source without hiding anything. The mark is a correlation. Marking a request for later recognition is steganography fused with fingerprinting. It hides a short, near-unique token in output that renders clean to a human and survives copy-paste. The goal is not to carry data. The goal is to make one request recognizable later.

The carriers are Unicode. A string that looks like plain ASCII can hold codepoints that occupy zero visual width. Zero-width space U+200B, zero-width non-joiner U+200C, zero-width joiner U+200D, word joiner U+2060, and the byte-order mark U+FEFF render as nothing in most surfaces. Chain them and there is a binary channel inside whitespace. The Unicode Tags block, U+E0000 to U+E007F, mirrors ASCII 0x00 to 0x7F one-to-one and renders invisibly. It is the basis of the ASCII-smuggling technique documented against LLM interfaces, where hidden instructions ride tag characters that the model tokenizes and a human never sees. Variation selectors extend the method: VS1 to VS16 at U+FE00-U+FE0F and the supplement at U+E0100-U+E01EF can append to a base glyph and smuggle arbitrary bytes, demonstrated publicly in 2025. Homoglyph substitution, Cyrillic а for Latin a, is a lower-bandwidth carrier that survives aggressive normalization.

None of this is new. CVE-2021-42574, the Trojan Source class, weaponized Unicode bidirectional override controls U+202A to U+202E to make source code read one way to a human and compile another way. Rated high severity, and it worked because review pipelines rendered the deception invisible. The reader trusts the render. The render lies. Invisible-character attacks against code review, against WAF regex, against copy-paste trust - the pattern is a decade deep. Marking requests is the same physics pointed at a different objective.

Statistical watermarking is the other branch and it behaves nothing like the above. Kirchenbauer’s 2023 method seeds a green-list partition of the vocabulary from a hash of preceding tokens, then biases the sampler toward green tokens. A detector runs a z-test on green-token frequency and flags machine origin without ever seeing a hidden character. Google DeepMind’s SynthID-Text ships this class in production. Meta has published adjacent watermarking work. The consequence for defenders: the mark is not in the bytes, it is in the token-selection distribution, and it is undetectable without the key. If a scheme is statistical rather than character-based, character-level scanning finds nothing. That is not the absence of a mark. That is the wrong detector.

Now the behavior model, because this is where the topic lives. An operator who wants intelligence does not spray. Spraying is loud and it generalizes. Marking is quiet and it specifies. A steganographic request mark is valuable precisely because it is targeted. It answers a question the observer already holds. Which of these thousand inbound requests came from the automated agent and not a human. Did the request that hit the staging API also hit production. Is the same tool driving traffic across three systems that do not otherwise correlate. A unique invisible token embedded at generation, surviving into logs, tickets, commits, and forwarded messages, lets an observer stitch a target’s automated activity together after the fact. That is reconnaissance - MITRE T1592 and T1589 class collection of victim host and identity information - carried by a channel the target’s own tooling propagates for free.

Model the operator, not the tool. An intelligence collector optimizes for two things: signal that survives handling, and signal that does not draw attention. A high-bandwidth covert channel, kilobytes hidden per message, is fragile and conspicuous. It breaks under re-encoding and it spikes under entropy analysis. A single recognizable token of a few bytes is neither. It survives paraphrase of the surrounding visible text because it is not in the visible text. It survives forwarding, quoting, and pasting into a ticket. It does not raise entropy because it is short. The collector is not moving data. The collector is planting a recognizable object in content that humans and systems will carry to places the collector cannot otherwise reach. That is why the targeted framing matters. The value is the correlation the mark enables, not the payload it carries.

The persistence is the underrated part. A visible fingerprint - a distinctive comment style, a header ordering, a TLS JA3 hash - lives in one layer and dies when that layer is stripped. An invisible-Unicode mark lives in the content itself and rides wherever the content goes. Generated code committed to a repository. A summary pasted into a Jira ticket. A message forwarded to a vendor. Each hop is a fresh observation point for anyone holding the mark and a detector. A request that started in one system becomes an identifiable object in five. Network-layer fingerprinting cannot do that. Content-layer marking can, and that reach is the entire reason an operator would choose it.

The obfuscation maps cleanly. T1001.002, data obfuscation through steganography. T1027.003, obfuscated information via steganography. T1205, traffic signaling, when the mark gates or triggers downstream behavior. Recon value and C2 value are separable but built on one primitive: a covert channel that rides legitimate content and passes review. An attacker does not need to exfiltrate through it. Recognition is enough. Knowing which request is which is intelligence, and intelligence precedes the intrusion rather than following it. MITRE has no clean technique for prompt-layer marking, and that gap in the matrix is itself a detection gap - analysts map to what exists and skip what does not.

What defenders see is close to nothing, and the reason is structural. Logging pipelines normalize. SIEM ingestion strips control characters, collapses whitespace, coerces to ASCII or lossy UTF-8, and truncates fields. Every one of those steps destroys the mark before an analyst queries it. The zero-width run is gone at ingest. The tag characters are dropped by the JSON encoder. The variation selectors are mangled by the terminal that rendered the log line. The mark existed on the wire and does not exist in the index. Cloudflare, WAF tiers, and CDN normalization sit in the same position. They may pass the codepoints through untouched or scrub them, and which one happens is rarely documented and almost never tested. Defenders are blind not because the signal is faint but because their own pipeline deletes it upstream of detection.

Detection has to run before normalization, on raw captured bytes. The rule set is enumerable. Flag any format-class Cf codepoint outside expected ranges. Flag the Tags block U+E0000 to U+E007F outright, because legitimate text does not use it. Flag zero-width runs across U+200B-U+200F, U+2060-U+2064, and U+FEFF. Flag bidi controls U+202A-U+202E and U+2066-U+2069. Flag variation selectors on non-emoji base characters. Run entropy analysis on whitespace sequences. Maintain a confusables map for homoglyph substitution. The detection logic is cheap. The hard part is capturing the raw request before a well-meaning parser sanitizes the evidence. For statistical watermarks, character scanning is useless - that needs the scheme’s own detector and key, which a defender will not hold, which makes statistical marking a verify-at-source problem rather than a SIEM problem.

The technical reality is narrow and it holds. The vendor-specific accusation is unverified and should be tested directly. Capture raw output at the byte level, enumerate codepoints outside printable ASCII, diff across repeated identical prompts, and look for a stable invisible token that shifts per session or per request. That test is deterministic and it settles the question without speculation. Independent of the accusation: invisible-Unicode carriers are real, they survive into systems that trust text, and most logging deletes them before anyone looks. That gap is the exposure. A request mark only serves an observer if it reaches the observer intact, and the same normalization that blinds defenders can also break the mark - which cuts both ways and is worth measuring rather than assuming. Teams that find stable non-printable codepoints in captured agent traffic should preserve the raw bytes and escalate to their security function rather than sanitize and move on. The evidence lives in exactly the characters the pipeline is built to discard.

Read the mark hidden in your bot's requests

Keep Reading

Mythos AI cleared for distribution, no validation report

The patch opens the attack window.

Contagious Interview ends at npm install

Stay in the loop