RC RANDOM CHAOS

The machine quoted an EFF staffer who never existed

A news system generated quotes from EFF staff who never existed because it resolves references without confirming that what they point to is real.

· 7 min read
The machine quoted an EFF staffer who never existed

A news site published articles containing the names of Electronic Frontier Foundation staff members who do not exist. Each name appeared inside a quotation. Each quotation was formatted as direct attribution, presented as something a specific person had said about a specific event. The people were not real. The statements were never given. The articles were generated and published as routine output, not as a flagged anomaly.

The structure of each fabrication was consistent. A name, an affiliation to a known organization, a title, and a quote that fit the surrounding narrative. The EFF checked its own roster and confirmed the names belonged to no one on staff, past or present. The system repeated the behaviour across multiple stories. It did not produce these attributions once and stop. It produced them the way it produced everything else.

The system did not distinguish the invented names from real attributions elsewhere in its output. It generated the false speakers with the same formatting, the same placement, and the same confidence as any verifiable fact. Nothing in the output marked the difference. The fabrication was not a malfunction sitting beside normal operation. It was normal operation.

The assumption was that a reference and a referent are the same thing. A system that generates news operates on the relationship between tokens, not the relationship between claims and the world. When it produces a name attached to an organization, it is resolving a familiar pattern: organizations have staff, staff have titles, statements have speakers. The assumption built into that resolution is that producing the form of an attribution is equivalent to possessing one.

Trust here was treated as a property of structure. A quote that is well formed, attributed to a plausible name, and situated inside a coherent article satisfies every internal condition the system uses to decide that output is acceptable. The premise was that consistency is correctness. That if a reference resolves cleanly and reads without friction, the thing it points to exists. The system was never built to ask the second question, because the form already answered the first.

This assumption was also transferable, and that is the part that matters. The format of a real attribution and the format of an invented one are identical. There is no structural difference between quoting a person who spoke and generating a person who did not. The system inherited the credibility of the form without inheriting the obligation behind it. Once the shape of sourced journalism was learned, every output wearing that shape was treated as valid by the same logic, because the logic only ever measured the shape.

What changed was not the system’s capability. The model did exactly what it was built to do, at the quality it was built to produce. What changed was the validity of the assumption that a resolved reference corresponds to something real. That assumption held in the training data, where attributions mostly pointed at statements that had actually been made. It does not hold at generation time, where the system produces references with no mechanism to determine whether a referent exists. The assumption no longer holds, but the system was never told.

The system never re-evaluated trust. It inherited it. The credibility of attribution as a form was established across decades of journalism in which a quoted name meant a real person had spoken, because something outside the text enforced that connection. The model absorbed that credibility as a statistical property of well-formed writing. When it generates a new attribution, it draws on trust accumulated in a prior state, a state in which reference and reality were held together by an external act of verification. The form transferred. The verification did not.

This is where the artifact became the objective. The system was optimized to produce attributions that resolve consistently, that fit the surrounding narrative, that read as sourced. Producing the attribution is the outcome the system can measure. Whether the attribution is true is a property the system cannot observe. Optimization pressure flows toward what can be measured, and over time the well-formed quote attached to a named person became the thing the system produces. The validation that the quote once stood for was never part of the loop, so the loop closed without it.

Externally, what the output shows is a name, a title, an affiliation, and a quotation, arranged so that each element points cleanly to the next. The attribution resolves. The name resolves to an organization, the organization resolves to a kind of authority, the authority resolves to a statement that fits the article around it. Every pointer in the chain has a target the system can produce. Nothing in that chain requires the person to exist. The system generated a complete reference and treated the completeness of the reference as the presence of the referent. Resolution succeeded, so the output was acceptable. That the target of the resolution was never real is not a condition the output exposes, because the output only ever carried the reference, never the thing it pointed to.

What is visible in the result is identity of source standing in for integrity of content. A quotation attributed to a named figure at the Electronic Frontier Foundation arrives carrying the weight of that institution. The credibility belongs to the name and the affiliation, not to the words. The system attached institutional authority to content the institution never produced, and the authority did the work that verification would otherwise have done. The statement reads as true because of who is said to have spoken it. The system generated the speaker precisely because a named source is what makes the statement resolve as sourced. The source was not consulted. The source was constructed, because the form of the source was the part the system could measure as complete.

None of this is a bypass. No check was evaded and no boundary was crossed, because the operation that would have caught it was never an operation the system performed. The model resolved a familiar pattern into well-formed text, which is the entire function. The fabricated staffer is not the system departing from its design. It is the system executing it. There was no verification step waiting to be skipped. Reference resolution is the whole behavior, and reference resolution ran to completion exactly as built. What looks like failure from outside is, from the system’s position, a clean success. The output is well formed. The output is also false, and the system has no surface on which those two facts conflict.

The pattern is execution based on reference, not verification. A system acts on the resolution of a pointer rather than on confirmation of what the pointer addresses. Once a reference resolves into a form the system accepts, the system proceeds as though the referenced thing is present and valid. The pointer is checked for form. The target is never checked for existence. The reference was credible in a prior state because something outside the system held it to a real referent. When that external act is absent, the reference still resolves with the same cleanliness, and the system treats clean resolution as sufficient grounds to act.

The same mechanism operates in dependency resolution. A build system is given a package name and a version number. The reference resolves to an artifact in a registry. The system retrieves the artifact and executes it because the reference matched, not because the contents were confirmed to be what that name once meant. The name is a pointer to a maintainer’s intent at some past moment, and the build trusts the pointer. When the artifact behind the name is replaced, the reference resolves as cleanly as it ever did, and the system executes whatever now sits at that address with the same confidence it executed the original. The resolution of the reference is the operation. The integrity of the referent was never inside the loop. A version string that once stood for verified code is reused in a state where nothing enforces that connection, and the system treats the string as equivalent to the code.

The two cases differ only in domain. The news generator resolves a name into a quotation. The build system resolves a version into code. Neither verifies, and both execute. In each, a reference that once stood for a confirmed reality is reused in a state where no external act maintains that link, and the reference is treated as the reality itself. The system is not malfunctioning in either case. It is performing the one operation it has, which is resolution, on the assumption that resolution and verification are the same step. They were the same step only as long as something outside the system kept them joined.

The system resolves the reference once. It does not return to ask whether the referent exists. The attribution is produced. The verification it was built to imply is not.

Share

Keep Reading

Stay in the loop

New writing delivered when it's ready. No schedule, no spam.