An AI engineer skipped the prediction test

Linear A has gone undeciphered for about 120 years, since Arthur Evans began pulling tablets out of Knossos around 1900 and named the script. The surviving corpus is small: roughly 1,400 documents - mostly clay tablets, bars, and sealings - totaling somewhere near 7,000 to 7,500 signs of text. That is less material than a short magazine article. Every few years someone announces they have solved it. The newest version of that announcement has an AI engineer attached to it. The pattern is worth understanding because the claim is almost never wrong in an interesting way. It is wrong in a predictable way.

”Cracked” is not a technical term

The first problem is the verb. In epigraphy there is no single event called “cracking” a script. There is a chain of claims, each of which can be checked: assigning sound values to signs, reading those signs into words, assigning meanings to the words, and then predicting what untranslated tablets will say before you read them. A real decipherment survives that last step. When Linear B was read, the test was brutal and public - new tablets came out of the ground at Pylos and the proposed values produced sensible Greek words, including place names nobody had fed into the system.

So when a headline says an AI “cracked” Linear A, ask which link in that chain is actually being claimed. Usually it is the weakest one: the engineer has produced sound values, or a clustering of similar signs, or a handful of words that resemble some known language. That is the beginning of the work, not the end of it. Treating the first link as the whole chain is where the hype lives.

The data problem nobody puts in the headline

Machine learning runs on volume. The models that read modern text train on billions of tokens. The decipherment models that have actually worked trained on corpora large enough to learn statistical structure - letter frequencies, word boundaries, inflectional patterns.

Linear A gives you about 7,000 signs. Many tablets are broken. Many signs appear only a handful of times, some only once. You cannot learn the grammar of a language from a corpus that small, no matter what is running underneath, because the patterns you would need to detect do not repeat often enough to separate signal from coincidence. This is not a tuning issue or a bigger-GPU issue. It is an information ceiling. Any system that claims to have learned the language from this corpus has either found a pattern that is too thin to test, or has quietly imported assumptions from somewhere else and is reading those back to you.

Why Linear B fell and Linear A has not

Linear B was the harder-looking problem that turned out to be solvable, and the reason is specific. Alice Kober spent the 1940s working out, without guessing the language, that the script was inflected and which signs shared vowels and consonants - she built a grid of relationships. Michael Ventris, with John Chadwick, then made the move that broke it: he guessed that certain repeated sign-groups were Cretan place names, and that the underlying language was an early form of Greek. Once you know the language, you have a dictionary, a grammar, and millions of words of comparison material. The script becomes a code over a language you already speak.

Linear A has no such anchor. We do not know what language it records. It is not Greek. It has no confirmed living or documented relative. There is no bilingual text - no Rosetta Stone - pairing it with something readable. So even a perfect reading of the sounds leaves you holding pronounceable words with no way to confirm what they mean. That missing anchor is the entire difficulty, and no model removes it. A model can only work with the relationships that are actually present in the data.

What the AI can do, and what it cannot

Here is the part that gets blurred. Linear B borrowed most of its syllabic signs from Linear A, so scholars can take the known Linear B sound values and apply them to the matching Linear A signs. This is standard and has been done for decades by hand. It lets you transliterate Linear A - turn the signs into syllables you can say out loud. “A-sa-sa-ra” is a real example of a recurring Linear A sequence read this way.

Transliteration is not translation. You can pronounce a word in a language you do not speak and still have no idea what it means. An AI that outputs syllabic readings of Linear A is doing the easy, already-solved half. The moment it assigns meanings, you should ask where those meanings came from. If the answer is “the model matched the syllables to words in Language X,” then the real claim is that Minoan is related to Language X - and that is a hypothesis people have floated for Luwian, Semitic languages, Greek, and others for a century without consensus. The model has not proven the relationship. It has assumed one and produced output consistent with the assumption. That is a bias baked into the method, not a discovery.

The questions to ask before you believe it

Treat the claim the way you would treat any system output you cannot yet trust. Ask:

What is the falsifiable prediction? A genuine decipherment says “this unread tablet will, when read, say roughly this.” If there is no prediction that could turn out false, there is nothing to verify.
What underlying language is being assumed, and was that assumption an input or a result? If the language family was chosen up front, the matches are circular.
How big was the training and test split? With 7,000 signs, a model that looks good probably saw most of its test material during fitting.
Can an independent scholar reproduce the readings from the published method and data alone? Not the press release - the method and the data.
Has it been submitted to a journal in the field, and what did reviewers who read Linear A for a living say?

If the engineer can answer those in plain terms and the answers hold, you may have something. If the answers are vague, or the work lives only in a preprint and a thread, you have a candidate, not a result. Both happen. Only one deserves the headline.

How the hype machine actually runs

The mechanism is consistent and it is not usually fraud. An engineer with real skills points modern tooling at a famous unsolved problem. They get output that looks structured, because these tools always produce something that looks structured. They write it up, reasonably, as “a promising approach.” Then it leaves their hands. A university communications office needs a story. A reporter needs a clean verb. “Promising statistical approach to sign clustering” becomes “AI cracks ancient language.” Each step is a small exaggeration, and the sum is a claim the original author never quite made.

The field has seen this with AI specifically. In 2019 a team at MIT’s CSAIL published a neural method that automatically deciphered Linear B and the older language Ugaritic. It was solid work - and it only succeeded because both scripts could be mapped onto a known related language, Greek and Hebrew respectively. The same paper noted the method had no purchase on Linear A, for exactly the reason above: nothing to map it to. The competent version of this research is honest about the wall. The hyped version is the one that forgets to mention it.

Peer review is the verification layer, and it is slow on purpose

In security you do not trust a control because someone says it works. You trust it after someone hostile tried to break it and failed. Peer review in a small philological field is that adversarial layer. The people who can actually check a Linear A claim is a list of maybe a few dozen specialists worldwide. They will look at whether the proposed values are consistent across the whole corpus, whether the meanings are forced, whether the method was applied honestly. This takes months, sometimes years, and it should. The slowness is the feature.

A preprint, a conference talk, or a viral post is not that layer. It is the submission to it. When you see “cracked” before any of those specialists have weighed in, you are watching a claim skip the only step that could confirm it. That does not make the engineer dishonest or the method worthless. The work might genuinely move the field a few inches. But “moved the field a few inches” and “solved a 120-year-old problem” are different sentences, and the gap between them is where you should keep your attention.

The quiet cost of the overclaim is the one that matters most. Every cycle of “AI cracked it” followed by silence makes the next real advance harder to hear. If a careful team does produce a partial reading that holds up, it arrives into an audience that has already learned to scroll past the headline. Skepticism is not the enemy of the breakthrough here. It is the thing that will let you recognize the breakthrough when it finally shows up with its evidence attached.

An AI engineer skipped the prediction test

”Cracked” is not a technical term

The data problem nobody puts in the headline

Why Linear B fell and Linear A has not

What the AI can do, and what it cannot

The questions to ask before you believe it

How the hype machine actually runs

Peer review is the verification layer, and it is slow on purpose

Keep Reading

YouTube built a checkbox, not a detector

Mid-2024: a drunk LLM found a ksmbd kernel bug

2023 mistakes an IP address for a passport

Stay in the loop