CVE-2026-31840 blames the encoder, corrupts the decoder heap

FFmpeg 9.1 shipped a rewritten native AAC path. The reported flaw is a heap corruption reachable through AAC processing, tracked as CVE-2026-31840, CVSS 8.8 - vector CVSS:3.1/AV:N/AC:L/PR:N/UI:R/S:U/C:H/I:H/A:H. The bug class is CWE-787, out-of-bounds write, with a CWE-416 use-after-free reachable on the same code path. The advisory headline says encoder. The exploitable surface is the decoder. That distinction is the entire analysis.

Encoders and decoders sit on opposite sides of the trust boundary. An encoder consumes PCM - raw samples produced locally, already inside the process. Nobody hands an encoder a hostile input, because the input is the machine’s own audio buffer. A decoder consumes a bitstream. The bitstream arrives from a file, an upload, a network stream, a container muxed by someone else. Every count, length, and flag in that stream is attacker-controlled until validated. When a media library is compromised, it is almost always through the parse-and-decode path, because that is the only place untrusted bytes reach the allocator. The “new AAC encoder” framing points at the wrong file. The write happens where ADTS and raw AAC frames get parsed before anything is re-encoded.

The bounds check that isn’t there

An AAC frame is a self-describing structure. It declares the number of channel elements, the profile, the sampling frequency index, and per-element configuration - max_sfb, section data, scalefactor bands, and extension flags for SBR and parametric stereo. The decoder reads that configuration and sizes its working buffers from it. Channel element contexts, spectral coefficient arrays, and per-channel overlap buffers are allocated against the declared geometry.

The defect is a missing validation between a declared count and an existing allocation. The first frame establishes a configuration and the decoder allocates against it. A later frame declares a larger element count or a scalefactor band index beyond the table bound. On a correct decoder, that mismatch is rejected or forces a clean reallocation. On the vulnerable path, the declared value is trusted and the write loop runs past the end of the allocated chunk. That is the out-of-bounds write. It lands in whatever the allocator placed adjacent - chunk metadata, or the next live object.

The use-after-free is the second-order effect. When configuration changes mid-stream, the decoder tears down and reallocates element context. If a pointer into the old spectral or SBR buffer is retained - held in an extension context that the reallocation path does not update - the next access dereferences freed memory. The out-of-bounds write corrupts allocator bookkeeping; the reallocation path leaves a dangling pointer. “Buffer overflow leading to a UAF” reads as loose language, but the mechanism is coherent: one bug primitive feeds the other because both live in the same undersized-allocation-plus-stale-pointer failure.

From corrupted chunk to controlled free

A linear overwrite past a heap chunk is not yet code execution. It is a starting primitive. The path to control runs through heap grooming, and a media container is an unusually good grooming tool. The attacker controls frame order, frame size, and how many allocations of each class the decoder performs before the corrupting write. On glibc that means shaping tcache and fastbin state. On a build using jemalloc - common in Android and in vendored mobile stacks - it means aligning size classes and runs.

The objective is to place a target object adjacent to the vulnerable allocation, then convert the overflow or the dangling pointer into a controlled free and a subsequent reuse. FFmpeg’s own object graph supplies candidates. Codec and format contexts carry function pointers. Frame and buffer structures carry reference-counted free callbacks - a pointer to the function that releases the buffer, invoked at teardown. Overwrite one of those and the free path calls an attacker-chosen address. That is the pivot from heap corruption to control flow, and it needs no shellcode in the audio data itself. The primitive is a write-what-where derived from the reuse, aimed at a callback the process will invoke on its own.

No working payload, chain, or grooming recipe belongs in a public brief. The mechanism is the point. What the attacker controls is the bitstream. What the bitstream buys is allocation order and one out-of-bounds write. What that write reaches is a function pointer the process already trusts.

Where this actually runs

FFmpeg is not a desktop application problem. It is a library problem, and the library is everywhere. Server-side transcoding pipelines ingest arbitrary uploaded media and run it through libavcodec without a human in the loop. Thumbnail and preview generators decode the first frames of every file a user submits. Chat and mail platforms transcode voice notes. Cloud video services normalise every upload. In all of these, T1190, exploitation of a public-facing application, is the entry - the malicious AAC arrives as a normal upload and the pipeline decodes it automatically. Where a user opens a file locally, the mapping is T1204.002, malicious file, into T1203, exploitation for client execution. Post-corruption, execution continues under T1059 in whatever interpreter the worker can reach.

The precedent is real. FFmpeg’s HLS and playlist handling was weaponised years ago for server-side request forgery and arbitrary file read against transcoding backends - crafted playlists that made the decoder fetch attacker-chosen resources. Those were not memory-corruption bugs, but they proved the model: automated media processing is an unauthenticated code path that touches hostile input at scale, and defenders rarely watch it. A heap primitive on that same surface is the more severe version of the same exposure.

No confirmed in-the-wild exploitation is attributed to this CVE at time of writing, and no named actor is linked. Treat that as a statement of current evidence, not an all-clear. The surface is attractive because it is unauthenticated, automated, and ubiquitous.

A failed exploitation attempt crashes the decoder. In development that surfaces as an ASan out-of-bounds report or a UAF trace. In production it is a SIGSEGV and a restarted worker - noise, if anyone even collects worker crash counts. A successful exploitation does not crash. That is the problem. The stable path produces no error the pipeline logs.

The signal is process lineage. A transcoding worker exists to decode and re-encode. It should never spawn a shell, a downloader, or an interpreter. On Windows, Sysmon Event ID 1 showing ffmpeg or the worker process as parent of cmd.exe, powershell.exe, or a scripting host is the detection. Sysmon Event ID 3 is outbound network from a transcoder that should only ever pull from and push to known storage endpoints. Event ID 8 and Event ID 10 flag injection and cross-process access if the payload pivots into another process. On Linux, where most of these pipelines live, the equivalents are auditd execve records or an eBPF sensor capturing a child process under the ffmpeg cgroup, and egress from a container whose network policy should permit none.

The gap is baselining. Media workers run as batch jobs, often in short-lived containers, often without an EDR agent inside the container at all. Nobody has enumerated the legitimate child processes of ffmpeg, because under normal operation there are none, so there is no baseline to alarm against. The detection that works is the cheapest one to state and the rarest to deploy - alert on any child process of the media decoder, and alert on any egress from the decode tier. Both are near-zero-false-positive in a correctly scoped pipeline.

After the patch, the library is still everywhere

The fix validates the declared element count and scalefactor band index against the live allocation before the write, and corrects the reallocation path so no stale extension pointer survives a configuration change. Upgrading the FFmpeg package closes the direct exposure. It does not close the real one.

FFmpeg is statically linked and vendored into an enormous population of software - browsers, editors, streaming clients, mobile apps, media servers, and appliance firmware. Each of those carries its own copy, on its own release cadence, patched only when the downstream vendor rebuilds. The patch boundary is the direct package version. The exposure boundary is every binary that bundled libavcodec and has not shipped a rebuild. That lag is measured in months for maintained products and never for abandoned ones. The unit of risk is not the FFmpeg version installed by the package manager. It is the software bill of materials - which of the deployed binaries embed a vulnerable libavcodec, and which vendor controls each rebuild.

For entities under SOCI obligations running media ingest in critical communications or broadcast infrastructure, this is a component-inventory question before it is a patch question, and confirmed exploitation against production should be escalated to the responsible security team rather than triaged in isolation. The residual exposure post-patch is a supply-chain accounting problem. The bug was in one file. The vulnerable code is in thousands of shipped products. Fixing the source does not fix the copies, and the copies are what process the untrusted stream.

CVE-2026-31840 blames the encoder, corrupts the decoder heap

The bounds check that isn’t there

From corrupted chunk to controlled free

Where this actually runs

Telemetry: the transcoder is the blind spot

After the patch, the library is still everywhere

Keep Reading

Korea's KCSC mandates server-side image parsers

memcpy walks off the end of the receiver

Log4Shell executed exactly as written

Stay in the loop