RC RANDOM CHAOS

htop is a reconnaissance surface

How htop and top expose Linux resource contention - OOM-killer steering, D-state telemetry gaps, niced miners, and PID exhaustion mapped to MITRE T1562 and T1499.

· 7 min read
htop is a reconnaissance surface

top and htop are not performance tools with a security footnote. They render, in real time, the kernel’s scheduler and memory-manager state for every process on the box. Priority, residency, run state, CPU time. That render is a contention map. Contention is predictable. Predictable is exploitable. This isn’t advanced. It’s overlooked. That’s why it works.

Everything both tools display comes from /proc. Not inference - direct reads of /proc/[pid]/stat, /proc/[pid]/status, /proc/meminfo, /proc/loadavg. The display is the kernel’s own bookkeeping, formatted. For an operator, that makes htop a reconnaissance surface and a contention gauge in the same pane. Reconnaissance because it enumerates every running process, its user, and its command line. Contention because it shows exactly which resource is starved and which process is starving.

Start with load average. The three numbers on the top line are the 1-, 5-, and 15-minute exponential moving averages of the run-queue length. On Linux, that count includes processes in state R and processes in state D. The D-state inclusion is the detail most people never internalise. A load average of 40 on an 8-core host with CPU columns sitting near idle is not compute saturation. It is I/O blocking. Processes stuck in uninterruptible sleep, waiting on a disk or a stalled network mount. The number is high and the CPU is bored. That divergence is the signal.

The state column carries the rest. R is running or runnable. S is interruptible sleep. D is uninterruptible sleep. Z is a zombie. T is stopped or traced. Each value is an attacker-relevant fact, not a curiosity.

D is the one that matters most. A process in uninterruptible sleep cannot be killed. SIGKILL does not reach it. The signal is queued and delivered only after the kernel completes the I/O the process is blocked on. Security agents that log synchronously behave badly here. auditd flushing to a saturated disk, an EDR sensor shipping events to a congested socket, a SIEM forwarder writing to a full spool - under I/O pressure they enter D. While they sit in D, they are not recording. An attacker who can generate disk or network I/O pressure can push a synchronous logging agent into uninterruptible sleep and open a window where host telemetry simply stops emitting. The agent is alive in the process table. It is producing nothing. That maps to MITRE T1562.001, impair defenses, achieved without touching the agent binary or its config.

Memory columns are the next lever. VIRT is the total virtual address space mapped. RES is resident set size - physical pages actually held. SHR is the shared portion. MEM% is RES over total RAM. RES is the field that decides who dies under pressure. When the kernel runs out of reclaimable memory, the out-of-memory killer selects a victim. Selection is driven by a badness heuristic - oom_score, tunable per process through oom_score_adj - that weights heavily toward large resident set size. An attacker who can inflate memory pressure controls which processes become eligible. Drive RSS up in a controlled way and the OOM killer’s victim selection becomes steerable. Steer it onto auditd. Onto the EDR daemon. Onto the monitoring collector. The kernel does the killing. The attacker only shaped the pressure. This is defense evasion executed by the operating system on the attacker’s behalf.

The kernel logs the kill. dmesg and the kernel ring buffer record “Out of memory: Killed process 1234 (auditd)” with the score. The catch is obvious once stated. If the process the OOM killer terminated was the thing shipping logs off the host, the record of its own death never leaves the box. The evidence and the eviction share a fate.

PRI and NI describe scheduler position. NI is niceness, minus twenty to nineteen, and it biases the completely fair scheduler’s virtual runtime accounting. A process at nice -20 gets a large share of CPU relative to peers. The inverse is the interesting case for an operator watching for intrusions. An attacker process running at nice 19, or throttled inside a cgroup with a tight cpu.max quota, deliberately holds a low share. It does not sort to the top of the CPU% column. It is patient and quiet by design. Cryptojacking implants do exactly this. Kinsing and the TeamTNT tooling documented across cloud Linux fleets throttle their miners and, in TeamTNT’s case, actively kill competing processes and disable cloud provider monitoring agents. A miner that sorts to the top of htop by CPU% and TIME+ gets found in the first triage pass. One that runs niced and cgroup-throttled hides under the sort order the analyst trusts.

TIME+ is cumulative CPU time consumed since the process started. It is the field that betrays long-running compute regardless of instantaneous CPU%. A miner that keeps its per-sample CPU low still accumulates TIME+. Sort by it and the patient process surfaces even when the CPU% column stays clean. Most triage never re-sorts. That is the gap the throttling exploits.

Task and thread counts on the header line map to process-table and PID pressure. kernel.pid_max bounds the total. Exhaust it and no new process can fork - including the shell an incident responder needs and the recovery tooling the platform relies on. Resource exhaustion of this class is MITRE T1499, endpoint denial of service. The fork storm does not need root. It needs a shell and an unbounded cgroup. htop shows the ceiling approaching in the task counter before the box stops accepting logins.

Zombies deserve their own note. A Z-state process has terminated but its parent has not reaped the exit status. A scattering of zombies is normal churn. A growing population of them, or zombies parented to a supervisor that should be reaping, indicates the parent has stalled or died. When the parent is a monitoring daemon or a process supervisor, accumulating zombies under it is a direct indicator that the supervisor stopped doing its job. A dead watchdog announces itself in the state column if anyone reads it.

There is a subtler use of the contention map. Time-of-check-to-time-of-use races have a success probability tied to the width of the window between the check and the use. Under CPU contention, the scheduler preempts more often and the window widens. An attacker exploiting a setuid TOCTOU or a symlink race can deliberately raise system load to shift a one-in-ten-thousand race toward one-in-three. htop shows precisely how much load is present and how much headroom remains to raise it. The tool that a defender uses to confirm the box is healthy is the same tool that tells an attacker how far the race window can be stretched.

What fires in telemetry, and what does not, follows directly. The OOM kill lands in the kernel ring buffer and, on most builds, in the journal and kern.log. auditd, if it is alive and its spool is draining, records the SIGKILL syscall and the process exit. Sysmon for Linux emits event ID 5 on process termination and event ID 1 on the fork storm’s process creations. A network-tier control at the perimeter - a Cloudflare or equivalent - sees nothing of this, because none of it crosses the network boundary. The whole sequence is local resource physics. The blind spots are specific and repeatable. If the induced pressure kills the log shipper, downstream detections never receive the OOM event. If I/O pressure parks auditd in D-state, the SIGKILL and the process creations that happen during that window are never written. The detection depends on the survival and responsiveness of the exact component the technique is engineered to starve. That circular dependency is the flaw.

There is no patch boundary here, because there is no CVE. None of this is a bug. The OOM killer selecting the largest resident process is the memory manager working as specified. D-state blocking SIGKILL is uninterruptible sleep behaving as designed. The CFS honouring niceness is the scheduler doing its job. This is not a vulnerability to remediate. It is operating-system behaviour to account for.

Residual exposure after any hardening remains, and it maps to concrete controls rather than advice. oom_score_adj set to protect security daemons removes them from the OOM killer’s easy-victim pool. cgroup v2 with per-workload memory.max and pids.max caps the pressure any single tenant can generate. I/O priority and separate spool devices keep synchronous loggers out of D-state under load. Detection that depends on the agent it is meant to protect is detection built on the wrong assumption. The agent’s liveness is itself a signal - its absence, its D-state, its OOM death are all events, and they must be observed from somewhere the starvation cannot reach. htop shows all of this to whoever opens it. The attacker opened it first.

Share

Keep Reading

Stay in the loop

New writing delivered when it's ready. No schedule, no spam.