RC RANDOM CHAOS

A junior operator, an API key, a hundred payloads

Google warns AI-powered hacking has reached industrial scale. Practical operational resilience steps for defenders facing faster, cheaper, adaptive attacks.

· 21 min read

The Threat Has Changed Speed, Not Shape

Google’s Threat Intelligence Group reported in November 2025 that state-aligned actors and criminal crews are now using large language models inside the active phases of intrusions - not just for phishing drafts, but for live malware generation, target reconnaissance, and adaptive command-and-control. They named specific families: PROMPTFLUX rewriting itself hourly, PROMPTSTEAL pulling Gemini API calls mid-operation, FRUITSHELL designed to bypass LLM-based defensive analysis. The label they used was industrial scale.

That phrase matters. It does not mean attacks are smarter. It means they are cheaper, faster, and produced at volume. A junior operator with a paid API key can now generate variants faster than your detection engineering team can write signatures. The economics flipped. That is the only thing that flipped.

The techniques themselves are old. Phishing is still phishing. Lateral movement is still lateral movement. Credential theft, persistence, exfiltration - none of it is new. What changed is the cost of producing the next variant. Where an attacker used to spend a week customising a payload for one target, they now spend twenty minutes producing a hundred. The defensive question is no longer can we stop the technique. The defensive question is can we operate at a tempo that survives the volume.

This post is about that operating tempo. Not the tools. The tempo. Because if you fix the tempo, the tools follow. If you fix the tools without the tempo, you buy expensive boxes that watch you get breached.

What Industrial Scale Actually Looks Like On Your Network

Forget the marketing diagrams. Here is what defenders are seeing in the field this year.

A phishing campaign used to mean one lure, sent to a thousand inboxes, with one payload. Now it means a thousand lures, each individually generated against the target’s LinkedIn profile, each delivering a payload that was compiled fresh sixty seconds before delivery. Your email gateway hashes do not match. Your URL reputation lists do not match. Your sandbox detonation sees a slightly different binary each time. Detection-by-signature is finished for this class of traffic. It was already weakening. It is now finished.

A compromised endpoint used to beacon to a known command-and-control domain. Now the beacon target is generated at runtime, sometimes by querying a public LLM API endpoint that the operator wraps as a relay. Your egress filtering sees traffic to api.openai.com or generativelanguage.googleapis.com - which your developers use legitimately every day - and the malicious flow hides inside that legitimate traffic. Blocking the API breaks your business. Allowing it gives the operator a covert channel.

A social engineering call used to be a human reading from a script. Now it is a voice clone of your CFO, generated from a thirty-second sample pulled off an earnings call, calling your accounts payable clerk at 4:47pm on a Friday to authorise a wire. The clerk has worked with the real CFO for six years. The clone passes the voice. It passes the cadence. It knows the project codename because the operator scraped it from a press release that morning.

This is what industrial scale means in practice. It does not mean more attacks. It means attacks that are individually tailored at the volume that used to belong to spray-and-pray. The defender’s old assumption - that targeted attacks are rare and broad attacks are obvious - is dead.

The Resilience Frame, Not The Defence Frame

Most cyber writing tells you how to stop the attack. That advice is now incomplete, because the attacks you cannot stop have multiplied. The question that matters for an organisation in 2026 is not how do we stop them. It is how do we keep operating while they are happening.

That is the resilience frame. It treats compromise as a recurring operational state, not an exceptional event. It assumes you will be inside a live intrusion at some point in any given quarter, and asks what your business does during that intrusion. The defence frame protects the perimeter. The resilience frame protects the function.

The distinction shows up in budget arguments. A defence-framed team asks for another endpoint product. A resilience-framed team asks how long the finance team can close month-end without the ERP, and writes the answer down, and tests it. The defence-framed team wants to add controls. The resilience-framed team wants to remove dependencies.

You can run both frames at once. Most mature organisations do. But if you are choosing where to put the next dollar this year, the resilience frame returns more. Defensive controls are now in an arms race where the attacker iterates faster than your procurement cycle. Resilience properties - segmentation, recovery time, manual fallback, decision authority - degrade more slowly and pay back over multiple incidents.

Map The Five Functions That Cannot Stop

Before you spend another dollar on tooling, run this exercise with your operations leaders. Ask them to name the five business functions that genuinely cannot stop for forty-eight hours without causing material harm. Not five hundred. Five.

For a hospital, the answer might be: patient identification, medication administration, imaging access, theatre scheduling, payroll. For a logistics firm: dispatch, fuel card authorisation, customs paperwork, driver communication, invoicing. For a mid-sized law firm: client communication, document version control, court filings, time recording, trust account access.

The list is hard to make. Operations leaders will resist the constraint. They will tell you everything is critical. Hold the line at five. The reason for the constraint is that you cannot protect everything with the same intensity, and pretending you can means you protect nothing well. Five is enough that the list reflects reality. More than five and you are back to a list of everything.

For each of those five functions, document three things on a single sheet of paper. The minimum people, the minimum systems, the minimum data. If the systems went away tomorrow, what is the manual fallback? If the manual fallback requires authority decisions, who has the authority and what is the trigger? Print those sheets. Laminate them. Put them in a binder in the operations room. The binder is your resilience artefact. It is worth more than a SIEM.

The people who tell you binders are old-fashioned have not been in an incident where the active directory was encrypted and nobody could log into the wiki to find the runbook. Paper has uptime properties that digital systems do not.

Segment Like You Mean It

Segmentation is the single highest-value technical control for an AI-augmented threat landscape, and it is the one most organisations have done worst. The reason it is high-value now is that the attacker’s speed advantage is amplified by reach. A toolkit that can iterate variants hourly is dangerous in proportion to how far it can travel once it lands. A flat network is a multiplier. A segmented network is a divider.

Segmentation is not VLANs. VLANs are addressing. Segmentation is enforcement. You have segmentation when a compromised workstation in marketing cannot reach a domain controller, cannot reach the finance file share, cannot reach the OT network, and cannot reach the backup repository. You verify that by trying it, not by reading the firewall config.

The practical model is to define three or four trust zones and enforce all inter-zone traffic through an identity-aware checkpoint. User workstations are one zone. Server workloads are another. Privileged administration is a third. Backup and recovery infrastructure is the fourth, and it is the one most organisations get wrong. Backup systems sit on the production domain, authenticate with production credentials, and get encrypted in the same incident that they were supposed to recover from.

If your backup repository can be reached from a domain user account, you do not have backups. You have a second copy of the production data, which an attacker will encrypt or delete during the dwell period before they trigger the visible portion of the attack. Real backup separation means a different identity store, a different network path, immutable storage where retention cannot be reduced by an authenticated administrator, and physical or logical air gap on at least one copy.

Microsoft published guidance in 2024 on the 3-2-1-1-0 backup rule, which adds an immutable copy and a zero-error verification step to the older 3-2-1 advice. Adopt it. The cost of immutable cloud storage is trivial compared to the cost of a ransom negotiation.

Identity Is The New Perimeter, And It Is Already Breached

The firewall stopped being your perimeter the day your first employee took a laptop home. Identity is the perimeter now. Identity is also the most actively attacked surface, because credentials are stolen at industrial scale through the same AI-powered tooling that produces the phishing.

The state of the art for identity controls in 2026 is not multi-factor authentication. MFA is the floor. The floor is still useful - accounts without MFA are picked off immediately - but treating MFA as the ceiling is how organisations get hit by adversary-in-the-middle phishing kits that capture the session cookie after the second factor. EvilProxy, Tycoon 2FA, Mamba - these kits have been industrialised for two years and are now sold with support contracts.

The upgrade is phishing-resistant authentication: FIDO2 hardware keys, platform passkeys, or certificate-based authentication for the small set of accounts that have administrative reach. If a compromised user can read mail, that is one risk. If a compromised user can create new accounts, modify group memberships, or read shared mailboxes, that is a different risk and it needs a different authentication grade. Tier your identities. Tier your authentication to match.

The second upgrade is conditional access that looks at signals beyond the password. Device compliance, network location, session anomalies, behavioural baselines. The conditional access engine is the choke point where you get to evaluate whether the request makes sense, not just whether the credentials are correct. Spend time on the conditional access policies. They are where the actual security lives.

The third item, which most organisations have not done, is session control. The session cookie is the prize after a successful phish, and it survives password resets, MFA enrolment changes, and most clean-up activity. After any suspected compromise, revoke sessions explicitly. Make session revocation a one-button action in your incident playbook. Practice it.

Detection Engineering For A Volume Problem

If the attacker can produce a hundred variants an hour, your detection content has to detect behaviours, not variants. This shift has been preached for a decade and ignored for a decade. The economics now force the issue.

The replacement for signature-based detection is behavioural detection - looking for what the attacker has to do, regardless of how the tool is written. They have to discover the environment. They have to escalate. They have to persist. They have to move laterally. They have to exfiltrate. Each of those phases produces telemetry that does not depend on which binary was used.

The MITRE ATT&CK framework is the public reference for this approach. Build your detection coverage against the techniques, not the implementations. T1059 - command and scripting interpreter abuse - is a technique. PowerShell Empire is an implementation. The technique survives the implementation. Detection content written against the technique survives the variant.

The practical work is unglamorous. Inventory which ATT&CK techniques your telemetry can theoretically detect, then test which ones it actually detects, then close the gaps. Atomic Red Team is a free library that lets you run individual technique tests in your own environment. Run them. Find out what fires and what does not. The first time most teams do this, they discover that thirty per cent of their assumed coverage is fictional.

The second piece is reducing alert volume so analysts can think. AI-augmented attacks produce more activity, which produces more alerts, which produces more fatigue. The answer is not more analysts. The answer is fewer, better alerts, each one carrying enough context that the disposition is obvious. Aim for a state where every alert your tier-one analyst sees comes with the affected user, the affected asset, the technique mapping, the recent context, and a recommended next action. Anything less and you are paying salary to read noise.

The Speed Equation In An Incident

In a live incident, two clocks run against each other. The attacker’s clock, measured from initial access to objective. The defender’s clock, measured from detection to containment. The gap between those clocks is the dwell time, and dwell time predicts damage almost perfectly.

The industry average dwell time, depending on whose report you read, ranges from eight days to twenty-one days. The figure that matters is yours. Most organisations do not measure it because measuring it requires admitting that you have a baseline of incidents, which the board does not want to hear. Measure it anyway. You cannot improve a clock you do not start.

The AI-augmented attacker has shortened their own clock. Initial access to domain admin used to be measured in days for a capable operator. The well-publicised cases in 2025 showed it dropping to hours, sometimes under sixty minutes, when the operator used LLM-assisted tooling to compress reconnaissance and tool selection. Your detection-to-containment clock has to come down faster than theirs.

The levers on the defender side are limited. Automation, pre-authorised playbooks, and decision rights. Automation handles the high-confidence, narrow-scope responses - isolating an endpoint that is clearly executing known-bad behaviour, disabling a user account that just authenticated from two continents in three minutes. Pre-authorised playbooks let your on-call analyst execute the response without waking the CISO. Decision rights let your CISO act without waiting for the CEO when the action affects the business.

The decision rights piece is where most programs fail. The technical capability to contain exists. The authority to contain does not. The on-call analyst sees the activity, calls the manager, the manager calls the director, the director calls the CISO, the CISO calls the COO to authorise taking the payroll system offline before midnight. Two hours pass. The exfiltration completes during the phone tree. Write down who can pull the cord without asking. Write down what they can pull. Tell their boss in advance, on paper, that this authority exists. Without that document, your incident response runs at the speed of email approvals.

The Human Tempo Problem

The BREAKPOINT thesis, the reason this newsletter exists, is that the human operating tempo determines whether the technical controls work. You can have the best playbook in the industry and lose the incident because the people executing it were on hour fourteen of a continuous shift and missed the obvious step. Resilience is a human property before it is a technical one.

In a sustained intrusion, fatigue compounds in three ways. Cognitive - the analysts get slower at pattern matching, miss correlations, default to known scripts even when the situation does not fit. Emotional - the stakes feel heavier with time, decisions get more cautious or more reckless, communication degrades. Physical - sleep loss, irregular eating, caffeine debt, all of which feed back into cognitive degradation.

The operational response is to rotate, not endure. Build a roster that assumes a major incident will run for seventy-two hours minimum, and that no single person can be on the bridge for more than twelve of those hours. If you do not have the bench depth for that roster, your incident response capacity is overstated and you should not be telling the board you have twenty-four-seven coverage. You have one shift, and after that shift, you have an exhausted shift making bad decisions.

The pre-incident work is to set the rotation expectations in calm time. Who is primary, who is secondary, who is the relief. What time the relief comes in. Who is allowed to refuse a shift change. Who has the authority to call in external help and stop the local team from working past their limit. The expectations are written down so that during the incident, nobody is negotiating who goes home while the attacker is still active.

The second piece is what happens after the incident. The team that just worked seventy-two hours does not go straight back to the queue. They get a forced stand-down. They get a debrief that surfaces what they actually saw, not the version sanitised for the board report. They get permission to say what hurt and what did not work. Programs that skip the debrief lose the institutional learning and burn out their people. Programs that do the debrief get better every incident.

Drilling Under Pressure

A plan that has never been executed under pressure is a hypothesis. Tabletop exercises validate the hypothesis. Functional drills test the execution. Full simulations test the system. You need all three, in increasing order of cost and decreasing order of frequency.

The tabletop is quarterly. Pull the incident team into a room for ninety minutes. Read out a scenario. Walk the response. Find the gaps in roles, communication, and authority. The output is a short list of fixes that go into the runbook before the next quarter. The cost is one afternoon. The return is alignment.

The functional drill is twice-yearly. Pick one capability - backup restoration, account isolation, network segmentation cutover, customer notification - and actually execute it. Not in production, but with production tooling, against production-equivalent systems, with production decision-makers in the loop. The functional drill catches the gap between the runbook says we can and we have actually done this once.

The full simulation is annual. A live exercise, ideally with a red team operating against your environment with your blue team responding in real time. The full simulation costs real money and produces real findings, and most organisations cannot tolerate the disruption. Run it anyway. The first one is brutal. The third one is informative. By the fifth, your team has muscle memory you cannot buy.

For smaller organisations that cannot afford a contracted red team, table-top scenarios with a destructive imagination work. Have someone outside the security team write the scenario. Make it specific. Use real assets, real accounts, real third parties. Bad scenarios are vague. Good scenarios make the team uncomfortable.

What To Measure And What To Ignore

Measurement drives behaviour, and most cyber metrics drive the wrong behaviour. Vulnerabilities patched per month, training completions, mean time to detect - all of them can be gamed and all of them can improve while the actual security posture degrades. Pick the metrics that map to outcomes.

The first metric is critical asset coverage. Of the five business functions you identified earlier, what percentage of the underlying systems are inside your monitoring scope, your patching scope, your backup scope, and your authentication tier? If a critical system is not in scope for any of those, you have a known gap that maps directly to a known impact. Report the percentage. Report the named systems that are missing.

The second is mean time to contain, not mean time to detect. Detection is half the story. Containment is the half that bounds the damage. Time the clock from first alert to confirmed isolation of the affected scope. If it is more than an hour for endpoint scope, more than four hours for account scope, more than twelve hours for tenant scope, your incident response is not keeping pace with the threat.

The third is recovery time objective, tested, not stated. The RTO that lives in the business continuity document is aspirational. The RTO you can prove in a drill is real. The gap between them is the gap between your perceived resilience and your actual resilience. Close it by either fixing the recovery capability or fixing the stated objective. Do not leave them in disagreement.

The fourth is third-party exposure. Most modern intrusions touch a third party at some point - a managed service provider, a software vendor, a contractor with VPN access. Count your third parties with privileged access. Count the ones you have actually assessed in the last twelve months. Count the ones that have a security incident notification clause in the contract. The three counts should be close to equal. They almost never are.

Metrics to ignore: training completion rates above ninety per cent (people click through), number of vulnerabilities (volume tells you nothing about exploitability), number of alerts handled (more alerts is not better), security spend as percentage of IT budget (industry benchmarks are meaningless for your specific risk).

The Supply Chain Is Now A Modelled Surface

The AI-augmented attacker does not just target you. They target the path to you. Software vendors, managed service providers, contractors, even open source maintainers - anyone with a trust relationship that lets them push code or commands into your environment is now a modelled target. Operators run open-source models against public registries to find packages with weak maintainer hygiene, then social engineer the maintainer or compromise the publishing credentials.

The defensive response operates in three layers. Reduce your trust surface - which third parties actually need privileged access, and can that access be downscoped to something narrower than a domain admin equivalent? Most cannot answer that question because the access was granted years ago by someone who has since left. Audit it.

Monitor the trust surface - when a third party logs in, when they execute, when they pull or push code, can you see it in the same telemetry stream where you see your own users? If their activity sits in a separate console that nobody watches, you have a blind spot the size of their access. Pipe their telemetry to the same place.

Contract the trust surface - incident notification clauses, audit rights, breach disclosure timelines, indemnification. Most third-party contracts do not have these because they were signed by procurement without a security review. Re-paper the ones that matter. Start with the top five third parties ranked by their access level. If the contract is not negotiable, document the residual risk and price it.

Communicating With The Business Without Lying

The CISO who tells the board everything is fine is the CISO who will be unemployed when everything is not fine. The CISO who tells the board everything is on fire is the CISO who loses budget arguments because nothing they say is actionable. The middle path is structural honesty.

Structural honesty means presenting the state of resilience in a way that maps to business decisions. Three categories. What we can absorb without business impact. What will cause measurable business impact but is recoverable within a defined window. What will cause existential or regulatory impact. Each category gets named systems, named scenarios, named numbers.

The board does not need the threat intel briefing. The board needs to know that if the customer-facing platform goes down for forty-eight hours, revenue impact is twelve million and customer churn is around eight per cent, and that the current recovery capability supports a twenty-six hour restore against a twenty-four hour stated objective. That is a decision the board can engage with. They can approve more recovery investment or accept the gap.

The industrial-scale AI threat does not change this framing. It changes the probability inputs. The probability of a serious incident in any given year has gone up. The cost of preventing every incident has gone up faster than budgets. The cost of recovering from an incident has stayed roughly flat. The arithmetic now favours preparation for recovery over investment in prevention, at the margin. The board needs to hear that arithmetic.

What This Year Looks Like For Defenders

The next twelve months will produce more incidents per organisation than the last twelve, on average, across most sectors. The incidents will be faster from initial access to objective. The phishing will be harder to spot. The voice clones will be better. The malware will recompile itself between detonations. The exfiltrated data will be processed by the same models that wrote the phishing email, looking for the next pivot.

None of that is a counsel of despair. It is a counsel of operational seriousness. The organisations that come through the next twelve months without material loss will not be the ones with the best products. They will be the ones with the clearest priorities, the shortest decision chains, the tightest segmentation, the most boring backups, and the rotation policy that keeps their people functional through a long incident.

The ones that get hurt will not be hurt by the technology gap. They will be hurt by the meeting that should have happened in calm time and was deferred to the next quarter. They will be hurt by the runbook step that nobody had executed end-to-end. They will be hurt by the third-party access that was granted in 2019 and never reviewed. They will be hurt by the on-call analyst who did not have the authority to act and could not reach the person who did.

Fix those things. The threat will keep evolving. The defensive posture that holds against it is mostly structural, mostly human, mostly old. AI did not change what works. It raised the cost of not doing what works.

A Practical Sequence For The Next Ninety Days

If you read this and want a concrete starting sequence, run it in this order.

Week one: identify the five business functions that cannot stop. Write them down. Get the operations leaders to sign the list.

Week two and three: for each of the five functions, document minimum people, systems, and data, and the manual fallback. Print the binder.

Week four: audit privileged access. Every account, every third party, every service principal. Cut anything that does not map to a current named owner and a current named purpose.

Week five: verify backup separation. Try to reach the backup repository from a standard user account. If you can, fix the architecture. Test an immutable restore on at least one critical system end-to-end.

Week six: write the decision-rights document. Who can isolate an endpoint, disable an account, take a system offline, engage external help, notify customers, notify regulators. Names, not roles. Sign it. File it with legal.

Week seven and eight: run a tabletop exercise against a specific scenario - an AI-augmented business email compromise with a wire fraud attempt, or a ransomware event with public data leak threat. Find the gaps. Fix the top three before the next round.

Week nine: review third-party contracts for the top five most-privileged external parties. Identify which ones lack incident notification clauses. Open the renegotiation.

Week ten: pick three ATT&CK techniques relevant to your environment. Run the Atomic Red Team tests for them. Confirm what fires. Fix what does not.

Week eleven: run a functional drill on backup restore. Time it. Compare against the stated RTO. Document the gap.

Week twelve: present to the board. Three categories of impact, named systems, named scenarios, named numbers. Ask for the specific investment that closes the largest gap. Do not ask for a generic budget increase.

This is not a transformation program. It is twelve weeks of unglamorous work that will move your resilience posture more than any product purchase you make this year. The threat is moving faster. Your structure has to move first, because the structure is what survives the next variant.

A Note On Mental Posture

The last piece, which is the one this newsletter exists to repeat. The people who work in cyber and in adjacent high-pressure functions are operating in a domain where the threat is now industrial and the work is unending. The temptation is to match the threat’s intensity with personal intensity. That ends badly, every time, for the individual and for the program they run.

The sustainable posture is the opposite. Steady cadence. Defined off-hours. Visible rest. Refusal to be the only person who knows how a system works. Documented decisions. Honest debriefs. The cyber defender who lasts ten years in this work is not the one who works the hardest. It is the one who built the structure that allowed them to work less hard than the situation seemed to demand, and to be available for the next incident, and the one after that, and the one after that.

The industrial-scale threat is a marathon environment. Run it like a marathon. Pause when you need to. Adapt to what is in front of you. Continue.


#ad Contains an affiliate link.

Share

Keep Reading

Stay in the loop

New writing delivered when it's ready. No schedule, no spam.