RC RANDOM CHAOS

The 9.x exposure with nothing to patch

Why exposed DigitalOcean data needs no exploit, what defenders miss in telemetry, and how to escalate a leak responsibly.

· 7 min read
The 9.x exposure with nothing to patch

Open storage. No authentication. The kind of exposure that needs no exploit chain, no memory corruption, no sandbox escape. The data is reachable. A request returns it. That is the whole attack.

The report behind this is a reader who found a large dataset sitting on DigitalOcean and wanted a contact to escalate it. The instinct is right. The mechanism under it repeats every quarter, same root cause, different provider, so it is worth taking apart.

This is not a CVE-class memory safety bug. There is no patch boundary because there is no software defect in the conventional sense. The defect is configuration. CWE-200, exposure of sensitive information to an unauthorized actor. It pairs with CWE-306, missing authentication for a critical function, and CWE-732, incorrect permission assignment for a critical resource. Where the resource shipped with a permissive default that nobody changed, CWE-1188 applies, insecure default initialization. No CVSS vector covers it because the weakness is in the deployment, not the code. If a score were forced onto it, an anonymous unauthenticated read of a full dataset over the network lands in the 9.x range on confidentiality impact alone.

The exposure takes two dominant shapes on DigitalOcean. The first is Spaces, the S3-compatible object storage product, left with a public-read ACL or a public bucket policy. The bucket lists. Objects download anonymously over HTTPS. No signature, no key, no token. The second is a database reachable on the public interface. A managed database with its trusted-source firewall left open, or a self-hosted instance on a droplet bound to 0.0.0.0 with no cloud firewall in front of it. MongoDB, Elasticsearch, Redis, PostgreSQL, MySQL. The service listens on its default port, accepts the connection, and serves records to a client that never authenticated because authentication was never enforced.

The database case has more texture than a missing firewall rule. A bind address of 0.0.0.0 means the listening socket accepts connections on every interface, including the public one DigitalOcean assigns to the droplet. Older MongoDB builds shipped with exactly that default before the project changed it, and a long tail of instances still runs that way through copied configs and stale images. Redis is worse than a read. An unauthenticated Redis instance exposes the CONFIG command, a known path to writing attacker-controlled files and from there to code execution, so an exposed cache is a foothold, not only a data problem. Unauthenticated Elasticsearch answers a search query against every index it holds. The common thread is a service that treats network reachability as proof of authorization, which on a public interface it never is.

The reason this keeps happening is the development workflow, not ignorance. A bucket gets set to public during a build so a frontend can pull assets without signed URLs. A database gets bound to all interfaces so a teammate can connect during testing. The cloud firewall is the last step and the step that gets skipped. The mental model is that the bucket name is unguessable and the droplet IP is obscure, so nobody will find it. Obscurity is not a control. Internet-wide scanners sweep the full IPv4 range continuously and enumerate common object-storage namespaces around the clock. A reachable service is a found service, usually within hours of going live.

The exploit path is the part that disappoints anyone expecting a chain. There is no chain. Discovery runs at internet scale through mass indexers. The actor does not target a victim. The actor harvests whatever responds. MITRE maps this cleanly. T1595, active scanning, for the discovery. T1530, data from cloud storage, for an open Spaces bucket. T1190 where a public-facing database is the door. No credential theft, no privilege escalation, no lateral movement, because the data sits on the front step and the door is open. The weaponization is a GET request or a database client connect. The volume and the lack of obfuscation noted in the original report are not anomalies. They are what an unauthenticated store looks like when the owner assumed nobody was looking.

Real-world precedent is not theoretical. The Meow attacks in 2020 wiped thousands of unauthenticated MongoDB and Elasticsearch instances with an automated bot that left no ransom note, only destruction and a placeholder string. Groups in the ShinyHunters lineage built a business on harvesting exposed stores and reselling them on breach forums. Public indexers like Shodan, Censys, and BinaryEdge catalog open services as a matter of routine, which means the gap between exposure and inclusion in a searchable index is short. The data does not have to be interesting to a human. A scraper takes it because it responded, and the triage happens later.

The detail that the data sat in the clear matters as much as the volume. An open store holding records encrypted under a key the owner controls is an availability and access problem. An open store serving plaintext PII is a confidentiality breach the moment the first object lands on an external host. No tokenization, no field-level encryption, no client-side wrapping means the harvested copy is usable on arrival, ready to be parsed, correlated against prior breach corpuses, and resold. Lack of obfuscation is not a stylistic observation. It sets the harm assessment, because directly readable identifiers are what trip notification thresholds.

Telemetry is where the defender is blind, and the blindness is structural. A droplet has no endpoint agent by default. DigitalOcean does not run code inside the customer operating system, so there is no EDR feed, no Sysmon stream, no process telemetry unless the owner installed and configured it. Spaces access logging is off until explicitly enabled. There is no flow-log equivalent capturing reads at the network layer by default. So a bulk anonymous download produces almost nothing on the victim side. Maybe a bandwidth line on the billing graph that nobody watches. Database query logs if logging was turned up, which on a misconfigured instance it was not. Authentication logs show nothing, because no authentication failed, because none was required. The exfiltration is indistinguishable from legitimate traffic. The service was built to serve anonymous reads, and that is exactly what it did.

The platform holds signal the tenant never sees. DigitalOcean runs the hypervisor and the network fabric, so the provider can observe the connection and the egress volume, but that view does not surface to the customer as a queryable log. The tenant’s own droplet records nothing useful either. The system auth log captures SSH sessions, not anonymous reads against a database port or an HTTP fetch from a public bucket. So the owner is left correlating a billing anomaly after the fact against a service that has no native audit trail, which is the same as having no detection at all until the data turns up somewhere public.

Walk the same event with the controls that should have fired. A cloud firewall scoped to known sources would have dropped the connection at the edge and logged the drop with a source address and ASN. Spaces access logging would have recorded the enumerator user agent, the source ASN, and the object keys pulled, which is enough to scope the breach after the fact. A managed database restricted to trusted sources would have refused the connection. Every one of those controls produces evidence. Every one of them is opt-in. None was opted into. The detection gap is not a missing product. It is the difference between the telemetry the platform can emit and the telemetry the owner enabled, and on a self-managed deployment that difference is usually everything.

Escalation is the part the original post got right and most finders get wrong. The correct route is DigitalOcean’s abuse and security reporting channel, not a personal contact sourced from a thread. The provider can reach the account owner and act on the resource. The finder cannot, and should not try. Pulling more of the dataset than the minimum needed to confirm exposure crosses from research into access, and good intent does not clean that up after the fact. Under Australian law the obligation sits with the entity that holds the data. The Privacy Act 1988 and the Notifiable Data Breaches scheme require assessment and notification to the OAIC and affected individuals where serious harm is likely. If the data belongs to a critical infrastructure entity, the SOCI Act adds incident reporting timelines on top. The responsible move is to report to the provider and the relevant CERT, document what was observed without redistributing it, and let the data holder run its obligations. Do not download. Do not mirror. Do not post a sample.

The closing reality is the uncomfortable one. There is no patch to apply and no version to roll forward to. The fix is a permission flip on the bucket policy and a firewall rule on the database, and both take minutes. What does not get fixed is the window. The residual exposure after closure is whatever was harvested while the store was open, and for an internet-facing resource that has been reachable and crawled, the safe assumption is that the copy is gone. Indexers cache. Scrapers retain. Once a dataset has answered anonymous requests at scale, the owner no longer controls every copy of it. Closing the bucket stops the next read. It does not recall the ones that already happened, and the breach assessment has to start from that assumption, not the hope that nobody got there first.

See also: NordVPN for tunneled traffic when operating outside controlled networks.


#ad Contains an affiliate link.

Share

Keep Reading

Stay in the loop

New writing delivered when it's ready. No schedule, no spam.