RC RANDOM CHAOS

Cloudflare opens self-managed OAuth to all devs, surviving a live Hydra upgrade

· via Hacker News

Original source

Cloudflare launched self-managed OAuth for all

Hacker News →

Cloudflare has made self-managed OAuth available to every developer on its platform, letting them register and run their own OAuth clients for scoped, delegated access to the Cloudflare API. Until now, third-party OAuth was limited to a handful of manually onboarded partners like PlanetScale, leaving everyone else to lean on API tokens that are clumsy to manage and ill-suited to delegated app flows. The push was driven largely by the rise of agentic tools demanding delegated access, alongside the usual SaaS integrations and internal developer platforms. To support it safely, Cloudflare reworked its consent screens, added dashboard-based revocation, and made app ownership visible to blunt OAuth phishing.

The more interesting story is the engineering underneath. Cloudflare’s OAuth runs on Hydra, an open-source engine that needed a major version jump it couldn’t afford to do in place. The team split the work into sequential 1.X and 2.X upgrades. Even the 1.X step required rewriting schema migrations to use CREATE INDEX CONCURRENTLY and building a custom Hydra that selected explicit columns instead of SELECT *, which had been breaking on the new schema. After cutover, stricter refresh-token reuse handling started invalidating entire token chains—a real problem for high-volume Wrangler and MCP clients—so they added refresh-token coalescing in the routing Worker to short-circuit retries.

The 2.X migration used a blue-green strategy spanning several hours of live traffic. Rather than freeze writes (which would have blocked revocations), they kept writes enabled, stretched token expiry to hours to cut new-token churn, and captured revocation events in a Cloudflare Queue so they could be replayed against the green database—critical to avoid silently restoring access users had deliberately revoked. The episode is a useful case study in migrating a stateful auth system without downtime, and notably, 2.X’s configurable refresh-token grace period ultimately solved the invalidation problem natively.

Read the full article

Continue reading at Hacker News →

This is an AI-generated summary. Read the original for the full story.