▲Backend Platform · Systems · AI Tooling
maria khan.
pages from a runbook
I'm Maria. I started this blog to write down the things I've spent a week debugging, partly so I remember them, partly because writing forces me to actually understand them. Email's on the contact page.
- 0 posts
- 0 topics
- 0 mo writing
- 2026 since
The interesting parts of any system are the ones nobody designed on purpose. The bottleneck nobody named, the retry that fans out, the cache that benchmarked clean and turned into a serial choke point the moment production thread count went up.
Production scale is the only honest test of an idea. Everything before it is a hypothesis with good intentions.
Every system has a hidden serial bottleneck. The job is finding the one nobody put there on purpose.
Most platform incidents aren't caused by the design. They're caused by the gap between the design and how the design actually runs.
A retry without a budget is a fan-out generator.
If your test passes the first time, the test is wrong.
Postgres did exactly what you asked it to. You asked for the wrong thing.
The framework default is rarely wrong. It's also rarely right at production scale.
The runbook entry that helps the most is the one you wrote at 2am after the incident, not the one you wrote when you finally knew what to say.
You don't get to thread-safe Ruby by sprinkling Mutexes over an existing design. You get there by deciding ahead of time who owns each piece of mutable state.
The dashboard tells you what is on fire. It doesn't tell you what just got dry.
Half of platform work is convincing the other half that the boring fix is the right fix.
Most performance bugs hide one syscall down from where you are looking. The application sees a slow query; one strace later, the truth is a 5ms futex wait under it.
Read the library's source before reading its docs. The docs are aspirational. The source is what runs at 2am.
Every abstraction is a lie at one layer down. The fun part is which layer.
When in doubt, strace it.
Shipped
A secret scanner I built end to end. A Rust engine that finds hardcoded secrets, confirms which ones are actually live by calling the provider, and rewrites them to read from the environment. It ships as a CLI, an MCP server an agent can call before it commits, a VS Code extension, and a package in every major registry.
4,800+ downloads across
- crates.io
- npm
- RubyGems
- Go
- VS Code
- Open VSX
- GitHub Actions
- MCP Registry
What I think about
Lately
- Drafting a long-form piece on the dynamic-config system I shipped. The kind of post I wish I had read before I started.
- Sat through a secure code review. Half intimidating, half satisfying, half a long list of things I now want to read about.
- Built a small in-browser etcd-watch playground for my own learning. Useless, fun, the kind of project a Saturday demands.
- Read 'Designing Data-Intensive Applications' (again). Different parts hit this time.
- Quietly removed three layers of caching that were each there to fix the previous one. Felt like cleaning a kitchen.
- Started writing in the open after a year of journal entries that nobody read.
Reading shelf
- Designing Data-Intensive Applicationsthe book everyone owns and re-reads. earned.
- Site Reliability Engineeringskip the fluff chapters. the post-mortem and capacity ones are gold.
- Working Effectively with Legacy Codeolder than half the platforms it would help with. still right.
- Database Internalsthe right level of detail for someone who works with databases but does not write them.
- Understanding Distributed Systemscovers the parts of distributed systems that nobody warns you about until you ship something.
- Working with Ruby Threadsthe GVL chapter is the one I send people first. still right in 2026 even though half the toolbox is new.
- AI Fluency: Framework & Foundationsanthropic's free course on working with claude. the 4D framework is genuinely useful once you stop treating ai as autocomplete.
- The Code Bookread this as a teen. got me into the field and never left.
Stack
Daily
Watching
Topics
Writing timeline
April 2026 hover a dot to see what published when June 2026
Recent writing
-
Catching the secret before the commit, not after the audit
The cheapest place to catch a hardcoded secret is before it is ever committed. On false-positive fatigue and why scanners get muted, the gap between looks-like-a-key and is-this-key-live, and how I built leakferret to classify, verify, and rewrite secrets in the editor, the pre-commit hook, and the AI agent itself.
-
What concurrent Ruby looks like, twelve years after the book
Walking the foundations of Storimer's Working with Ruby Threads with the actual race conditions and deadlocks you'll see in production, then walking the primitives that have shown up since 2014: Ractors, Fiber.scheduler, async, Falcon, and the snapshot-plus-AtomicReference pattern I keep using for hot-path lookups.
-
Debugging Redis::CannotConnectError in Ruby
A month of thousands of connect-timeout errors a day from a Rails app on redis-rb. The dead ends (pool size, KEDA, DNS, kernel knobs you can't tune on managed Redis), the error taxonomy that actually narrows it down, and the four-line fix that turned out to be a footgun in your own code.