Writing on software, systems, and hard-won lessons.
Writing on developer experience, systems thinking, and the mistakes behind both - covering AI workflows, continuous improvement, and the mental models that drive better decisions.
The last two developers I onboarded both asked about test driven development. They were shocked when I said TDD is unnecessary for most work. I've tried writing tests first, but over the past 25 years I've found it's very rarely the best approach. And have watched developers on my team struggle with productivity and extra cognitive load trying to make it work.
Development should drive the tests, not the other way around.
TDD advocates have told me that difficulty is the process working as intended. That our red-green-refactor loops were too large, and we should be writing a single tiny test for just the next few lines of code. But if the change is that small, static analysis or existing tests might catch that. Beller et al. observed 2,443 developers over 2.5 years and found that only 2.2% of sessions contained strict TDD patterns. If almost nobody does "true TDD," the problem might be the methodology, not the developers. And if what most developers actually practice looks nothing like TDD, then calling it TDD is just lending the methodology's credibility to a completely different workflow.
Just because you didn't work test-first doesn't mean you have to work test-never.
Others will argue TDD's real value is design, not testing. That writing the test first forces better architecture. From my experience, getting better at planning is more valuable than forcing design through a testing framework. On a healthcare SaaS platform I've worked on for 13 years, the best architecture decisions came from understanding the domain and planning the approach, not from writing a test and letting it push me toward a shape. Plan before you code, update the plan as you learn, and sharpen it as you go. That's a skill worth developing. Writing throwaway tests to arrive at the same design decisions is the long way around.
There are moments where writing the test first does make sense, when the behaviour is already fully understood and you're codifying a known expectation. But that's a small subset of real development work, and it doesn't require TDD as a philosophy.
The pattern across TDD research is consistent: the test-first part, the thing that actually defines TDD, has little to no measurable effect.
Erdogmus, Morisio, and Torchiano ran a controlled experiment and found that writing more tests correlated with higher productivity regardless of whether tests were written first or last. The benefit came from testing, not from sequencing.
Fucci et al. (2017) studied 39 professionals. The order you write tests and code in? Didn't matter. What mattered was working in small cycles. The test-first part had no effect on quality or productivity.
Rafique and Mišić (2013) looked across 27 studies. TDD showed a small quality benefit, and that's worth acknowledging, it's the strongest thing the research has in TDD's favour. But there was no productivity benefit. When they split by setting, productivity went up 19% in university studies but dropped 22% in industry. The more the work looked like real development, the worse TDD performed on productivity.
Bissi et al. (2016) reviewed studies from 1999 to 2014. About 44% showed lower productivity with TDD compared to writing tests after. Professional teams consistently did worse.
Tosun et al. (2017) tested 24 professionals on tasks of varying complexity. TDD helped on simple tasks. On complex ones, productivity dropped. The more the work looked like real development, the worse TDD performed.
Ghafari et al. (2020) asked why TDD research is so inconclusive and identified five problems: studies define TDD differently, participants are often TDD newcomers, experiments focus on greenfield code, no one tests TDD on existing codebases, and the comparison baseline varies wildly. Even TDD's own research base can't agree on what it's measuring.
A TDD advocate will point out that Beller et al.'s 2.2% finding could mean TDD is just hard to measure in the wild rather than that nobody does it. That's fair. But it also means the studies claiming TDD benefits may not be measuring strict TDD either. The research is messy in both directions.
This argument has been going on for over a decade. Here's where it stands.
"TDD causes design damage." DHH (creator of Ruby on Rails) argued in 2014 that designing code for testability rather than for the problem domain produces worse architecture. TDD advocates respond that this is a mocking problem, not a TDD problem. Kent Beck himself said he mocks almost nothing.
"Most unit testing is waste." Jim Coplien argued in 2014 that unit tests mostly cover cases that never occur in practice, and only have real value when testing algorithmic logic. TDD advocates respond that tests catch regressions you can't predict. Coplien's counter: integration and system tests do that better.
"You're doing TDD wrong." The most common defence. If it's not working, your loops are too large or you're over-isolating. But the Beller study found only 2.2% of developers follow strict TDD. If almost nobody does it correctly, that's worth paying attention to.
"TDD is a design tool, not a testing tool." The strongest defence. Writing the test first makes you think about the interface before you write the code. But Kent Beck said he tests "as little as possible to reach a given level of confidence." Martin Fowler said self-testing code matters more than whether you wrote the test first.
Even TDD's creators have softened. In the 2014 "Is TDD Dead?" conversations, Beck framed testing decisions as trade-offs involving skill, scope, lifespan, and coupling rather than as a universal discipline. Fowler drew a line between self-testing code and TDD, saying TDD is one path to self-testing code but not the only one. They concluded closer to "test what matters, when it matters" than to strict test-first. In my view, that's development driven testing.
Writing a test first usually means you're just testing a guess.
When I integrated a new payment provider into our platform, the documentation looked comprehensive. But their sandbox behaved differently from what the docs described for edge cases around partial refunds and webhook retry logic. If I'd written tests first based on the docs, I'd have thrown them all out within a day. Instead, I made exploratory API calls, figured out how the service actually worked, then wrote tests that matched reality.
Testing first is usually a bad idea when you are still figuring out what to build, like messy UI work, quick prototypes, strange real-world data, or a new external service that you do not understand yet. Once the shape of the solution is clearer, that is usually the point where people say to start adding tests.
A passing test suite means the scenarios you thought of work. It doesn't mean the code is safe. Users are unpredictable. They'll submit forms half-filled, hit back and forward in sequences you didn't consider, and connections will drop mid-request. No amount of tests written before or after the code will cover what you didn't think to test.
TDD can make this worse by creating false confidence. A green test suite feels like a safety net, but it's only as good as the cases you imagined. The bugs that actually make it to production are almost always scenarios nobody thought to test. Manual testing, real user feedback, monitoring in production, and good error handling catch those. A unit test you wrote before the code existed doesn't.
TDD requires thinking about what the required change is, how it should be implemented, and how it should be tested, all before you've confirmed whether your approach even works. There might be a subtle nuance to how the new code will interact with the rest of the code base, which isn't apparent until you run the code and review the output. How often does a non-trivial change work exactly as you expected on the first run? I can count those on one hand across 25 years.
Then there's the context switching between "what am I testing" and "what am I building." You're being pulled in different directions because you wrote the test before you understood the problem.
Noel Rappin, author of several books on Rails testing, wrote in 2016 that even as a test-first advocate, he finds "it nearly impossible to determine what I want to test for until I code up at least part of the view." He acknowledged that writing the code first swaps out TDD's exploration benefit in exchange for making useful tests easier to write. He was right, and his honest framing of the trade-off is closer to how most experienced developers actually work than the strict TDD gospel.
When I started my software development company 25 years ago, it was all fixed price contracts. Dozens of them over the first 5 years. If I went over time, I was effectively working for free. With only a certain amount of hours per day, I had to get better at estimating and faster at delivering while maintaining quality.
I tried TDD at first. But after finding I'd delete half the tests by the time I was ready to commit, I tried writing the code first and testing after. It saved so much time, and was a more enjoyable way to work, I've been working this way since.
I used to write lots of unit tests, especially useful when simplifying a feature or upgrading a code base. They'd run during development to confirm I hadn't missed anything. But then static code analysis became far more powerful. Now my unit tests are only logic-based, and that makes TDD even less practical than before.
Run preflight check scripts before every commit, and CI with every pull request. The tooling available today is seriously good: ESLint and TypeScript strict mode for Node, mypy and ruff for Python, golangci-lint for Go, cargo clippy for Rust, PHPStan for PHP. These catch type errors, null references, dead code, unreachable logic branches, security patterns, and code complexity automatically, before I've written a single new test.
My Go + Next.js preflight script runs over 20 checks. Backend gets go vet, golangci-lint, nilaway, revive, gofmt, and a full build before any tests run. Security checks run govulncheck and gosec. The SQL layer verifies that sqlc generate output matches committed code. Frontend gets ESLint, TypeScript strict mode, Prettier, plus knip and depcheck for unused code and dependencies, and madge for circular dependency detection. Backend and frontend checks can run in parallel. I have similar scripts for PHP, Rust, and Python projects.
Static analysis doesn't replace testing. But it has replaced a significant chunk of what I used to write unit tests for, like type mismatches, unused variables, null safety, and unreachable code paths. The tests that remain are the ones that actually matter: business logic, integration behaviour, and regression prevention. Kent C. Dodds put it well: "Write tests. Not too many. Mostly integration." That's closer to how I work than anything TDD prescribes.
Testing matters. I'm not arguing against tests. I'm arguing against writing them before you understand what you're building. Build the thing, understand it, confirm it works, then lock it down with the right test for the right risk.
AI coding agents are fast, but they don't know what you know. Without upfront planning, an agent will confidently scaffold an entire architecture around the wrong assumptions. And you won't notice unt...
Antifragility is my goal for developer experience. When something breaks, I want the fix to make it harder for that same thing to happen again.
What is claude insights: The /insights command in Claude Code generates an HTML report analysing your usage patterns across all your Claude Code sessions. It's designed to help us understand how we in...