
The next article initially appeared on Wes McKinney’s weblog and is being republished right here with the writer’s permission.
Like lots of people, I’ve discovered that AI is horrible for my sleep schedule. Previously I’d get up briefly at 4:00 or 4:30 within the morning to have a sip of water or use the toilet; now I’ve hassle going again to sleep. I might be doing issues. Earlier than I might get a stable 7–8 hours an evening; now I’m fortunate once I get 6. I’ve largely stopped preventing it: Now once I’m rolling round restlessly in mattress at 5:07am with concepts to feed my AI coding brokers, I simply rise up and begin my day.
Amongst my inside circle of engineering and information science pals, there may be a number of dialogue about how lengthy our aggressive edge as people will final. Will having good concepts (and many them) nonetheless matter because the brokers start having higher concepts themselves? The human-expert-in-the-loop feels important now to get good outcomes from the brokers, however how lengthy will that final till our wildest concepts will be became working, tasteful software program whereas we sleep? Will or not it’s a mild obsolescence the place we fortunately hand off the reins or one thing else?
For now, I really feel wanted. I don’t describe the way in which I work now as “vibe coding” as this feels like a pejorative “immediate and chill” method of constructing AI slop software program initiatives. I’ve been constructing instruments like roborev to convey rigor and steady supervision to my parallel agent periods, and to closely scrutinize the work that my brokers are doing. With this radical new method of working it’s laborious to not be contemplative about the way forward for software program engineering.
In all probability the e book I’ve referenced probably the most in my profession is The Legendary Man-Month by Fred Brooks, whose now-famous Brooks’s regulation argues that “including manpower to a late software program undertaking makes it later.” Currently I discover myself asking whether or not the teachings from this e book are relevant on this new period of agentic improvement. Will a proficient developer orchestrating a swarm of AI brokers be capable to construct complicated software program quicker and higher, and can the short-term productiveness positive aspects result in long-term undertaking success? Or will we run into the identical bottlenecks—scope creep, architectural drift, and coordination overhead—which have plagued software program groups for many years?
Revisiting The Legendary Man-Month (TMMM)
One in every of Brooks’s central arguments is that small groups of elite individuals outperform massive groups of common ones, with one “chief surgeon” supported by specialists. This results in a excessive diploma of conceptual integrity in regards to the system design, as if “one thoughts designed it, even when many individuals constructed it.”
Agentic engineering seems to amplify these issues, because the high quality of the software program being constructed is now solely pretty much as good because the people within the loop curating and refining specs, saying sure or no to options, and taming pointless code and architectural complexity. One of many metaphors in TMMM is the “tar pit”: “Everybody can see the beasts struggling in it, and it seems like every one among them might simply free itself, however the tar holds all of them collectively.” Now, we’ve got a brand new “agentic tar pit” the place our parallel Claude Code periods and git worktrees are engaged in fight with the code bloat and incidental complexity generated by their digital colleagues. You may systematically refactor, however invariably an agentic codebase will find yourself bigger and extra overwrought than something constructed by human hand. That is technical debt on an unprecedented scale, accrued at machine pace.
In TMMM, Brooks noticed {that a} working program is possibly 1/ninth the way in which to a programming product, one which has the required testing, documentation, and hardening towards edge instances and is maintainable by somebody apart from its writer. Brokers at the moment are making the “working program” (or “appears-to-work” program, extra precisely) a fantastic deal extra accessible, although many newly minted AI vibe coders clearly underestimate the work concerned with going from prototype to manufacturing.
These issues compound when contemplating the closely-related Conway’s regulation, which asserts that the structure of software program techniques tends to resemble the organizations’ workforce or communication construction. What does that appear like when utilized to a digital “workforce” of brokers with no persistent reminiscence and no shared understanding of the system they’re constructing?
One other “huge thought” from TMMM that has caught with individuals is the n(n-1)/2 coordination downside as groups scale. With agentic engineering, there are fewer people concerned, so the coordination downside doesn’t disappear however slightly modifications form. Totally different agent periods could produce contradictory plans that people need to reconcile. I’ll go away this agent orchestration query for an additional publish.
No silver bullet
“There is no such thing as a single improvement, in both know-how or administration method, which by itself guarantees even one order-of-magnitude enchancment inside a decade in productiveness, in reliability, in simplicity.”
—“No Silver Bullet” (1986)
Brooks wrote a follow-up essay to TMMM to take a look at software program design by the lens of important complexity and unintended complexity. Important complexity is prime to attaining your objective: If you happen to made the system any easier, it will fall in need of its downside assertion. Unintentional complexity is every little thing else imposed by our instruments and processes: programming languages, instruments, and the layer of design and documentation to make the system comprehensible by engineers.
Coding brokers are most likely probably the most highly effective instrument ever created to sort out unintended complexity. To suppose: I principally don’t write code anymore, and now write tons of code in a language (Go) I’ve by no means written by hand. There’s a number of dialogue about whether or not IDEs are nonetheless going to be related in a 12 months or two, when possibly all we’d like is a textual content editor to overview diffs. The productiveness positive aspects are huge, and I say this as somebody burning north of 10 billion tokens a month throughout Claude, Codex, and Gemini.
However Brooks’s “No Silver Bullet” argument predicts precisely the issue I’m experiencing in my agentic engineering: The unintended complexity is not any downside in any respect anymore, however what’s left is the important complexity which was all the time the laborious half. Brokers can’t reliably inform the distinction. LLMs are extraordinary sample matchers skilled on the whole lot of humanity’s open supply software program, so whereas they’re sensible at coping with unintended complexity (refactor this code, write these exams, clear up this mess), they wrestle with the extra delicate important design issues, which regularly don’t have any precedent to sample match towards. Additionally they typically are likely to introduce pointless complexity, producing massive quantities of defensive boilerplate that’s hardly ever wanted in real-world use.
Put one other method, brokers are so good at attacking unintended complexity that they generate new unintended complexity that may get in the way in which of the important construction that you’re attempting to construct. With a few my new initiatives, roborev and msgvault, I’m already coping with this downside as I start to succeed in the 100 KLOC mark and watch the brokers start to chase their very own tails and contextually choke on the bloated codebases they’ve generated. Sooner or later past that (the following 100 KLOC, or 200 KLOC) issues begin to crumble: Each new change has to hack by the code jungle created by prior brokers. Name it a “brownfield barrier.” At Posit we’ve got seen brokers wrestle way more in 1 million-plus-line codebases corresponding to Positron, a VS Code fork. This appears to assist Brooks’s complexity scaling argument.
I might hesitate to position a wager on whether or not the current is a ceiling or a plateau. The fashions are clearly getting higher quick, and the issues I’m describing right here could look charmingly quaint in two years. However Brooks’s important/unintended distinction offers me some confidence that this isn’t simply in regards to the present limitations of the know-how. Determining what to construct was the laborious half lengthy earlier than we had LLMs, and I don’t see how a flawless coding agent modifications that.
Agentic scope creep
When producing code is free, figuring out when to say “no” is your final protection.
With the price of producing code now converging to zero, there may be virtually nothing stopping brokers and their human taskmasters from pursuing all avenues that might have beforehand been price or time prohibitive. The temptation to spend your day prompting “and now are you able to simply…?” is overwhelming. However any new generated characteristic or subsystem, whereas low-cost to create, just isn’t costless to take care of, take a look at, debug, and motive about sooner or later. What appears free now carries a future contextual burden for future agent periods, and every new bell or whistle turns into a brand new vector of brittleness or bugs that may hurt customers.
From this angle, constructing nice software program initiatives possibly by no means was about how briskly you’ll be able to kind the code. We are able to “kind” 10x, possibly 100x quicker with brokers than we might earlier than. However we nonetheless need to make good design choices, say no to most product concepts, keep conceptual integrity, and know when one thing is “performed.” Brokers are accelerating the “simple half” whereas paradoxically making the “laborious half” doubtlessly much more tough.
Agentic scope creep additionally appears to be actively destroying the open supply software program world. Now that the bar is decrease than ever for contributors to leap in and provide assist, initiatives are drowning in torrents of three,000-line “useful” PRs that add new options. As builders grow to be more and more hands-off and disengaged from the design and planning course of, the brokers’ runaway scope creep can get uncontrolled shortly. When the individual submitting a pull request didn’t write or absolutely learn the code in it, there’s probably nobody concerned who’s actually accountable for the design choices.
I’ve seen in my very own work on roborev and msgvault that brokers will suggest overwrought options to issues when a easy resolution would do exactly positive. It takes judgment to know when to intervene and the right way to preserve the agent in verify.
Design and style as our final foothold
Brooks’s argument is that design expertise and good style are probably the most scarce assets, and now with brokers doing the entire coding labor, I argue that these expertise matter extra now than ever. The bottleneck was by no means palms on keyboards. Now with the brand new “Legendary Agent-Month,” we are able to moderately conclude that design, product scoping, and style stay the sensible constraints on delivering high-quality software program. The builders who thrive on this new agentic period received’t be those who run probably the most parallel periods or burn probably the most tokens. They’ll be those who’re in a position to maintain their initiatives’ conceptual fashions of their thoughts, who’re shrewd about what to construct and what to go away out, and train style over the large quantity of output.
The Legendary Man-Month was printed in 1975, greater than 50 years in the past. In that point, quite a bit has occurred: large progress in {hardware} efficiency, programming languages, improvement environments, cloud computing, and now massive language fashions. The instruments have modified, however the constraints are nonetheless the identical.
Perhaps I’m attempting to justify my very own continued relevance, however the actuality is extra complicated than that. Not all software program is created equal: CRUD enterprise productiveness apps aren’t the identical as databases and different essential techniques software program. I feel the median software program consulting store is totally toast. However my thesis is extra about improvement work within the 1% tail of the distribution: issues inaccessible to most engineers. This can proceed to require professional people within the loop, even when they aren’t doing a lot or any guide coding. As one latest adjoining instance, my pal Alex Lupsasca at OpenAI and his world-class physicist collaborators had been in a position to create a formulation of a tough physics downside and arrive at an answer with AI’s assist. With out such consultants within the loop, it’s way more doubtful whether or not LLMs would be capable to each pose the questions and give you the options.
For now, I’ll most likely nonetheless be getting away from bed at 5am to feed and tame my brokers for the foreseeable future. The coding is simpler now, and truthfully extra enjoyable, and I can spend my time fascinated about what to construct slightly than wrestling with the instruments and techniques across the engineering course of.
Due to Martin Blais, Josh Bloom, Phillip Cloud, Jacques Nadeau, and Dan Shapiro for giving suggestions on drafts of this publish.
