Brokers don’t know what attractiveness like. And that’s precisely the issue. – O’Reilly

April 13, 2026

11

Luca Mezzalira, writer of Constructing Micro-Frontends, initially shared the next article on LinkedIn. It’s being republished right here together with his permission.

Each few years, one thing arrives that guarantees to alter how we construct software program. And each few years, the business splits predictably: One half declares the outdated guidelines useless; the opposite half folds its arms and waits for the hype to move. Each camps are normally improper, and each camps are normally loud. What’s rarer, and extra helpful, is somebody standing in the course of that noise and asking the structural questions: Not “What can this do?” however “What does it imply for the way we design methods?”

That’s what Neal Ford and Sam Newman did in their current hearth chat on agentic AI and software program structure throughout O’Reilly’s Software program Structure Superstream. It’s a dialog value pulling aside fastidiously, as a result of a few of what they floor is extra uncomfortable than it first seems.

The Dreyfus entice

Neal opens with the Dreyfus Mannequin of Information Acquisition, initially developed for the nursing occupation however relevant to any area. The mannequin maps studying throughout 5 levels:

Novice
Superior newbie
Competent
Proficient
Professional

His declare is that present agentic AI is caught someplace between novice and superior newbie: It will probably observe recipes, it may even apply recipes from adjoining domains when it will get caught, however it doesn’t perceive why any of these recipes work. This isn’t a minor limitation. It’s structural.

The canonical instance Neal offers is gorgeous in its simplicity: An agent tasked with making all checks move encounters a failing unit check. One completely legitimate solution to make a failing check move is to interchange its assertion with assert True. That’s not a hack within the agent’s thoughts. It’s an answer. There’s no moral framework, no skilled judgment, no intuition that claims this isn’t what we meant. Sam extends this instantly with one thing he’d actually seen shared on LinkedIn that week: an agent that had modified the construct file to silently ignore failed steps fairly than repair them. The construct handed. The issue remained. Congratulations all-round.

What’s attention-grabbing right here is that neither Ford nor Newman are being dismissive of AI functionality. The purpose is extra delicate: The creativity that makes these brokers genuinely helpful, their potential to go looking answer house in methods people wouldn’t assume to, is inseparable from the identical property that makes them harmful. You possibly can’t absolutely lobotomize the improvization with out destroying the worth. This can be a design constraint, not a bug to be patched.

And whenever you zoom out, that is a part of a broader sign. When skilled practitioners who’ve spent a long time on this business independently converge on requires restraint and rigor fairly than acceleration, that convergence is value taking note of. It’s not pessimism. It’s sample recognition from individuals who’ve lived by way of sufficient cycles to know what the warning indicators seem like.

Conduct versus capabilities

Some of the essential issues Neal says, and I feel it will get misplaced within the general density of the dialog, is the excellence between behavioral verification and functionality verification.

Behavioral verification is what most groups default to: unit checks, useful checks, integration checks. Does the code do what it’s speculated to do in response to the spec? That is the pure match for agentic tooling, as a result of brokers are literally getting fairly good at implementing conduct in opposition to specs. Give an agent a well-defined interface contract and a transparent set of acceptance standards, and it’ll produce one thing that broadly satisfies them. That is actual progress.

Functionality verification is more durable. A lot more durable. Does the system exhibit the operational qualities it must exhibit at scale? Is it correctly decoupled? Is the safety mannequin sound? What occurs at 20,000 requests per second? Does it fail gracefully or catastrophically? These are issues that the majority human builders battle with too, and brokers have been skilled on human-generated code, which suggests they’ve inherited our failure modes in addition to our successes.

This brings me to one thing Birgitta Boeckeler raised at QCon London that I haven’t been capable of cease serious about. The instance everybody cites when making the case for AI’s coding functionality is that Anthropic constructed a C compiler from scratch utilizing brokers. Spectacular. However right here’s the factor: C compiler documentation is very well-specified and battle-tested over a long time, and the check protection for compiler conduct is a number of the most rigorous in all the software program business. That’s as near a solved, well-bounded drawback as you will get.

Enterprise software program is sort of by no means like that. Enterprise software program is ambiguous necessities, undocumented assumptions, tacit information dwelling within the heads of people that left three years in the past, and check protection that exists extra as aspiration than actuality. The hole between “can construct a C compiler” and “can reliably modernize a legacy ERP” isn’t a spot of uncooked functionality. It’s a spot of specification high quality and area legibility. That distinction issues enormously for the way we take into consideration the place agentic tooling can safely function.

The present orthodoxy in agentic growth is to throw extra context on the drawback: elaborate context recordsdata, structure resolution data, tips, guidelines about what to not do. Ford and Newman are appropriately skeptical. Sam makes the purpose that there’s now empirical proof suggesting that as context file measurement will increase, you see degradation in output high quality, not enchancment. You’re not guiding the agent towards higher judgment. You’re simply accumulating scar tissue from earlier disasters. This isn’t distinctive to agentic workflows both. Anybody who has labored significantly with code assistants is aware of that summarization high quality degrades as context grows, and that this degradation is simply partially controllable. That has a direct affect on choices revamped time; now shut your eyes for a second and picture doing it throughout an enterprise software program, with many groups throughout completely different time zones. Don’t get me improper, the instruments assist, however the assistance is bounded, and that boundary is usually nearer than we’d prefer to admit.

The extra trustworthy framing, which Neal alludes to, is that we want deterministic guardrails round nondeterministic brokers. No more prompting. Architectural health features, an concept Ford and Rebecca Parsons have been selling since 2017, really feel like they’re lastly about to have their second, exactly as a result of the price of not having them is now instantly seen.

What ought to an agent personal then?

That is the place the dialog will get most attention-grabbing, and the place I feel the sphere is most confused.

There’s a seductive logic to the microservice because the unit of agentic regeneration. It sounds small. The phrase micro is within the identify. You possibly can think about handing an agent a service with an outlined API contract and saying: implement this, check it, carried out. The scope feels manageable.

Ford and Newman give this concept honest credit score, however they’re additionally trustworthy concerning the hole. The microservice stage is enticing architecturally as a result of it comes with an implied boundary: a course of boundary, a deployment boundary, usually a knowledge boundary. You possibly can put health features round it. You possibly can say this service should deal with X load, keep Y error fee, expose Z interface. In concept.

In observe, we barely implement these things ourselves. The brokers have discovered from a corpus of human-written microservices, which suggests they’ve discovered from the overwhelming majority of microservices that have been written with out correct decoupling, with out actual resilience considering, with none rigorous capability planning. They don’t have our aspirations. They’ve our habits.

The deeper drawback, which Neal raises and which I feel deserves extra consideration than it will get, is transactional coupling. You possibly can design 5 fantastically bounded companies and nonetheless produce an architectural catastrophe if the workflow that ties them collectively isn’t thought by way of. Sagas, occasion choreography, compensation logic: That is the stuff that breaks actual methods, and it’s additionally the stuff that’s hardest to specify, hardest to check, and hardest for an agent to cause about. We made precisely this error within the SOA period. We designed beautiful little companies after which found that the attention-grabbing complexity had merely migrated into the combination layer, which no one owned and no one examined.

Sam’s line right here is value quoting instantly, roughly: “To err is human, however it takes a pc to essentially screw issues up.” I believe we’re going to provide some genuinely legendary transaction administration disasters earlier than the sphere develops the muscle reminiscence to keep away from them.

The sociotechnical hole no one is speaking about

There’s a dimension to this dialog that Ford and Newman gesture towards however that I feel deserves way more direct examination: the query of what occurs to the people on the opposite aspect of this generated software program.

It’s not fully correct to say that each one agentic work is occurring on greenfield initiatives. There are instruments already in manufacturing serving to groups migrate legacy ERPs, modernize outdated codebases, and deal with the modernization problem that has defeated standard approaches for years. That’s actual, and it issues.

However the problem in these circumstances isn’t merely the code. It’s whether or not the sociotechnical system, the groups, the processes, the engineering tradition, the organizational buildings constructed across the present software program are able to inherit what will get constructed. And right here’s the factor: Even when brokers mixed with deterministic guardrails may produce a well-structured microservice structure or a clear modular monolith in a fraction of the time it might take a human group, that architectural output doesn’t robotically include organizational readiness. The system can arrive earlier than the individuals are ready to personal it.

One of many underappreciated features of iterative migration, the incremental strangler fig method, the sluggish decomposition of a monolith over 18 months, isn’t primarily danger discount, although it does that too. It’s studying. It’s the method by which a group internalizes a brand new approach of working, makes errors in a bounded context, recovers, and builds the judgment that lets them function confidently within the new world. Compress that journey too aggressively and you may find yourself with structure whose operational complexity exceeds the organizational capability to handle it. That hole tends to be costly.

At QCon London, I requested Patrick Debois, after a chat protecting greatest practices for AI-assisted growth, whether or not making use of all of these practices constantly would make him snug engaged on enterprise software program with actual complexity. His reply was: It relies upon. That felt just like the trustworthy reply. The tooling is bettering. Whether or not the people round it are conserving tempo is a separate query, and one the business isn’t spending practically sufficient time on.

Present methods

Ford and Newman shut with a topic that just about by no means will get lined in these conversations: the huge, unglamorous majority of software program that already exists and that our society is determined by in methods which might be straightforward to underestimate.

A lot of the discourse round agentic AI and software program growth is implicitly greenfield. It assumes you’re beginning contemporary, that you simply get to design your structure sensibly from the start, that you’ve clear APIs and tidy service boundaries. The truth is that the majority beneficial software program on this planet was written earlier than any of this existed, runs on platforms and languages that aren’t the pure habitat of contemporary AI tooling, and comprises a long time of amassed choices that no one absolutely understands anymore.

Sam is engaged on a ebook about this: how you can adapt present architectures to allow AI-driven performance in methods which might be truly protected. He makes the attention-grabbing level that present methods, regardless of their fame, typically provide you with a head begin. A well-structured relational schema carries implicit which means about information possession and referential integrity that an agent can truly cause from. There’s construction there, if you know the way to learn it.

The overall lesson, which he states with out a lot drama, is which you could’t simply expose an present system by way of an MCP server and name it carried out. The interface isn’t the structure. The dangers round safety, information publicity, and vendor dependency don’t go away since you’ve wrapped one thing in a brand new protocol.

This issues greater than it might sound, as a result of the software program that runs our monetary methods, our healthcare infrastructure, our logistics and provide chains, isn’t greenfield and by no means will likely be. If we get the modernization of these methods improper, the results are usually not summary. They’re social. The intuition to index closely on what these instruments can do in superb situations, on well-specified issues with good documentation and thorough check protection, is comprehensible. However it’s precisely the improper intuition when the methods in query are those our lives rely upon. The architectural mindset that has served us properly by way of earlier paradigm shifts, the one which begins with trade-offs fairly than capabilities, that asks what we’re giving up fairly than simply what we’re gaining, isn’t non-compulsory right here. It’s the minimal requirement for doing this responsibly.

What I take away from this

Three issues, principally.

The primary is that introducing deterministic guardrails into nondeterministic methods isn’t non-compulsory. It’s crucial. We’re nonetheless determining precisely the place and the way, however the framing must shift: The aim is management over outcomes, not simply oversight of output. There’s a distinction. Output is what the agent generates. End result is whether or not the system it generates truly behaves appropriately beneath manufacturing situations, stays inside architectural boundaries, and stays operable by the people accountable for it. Health features, functionality checks, boundary definitions: the boring infrastructure that connects generated code to the true constraints of the world it runs in. We’ve had the instruments to construct this for years.

The second is that the individuals saying that is the longer term and the individuals saying that is simply one other hype cycle are each in all probability improper in attention-grabbing methods. Ford and Newman are cautious to say they don’t know what attractiveness like but. Neither do I. However we’ve got higher prior artwork to attract on than the discourse normally acknowledges. The ideas that made microservices work, once they labored, actual decoupling, specific contracts, operational possession, apply right here too. The ideas that made microservices fail, leaky abstractions, distributed transactions dealt with badly, complexity migrating into integration layers, will trigger precisely the identical failures, simply sooner and at bigger scale.

The third is one thing I took away from QCon London this yr, and I feel it may be crucial of the three. Throughout two days of talks, together with periods that took diametrically reverse approaches to integrating AI into the software program growth lifecycle, one factor turned clear: We’re all learners. Not within the dismissive sense however in probably the most literal software of the Dreyfus mannequin. No one, no matter expertise, has discovered the suitable solution to match these instruments inside a sociotechnical system. The recipes are nonetheless being written. The battle tales that may finally change into the prior artwork are nonetheless occurring to us proper now.

What obtained us right here, collectively, was sharing what we noticed, what labored, what failed, and why. That’s how the sphere moved from SOA disasters to microservices greatest practices. That’s how we constructed a shared vocabulary round health features and evolutionary structure. The identical course of has to occur once more, and it’ll, however provided that individuals with actual expertise are trustworthy concerning the uncertainty fairly than performing confidence they don’t have. The velocity, finally, is each the chance and the hazard. The expertise is shifting sooner than the organizations, the groups, and the skilled instincts that want to soak up it. The very best response to that isn’t to faux in any other case. It’s to maintain evaluating notes.

If this resonated, the full hearth chat between Neal Ford and Sam Newman is value watching in its entirety. They cowl extra floor than I’ve had house to react to right here. And should you’d prefer to be taught extra from Neal, Sam, and Luca, try their most up-to-date O’Reilly books: Constructing Resilient Distributed Programs, Structure as Code, and Constructing Micro-frontends, second version.

Brokers don’t know what attractiveness like. And that’s precisely the issue. – O’Reilly

The Dreyfus entice

Conduct versus capabilities

What ought to an agent personal then?

The sociotechnical hole no one is speaking about

Present methods

What I take away from this

Related Articles

Malicious JetBrains Plugins Steal AI API Keys as Chrome Extensions Seize Chatbot Chats

Helpful Apps and Software program for FPV Drone Pilots

ios – UITabBarController on iPadOS 18 swallows all touches even with mode = .tabBar (by way of Python/rubicon-objc)

LEAVE A REPLY Cancel reply

Latest Articles

Malicious JetBrains Plugins Steal AI API Keys as Chrome Extensions Seize Chatbot Chats

Helpful Apps and Software program for FPV Drone Pilots

ios – UITabBarController on iPadOS 18 swallows all touches even with mode = .tabBar (by way of Python/rubicon-objc)

Optical multistability in a compact microcavity enabled by near-exceptional coupling

Pegasus Tech Ventures launches $60M fund for bodily AI startups

ABOUT US