Coding Agents and Developer Economics on the Microsoft Stack (2026)
Coding agents are normal tool evolution, not rupture. The Microsoft business-apps read: execution compresses, but judgment and accountability do not.
Every few years a tool arrives that is supposed to end programming as a paid profession. FORTRAN was going to let scientists write their own code. COBOL was going to let managers read it. SQL was going to let business users query the database without a developer in the room. Each prediction was reasonable. Each was wrong in the same direction: demand for skilled builders went up, not down.
Brad DeLong’s essay Coding agents as a continuation of normal software tool evolution puts AI coding agents in that lineage. His argument is not that agents are unimpressive. It is that they are normal. They compress the part of software work that was already getting compressed - the execution - while leaving the parts that resisted automation for seventy years exactly where they were: deciding what to build, verifying that it works, and holding the contextual understanding that makes both possible.
That framing matters more in the Microsoft business-apps world than almost anywhere else, because this is the stack where “anyone can build it” has been the marketing promise the longest. Copilot Studio, low-code Power Platform, makers shipping agents without writing a line of C#: the execute phase here is already close to free. So the question DeLong forces is the useful one. If building is no longer the constraint, what is the scarce skill an enterprise actually pays for in 2026?
The headcount data is the whole argument
DeLong’s strongest move is to put numbers on a thing people assert from intuition. Software tooling has gotten dramatically more productive across ninety years, and the population of people who do software work did not shrink. It multiplied.
He traces the count across ninety years:
| Year | Role | Approximate headcount |
|---|---|---|
| 1935 | Calculator-tabulator machine wirers | about 2,000 |
| 1965 | Coders | about 80,000 |
| 1995 | Programmers | about 500,000 |
| 2025 | Software developers plus programmers | about 2.25 million developers plus 250,000 programmers |
The population that does software work did not shrink as the tools got more powerful. It grew by orders of magnitude. Even the conservative recent-window read, the thirty years from 1995 to 2025, is on its own roughly a fivefold rise. The assembler did not end the coder. The compiler did not end the assembler programmer. The high-level language did not end the compiler programmer. Each tool moved the work up a level and the number of people doing it grew.
This is the Jevons-paradox shape, and it is the part of the essay most worth internalizing if your career is on this stack. When a productive input gets cheaper, you do not automatically consume less of it. Often you consume far more, because uses that were previously too expensive to justify suddenly clear the bar. Cheaper software production did not mean less software. It meant software went into places no one would have funded a custom build for in 1995.
The Microsoft business-apps stack is living through exactly this right now. When a department can stand up a Copilot Studio agent in an afternoon, the answer is not fewer agents. It is more agents, in more corners of the business, owned by more people, touching more data. Every one of those agents is a decision someone made and an outcome someone is accountable for. That work does not compress. It multiplies along with the agents.
What actually compresses, and what does not
DeLong’s clean line is that writing code was never the bottleneck. The constraints were always upstream and downstream of the keyboard: deciding what to build, verifying that the deliverable is correct, and maintaining the deep contextual understanding that lets you do either one well. Agents are very good at the middle. They are not good at the ends, and there is no strong reason to expect that to change soon.
It helps to name the three phases plainly and ask which one an AI agent genuinely takes off your plate.
| Phase | What it is | Who owns it after agents arrive |
|---|---|---|
| Decide | Choosing what to build, for whom, against which constraint. Picking the agent, the data it sees, the boundary it must not cross. | Human-led. The agent can surface options, but it has no stake and no accountability for the choice. |
| Execute | Producing the artifact: the flow, the plugin, the prompt, the C# orchestration, the connector wiring. | Increasingly the agent, with a human in review. This is the phase that compresses. |
| Deliver | Verifying the result is correct, shipping it into production, and owning it on-call when it misbehaves. | Human-led. The agent can draft tests and checks, but verification and accountability do not transfer to a tool that cannot be held responsible. |
On the Microsoft stack the three phases are concrete, not abstract. Decide is the architect choosing whether a problem wants a Copilot Studio agent, a Foundry-hosted container, or no agent at all. Execute is the build itself, and this is where Copilot, agents, and low-code tooling have made enormous progress. Deliver is the part that keeps people honest: who signs off that a Copilot Studio agent gives correct answers about pricing or eligibility, and who gets paged when it gives a wrong one to a customer at 2am.
That last question is the one the technology cannot answer. An agent can draft the plugin. It cannot be the name in the incident channel. Accountability is not a capability you can add to a model. It is a property of a person inside an organization, and it is precisely what an enterprise is buying when it hires a senior architect rather than renting more compute.
The job changes shape, exactly as it always has
DeLong’s other useful observation is that the profession does not vanish under a new tool. It transforms. He contrasts the 1995 programmer, who translated specifications into code while managing memory and databases by hand, with the 2025 developer, who orchestrates tools and services across distributed systems and owns the result from design through deployment to on-call pager duty. Same profession, different centre of gravity. The hand-management of low-level mechanics fell away. The scope of ownership expanded.
The Microsoft business-apps practitioner has lived a smaller version of this transition more than once. The person who hand-wrote plugin registration steps and FetchXML by memory in 2012 is, if they kept current, the person designing solution-aware ALM, environment strategy, and governance guardrails in 2026. The low-level mechanics got abstracted. The ownership got bigger. AI agents are the next turn of that same wheel, not a different machine.
| Then | Now |
|---|---|
| Hand-wrote the plugin, the flow, the query. | Specs the behavior, then reviews and verifies what the agent drafts. |
| Owned a feature inside an application. | Owns an agent's outcomes from intake through retirement, including its on-call. |
| Bottleneck was typing the code. | Bottleneck is deciding what is worth building and proving it is correct. |
| Scarce skill: knowing the API surface cold. | Scarce skill: judgment, verification discipline, and accountability. |
None of this is a downgrade for the practitioner. It is the opposite. The work that compresses is the work that was always the least differentiated. The work that remains is the work that was always the reason a skilled person was in the room.
What should you hire a Microsoft AI architect for in 2026?
Hire for the work that does not compress: decision quality (judging which agents should exist and what they must not touch), verification discipline (proving an agent’s output before trusting it), and accountability ownership (being the name on it in production). Build speed is the one axis a tool already wins, so it is the wrong axis to hire on.
This is the part that matters for anyone making or seeking a senior hire on this stack. If execution compresses and judgment plus accountability stay scarce, then hiring on build speed is hiring on the wrong axis. The candidate who can produce a working flow fastest is competing with a tool that is getting faster every quarter. The candidate who can decide which agent should exist, prove it behaves, and own it when it does not is competing with no tool at all.
Three things are worth paying for, in order.
Decision quality. The ability to look at a business problem and correctly judge whether it wants an agent, what that agent should and should not touch, and what the failure modes are before a line is built. This is the cheapest decision to get wrong and the most expensive to discover late. It does not show up in a coding test.
Verification discipline. The instinct to treat an agent’s output as a draft to be proven, not a result to be trusted. On this stack that means eval datasets, deterministic gates, governance checks, and the refusal to ship a Copilot Studio answer to customers because it looked right in three manual tries. The more the build compresses, the more this matters, because the volume of things to verify goes up while the cost of producing them goes down.
Illustratively, that discipline is concrete on this stack. A Copilot Studio agent that answers pricing-eligibility questions gets an eval set and a gate before it ever faces a customer:
# Eval set for a Copilot Studio pricing-eligibility agent (excerpt, illustrative)
Q: Are Claude models on Azure covered by Founders Hub credits? -> Expected: No
Q: Does a sponsored subscription with a card on file get charged? -> Expected: Yes
Q: Which regions deploy Claude on Foundry today? -> Expected: East US 2, Sweden Central
# Gate: a Power Automate test flow runs this set on every solution import;
# a pass-rate below 100% fails the pipeline and the solution does not promote.
# Owner of record: a named architect. Audit: Managed Environment, DLP and log retention on.
The agent can draft that eval set. It cannot decide the threshold, hold the gate, or be the name on the incident. Those are the parts that do not compress.
Accountability ownership. The willingness and the standing to be the name on the agent in production. To define its success metrics, hold its release gate, and lead the response when it misfires. This is the scarcest of the three because it cannot be automated, outsourced, or faked, and because most organisations have not yet built the role that holds it.
A useful interview reframe falls out of this directly. Stop asking a senior candidate to build the thing faster. Ask them to tell you what should not be built, how they would prove the thing is correct once it exists, and who they think should own it on-call. The answers separate someone who can drive a tool from someone you can hand a production estate to.
Where the analogy has edges
The continuity argument is strong, and it is worth holding it honestly rather than as a comfort blanket. DeLong’s own framing - agents as cranes or photolithography, automating heavy execution while humans keep supervisory control - is a claim about supervised execution, not autonomous judgment. It holds while a human stays in the loop on the decide and deliver ends. The open question for the Microsoft stack is how disciplined enterprises will be about keeping that human there as the agents multiply and the temptation to let them self-approve grows.
That is not a refutation of the thesis. It is the condition the thesis runs on. The headcount grew across ninety years because skilled judgment and accountability stayed essential at every tool transition. They stay essential through this one only if organizations choose to keep them in the loop. The architects who make that case, and who can fill the decide and deliver roles themselves, are the ones the data says will be in more demand, not less.
DeLong’s bottom line, translated for this stack: the agent is the crane. Someone still has to decide where the building goes, sign off that it is safe to occupy, and answer the phone when something cracks. That someone is the hire. It always was.
What this does not mean
The continuity read is optimistic, not a guarantee. Hold the edges honestly.
- Aggregate growth is consistent with individuals being squeezed out of the execute-only tier. The category grows; a given job is not safe by default.
- Agents do draft evals, tests, and options. The human owns the gate, not the absence of agent involvement.
- Cheaper execution is not free quality. Review burden rises with agent volume, it does not fall.
- The historical pattern is correlational, not a law. Ninety years of growth does not guarantee the next ten.
- The headcount-and-continuity read is DeLong’s. The hire-for-decide-and-deliver prescription is mine, an extension of his data rather than his claim.
Read Next
- Foundry Hosted vs In-Process vs Copilot Studio Agents (2026 Decision). The decision framework for which build path an agent actually wants, picked by who builds and who runs it.
- AI Copilots vs Custom Azure Build: The Build-Buy Decision. The upstream decide question this article argues is the scarce skill, worked through for the buy-vs-build choice.
- Source: Brad DeLong, “Coding agents as a continuation of normal software tool evolution”. The essay this piece responds to.
Stay in the loop
Get new posts delivered to your inbox. No spam, unsubscribe anytime.
Related articles
Foundry Hosted vs In-Process vs Copilot Studio Agents (2026 Decision)
Foundry Hosted vs in-process vs Copilot Studio agents: a 2026 four-gate decision framework that picks the right path by who builds it and who runs it.
How to Align Claude Code With Your Codebase: 6 Techniques (2026)
Align Claude Code with your codebase and intent: plan mode, full context, project memory, and gates that cannot be skipped. Six practical techniques.
Copilot Credits Went Live: What Work IQ and Cowork Actually Cost
Copilot Credits billing went live June 16 across Work IQ and Cowork: the real per-call cost, the license gate, and the controls to set before July 1.