The environment is the product

Same models, opposite outcomes

You’ve probably already noticed this, even if you haven’t named it yet. Some parts of your codebase produce clean agent output with minimal review overhead. Other parts produce output that technically works but generates long review threads, quiet rework, and a growing sense that the AI is creating more work than it saves. The difference isn’t randomness. The clean parts have clear conventions, explicit boundaries, and enough documented context that an agent can make good decisions. The messy parts have knowledge that lives in people’s heads. Same models, same team, wildly different outcomes — and the variable is always the environment.

That environment needs a dedicated owner, a function we’ll be calling DAX: Developer and Agent experience engineer.

Week one: the audit

Your DAX engineer shows up, and the first thing they do is not install anything. They sit with the codebase and try to answer a simple question: if an agent started working here today, what would it get wrong?

They find the usual things. Conventions that live in people’s heads but aren’t written anywhere an agent can read them. An architectural boundary the team agreed on six months ago that exists only as a Slack message and a shared understanding. A test suite that works fine when a human knows which parts to run and in what order, but that no automated process can invoke end-to-end. Documentation that was accurate once and has drifted since.

The industry spent 2023 and 2024 on prompt engineering, but the discipline that actually determines output quality turned out to be context engineering: controlling what an agent knows before it starts working, not how cleverly you phrase the request. Your DAX engineer’s first job is to figure out the gap between what your team knows and what an agent can access, because that gap is where every quality problem originates. An agent encountering ambiguity doesn’t ask a colleague. It guesses. And at scale, guesses compound into a codebase that looks correct and is quietly, expensively wrong.

Month one: the wiring

By the end of the first month, the DAX engineer has started encoding the team’s actual conventions into machine-readable context: rule files, structured decision records, agent configuration that reflects how the team genuinely works rather than how a wiki page from last year says it should work. Think of this as the evolution from .editorconfig telling a linter what to flag, to a context layer telling an agent how to think about your codebase.

They’ve also started redesigning the verification pipeline. This matters because the review bottleneck is the constraint nobody planned for. Teams that gave agents the ability to write code at high velocity discovered they’d created a new problem: thousands of lines of functioning-but-unreviewed code that couldn’t be deployed. The answer isn’t to hire more reviewers. It’s to build layers: automated checks catch mechanical issues, AI-assisted review catches structural ones, and by the time a human reviewer opens a pull request, they’re exercising judgment about design and intent rather than catching things a machine should have caught first.

And they’ve started thinking about specs. The “vibe coding” era produced its cautionary data fast (a quarter of Y Combinator’s Winter 2025 cohort shipped 95%-AI-generated codebases and then drowned in debt and security holes), and the counter-movement is spec-driven development, where the specification becomes the source of truth rather than the code. The workflow gaining traction is to write a minimal spec, have an agent interview you to sharpen it, then execute the refined spec in a clean session with full project context. The DAX engineer’s job is to make this workflow feel natural rather than bureaucratic, because if people don’t actually use it, it doesn’t matter how good it is.

Month three: the compound

Here is where the role starts to pay for itself. The DAX engineer notices that one team figured out a constraint that dramatically reduced agent mistakes in a particular part of the codebase, but nobody else knows about it. They build the mechanism for that insight to propagate: not a wiki page nobody reads, but an actual change to the shared context that every agent session picks up automatically.

They notice that the senior engineers are spending most of their time reviewing agent output rather than building, and that this is a capacity problem with no obvious solution. So they invest in making the automated verification layers good enough that senior review becomes about the genuinely hard calls, not about catching issues that a well-configured CI pipeline should have prevented.

They start tracking the metrics that actually matter in this new world. Not velocity or lines of code (an agent produces a thousand lines in minutes, which tells you nothing), but things like: what fraction of agent-generated code passes review without significant rework, and how quickly a workflow improvement on one team reaches every other team. These are the numbers that tell you whether your environment is actually getting better, and most organisations aren’t tracking them because nobody’s job depends on them.

And they’re building the feedback loop that makes the whole system compound. Every piece of encoded knowledge makes every future agent interaction slightly better. Every improvement to the verification pipeline means humans spend less time on mechanical review and more on the judgment calls that only humans can make. Every spec template that actually gets used means fewer implementations that compile beautifully and solve the wrong problem. This compounding is one of the few genuine structural advantages available right now, and it only happens when someone’s job is to make it happen.

The bet

Platform engineering builds what you deploy on. Developer experience smooths friction in your daily tools. Quality engineering focuses on testing. None of these functions was designed to answer the question of how humans and agents should produce software together, and the work cuts across all of them in a way that means no existing function will pick it up spontaneously.

Whether this role lives in platform, as a dedicated team, or reporting to a VP of engineering matters less than whether anyone has the mandate at all. Every engineering organisation now has access to the same models, the same tools, the same capabilities. The differentiation is entirely in the environment, and the organisations that invest in it deliberately will compound their advantage with every release while the rest discover that moving faster without moving better is just a more efficient way to accumulate debt.

The interesting question is not whether this function is needed. It’s what the organisation that gets it right looks like in two years, and how far ahead they’ll be by the time everyone else catches on.