The State of AI Deployment
AI model progress and deployment are not the same thing. And outside of the big three verticals of coding, law, and finance, the net impact of major model progressions has been muted.
The labs obviously need to rectify this imbalance between model capability and utility and so are now directly jumping fully into the AI deployment game with OpenAI creating The Deployment Company (DeployCo) and Anthropic rumored to follow suit soon.
There’s at least some suspicion that the labs are doing this for subtler reasons including access to further training data and vendor commercial lock in. These all may very well be true, but the biggest reason is simply this: deployment is a lot harder than anyone gave it credit for.
We are nowhere near peak deployment and not to spoil the rest of this piece, but nobody has cracked the sort of autonomous software lifecycle development patterns that is required to enable rapid model deployments in the enterprise.
As it stands, deployment is subject to commercial, sociotechnical, and limited knowledge on behalf of buyers and even deployers around the exact capabilities of models.
We will talk around the core gaps to solve, but perhaps the current limits and scopes in deployment are most clear in DeployCo’s acquisition of Tomoro.
The Current Limitations of Deployment
DeployCo announced its existence with the acquisition of Tomoro to bring 150 Forward Deployed Engineers in the fold.
Tomoro consists of mostly ex-Accenture strategy and data scientists in the UK and was partially incubated with OpenAI.
Their public portfolio consists of projects like Virgin Atlantic’s travel chatbot1, a Masters of the Universe wrapper around OpenAI’s Sora and Image models, and a meal planning/grocery shopping chat experience with Tesco, a large supermarket.
Importantly, these projects are not transformational. They are additive, but it’s unclear if they even contain the relevant categories of data that labs, deployment firms, investment management firms desire to utilize for transformation. Let’s just say Virgin Atlantic isn’t re-architecting their customer service organization due to the capabilities of this chatbot.
But they do consume tokens. And importantly, they serve as training grounds for the relevant FDEs at Tomoro around how to work in enterprise contexts.
But in short, dedicated AI deployment is still so early that the wrapper projects that no longer get funding inside of vertical AI are the bread and butter of AI deployment in the enterprise today.
If there’s some latent suspicion that OpenAI and Anthropic have this motion or strategy figured out, I don’t think this is true.2 They will figure it out, but there's currently no dedicated advantage to the data and inference they possess. There is currently no path to deploying AI into the enterprise without bodies.
The Deployment Gap
Yet, OpenAI and Anthropic feel a need to figure it out given a couple key factors:
Buyer knowledge around model capabilities is unsophisticated. The translation problem from an evaluation to a tractable business insight is challenging. This is amplified by current evaluations being only pseudo-capability assessments.3
The sociotechnical problems necessitate an outside partner in many cases. AI as a technology is so expansive and so ruthless in terms of its potential impact on organizations and their employees that deployment consultants ultimately are inhabiting the same role that McKinsey pioneered with management consulting and giving an outside endorsement for transformation.
Traditional vendors are not yet adapting to customer budgets and the labs are not willing to wait for inference to come online given current model capability.
And specifically for the labs, there are several more reasons.
I think deployment is actually on the path to recursive self-improvement for models. The prospect of more autonomous agent deployments that can ingest the context of an enterprise and self-direct more deployment opportunities depends on the sort of decision data that FDEs are going to emit.
The labs want as much long-horizon task and process data from every industry in the pursuit of AGI.4
Commercial lock-in matters given how tightly correlated OpenAI and Anthropic’s model performances are to date. The cost of compute and data to drive the next batch of models depends on hundreds of billions in revenues and the enterprise is the only viable path to these sorts of model revenues from here.
The labs aren’t the only players of course. Beyond the labs, there are plenty of organizations offering deployment services spanning Distyl, BrainCo, Ciridae, and more.
We are nowhere near peak model performance, nowhere near peak deployment demand, and nowhere near peak dollars that will be extracted from deployments.
Again the reason is simple: right now AI has barely registered on the P+L or product offerings of industries. If and when certain companies engage in cybernetic transformations that give them a decisive advantage in their industry, deployment at any cost will ensue.
The quip today is that the economy is now a big data generation machine for models. The quip soon will be that the economy is one big implementation engine for the geniuses in the data centers.
When the economy fully internalizes the potential of current generation models, the forthcoming potential of physical AI (robotics), and the talent barrier around deployment, FDE numbers will number in the millions.
The Near Future
Agents are improving at a staggering rate, especially in coding lines of work. Already today, FDEs rarely write code by hand and function as conduits for Codex and Claude to translate user interviews into code.
As of now, no company has truly cracked autonomous lifecycle development for deployment. Simply put, it’s still far more effective to run a human-heavy deployment business than it has been to spend time and energy on devops and process mining layers that would extricate FDEs further from the transformation process resulting in higher margin deployment businesses.5
This is obviously forthcoming and I’d expect many organizations to begin to publicly indicate that this is what they are aiming for. Beyond the margins, the second anyone has assembled a plausible path here, a lab will acquire for one reason: deployment lifecycles like this will have some of the longest-horizon reasoning traces for coding agents. We don’t have enough of these today and as training data, there’s a colorable argument that the datasets here are worth more than Cursor’s dataset.
As coding agents continue to progress, the forthcoming work will look even more esoteric involving scrutinizing org charts, mining every process inside of an organization, assembling bespoke evaluations for every business unit, and charting the progression of companies towards AI-enabled futures.6
The Deployment Era will quickly bleed into the Advisory Era. The best deployment will have autonomous lifecycle development systems, people ops and strategy orgs, and ultimately act as kingmakers in industries. Companies will quickly be valued in part by their consortium of partners that can execute transformations and chart the path to increased multiples, better products and services, and more durable businesses in an era marked by constant change.
I’d expect a certain portion of these deployment/advisory companies to create a halo effect for companies that they work with. The expected value from an engagement will be so high that the stock price will go up dramatically as a result.7
Open Questions
Will the most successful deployment companies be fully associated with one lab? Or will they instead be independent and capable of directing tokens and compute budgets fully in the best interest of the company?
Ultimately I believe that the labs have looked upon the current crop of system integrators, software providers, and more and found it all wanting. The reliable partners for research goals around model feedback loops and data accumulation, and for driving inference inside of the broader economy are few and far between.
That said, implicit in the Deployment Era game is a new mandate. Enterprises and the mid-market need help. You want to facilitate tokens, intercept reasoning traces for your own model iterations, and ensure you are critical to the future of your customer? You better have a deployment plan.
It is non-obvious that the labs are the best deployment partners. They certainly can be if one gets a sizable model advantage and access is gated to exclusive deployment contracts. But this seems implausible. We are likely 2 years away from state of the art models and harnesses being sufficient for any particular workflow in knowledge work. And already small models are a plausible path towards cost effective work takeover.
And so it would seem to me that despite the billions already poured into deployment (most of which is admittedly concentrated in DeployCo), there is no clear frontrunner to winning the market. And in fact, market structure may prohibit any one winner altogether.
As a result, I’m extremely bullish on companies that are organizing themselves as an autonomous lifecycle company focused on deployment with differentiated technology for supporting FDEs to rapidly improve the enterprise and a huge focus on training data and capability evaluation.
If you’re building this, I’d love to talk.
Systems of Record Strike Back?
But there is one more group who could naturally slide in here and actually begin to take market share and compound their existing advantages. I’m of course talking about industry software companies.
Again, the only reason the labs can theoretically build deployment companies in your vertical is because the existing industry players don’t have one. Systems of record have the data, have the relationships, and certainly have the need to transform the company into an AI-enabled one. It’s simply mandatory as a result to build out these organizations.
The net impact will certainly be better products, proprietary models and research, and evals that nobody else in the market has access to.
To sum it all up, the current state of deployment is nascent, relatively manual, and not R&D driven. The future state will be technologically based, autonomous with human intervention, and advisory in nature.
Whichever players occupy the gap and productize the forthcoming transformation imperatives will become $50B organizations.
Upcoming Pieces:
Long Lake’s acquisition of Amex GBT (aiming for Friday)
Physical AI infrastructure
Sim2Real, Robotics Data, and More
Still in beta with zero tool calling, no booking workflows, and extremely limited utility as a CX agent.
And I’ve heard nothing privately to convince me of the contrary.
There’s only 4-5 verticals with anything approaching rigorous evals. Biology, healthcare, law, finance, and coding. And even here, from a deployment context, public evals are only moderately helpful for gauging implementation strategies or for investors, AI underwriting strategies like the ones that Long Lake performs.
This is also why it’s borderline insane to partner directly with a lab for deployment strategies even if there is no clear line of sight today to the labs competing vertically. Don’t let the fox in the henhouse.
Similar to human data companies, there’s very few existing organizations that are modeling themselves in this way and are instead body shop operations.
There’s a very specific set of data desired here to inform these decisions. You want organizational charts, software systems transaction/process mining, work product monitoring, and evaluation layers to audit model progress on work.
That starts to look like a somewhat wacky purpose built product and software layer comprising Celonis x Workforce Management x Braintrust x RDM x Model Forecasting. But the payoff will ultimately be the ability to drop into any company and drastically start to improve the operations.
The Reverse Anthropic Announcement. Long Lake with their capabilities is the closest to this idea and of course are reaping the full equity upside of the transformation.


