Two Ideas on AI
RL and revenue
Good afternoon and Happy Thanksgiving,
Let’s get to it shall we?
Today, an overview of ideas in AI I’m thinking through in relation to vertical AI.
RL partnerships could be a defining strategy for vertical SaaS companies shortly
Vertical SaaS companies are some of the largest benefactors from Reinforcement Learning becoming the state of the art approach for applied agents.
There’s a whole litany of companies and strategies across the stack here, but generally everyone agrees on a couple core ideas:
Agents need to train in realistic environments with rewards and verifiable signals for their actions.
Regardless of whether RL fully generalizes, it presents a cost-effective path to vertical or function-specific agents. The labs are betting that RL does somewhat generalize but even in the worst case scenario, they end up with highly adept agents for particular knowledge work functions like financial analysis, healthcare work, etc.
Enterprises are also placing bets here due to the reduced cost of RL training and the corresponding impact on their operations, often favoring partnerships with RLaaS vendors to train models on their unique operations.
SMBs are never going to stand up reinforcement learning on their business operations. They’re going to simply rely upon vertical vendors to perform this necessary function for them.
Likewise, its borderline uneconomical for the labs to build RL environments and recruit the experts for labeling data in every vertical.
In my view, these two insights position vertical SaaS companies as the RL power brokers in their verticals. They are the beneficiaries from foundation models needing RL environments possibly leading to partnerships with vertical SaaS vendors. And they are the distribution mechanism for RL-trained models specific for their vertical.
Which means the real leverage sits with the systems of record that already intermediate these workflows.
Incentives are more aligned with the labs than they are with most vertical AI startups. Systems of record benefit when their software becomes a core part of the benchmark, and when agents are trained to treat their application as the default surface for getting work done.
They have already assembled the distribution, and more importantly, the expert networks who actually know what “good” looks like in that vertical. Those same experts are the ones who can label and define proper evals.
Since state of the art agents are still sub optimal for nearly every industry’s operations, whoever first wires their workflows into repeatable RL loops will have real power. If your software is the environment, then your customer base becomes the training distribution. That is a very different position than simply “adding an AI copilot.”
I do not see any indicators that systems of record are under siege from foundational model providers. The pressure is hitting the vertical agent companies instead. Labs are actively searching for ways to produce evals and agents that can properly navigate traditional web applications and existing SaaS. That points them toward platforms that already have stable usage and structured workflows.
The cost of RL is nowhere near the cost of pre training. With potential subsidies from labs in exchange for access to RL environments, the economics can become even more attractive. You can imagine co funded programs where the lab underwrites runs and the SaaS vendor provides stable environments plus expert reviewers.
Even so, while RL training costs are lower than pre training, human data labeling costs are climbing, and there is every reason to believe this remains a bottleneck for model progress.
Vertical SaaS companies may be well positioned to have lower cost methods to produce human experts through existing distribution, real insight into which users take what sorts of actions inside of software. And of course they’ll be getting paid handsomely for enabling these sorts of expert marketplaces.
In short, the same insight that drove embedded initiatives inside vertical SaaS may end up driving the frontier of RL environments, training, and RL runs. If there is a lower cost way to procure the ingredients that agents require to improve, and if those ingredients sit inside of system of records already, it becomes rational to treat RL partnerships as both an R&D priority as well as a revenue line.
There are clear paths to revenue via end agent releases for customers and training.
Of course, this hinges on having the machine learning expertise and ability to recruit human annotators for the required vertical functions.
But doesn’t this as a result sort of look like other embedded opportunities? Vertical SaaS companies didn’t want to become payments, lending, or banking experts. They instead chose to partner with preferred experts across the stack to expose these products to their customers. I wonder if a similar insight will drive RL partnership initiatives with of vertical SaaS companies to deliver RL environments, expert networks, and more.
In fact, I think there’s clear reasons why this will end up being a company.
Labs don’t want to have to procure environments from tons of individual vendors. They want their procurement costs to be low, want little variability in training across RL environments, and likewise want Mercor-like simplicity in recruiting experts and knowing the end data that they’re getting back.
For vertical SaaS, the majority do not want to in-house a research group. They want to participate (greatly) in the upside for successful RL runs, but incur as little opex as possible.
All of these look very similar to the reasons why embedded partnerships took off across financial products inside vertical SaaS. We may be entering an era where the same holds true for frontier AI.
If you know any RL engineers exploring this, I would love to chat with them.
As an aside, I am thinking about hosting an “RL for vertical SaaS” meetup in San Francisco. If you are interested, let me know.
Systems of Action are judged by revenue growth, not cost optimization
The winners inside vertical AI so far are primarily centered around revenue uplift.
This will likely continue for the next few years, at minimum.
There is no need to make this overly elaborate. If AI usage still requires a human in the loop, and it clearly does, then the most successful AI implementations will enable revenue growth through more efficient use of human labor.
This logic has underwritten the success of EvenUp and related companies that convert labor bottlenecks around revenue into far more digestible services and automations.
AI enabled services are thus far more compelling narratives and more robust companies than AI enabled rollups. Rollups mostly have bought existing distribution and then concentrated on opex reduction. AI enabled services have focused on reimagining the service itself and when successful grow organically in correlation with their own end customer’s growth.
A couple examples:
Crosby AI lets customers move faster on MSAs, NDAs, and contract reviews; all bottlenecks inside of a deal.
Owner for restaurants is totally focused on the revenue impact from their AI and software.
And lastly Endeavor focuses almost exlusively on the revenue generating functions for manufacturers leading to more revenue and faster revenue recognition through sales order automation.
In short, the leading AI companies are synonymous with revenue growth for their clients. The more intimately tied an AI product or service is with revenue growth, the more likely, it’s going to find incredible product-market fit.




