Stay up-to-date on our team's latest theses and research
“All teams will henceforth expose their data and functionality to LLMs, and anyone who doesn't do this will be fired.” - Jeff Bezos (theoretically)
Jeff Bezos famously mandated this for web services when starting AWS, and it would be a likely update for the LLM era.
When building for developers, this discipline made AWS services “externalizable by default,” propelled Amazon’s platform strategy, and helped cement microservices as modern dogma.
When building for LLMs, this discipline means having the right context from a variety of systems easily accessible. Although they are incredibly powerful, LLMs are only as intelligent as the context they have.
At Theory Ventures, we’re investors, but we’re also all builders, creating internal software to move fast. Our goal: answer investor questions about any company in seconds: from “What’s new since the last call?” to “Show revenue by cohort” to “Summarize the last three meetings.”
As a simple example, let’s consider an investor writing a company memo to share internally; information about the company is comprised from several different sources:
Remembering and managing all of these different data sources and copying them is a lot of work, but what if all that context could be available to the investor’s LLM just by mentioning the company name?
Model Context Protocol (MCP) has emerged as a simple and robust way to give LLMs the right context. MCP is a standardized protocol that empowers LLMs to connect with data sources, tools, and workflows.
A well-designed MCP enables users and agents to complete their work wherever that work is happening: in chat or in a container. MCP is intentionally boring in all the right places: predictable schemas, explicit descriptions, and a clean contract between the system and the model.
We deploy an MCP server on FastMCP Cloud so the LLM client can call it from anywhere without custom infrastructure.
LLMs don’t magically know our world. They need:
So we exposed one MCP tool that the model can reason about: company_context. Given a company name or domain, it returns a structured summary with IDs, core metadata, notes, historical activity, and (when applicable) financials. Internally, this tool orchestrates multiple services, but the LLM only sees a single, well‑documented interface.
What the tool returns at a high level:
Here’s the company_context tool’s internal orchestration:
# Pseudocode – orchestrating a company context query
def get_company_context(company_name: str):
company_id = get_company(company_name)
core = fetch_core_company_data(company_id)
notes = fetch_notion_pages(company_id)
history = fetch_historical_data(company_id)
financials = None
if core.get("is_portfolio_company"):
financials = fetch_financials(company_id)
return serialize_company_data(core, notes, transcripts, history, financials)
This tool returns a compact, documented schema so the model knows what it returns and how to consume it. For example:
{
"name": {
"value": "Company A",
"description": "Public-facing name of the company"
},
"id": {
"value": "123444ee-e7f4-4c9f-9a4e-e018eae944d6",
"description": "Canonical company UUID in our system"
},
"domains": {
"value": ["example.com", "example.ai"],
"description": "Known web domains used by the company"
},
"notion_pages": {
"value": [
{"page_id": "abcd1234", "title": "Intro & thesis", "last_edited": "2025-07-28"}
],
"description": "Notion pages with analyst/investor notes"
},
"is_portfolio_company": {
"value": true,
"description": "Whether the company is in our portfolio"
}
}
This isn’t fancy agent pixie dust; it’s just clear contracts that let the model get access to the context it needs without human input.

If you remember only one thing, make it this: Expose your service’s core functionality as MCP tools and make them excellent. That’s the shortest path to truly AI‑native software. It’s the clearest mandate for the next decade.
Imagine dropping Einstein into a back-office job at a random Fortune 500 company. Despite his genius, if he didn’t know what the company does or how the role works, he wouldn’t be much help.
AI systems are rapidly improving at work tasks, like summarizing notes, writing queries, and updating slides. But they suffer from the same challenge: knowing how to do work is very different from actually working at a company.
As our models continue to get smarter, how do we get them to be better at doing real jobs?
Building AI automation for enterprises is so complicated because every company operates differently. Even two businesses in the same industry can have distinct processes, systems, and decision-making.
No matter how smart foundation models get, there is no way for them to address this. It’s not an intelligence issue; it requires knowledge of companies’ internal operations, which are proprietary, idiosyncratic, and often undocumented.
So how can we make AI automations work? We need some way to:
Understand how an enterprise works,
Deliver that knowledge to an AI system, and
Maintain and update that knowledge over time.
We believe this will create the first major new system of record in years: a Business Context Layer.
Today, a new customer support rep might be told to read a 100-page Standard Operating Procedure (SOP) during onboarding. The SOP includes instructions on how to run processes and handle exceptions: If a customer wants to change account information, always ask for verification. If they ask for a refund, consult these policy rules.
In most companies, these documents are incomplete, outdated, and even contradictory. This leads to teams building tribal knowledge on their own and following processes inconsistently.
When there are millions of AI agents performing complex tasks across the enterprise, we will need something much better: a living system of record that documents all of the written and unwritten rules for how a company operates, and delivers the right instructions to AI systems as they do work. We call this a Business Context Layer or BCL.
We think the key components of this platform are:
Automated context extraction/synthesis from operational data: Today’s SOPs are created manually via painstaking process mining, but a future BCL will need to do this largely automatically. This is a complex problem: a system needs to observe human activity (e.g. logs, tickets, chats, or screen recordings), infer a set of rules that describes the behavior, test those rules on real-world data, then iterate on them until they are as accurate as possible.
A retrieval system to deliver the right context to AI agents: At enterprise scale, a document describing all of the business rules would be far too large to pass the whole thing to every AI agent: it would be prohibitively slow/expensive, and you would likely suffer decreased accuracy due to context rot. An enterprise-grade BCL will need to (1) index/store this data efficiently, (2) find and deliver the context required to complete a given task, and (3) track what context was used for which queries to inform future improvements.
An excellent interface for domain experts to maintain and update context: The BCL must be constantly updated and improved as processes change. Maintaining this knowledge base will become a primary job for humans. This is a complex product to build – it’s got elements of a source control platform like GitHub, experimentation like Statsig, and user-friendly collaboration like Figma. Product & UX will be a major differentiator in the space.
In parallel to this emerging context layer, many startups are building “digital twins” of enterprise software and systems. You can let millions of AI agents loose in these simulated environments, provide a goal (e.g., “resolve these support inquiries”), and they will learn how to make business decisions and operate tools via Reinforcement Learning (RL). We think this technique is powerful, but it solves a different problem than a BCL.
Any model fine-tuning comes with trade-offs. You need the expertise and capacity to run training jobs. You need to continuously evaluate and backtest models, because fine-tuning can impact them in unexpected ways. When new models come out, you have to do the whole thing over. RL has all these challenges and more: it is notoriously difficult/unstable to train and very hard to design the appropriate scoring/reward functions, often resulting in unexpected behaviors.
For enterprise workflow automation, there are two other major limitations of the RL approach:
It is a black-box: You don’t know what the model learned or why it made a certain decision. A BCL might show a simple learned rule in text: “Anyone having trouble accessing their account should be passed on to the Customer Verification team.” But with an RL system, these learnings will be hidden in model weights.
It is not easily modified: Companies are constantly changing their processes, and AI agents will need to, too. Say you want to modify your workflow to “Anyone having trouble accessing their account should first try to reauthenticate with our new portal. If that doesn’t work, then send them to the Customer Verification team.” With a BCL, this change could be made in a few minutes in plain text (then using a testing/eval harness to evaluate the impact). With RL, you might need to update the environment, design a new reward function, re-run training, and then evaluate the impact. That is a long and arduous process.
Our hypothesis is that RL environments will play an important role, but primarily serve large research labs. Using them, foundation models will get dramatically better at doing enterprise work generally: updating CRMs, processing tickets, writing messages, etc. Companies will then use a BCL to provide instructions on how these models should do work at their business – in a human-interpretable, easily-modifiable form.
To bring a BCL to market, you need to sell outcomes, not infrastructure. Outcomes are what drive executive urgency. And most enterprises will not be capable of building complex applications with this infrastructure, even if they, in theory, could create ROI.
The clearest value proposition for a BCL is operational process automation and augmentation. Whether an enterprise buys a workflow automation platform or tries to build it internally, the performance out of the box will likely be poor due to a lack of business context. A BCL solves this problem, without requiring teams of consultants and engineers to hardcode information in prompts. It can help automate a much larger proportion of tasks, along with more reliability and controllability.
There are additional value propositions from a BCL: visibility/management (providing leaders with insight into how the organization operates) and productivity (providing front-line workers with additional context or information to do their jobs better), but we think these are secondary to core automation.
Our key questions on the future of the BCL are about the packaging and delivery model:
Any AI automation needs organization-specific context to improve performance. Today’s ascendant platforms in customer support, ITSM, sales automation, etc., are already limited by a lack of context. Right now, they solve this primarily through forward-deployed resources, but they will likely try to productize this capability over time.
Will there be a standalone context layer, or will this just be an approach/feature of each enterprise AI app? We think businesses will benefit from a single shared context layer, versus having context siloed across many separate applications, but it remains an open question.
We think an effective context layer must be created and maintained mostly autonomously. However, building end-to-end automations could still be manual – you might need process discovery to figure out what you should automate in the first place, systems/data engineering to get a solution into production, and change management when a system is deployed.
Companies like Distyl.ai are trying to build a next-generation Palantir: selling complete, services-led solutions built on top of a central platform that can drive recurring revenue and expansion use cases. We think this approach is likely to dominate in the F100, but that there is an equally exciting opportunity to build a more scalable, product-led company for the rest of the enterprise and mid-market.
If you’re thinking about business context for AI systems, we’d love to chat! Send a note to at@theoryvc.com.
Developer Relations (“DevRel”) is most often associated with companies that build for developers, but what does it look like in venture capital?
In traditional DevRel, the function is straightforward: advocacy ensures developers not only adopt a product, but also use it well. The DevRel sits at the intersection of community, technical credibility, and storytelling, helping translate between builders and the broader ecosystem.
For Theory, a DevRel takes a different shape.
We’ve been a technical firm since day one, and have since engineered numerous AI workflows to improve how we operate as a team. With this, we have seen both failures and successes, and we believe in sharing those experiences with other builders.
By building DevRel into how we operate, we are expanding our efforts to extend and engage with communities that help technical founders succeed.
Founders don’t just evaluate capital. They evaluate investors for their value creation: are these people serious technologists who can give credible advice? Do they understand the tools and communities that matter today, not just the boardroom abstractions? Traditional marketing cannot answer that question. A DevRel can.
The role is a hybrid of technologist, writer, strategist, and investor.
By participating in the technical conversations & forums as users, we collapse the distance between our firm and the developers who strive to build at the cutting edge. The result is better dialogue, sharper insight, and an ecosystem that sees us as genuine collaborators rather than distant financiers.
The obvious function of DevRel is outward-facing: exploring & amplifying ideas, highlighting tech we find impressive, and engaging on topics that matter. A strong outward presence ensures the firm is visible not just when capital is exchanged, but long before—at meetups, in online forums, and in the debates that shape new technologies.
Less obvious, but equally important, is the inward function. A DevRel is also a sensor: detecting signals from technical communities, surfacing emerging trends, and translating the chatter of early adopters into actionable insight. In this way, DevRel doesn’t just broadcast a message; they tune the dial on our collective understanding of where technology is headed.
To our knowledge, no venture capital firm has yet invested in a full-fledged DevRel role. For us, the motivation is clear: by showing up and contributing in the open, we build relationships with the people pushing technology forward—the very people we hope to back and collaborate with in the future.
It’s a natural extension of our research-driven approach to identifying technical discontinuities, and it reflects what we value most: intellectual honesty, technical depth, and staying close to builders. We’ll see you in the arena.