AI systems are typically evaluated with humans as the gold standard. How many college-level math problems can it solve? How many medical questions can it answer? How accurately can it extract information from a contract or purchase order?
Even as LLMs become superhuman test-takers, it’s clear they still can’t reliably do many actual jobs. Most of the thought and work going into AI applications is oriented towards building value with a system that’s only ~80% as good as a human. How do you limit AI to tasks that it can do more reliably? How do you incorporate humans to check AI activities? This is critical, and drives our research in many areas of AI software.
But there are also jobs where AI systems are structurally better than humans. For these, AI won’t be struggling to keep up with the average employee; it will be many multiples better than any person could be. And while automation in general can reduce costs, integrating AI into jobs they do better than humans can create new company capabilities, improve customer experience, and drive revenue.
We think these categories are particularly ripe for new AI software startups.
No matter how smart humans are, they are fundamentally limited in how much they can do. You can only read so many pages, or draft so many emails.
For jobs that require high throughput, humans must build systems to make decisions in aggregate. They might just search for specific keywords across those pages, or use rules-based decision trees to select from pre-drafted email responses. Perhaps a fraction of the tasks are escalated to human review.
Regardless of how carefully you draft your email campaigns or how many rules are in your decision tree, we know that these systems will always fail. Every page or email is slightly different. While humans can handle that variability, the rules-based systems they implement cannot.
In an earlier post, we described LLMs as an infinite supply of near-free interns.
Like a new intern, the tasks they do should be straightforward. That could be because the task:
When tasks are simple, imagine how much work an infinite number of these AI interns can do. They can process 10,000 pages just as easily as they process 100. They can operate 24/7, without getting tired or bored. While there can be issues with hallucinations, when provided with contextual information they are less likely than a human to forget a name or make a typo.
Perhaps most important of all, the AI interns look at each task and apply human-like reasoning independently for each one. For high-throughput jobs, the alternative to an AI doing a task is not actually a human doing it: it’s a legacy rules-based system, or nobody doing it at all.
What will happen to these jobs that AI does best? As discussed in a previous post, they won’t go away entirely, but could dramatically change in scope. Generally, they will uplevel – from execution to orchestration, or first-pass to escalation/review. Without entry-level work, it may be challenging to onboard and train talent.
Security operations
Analysts in security operations centers (SOCs) get overwhelmed by a deluge of alerts generated by their detection tools. Often there are dozens of near-identical ones; each could take the better part of an hour to investigate fully. To stay afloat, they compile rules-based filters and playbooks, even though they know these let things slip through the cracks.
Dropzone AI, a Theory portfolio company, has shown that agentic systems can replicate manual investigations. Their AI systems have expert skillsets – for example, they have deep knowledge of dozens of tool-specific querying languages. But they wouldn’t need to be the best analyst in the world to have a massive impact. The fact that AI agents can review each individual alert, in minutes, at any time of day or night, and with perfect memory, is a dramatic step change from the small fraction of alerts that get reviewed (often far too late) today.
Customer engagement
Customer engagement platforms like Salesforce and Braze send billions of texts, notifications, and emails per week. Of course, it would be impossible for a human to write each one of them. Instead, marketing teams must draw out rules-based journeys. This is the sequence we’ll send to new users. Here’s what we’ll do when a customer leaves something in their cart. But every user is different. Even if you could identify and define micro cohorts/segments, it’s just not possible to manage many thousands of different messaging campaigns at once.
AI agents don’t have this constraint. They can use millions of different strategies for millions of users, experimenting with content, channel, and timing. It doesn’t matter how perfectly crafted a message is if it’s directed at the wrong person – agents that personalize messages for each individual are much more likely to find the specific attributes that drive business outcomes. We’ll share more on this space next week!
Investment research
In investment research, ideas mean nothing unless they can be translated into actions. Say you want to invest in businesses that will benefit from increased AI usage. Of course, Microsoft and NVIDIA will be on the list, but there are scores more companies along the value chain – datacenter component manufacturers, system integrators, REITs, etc. Researching this thesis would require analysts comb through thousands of pages of documents and create massively complex financial models.
Human analysts can cover a small number of companies and race to update models when new earnings reports drop. AI analysts can easily screen hundreds of companies on an ongoing basis, just as fast/accurately as they could scan one. We expect this will dramatically change how investment firms operate, allowing firms to systematize qualitative strategies in the way they run quant strategies today.
Supply chain operations
Supply chain organizations manage hundreds of vendors supplying thousands of goods and services (if not more). The challenging cognitive work is dealing with inevitable problems that arrive daily, and figuring out how to optimize procurement/logistics over time. But most of the day-to-day work is data collection and relationship management. Each day, professionals spend hours tracking status updates, copying numbers, reviewing RFP responses, and matching invoices.
AI systems can easily maintain thousands of email conversations at once. They can instantly read through lengthy PDFs and spreadsheets and extract just the relevant information. In addition to freeing up time for humans to focus on more important work, they will enable new strategic capabilities – like dramatically expanding the frequency and scope of RFPs, or proactively monitoring and alerting for supplier issues.
—
If you’re building automation for jobs where AI has a structural advantage, we’d love to hear from you! Reach out to at@theory.ventures.