Stay up-to-date on our team's latest theses and research
Imagine a web with no websites. All your favorite apps are now just databases, queried for information by agents that zip around on your behest.
Bots made up nearly half of global web traffic in 2023. With LLM agents, this number will explode. It seems plausible that 90% or more of web traffic will be non-human in the near future.
Today’s products and websites are designed for humans. They render information visually via carefully crafted UIs. They erect captchas and other roadblocks to try to stop bots, which are associated with spam and fraud.
In a world primarily composed of bots, what happens to websites? Do agents learn to navigate them like a human? Do they go away entirely?
It’s tempting to think that websites are obsolete in a world full of agents. Websites are just visual interfaces to interact with data. Why make an AI agent click through a set of buttons when it could just query the application directly?
Even though some applications will build agent-friendly APIs, for the foreseeable future we believe the web will remain dual-use. This is because:
UIs are effective
The information bandwidth of human vision is over 1 million times higher than the bandwidth of reading or listening to language. Product designers take advantage of this with carefully crafted visual interfaces that efficiently convey complex information and help users take action on it.
Say you want to compare restaurant options. Most likely, a scrollable map with location pins, overlays with photos or reviews, and buttons for common filters will be more effective than typing all your criteria into a chat interface and reading the results one at a time.
There will certainly be embedded applets and content cards, but the best and most robust UIs will remain on native web apps themselves – more on that below.
Building good APIs is hard
There are many beautiful web apps but few amazing APIs. Even when web apps have one, it usually doesn't provide the same feature set as the app and often has poor DX/usability. This is partly because APIs are not prioritized and partly because building effective APIs is hard – especially if the application was originally architected around a UI.
For AI agents, even the best API might not be good enough. LLMs struggle to use tools reliably, and user flows that involve multiple API calls (e.g. “find a hotel and book it for me”) pose a challenge. Developers will likely need to enrich APIs with additional documentation, prompting, or validation to ensure that AI agents can use them reliably.
Companies are incentivized to prioritize their surface over others
The 2010s demonstrated the impact demand-aggregating platforms like Facebook had on the publishing industry. These platforms boosted publishers’ traffic initially but had a damaging long-term impact as many consumers stopped going to the publishers’ sites entirely (depriving them of advertising dollars).
With AI assistants, many other companies face the same kind of disintermediation.
Airbnb wants you to think of its brand when planning a weekend trip. If you choose to go to Airbnb to initiate a search, you’re likely to complete a transaction on the platform.
Imagine you instead ask your AI assistant to plan the trip and get results from Airbnb, Vrbo, and Booking.com (perhaps not even showing which is which). You’re much less likely to complete that transaction on Airbnb; the company also gets less information about your behaviors/interests, less ability to serve targeted recommendations, and loses brand equity.
There will be a spectrum of responses – some companies embracing bots, others attempting to block them entirely. But on the whole, we expect that most companies will prioritize their own websites, chatbots, and apps over working smoothly with third-party AI assistants.
For human users, we expect the dual-use web will look pretty similar to today. But behind the scenes, there are early signs of how internet tooling will change to support AI agents.
Browser + automation infrastructure
Outside of flashy demos, AI agents will not browse websites on actual computer screens. Instead, they will use headless browsers, which replicate the functionality of a regular browser but can be run in large numbers in the cloud.
Similarly, AI agents will not move around a cursor or type with a keyboard. They’ll use a browser automation framework that lets them interact with web pages programmatically using code.
There is a healthy ecosystem of headless browsers and automation frameworks today. Historically, they’ve been designed for web scraping and web app testing. These use cases share some common needs with AI agents, such as resource management, bot avoidance, and various tricks to improve reliability of actions (e.g. only clicking a button when it’s active).
But AI agents will also have a set of unique needs that are unlike scrapers of the past. They will:
In addition to enabling AI agents, browser and automation infrastructure will support the collection of data for foundation model training, fine-tuning, and retrieval systems. Because LLMs can easily generate automation framework scripts, we expect there will be an explosion of AI-generated end-to-end tests for every application.
Combined, these will all dramatically increase the need for and value of next-generation browser and automation infrastructure. For more on this topic, check out this great memo from Paul Klein at Browserbase.
Authentication/authorization
Most websites today have defenses to try to identify and block bots. Accessing and actioning requests will be even more difficult for AI agents.
Many things you’d want an assistant to do require not just that it accesses a website, but that it takes actions on your behalf. Booking a flight, changing a dinner reservation, or sending a message all require the user to be signed in. How will AI agents do that?
In the short term, many agents will spoof human activity to take these actions. This is complex, requiring realistic human IP addresses, devices, and behaviors, as well as managing sensitive user credentials. It’s also relatively high-risk: a misstep could result in a user’s account being banned. But we expect that so long as the actions are reasonable and directed by a user, most platforms should be amenable (or at least turn a blind eye) to the activity.
In the longer term, we expect there will be separate pathways for machine authentication and authorization. These might look like consumer-directed service accounts. We think it will be a while before these are commonplace and might only be delivered with agent APIs instead of the web interface.
Website design/implementation
Today’s web applications are designed for visual consumption. The actual website code is often difficult to interpret, whether for system design (e.g. calling external services) or to stop scraping through deliberate obfuscation.
If they want to support AI agents, websites of the future might undergo redesigns that are invisible to the human eye but help agents navigate. You could imagine redesign or decoration of HTML elements with additional comments to help an agent understand what is what. Websites could even have invisible “agent sections” that are specifically designed for AI users.
We think dual-use websites are likely to remain important for a long time. But there are a couple potential trends that could change their relevance.
AI-generated APIs: As discussed above, it’s hard to build and maintain useful APIs. As AI code generation matures, the effort required to create a full-featured, agent-friendly API could drop by an order of magnitude. If that’s the case, more companies might maintain both websites designed for humans and APIs designed for agents.
New business models: Most businesses today rely on direct traffic to make money. There are inherent disadvantages to a site’s usage come through a third-party assistant versus a direct visitor, but it’s possible more platforms find a usage-based model, exclusive partnerships, or other business arrangements that incentivizes them to prioritize agents.
Assistant adoption/demand aggregation: It’s possible that assistant platforms like ChatGPT, Gemini, and Meta AI become the predominant way many consumers engage with the web. Much of this change is behavioral – how many people enjoy browsing the web and will prefer to shop themselves versus having a virtual assistant provide 1-2 options? If a majority of people use assistants to make purchasing decisions, most companies will have no choice but to provide first-class support to AI agents (just as most brands have to sell on Amazon).
Search is one of the hardest technical problems in computer science. Only a handful of products like Google, Amazon, and Instagram do it well.
Until recently, most products didn’t need great search – it wasn’t core to the user experience.
But with the explosion of LLMs and retrieval systems to support them, every LLM company suddenly needs to have world-class search embedded within their product just to make it work.
With this emerging need, how will this new wave of AI companies solve search?
Retrieval is a critical component of LLM systems, and isn’t going away
Retrieval-augmented generation (RAG) systems deliver relevant information to an LLM to help it respond to a query. This grounds LLM generation in real and relevant information.
Imagine an LLM is answering a question on a history test. Without RAG, the LLM would have to recall the information from things it’s learned in the past. With RAG, it’s like an open-book test: along with the question, the model is provided with the paragraph from the textbook that contains the answer. It’s clear why the latter is much easier.
Finding the right paragraph in the textbook may not be easy. Now imagine trying to find one code snippet in a massive codebase, or the relevant item in a stack of thousands of invoices. Retrieval systems are designed to tackle this challenge.
New LLMs have longer context windows, which allow them to process larger inputs at once. Why take the effort to find a paragraph out of the textbook if you could just load in the entire book?
For most applications, we think that retrieval won’t go away even with >1M token context windows:
Semantic similarity search is just one piece of the puzzle
As LLM prototyping exploded, people quickly turned to semantic similarity search.
This approach has been used for decades. First, separate data into chunks (e.g. each paragraph in a word document). Next, run each chunk through a text embedding model that outputs a vector which encodes semantic meaning of the data. During retrieval, embed the query and retrieve the chunks with the nearest vector representations. These chunks contain the data that (in theory) have the most similar meaning to the query.
Semantic similarity is simple to build, but results in pretty mediocre search. Some key limitations of this approach are:
To the first point, this approach only searches based on the semantic meaning of the query. If you look at any of the companies that do search well, semantic similarity is only one piece of the puzzle.
The goal of search is to return the best results, not the most similar ones.
YouTube combines the meaning of your search query with vectorized predictions of what videos you’re most likely to watch, based on global popularity and your viewing history. Amazon makes sure to prioritize previous purchases in search results, which it knows you were probably looking to re-order.
The future of retrieval systems
Google was founded on the PageRank algorithm, a simple way to rank web pages. But today’s Google search would be unrecognizable to the initial team: it is an incredibly complex system that uses many approaches to return the best results.
Similarly, teams building RAG systems started with simple semantic similarity search. We believe they will quickly become more complex and end up looking like today’s production search or recommender systems. The problems are not that different: from a large set of candidate items, select a small subset that is most likely to achieve some goal.
Today, most retrieval systems look like:

A future system might look something like:

Retrieval systems will have dramatic impacts on the capabilities of LLM applications: their effective memory, response quality, reliability, and performance/latency. We think that for many applications, these systems will have more of an impact on end capabilities than the LLM itself.
Because of this, we think that most companies will build these systems in-house as a core competency and differentiator. These builders will rely on a new set of infrastructure to build retrieval systems specific to their application.
To date, most investment has gone to databases to store vectors and retrieve nearest neighbors. But thinking through a future stack like the one described above, the database is only a small part of the solution.
Building these new systems will require better tooling to:
Although most companies will build retrieval systems themselves, it’s possible that the stack consolidates in a couple ways:
We’re excited for the evolution of retrieval and search as product enablers. If you’re building infrastructure for retrieval systems, or retrieval systems to power a new application, we’d love to chat!