Ambient AI: why the best assistants will run on-device

Sep 1, 2025
Jan 14, 2025

Intro

AI copilots have popped up everywhere. From dedicated platforms like ChatGPT, Claude, and Gemini to shopping websites and notes applications, there are integrated assistants ready to answer our questions.

But there is one huge limitation: They don’t know anything about each other, and they don’t know much about us either. Our days involve constant switching between websites, on-device apps, and SaaS platforms. Using AI across them involves piecemeal copying, pasting, and prompting the model with additional context.

Say you’re planning a trip with some friends. You texted back and forth about preferences, did some research on Google Maps, and now want to send an email proposing a plan to the group. An assistant today wouldn’t have the context of your text discussions, knowledge of the Google Maps listings you viewed, or awareness that you currently have an email draft open. You’d have to tell it all that yourself, copy over any content, then paste its output into your message.

We believe that AI assistants of the future must operate ambiently. They will be always-on, running locally, with visibility across all the applications on your device. With shared, global context on everything you’re doing, they will be able to answer questions and automate workflows seamlessly. Whoever builds this system will own the next major AI developer platform.

There can only be one dominant interface to ask questions. People don’t like context switching, and will naturally gravitate to the interface that works best. In the early 2000’s, it was Google’s search bar. By 2030, it will be ambient assistants. Read on to find out why.

Ambient is the final form for search / assistants

In the search and assistant space, we believe three key dynamics are true:

  1. Users naturally allocate ~100% of their time to the platform with the broadest capabilities: When a web search-enabled Gemini can do all of my Google searches and support a much broader range of queries, there is no reason to use legacy search for a subset of them. Because use is so frequent it’s almost reflexive, people will naturally default to a single platform for their search/assistant queries. This results in a winner-takes-all dynamic. We think this will hold true for audio-based assistants and soon, ambient ones.
  2. There is low user tolerance for failure: There is nothing more frustrating than seeing or hearing, “Sorry, I can’t help with that.” For a given question, assistants have a couple times to get it right before users give up — either on that query, or on the platform entirely.
  3. User behaviors co-evolve with platform capabilities: Over the past couple decades, web users honed their skills around keyword search and later, semantic similarity. Now, with LLMs, users can ask a much broader range of queries and expect more from the results. Once assistants have real-time, always-on context on current work, users will again develop new behaviors and expectations of proactive support.

Our perspective is that ambient assistants are the final form in this space.

Just as browser-based AI assistants fully subsume capabilities of traditional search, ambient assistants fully subsume the capabilities of browser-based assistants. They can search the web for the weather or recent news. They can help you draft an email or summarize meeting notes. And, uniquely, they will have context to help you with your ongoing work, regardless of what it is, and without you needing to provide it.

Imagine an intelligent auto-complete that predicts your next action across any application. Want to send an email following up on next steps from a meeting? Add texted dinner plans to your calendar? Forward a meme to your partner? Do it with a simple command or single click in your ambient assistant app.

Why won’t applications just integrate directly with existing LLMs?

While it might seem logical for applications to integrate directly with platforms like Gemini, ChatGPT, or Claude, the reality is more complex.

Computer use always spans several applications, with a substantial long tail. Point-to-point integrations come with a number of challenges; when you don’t own the API, you are subject to its limitations and changes.

Many common assistant use cases are practically impossible to achieve direct integrations alone: What if I want to ask about a past discussion, but don’t remember if it was in Slack or Gmail? What if I want help with documents that I’m actively working on but haven’t yet saved?

These would be challenging to answer with APIs but straightforward if you have full access to the state of a device. Universal visibility is required to provide the context on user workflows and intent that enables new automations.

What will make ambient assistants excellent?

While they are conceptually simple, it’s only recently become possible to build ambient assistants through the maturation of LLMs and infrastructure. Improvements in smaller open-source language and audio models allow for local-first assistants. Multi-modal models can now reliably extract and parse information from desktop screencaps. And maturing interfaces and retrieval approaches help assistants find the right context for a query.

We think that the winning ambient assistant will be built on:

Novel UX with excellent execution: A successful ambient assistant will be used many times a day, for a broad set of tasks, by a diverse set of users. Barriers to learning new habits are high, and switching costs if a user is dissatisfied are low. This is one of the hardest products to design, and excellent UX will in large part determine who wins.

The key challenge is making the product simultaneously unobtrusive (so it works out of the box) and configurable (so it works how more advanced users want it to). The vast majority of users will need to be shown exactly what to do in the platform, while advanced ones (and in the future, developers) will want full configurability.

Last but not least, it will be critical to seamlessly cross-platform, which both unlocks a broader set of use cases and will be a major differentiator from OS incumbents (more on them below).

We frequently use Highlight, which makes cross-application automations easy. With a simple shortcut, you can take on-device context (such as a text note or audio recording of a meeting), dictate instructions to the assistant, and have it generate outputs directly into an application. These types of low-friction workflows will be essential for everyday use.

Distribution: It’s a big ask to download a new application, let alone one that requires always-on access to everything on your device. In addition to excellent UX to ensure users stay on the platform, distribution ability will be critical — particularly since OS incumbents will have a massive advantage in this area. There are multiple potential distribution paths, from direct PLG and consumer growth loop to enterprise licenses and device partnerships. Because UX is so important, we don’t think distribution is sufficient to win — people won’t use a crappy built-in assistant — but it is necessary to win big.

Performant, reliable, private on-device operations: For privacy reasons, as well as cost and latency, ambient assistants will need to operate on-device. Even with a cloud-based LLM, data captured from the screen and device state must happen locally. Data extraction, synthesis, embedding, and storage may need to be on-device as well. Performance will be vital — we believe users will not tolerate any substantial degradation of their device performance or battery. And data privacy will be essential, both for personal and enterprise use.

Context and memory: Ambient assistants will consume a massive amount of unstructured data and must identify what is relevant to a user query. Depending on the use case, it might need to persist that information over days or weeks. Doing this will involve memory and retrieval systems combined with interfaces and workflows for users to prompt the assistant with the right context. The right approach(es) will depend on user behaviors: How often will they ask different kinds of questions about the past? But over time, we think reliable long and short-term memory will be an important capability.

The next application platform

When ambient assistants emerge as the dominant way people interact with AI, they will also create the next major platform opportunity.

As described in recent posts, general assistants won’t be the best choice to automate most end-to-end workflows. We expect a proliferation of domain-specific applications with LLM orchestration, data integrations, and interfaces designed specifically for their use case.

But when any of these workflows needs access to the computer — whether it needs context across applications, or to run a model locally — it will be more convenient for developers to build on top of an existing platform. A successful ambient assistant will already (1) have performant, secure access to device context, and (2) be deeply embedded in user habits. In fact, it’s almost impossible to imagine multiple on-device platforms running concurrently: Who wants two conflicting search bars sucking power from their machine?

Beating the incumbents

If owning the ambient assistant will be the next major application platform, it will be the objective of every incumbent in the space. Here’s why we think there’s an opportunity for a startup to win:

Foundation model providers have native apps and have started to demo some agentic computer use capabilities. But these companies are locked in a capital and time-intensive battle around core model performance. As described above, we believe that ambient assistant success will be driven by performant on-device inference, UX, and memory systems. These are a very different set of priorities and constraints from current hosted foundation model development, so it’s likely that a focused startup can outperform them.

OS incumbents like Microsoft, Apple, and Google are more formidable competitors because they own the operating systems and have sufficient alignment to invest heavily in local applications and infrastructure. It’s possible that they win in certain areas, such as Siri’s deep integration with iMessage and Google’s ownership of Android/Chrome. But we think these companies will be constrained by their need to support a broad general consumer base and their desire/philosophy to control their assistant experience. For example, Apple has announced an “onscreen awareness” feature where developers can make data available to Siri, but it is unlikely they will allow developers (or users) to build applications that change how Siri operates. These companies are also unlikely to prioritize cross-platform (e.g. Android-Mac) or web application (e.g. Notion, Jira) support as first-class citizens.

It’s clear that while foundation model providers and OS incumbents have certain advantages in the race, they are constrained by competing priorities. This leaves room for a nimble, focused startup to address gaps and emerge as the next leading major application platform.

We are excited to hear from companies innovating in this space — please reach out to at@theory.ventures.

Get the latest in AI & data, straight to your inbox.

Thanks for subscribing!
Oops! Something went wrong while submitting the form.