Ars Technica AI October 3, 2024 TERMINATOR

What Copilot Vision means for browsing with AI in Edge

Microsoft is testing Copilot Vision, an opt-in feature that lets Copilot view pages inside Edge and answer questions about them. The trial is limited, tied to Copilot Pro, and framed with privacy limits including no stored Vision audio, images, text, or conversations.

WTF Index TERMINATOR

◄ Terminator 2 Idiocracy 1 ►

An AI assistant that can view browser pages raises mild surveillance and autonomy concerns, though it is opt-in and privacy-limited.

What Copilot Vision means for browsing with AI in Edge

Microsoft is moving its consumer AI assistant closer to the browser. With Copilot Vision, the company is testing a feature that lets Copilot see the page a user is viewing in Microsoft Edge and respond to questions about that page.

The feature is part of a broader set of updates to Copilot that Microsoft unveiled on Monday for a limited group of $20/month Copilot Pro subscribers. Alongside Vision, Microsoft is also introducing Copilot Labs, a place for testing AI features before they reach more users.

What Microsoft is testing

The two experimental features serve different purposes. Copilot Labs is designed as a proving ground for Microsoft’s latest AI tools. Microsoft describes it as offering “a glimpse into ‘work-in-progress’ projects.”

The first Labs feature is called “Think Deeper.” It uses step-by-step processing to handle more complex problems than the regular Copilot. Microsoft’s version is based on OpenAI’s new o1-preview and o1-mini AI models.

Think Deeper has so far reached some Copilot Pro users in Australia, Canada, New Zealand, the UK, and the US. Microsoft has not said when either Think Deeper or Copilot Vision will become more widely available.

Copilot Vision is the more visible shift for everyday browsing. Instead of asking users to copy text from a page into a prompt, Vision gives Copilot direct context from the page open in Edge, when the feature is enabled.

How Copilot Vision changes the browser assistant

Copilot Vision is meant to let the AI assistant understand what is on the page a person is browsing. According to Microsoft, when the feature is turned on, Copilot can “understand the page you’re viewing and answer questions about its content.”

That changes the interaction model. A user would no longer need to describe a webpage from scratch before asking for help. Copilot could use the page itself as the shared context for a question.

The practical idea is simple: browsing becomes more conversational. If the assistant can see what the user sees in Edge, the user can ask about the page directly. Microsoft presents that as a move toward more natural interaction and task assistance beyond text-only prompts.

At the same time, the feature is still an experiment. It is not being released broadly, and Microsoft is initially restricting where Vision can operate.

The privacy limits Microsoft is emphasizing

Because Copilot Vision can inspect browsing activity inside Edge, privacy is central to the test. Microsoft says the feature is entirely opt-in. The company also says no audio, images, text, or conversations from Vision will be stored or used for training.

Microsoft is placing other boundaries around the trial. Vision is initially limited to a pre-approved list of websites. The company is also blocking the feature on paywalled and sensitive content.

Those limits are important because the feature asks users to accept a new kind of assistant behavior. A browser AI that can see page content may be useful, but it also creates a sharper trust question than a chatbot waiting for pasted text.

The core privacy points Microsoft has stated are:

Copilot Vision is opt-in.
Microsoft says Vision audio, images, text, and conversations will not be stored.
Microsoft says that Vision data will not be used for training.
Vision is starting on a pre-approved website list.
Paywalled and sensitive content are blocked at the start.

The company says the rollout is gradual because it wants to balance “pioneering features and a deep sense of responsibility.” It also says it will be “listening carefully” to user feedback as access expands.

Why the rollout is cautious

The cautious launch follows a wider tension around AI assistants that observe user activity. The source article points to Microsoft’s Recall feature, which keeps a record of everything a person does on a PC so an AI model can recall it later.

That reaction matters because Copilot Vision also touches the boundary between helpful context and user surveillance concerns. Even when a feature is opt-in, users may be sensitive to an assistant monitoring activity, especially if data is sent to the cloud for processing.

Microsoft AI chief executive Mustafa Suleyman told Reuters that he sees Copilot as an “ever-present confidant” that could, with permission, learn from Microsoft-connected devices and documents. He also said Microsoft co-founder Bill Gates has shown particular interest in Copilot’s potential to read and parse emails.

Those comments point to a broader direction for Copilot: an assistant that works across more of a user’s digital life, if the user grants access. Copilot Vision is a browser-focused step in that direction, but Microsoft is keeping the first version narrow.

What to watch next

The key question is not only whether Copilot Vision works, but whether users decide the tradeoff is acceptable. For Microsoft, the feature has to show that an AI assistant can use live browsing context without making people feel that their browsing is being watched in a way they cannot control.

For now, the test remains limited to some Copilot Pro subscribers, and Microsoft has not provided a timeline for wider availability. That makes Copilot Vision less a finished product than a signal of where Microsoft wants its AI assistant to go: closer to the user’s screen, more aware of context, and more dependent on clear permission and trust.