Loading stock data...

Microsoft Launches Preview of Copilot Vision, an AI Tool That Reads Screen Content

On Thursday, Microsoft began rolling out a limited, U.S.-only preview of Copilot Vision, a tool that can understand and respond to questions about sites you’re visiting using Microsoft Edge. Gated behind Copilot Labs, an opt-in program for experimental AI capabilities, Copilot Vision can analyze text and images on web pages to answer queries like ‘What’s the recipe for this lasagna?’

What is Copilot Vision?

Copilot Vision is a tool that uses AI to understand and respond to questions about sites you’re visiting using Microsoft Edge. It can analyze text and images on web pages, summarize and translate text, and handle tasks like spotlighting discounted products in a store catalog.

How Does it Work?

When you choose to enable Copilot Vision, it sees the page you’re on, reads along with you, and you can talk through the problem you’re facing together. Microsoft wrote in a blog post that ‘it’s a new way to invite AI along with you as you navigate the web, tucked neatly into the bottom of your Edge browser whenever you want to ask for help.’

What Can Copilot Vision Do?

Copilot Vision can do several things:

  • Answer questions: It can analyze text and images on web pages to answer queries like ‘What’s the recipe for this lasagna?’
  • Summarize and translate text: It can summarize and translate text, making it easier to understand information on web pages
  • Handle tasks: It can handle tasks like spotlighting discounted products in a store catalog

Limitations of Copilot Vision

Copilot Vision has several limitations:

  • Data deletion: Processed audio, images or text aren’t stored or used to train models
  • Types of websites: For the time being, Microsoft’s blocking the feature from working on paywalled and ‘sensitive’ content
  • List of allowed websites: The list is determined by category and on a case-by-case basis

Microsoft’s Approach

Microsoft’s cautious approach is partly the product of legal disputes with news outlets. In one ongoing suit, The New York Times alleged that Microsoft let users get around its paywall by serving NY Times articles through the Copilot chatbot on Bing.

  • Machine-readable controls: Microsoft said that Copilot Vision will respect sites’ ‘machine-readable controls on AI,’ like rules that disallow bots from scraping data for AI training
  • Precise which controls: However, the company hasn’t said precisely which controls Vision will respect

Feedback and Collaboration

Microsoft is committed to taking feedback from publishers to allay their concerns. Some of those they’ve collaborated with are third-party publishers, who help them understand how Vision could be used to help people better engage and make decisions on their pages.

Conclusion

Copilot Vision is a tool that uses AI to understand and respond to questions about sites you’re visiting using Microsoft Edge. It can analyze text and images on web pages, summarize and translate text, and handle tasks like spotlighting discounted products in a store catalog. While it has several limitations, Microsoft’s cautious approach is partly the product of legal disputes with news outlets.

Frequently Asked Questions

  • Q: What is Copilot Vision?
    • A: Copilot Vision is a tool that uses AI to understand and respond to questions about sites you’re visiting using Microsoft Edge
  • Q: How does it work?
    • A: When you choose to enable Copilot Vision, it sees the page you’re on, reads along with you, and you can talk through the problem you’re facing together
  • Q: What can Copilot Vision do?
    • A: It can answer questions, summarize and translate text, and handle tasks like spotlighting discounted products in a store catalog

Related Articles