Microsoft boosts AI capabilities with updates to Copilot
On Tuesday, Microsoft revealed an upgraded version of its Copilot chatbot, incorporating new voice and visual features aimed at improving the user experience.
The enhanced Copilot is capable of engaging in voice conversations and interpreting images. This new functionality elevates user interaction, providing four voice options that can be utilized for brainstorming, quick queries, and emotional support.
Mustafa Suleyman, Microsoft's executive vice president and CEO of AI, characterized Copilot as "in your corner, by your side," with the goal of delivering a seamless and intuitive AI experience.
In response to criticisms aimed at OpenAI regarding a chatbot voice that resembled actress Scarlett Johansson, Microsoft employed voice actors to generate training data for the four voice options, ensuring they do not mimic any recognizable figures.
The company is also testing visual capabilities, which enable users to "see" content while interacting with the AI on a webpage, along with receiving pertinent suggestions without interrupting their workflow. Microsoft has assured users that data gathered through the visual feature will be discarded post-use and that it will be limited to certain websites to ensure safety.
Additionally, Microsoft has introduced the "Think Deeper" feature, which empowers Copilot to address more intricate queries and reasoning, following the path set by OpenAI's recently updated model aimed at scientific, coding, and mathematical challenges. The "Discover" feature is designed to create a more personalized Copilot experience based on user interactions; however, this feature is not yet available in the EU or Britain due to stricter data protection laws.
By leveraging its $13 billion partnership with OpenAI, Microsoft is bringing generative AI technology to a broader audience. The tech giant now faces heightened competition from key players such as Google, Apple, and Meta, all of whom are embedding AI within their popular platforms to target a wider consumer market.
On the same day, OpenAI introduced new developer tools aimed at facilitating the creation of AI applications. A standout among these tools is a real-time solution that enables developers to create AI voice applications using a single set of instructions. This simplifies a process that previously required multiple steps, including transcribing audio, generating responses, and converting text back into speech.
As part of this initiative, OpenAI also rolled out a fine-tuning tool that allows developers to enhance AI models following training. This tool enables developers to refine model responses through images and text, incorporating human feedback to better distinguish between strong and weak responses. By fine-tuning with images, models improve their image recognition capabilities, which can enhance visual search functions and object detection in autonomous vehicles.
Frederick R Cook contributed to this report for TROIB News