ChatGPT Set to Receive Video Capabilities

OpenAI's AI chatbot is set to gain the capability to identify objects using a smartphone camera and provide immediate feedback.

Dec 13, 2024 - 13:00

ChatGPT Set to Receive Video Capabilities

OpenAI's ChatGPT now has the ability to process video cues from users via their smartphone cameras and provide real-time responses. This new feature is available to paid ChatGPT Plus and Pro subscribers, with plans to extend access to enterprise and educational customers next month.

Since its launch in 2022, the AI chatbot has continually enhanced its functionalities. Its developers indicated last year that the GPT-4 Large Language Model could achieve a higher SAT score than over 90% of test-takers.

OpenAI introduced this latest capability during a livestream event on Thursday. The feature allows ChatGPT to engage with users based on visual input captured via a smartphone camera or displayed on a screen. For example, users can request assistance in composing a response to messages in an open app or seek real-time advice for various tasks.

In February, researchers showcased a tool called 'Sora,' which is “able to generate complex scenes with multiple characters, specific types of motion, and accurate details of the subject and background,” all derived from user prompts. The tool can enhance images or videos by adding new elements, as demonstrated by the company through examples shared on its social media platforms.

In July, Reuters reported that OpenAI was developing methods to significantly boost the reasoning capabilities of AI models. An anonymous source referred to these enhancements as a “work in progress,” which would enable ChatGPT not only to respond to queries but also to carry out “deep research” and actively browse the internet.

The project, referred to as ‘Strawberry,’ aims to improve the AI chatbot's ability to identify common-sense solutions that are often intuitive for humans—an area where ChatGPT and similar models have struggled.

Around the same time, Russian scientists from T-Bank's AI Research Lab and the Moscow-based Artificial Intelligence Research Institute announced the creation of a new AI model dubbed ‘Headless-AD.’ According to its developers, this model can adapt to new tasks and contexts autonomously, without the need for human input, and is already capable of performing five times more actions than it was originally trained to do.

Mathilde Moreau for TROIB News

Discover more Science and Technology news updates in TROIB Sci-Tech