OpenAI has introduced a quicker, more cost-effective version of the artificial intelligence system supporting its chatbot, ChatGPT, as the start-up aims to keep ahead of growing competition. The new and improved GPT-4o was revealed at an event broadcast live on Monday, delivering an upgrade on its GPT-4 model that is now over a year old. With the new iteration of the large language model, trained on a colossal amount of data sourced from the Internet, it will boast advanced abilities in processing text, audio, and images in real-time. The upgraded features will be rolled out in the upcoming weeks.
The new system is capable of generating an audio response to a spoken query within milliseconds, providing a more seamless exchange, according to the company. A demonstration of the model saw Mira Murati, OpenAI’s Chief Technology Officer, engage in a conversation with the revised ChatGPT using just speech, demonstrating the chatbot could respond. Moreover, the chatbot could translate speech almost instantly from one language to another and, at one point, performed a section of a story when requested.
Murati, speaking to Bloomberg, said, “this represents a significant advancement in user interaction and ease of usage. We’re effectively enabling collaboration with tools like ChatGPT.” She added that a variety of features previously restricted to paying subscribers of ChatGPT will now be accessible to all users. These feature enhancements include the capability to find answers to queries online, converse with the chatbot in various voices, and command it to retain details that can be recalled later on.
The launch of GPT-4o is set to disrupt a quick-paced and ever-changing AI sector, where GPT-4 remains a benchmark. Several tech companies and emerging start-ups, including Google’s parent company, Alphabet, Cohere, and Anthropic, have recently introduced AI models that they claim either equate to or surpass the GPT-4 model with certain benchmarks.
In an unusual blog post, OpenAI’s CEO, Sam Altman, stated that the original ChatGPT offered a glimpse of how people could use language to engage with computers, but the experience with GPT-4o is “profoundly different”.
“The sensation is akin to AI from cinema, and I still find it astonishing that it is tangible,” he remarked. “Achieving human-like response rates and expressivity proves to be a considerable shift.” GPT-4o – the ‘o’ symbolises ‘omni’ – diverges from the multiple AI models utilised in processing varied inputs. It merges voice, text, and vision into a unified model, which enables it to outpace its forerunner. For instance, given a visual prompt, the system can answer with another image. The firm revealed that this new model is twice as rapid and considerably more efficient.
“Having three separate models operating in sync introduces substantial latency and interrupts the continuity of the experience,” noted Murati. “Yet, when a single model inherently employs reasoning across audio, text, and vision, all latency is eradicated, and interactions with ChatGPT more closely resemble how we communicate.”
However, the newly introduced model experienced some hitches. The audio routinely disconnected while the scientists engaged in their demonstration. The AI system caught the audience off guard when, after guiding a researcher in resolving an algebra equation, it unexpectedly adopted a flirty tone, saying, “Impressive outfit you’re sporting there.”
OpenAI is beginning to expand GPT-4o’s new textual and visual features to certain premium ChatGPT Plus and Team users and will be providing these features to corporate users in due course. The firm intends to release the updated ‘voice mode’ assistant version to ChatGPT Plus subscribers in the approaching weeks.
In line with its updates, OpenAI announced it will now permit anyone to utilise its GPT Store, featuring user-created chatbots, effectively ending its exclusivity for paying customers.
Recently, frenzied conjecture pertaining to OpenAI’s impending launch has morphed into a favourite discussion topic in Silicon Valley’s inner circles. An enigmatic chatbot sparked intrigue amongst AI enthusiasts after appearing on a benchmarking site, seemingly competing with GPT-4’s capabilities. Altman playfully alluded to the chatbot on X, thereby instigating speculation that his enterprise was behind it. On Monday, an OpenAI staff member confirmed on the X platform that the cryptic Chatbot was indeed GPT-4o.
OpenAI has been actively engaged in the development of a vast line-up of innovations, spanning from voice technology to video software. Bloomberg had earlier reported that the firm was also on track to introduce a search utility for ChatGPT.
The tech company has recently dispelled some circulating conjecture by stating that it will not be rolling out GPT-5 anytime soon, a version of the AI model that is speculated to far surpass existing AI systems in capability. The firm further clarified that it would not be unveiling a new search tool at the gathering on Monday, a service that could potentially rival Google. This news led to an uptick in Google’s share price.
However, in the wake of the event, Altman, an influential figure at the firm, maintained the aura of anticipation. “We will be revealing more upcoming stuff soon,” he commented on X, Bloomberg reports.