Company claims OpenAI o1 can “reason” and calls it the most “dangerous” to date
OpenAI has released an update to ChatGPT called OpenAI o1, the first in a new series of AI models designed to complete more complex tasks and solve harder problems in science, coding and math.
Known during development as Strawberry, a preview version of o1 has been made available to existing ChatGPT users. There are two versions, o1-preview for general users, with a query limit of 50 per week, and o1-mini targeted at developers, offering 50 queries a day.
According to the company, it could be used by health care researchers to annotate cell sequencing data, by physicists to generate complicated mathematical formulas needed for quantum optics and by developers in all fields to build and execute multi-step workflows.
Self-Training Process
While OpenAI uses terminology like “thinking” and “reasoning” for its new models, it represents a step forward in generative AI but is nowhere near the industry’s ultimate goal of creating an artificial general intelligence (AGI).
ChatGPT announced in a blog post that o1 and the upcoming models in the series spend more time processing before they respond, using a self-training process to learn new strategies and recognize mistakes.
“In our tests, the next model update performs similarly to PhD students on challenging benchmark tasks in physics, chemistry and biology,” the company claimed. “We also found that it excels in math and coding. In a qualifying exam for the International Mathematics Olympiad (IMO), GPT-4o correctly solved only 13% of problems, while the reasoning model scored 83%.”
However, this early release of o1 has some limits compared with ChatGPT, for example, it is not able to browse the web or upload files and images like the chatbot’s current engine GPT-4o.
As CEO Sam Altman said in a post on X (formerly Twitter), “o1 is still flawed, still limited and it still seems more impressive on first use than it does after you spend more time with it.”
Safety Training
OpenAI has described the new model as its “most dangerous yet,” but this has widely been dismissed as pure marketing. In fact, the company has put in place several safety measures and guardrails to prevent it from bypassing them, a process known as jailbreaking. According to the company, GPT-4o scored 22 (on a scale of 0-100) in its internal jailbreaking test while its o1-preview model scored 84.
The company recently began putting into force agreements with the U.S. and U.K. AI Safety Institutes by granting them early access to a research version of this model to establish a process for research, evaluation and testing of future models before release.
Source: https://aibusiness.com/nlp/openai-chatgpt-model-update-offers-advanced-reasoning