You are currently viewing How A.I. Chatbots Like ChatGPT and DeepSeek Reason

How A.I. Chatbots Like ChatGPT and DeepSeek Reason

  • Post category:technology
  • Post comments:0 Comments
  • Post last modified:March 27, 2025

Here is the result in plain text:

In September, OpenAI unveiled a new version of ChatGPT designed to reason through tasks involving math, science and computer programming. Unlike previous versions of the chatbot, this new technology could spend time “thinking” through complex problems before settling on an answer.

Soon, the company said its new reasoning technology had outperformed the industry’s leading systems on a series of tests that track the progress of artificial intelligence.

Now other companies, like Google, Anthropic and China’s DeepSeek, offer similar technologies.

But can A.I. actually reason like a human? What does it mean for a computer to think? Are these systems really approaching true intelligence?

Reasoning just means that the chatbot spends some additional time working on a problem.

“Reasoning is when the system does extra work after the question is asked,” said Dan Klein, a professor of computer science at the University of California, Berkeley, and chief technology officer of Scaled Cognition, an A.I. start-up.

It may break a problem into individual steps or try to solve it through trial and error.

The original ChatGPT answered questions immediately. The new reasoning systems can work through a problem for several seconds — or even minutes — before answering.

In some cases, a reasoning system will refine its approach to a question, repeatedly trying to improve the method it has chosen. Other times, it may try several different ways of approaching a problem before settling on one of them. Or it may go back and check some work it did a few seconds before, just to see if it was correct.

This is kind of like a grade school student who is struggling to find a way to solve a math problem and scribbles several different options on a sheet of paper.

It can potentially reason about anything. But reasoning is most effective when you ask questions involving math, science and computer programming.

You could ask earlier chatbots to show you how they had reached a particular answer or to check their own work. But a reasoning system goes further. It can do these kinds of things without being asked. And it can do them in more extensive and complex ways.

Companies call it a reasoning system because it feels as if it operates more like a person thinking through a hard problem.

For years, these companies relied on a simple concept: The more internet data they pumped into their chatbots, the better those systems performed.

But in 2024, they used up almost all of the text on the internet.

That meant they needed a new way of improving their chatbots. So they started building reasoning systems.

Last year, companies like OpenAI began to lean heavily on a technique called reinforcement learning.

Through this process — which can extend over months — an A.I. system can learn behavior through extensive trial and error. By working through thousands of math problems, for instance, it can learn which methods lead to the right answer and which do not.

It is a little like training a dog. If the system does well, you give it a cookie. If it doesn’t do well, you say, “Bad dog.”

(A The New York Times sued OpenAI and its partner, Microsoft, in December for copyright infringement of news content related to A.I. systems.)

It works pretty well in certain areas, like math, science and computer programming. These are areas where companies can clearly define the good behavior and the bad. Math problems have definitive answers.

Reinforcement learning doesn’t work as well in areas like creative writing, philosophy and ethics, where the distinction between good and bad is harder to pin down. Researchers say this process can generally improve an A.I. system’s performance, even when it answers questions outside math and science.

It gradually learns what patterns of reasoning lead it in the right direction and which don’t, said Jared Kaplan, chief science officer at Anthropic.

No. Reinforcement learning is the method that companies use to build reasoning systems. It is the training stage that ultimately allows chatbots to reason.

Absolutely. Everything a chatbot does is based on probabilities. It chooses a path that is most like the data it learned from — whether that data came from the internet or was generated through reinforcement learning. Sometimes it chooses an option that is wrong or does not make sense.

A.I. experts are split on this question. These methods are still relatively new, and researchers are still trying to understand their limits. In the A.I. field, new methods often progress very quickly at first, before slowing down.

Source link

Leave a Reply