By using this site, you agree to the Privacy Policy and Terms of Use.
Accept
ScienceabodeScienceabode
  • Home
  • News & Perspective
    News & PerspectiveShow More
    Microorganism that causes rare but severe eye infections detected in NSW coastal areas
    By Admin
    Scientists identify common cause of gastro in young children and adults over 50 years old
    By admin
    AI reveals hidden traits about our planet’s flora to help save species
    By admin
    Eye drops slow nearsightedness progression in kids, study finds
    By admin
    Using AI to create better, more potent medicines
    By admin
  • Latest News
    Latest NewsShow More
    Even thinking about marriage gets young people to straighten up
    By admin
    Study: People tend to locate the self in the brain or the heart – and it affects their judgments and decisions
    By admin
    UCLA patient is first to receive successful heart transplant after using experimental 50cc Total Artificial Heart
    By admin
    Via Dying Cells, UVA Finds Potential Way to Control Cholesterol Levels
    By admin
    Racial makeup of labor markets affects who gets job leads
    By admin
  • Health
    Health
    The World Health Organization defines health as “a state of complete physical, mental, and social well-being and not merely the absence of disease or infirmity.”…
    Show More
    Top News
    Researchers design machine learning models to better predict adolescent suicide and self-harm risk
    September 11, 2023
    Scientists identify evolutionary gateway helping pneumonia bacteria become resistant to antibiotics   
    October 3, 2023
    New research indicates some people may be physically unable to use police breathalysers
    October 3, 2023
    Latest News
    New approach could transform epilepsy treatment
    April 24, 2026
    AI set to make medical scan reports twice as easy to understand for patients
    March 16, 2026
    How Mtb safeguards itself from foreign DNA
    January 14, 2026
    Study: High-fat diets make liver cells more likely to become cancerous
    January 5, 2026
  • Environment
    EnvironmentShow More
    Coo-ee! Scientists tap into the inner workings of long-distance animal calls
    By Admin
    Some early life forms may have breathed oxygen well before it filled the atmosphere
    By Admin
    Deforestation can cause eight-fold increase in flood event risk, says report
    By Admin
    Pollution and Dementia: The Connection Too Dangerous to Ignore
    By Admin
    Diver-Operated Microscope Brings Hidden Coral Biology into Focus
    By Admin
  • Infomation
    • Pricavy Policy
    • Terms of Service
  • Jobs
  • Application Submission
Notification Show More
Aa
ScienceabodeScienceabode
Aa
  • Home
  • Health
  • Anatomy
  • Jobs Portal
  • Application Submission
  • Categories
    • Health
    • Anatomy
    • Food & Diet
    • Beauty Lab
    • News & Perspective
    • Environment
  • More Foxiz
    • Blog Index
    • Sitemap
Follow US
Technology

Teaching AI models to say “I’m not sure”

Admin
Last updated: 2026/04/29 at 5:35 AM
By Admin
Share
6 Min Read
SHARE

Confidence is persuasive. In artificial intelligence systems, it is often misleading.

Today’s most capable reasoning models share a trait with the loudest voice in the room: They deliver every answer with the same unshakable certainty, whether they’re right or guessing. Researchers at MIT’s Computer Science and Artificial Intelligence Laboratory (CSAIL) have now traced that overconfidence to a specific flaw in how these models are trained, and developed a method that fixes it without giving up any accuracy.

The technique, called RLCR (Reinforcement Learning with Calibration Rewards), trains language models to produce calibrated confidence estimates alongside their answers. In addition to coming up with an answer, the model thinks about its uncertainty in that answer, and outputs a confidence score. In experiments across multiple benchmarks, RLCR reduced calibration error by up to 90 percent while maintaining or improving accuracy, both on the tasks the model was trained on and on entirely new ones it had never seen. The work will be presented at the International Conference on Learning Representations later this month.

The problem traces to a surprisingly simple source. The reinforcement learning (RL) methods behind recent breakthroughs in AI reasoning, including the training approach used in systems like OpenAI’s o1, reward models for getting the right answer, and penalize them for getting it wrong. Nothing in between. A model that arrives at the correct answer through careful reasoning receives the same reward as one that guesses correctly by chance. Over time, this trains models to confidently answer every question they are asked, whether they have strong evidence or are effectively flipping a coin.

- Advertisement -
MedBanner_Skyscraper_160x600_03/2018

That overconfidence has consequences. When models are deployed in medicine, law, finance, or any setting where users make decisions based on AI outputs, a system that expresses high confidence regardless of its actual certainty becomes unreliable in ways that are difficult to detect from the outside. A model that says “I’m 95 percent sure” when it is right only half the time is more dangerous than one that simply gets the answer wrong, because users have no signal to seek a second opinion.

“The standard training approach is simple and powerful, but it gives the model no incentive to express uncertainty or say I don’t know,” says Mehul Damani, an MIT PhD student and co-lead author on the paper. “So the model naturally learns to guess when it is unsure.”

RLCR addresses this by adding a single term to the reward function: a Brier score, a well-established measure that penalizes the gap between a model’s stated confidence and its actual accuracy. During training, models learn to reason about both the problem and their own uncertainty, producing an answer and a confidence estimate together. Confidently wrong answers are penalized. So are unnecessarily uncertain correct ones.

The math backs it up: the team proved formally that this type of reward structure guarantees models that are both accurate and well-calibrated. They then tested the approach on a 7-billion-parameter model across a range of question-answering and math benchmarks, including six datasets the model had never been trained on.

The results showed a consistent pattern. Standard RL training actively degraded calibration compared to the base model, making models worse at estimating their own uncertainty. RLCR reversed that effect, substantially improving calibration with no loss in accuracy. The method also outperformed post-hoc approaches, in which a separate classifier is trained to assign confidence scores after the fact. “What’s striking is that ordinary RL training doesn’t just fail to help calibration. It actively hurts it,” says Isha Puri, an MIT PhD student and co-lead author. “The models become more capable and more overconfident at the same time.”

The team also demonstrated that the confidence estimates produced by RLCR are practically useful at inference time. When models generate multiple candidate answers, selecting the one with the highest self-reported confidence, or weighting votes by confidence in a majority-voting scheme, improves both accuracy and calibration as compute scales.

An additional finding suggests that the act of reasoning about uncertainty itself has value. The researchers trained classifiers on model outputs and found that including the model’s explicit uncertainty reasoning in the input improved the classifier’s performance, particularly for smaller models. The model’s self-reflective reasoning about what it does and doesn’t know contains real information, not just decoration.

Source: Massachusetts Institute of Technology.

Published on April 29, 2024

TAGGED: AI, Artificial intelligence, RLCR
Admin April 29, 2026 April 29, 2026
Share This Article
Facebook Twitter Copy Link Print

Fast Four Quiz: Precision Medicine in Cancer

How much do you know about precision medicine in cancer? Test your knowledge with this quick quiz.
Get Started
Even in Winter, Life Persists in Arctic Seas

(USCGC Healy breaking through the Bering Sea waves. Credit: Chantelle Rose/NSF)   Despite…

A Biodiversity Discovery That Was Waiting in the Wings–Wasp Wings, That Is

Wing size differences between two Nasonia wasp species are the result of…

Entertainement

Coming soon

Your one-stop resource for medical news and education.

Your one-stop resource for medical news and education.
Sign Up for Free

You Might Also Like

New chip can protect wireless biomedical devices from quantum attacks

By Admin
Technology

Quantum sensors on the move

By Admin
Technology

New ‘negative light’ technology hides data transfers in plain sight

By Admin

AI set to make medical scan reports twice as easy to understand for patients

By Admin
Facebook Twitter Youtube Instagram
Company
  • Privacy Policy
  • Editorial Policy
  • Accessibility Statement
  • Contact US
  • Feedback
  • Advertisement
More Info
  • Newsletter
  • Beauty Lab
  • News & Perspective
  • Food & Diet
  • Health
  • Environment
  • Anatomy

Sign Up For Free

Subscribe to our newsletter and don't miss out on our programs, webinars and trainings.

Copyright © 2023 ScienceAbode. All Rights Reserved. Designed and Developed by Spirelab Solutions (Pvt) Ltd

Welcome Back!

Sign in to your account

Lost your password?