Read on app Read on app
✕
Prayer Times
  • Morocco
  • Lifestyle
  • Western Sahara
  • Login
Morocco World News
  • Home
  • Culture
  • Politics
  • Society
  • Economy
  • Opinion
  • Education
  • Sustainability
  • Tech
  • Sport
  • GITEX 2026
No Result
View All Result
Morocco World News
  • Home
  • Culture
  • Politics
  • Society
  • Economy
  • Opinion
  • Education
  • Sustainability
  • Tech
  • Sport
  • GITEX 2026
No Result
View All Result
Morocco World News

Home > Headlines > BridgeBench Shows Top AI Models at 10% Accuracy Despite Strong Reasoning

BridgeBench Shows Top AI Models at 10% Accuracy Despite Strong Reasoning

The results suggest that even when leading AI models are wrong, they can still produce highly convincing explanations, as reflected in consistently high evidence scores.

Oumaima Moho AmerbyOumaima Moho Amer
Apr, 17, 2026
0 0
A A
AI models

AI models

Follow the latest news from Morocco World News

Join on WhatsApp Join on Telegram

Casablanca – BridgeBench, a new benchmarking project focused on AI reasoning, has released a ranking that exposes a gap between how confidently models explain answers and how often those answers are correct.

The benchmark tests models on reasoning-heavy tasks and  scores them across three metrics. Accuracy measures whether the final answer is correct. Evidence evaluates how well the model supports its reasoning with verifiable steps or sources. The overall score combines both, aiming to reward systems that not only answer, but also justify.

In the latest results, xAI’s Grok 4.20 Reasoning model ranks first with a score of 41.8. It records 10.0% accuracy and 89.7% on evidence. OpenAI’s GPT-5.4 follows closely with a score of 40.6, matching the same 10.0% accuracy and slightly stronger evidence at 90.6%.

Anthropic’s Claude Opus 4.7 comes third at 40.3, but with lower accuracy at 6.7%, offset by the highest evidence score among the top models at 91.3%.

Read also: Google Launches AI-powered Desktop Search App for Windows

In fourth place is Grok 4.20, the non-reasoning version, scoring 40.0 with 6.7% accuracy and 89.9% evidence. Claude Opus 4.6 rounds out the top five with a score of 39.6, posting 10.0% accuracy and 86.1% evidence.

Further down, Google’s Gemini 3.1 Pro ranks 15th with a score of 34.3. Its accuracy drops sharply to 3.3%, despite an evidence score of 89.1%.

What makes the ranking striking is not who leads, but how low the accuracy remains across all models. Even the top systems only answer correctly about one in ten times.

At the same time, their evidence scores are consistently high, raising questions about what exactly is being measured. If models can produce convincing chains of reasoning while still being wrong most of the time, the benchmark may be capturing fluency more than reliability.

Morocco World News is also on X — check out our latest posts now! Get MWN on iOS and Android for instant access to breaking news.

Tags: AIchatgptClaudegeminiGrok
TweetShareShareSendShareScan

Recent News

Morocco’s Ayoub El Kaabi has stressed that the primary objective of the Atlas Lions is to secure a victory against Haiti in tonight’s match, and to bring joy to the Moroccan fans.

Al-Kaabi: ‘We’re Brothers and Family’ Determined to Make Morocco Happy, Proud’

June 24, 2026
The 2026 World Cup group stage continued with another entertaining matchday, producing 66 goals in total.

World Cup 2026: Superstars Shine as Group Stage Begins to Take Shape

June 24, 2026
scotland brazil world cup

Scotland Stands One Game from History with Brazil Looming in Group C Decider

June 24, 2026
Abdelmalek Essaâdi University.

Al Hoceima to Get New Economics Faculty Under Morocco’s University Overhaul

June 24, 2026
Moroccan para taekwondo athlete Rajae Akermach continued her encouraging run of form by claiming a silver medal at the WT President's Cup Europe (G3) in Nuremberg, just days after securing a fifth-place finish at the World Para Taekwondo Grand Prix in Rome.

Morocco’s Rajae Akermach Claims President’s Cup Silver at Rome Grand Prix

June 24, 2026

USEFUL LINKS

  • About
  • Privacy Policy
  • Contact
  • Careers
  • Terms Of Use
  • Cookies Policy

TOPICS

  • Mawazine 2025
  • Environment
  • Politics
  • Lifestyle
  • Sports
  • Western Sahara

REGIONS

  • International
  • Maghreb
  • Middle East
  • Africa

Download our App


Download the Morocco World News app on Google Play for Android

Download the Morocco World News app on the Apple App Store for iPhone and iPad

Copyright 2026 Morocco World News. All rights reserved. Morocco World News is not responsible for the content of external sites.
Read about our approach to external linking.

Welcome Back!

Login to your account below

Forgotten Password?

Retrieve your password

Please enter your username or email address to reset your password.

Log In
  • Login
No Result
View All Result
  • Home
  • Culture
  • Politics
  • Society
  • Economy
  • Opinion
  • Education
  • Sustainability
  • Tech
  • Sport
  • GITEX 2026

Useful Links

  • Prayer Times

Useful Links:

  • Prayer Times

All Right Reserved © 2025 Morocco World News .

Contact us
Are you sure want to unlock this post?
Unlock left : 0
Are you sure want to cancel subscription?