- Horizon AI
- Posts
- OpenAI Claims to Have Found the Cause of AI Hallucinations 👀
OpenAI Claims to Have Found the Cause of AI Hallucinations 👀
Merge images into a single picture using Nano Banana

Welcome to another edition of Horizon AI,
OpenAI published a new paper asking a simple question with a hard answer: why do LLMs still make things up, and what can we do about it?
Let’s jump into it!
Read Time: 4.5 min
Here's what's new today in the Horizon AI
Did OpenAI Just Solve Hallucinations?
Google Introduces New AI-Powered Capabilities for Gemini, Search, and NotebookLM
AI Tutorial: Merge images into a single picture using Nano Banana
AI Tools to check out
AI Findings/Resources
The Latest in AI and Tech 💡
AI News
OPENAI
Did OpenAI Just Solve Hallucinations?

In the paper, OpenAI defines hallucinations as “plausible but false statements generated by language models,” and acknowledges that despite improvements, hallucinations “remain a fundamental challenge for all large language models.”
Details:
The research suggests hallucinations arise because standard training and evaluation reward guessing over saying “I don’t know.”
When models are graded only on accuracy, correct guesses raise their score, while admitting uncertainty guarantees a zero.
The researchers propose redesigning evaluation metrics to explicitly penalize confident errors and award partial credit for expressing uncertainty, similar to tests that deduct points for wrong guesses.
The paper admits there is still a lot of work to be done: these methods may help reduce hallucinations, but completely solving the problem is still a long way off.
TOGETHER WITH STARROCKS SUMMIT 2025
StarRocks Summit 2025 — Free & Virtual
A day of engineers teaching engineers on Sept 10 (free and virtual). 25+ practical sessions from Coinbase, Intuit, Pinterest, Demandbase, plus Celonis, TRM Labs, Eightfold.
See how teams use StarRocks, the open-source, high-performance analytical database, to:
Hit sub-second queries at PB scale under heavy concurrency
Kill brittle joins and pipelines for fresher data
Cut storage 10× and shrink infra bills
Run Apache Iceberg with warehouse-grade speed
Power AI-driven, customer-facing analytics
Real results:
🔧 Demandbase replaced 49 ClickHouse clusters and slashed storage ~90%.
⚡ Pinterest halved p90 latency; Druid costs down to one-third.
🧊 TRM Labs built an Iceberg-centered lakehouse delivering sub-second insights.
Explore the agenda & claim your free pass 👉 One day only, Sept 10.
Google Introduces New AI-Powered Capabilities for Gemini, Search, and NotebookLM

Google is upgrading three Gemini-powered products: the Gemini app now supports audio files, Search’s AI Mode adds five new languages, and NotebookLM can automatically generate structured reports.
Details:
The Gemini app now supports audio uploads, free users can upload up to 10 minutes of audio and get five free prompts per day, while AI Pro and AI Ultra users can upload up to three hours. Each prompt can include up to 10 files in many formats, even bundled inside ZIPs.
Thanks to Gemini 2.5 integration, Google Search’s AI Mode now supports Hindi, Indonesian, Japanese, Korean, and Brazilian Portuguese.
NotebookLM gains new report styles in over 80 languages, generating study guides, briefing docs, blog posts, flashcards, and quizzes from a user’s uploaded docs and media. The feature should reach all users by the end of this week.
Even with multiple AI updates already released in the past month, Google shows no signs of slowing down.
AI Tutorial
Merge images into a single picture using Nano Banana

Source: @MrDavids1 on X
Go to the Gemini website or mobile app and select the Gemini 2.5 Flash model, or go to Google AI Studio and choose the Nano Banana/gemini-2.5-flash-image-preview model.
Upload your images
Enter a prompt describing how you’d like them to appear in the final image.
Example: "A model is posing and leaning against a white nissan silvia. He is wearing the following items, the pants are tucked into the sneaker. He is standing against a dark grey background. The car is doing a standstill burnout with orange smoke rising from the back tyres. There is a tiger sitting next to him."

A template could be:
A [subject] is [doing an action or positioned in some way] with [object]. The [subject] is [appearance, clothing, or accessories if relevant]. The background is [description of setting or color]. [Optional: describe object/vehicle action or effect]. There are also [extra characters, animals, or props] next to the [subject].
But you can adjust it further depending on the result you want and the elements you want to include.
AI Tools to check out
👀 Uxia: Validate your User flows UX & UI in seconds with AI.
🌀 Spiral: Analyze customer feedback with AI.
🎶 Mureka: AI music generator that creates unique and customizable songs, lyrics and tracks for any project.
AI Findings/Resources
🛡️ A threat researcher illustrates 5 different prompt injection techniques used to exploit weaknesses or bypass safeguards in AI models
🤔 How the AI boom is leaving consultants behind
✈️ A United Airlines passenger explained how they fooled the AI chatbot so they could finally speak with a person.
TOGETHER WITH PACASO
Keep This Stock Ticker on Your Watchlist
They’re a private company, but Pacaso just reserved the Nasdaq ticker “$PCSO.”
No surprise the same firms that backed Uber, eBay, and Venmo already invested in Pacaso. What is unique is Pacaso is giving the same opportunity to everyday investors. And 10,000+ people have already joined them.
Created a former Zillow exec who sold his first venture for $120M, Pacaso brings co-ownership to the $1.3T vacation home industry.
They’ve generated $1B+ worth of luxury home transactions across 2,000+ owners. That’s good for more than $110M in gross profit since inception, including 41% YoY growth last year alone.
And you can join them today for just $2.90/share. But don’t wait too long. Invest in Pacaso before the opportunity ends September 18.
Paid advertisement for Pacaso’s Regulation A offering. Read the offering circular at invest.pacaso.com. Reserving a ticker symbol is not a guarantee that the company will go public. Listing on the NASDAQ is subject to approvals.
The latest in AI and Tech
Alibaba’s Qwen Team has unveiled its largest large language model yet: Qwen3-Max-Preview (Instruct), featuring over 1 trillion parameters, blazing fast response times, and API availability.
Benchmarks show it surpasses Alibaba’s previous top model, Qwen3-235B, and competes closely with other high-end LLMs, outperforming Claude Opus 4, Kimi K2, and Deepseek-V3.1 across multiple tests.
Authors Grady Hendrix and Jennifer Roberson filed a proposed class-action lawsuit accusing Apple of using copyrighted books to train its AI systems without consent, credit, or compensation.
The funding more than doubles the startup’s $4 billion valuation from March. The round was led by Founders Fund and also saw participation from Lux, 8VC, Elad Gil, and other investors.
The Dutch supplier of advanced chipmaking equipment, is set to become the largest shareholder in French AI startup Mistral. The company is investing €1.3 billion ($1.5 billion) in Mistral’s €1.7 billion (~$2 billion) Series C funding round and is expected to take a board seat.
That’s a wrap!
We'd love to hear your thoughts on today's email!Your feedback helps us improve our content |
Not subscribed yet? Sign up here and send it to a colleague or friend!
See you in our next edition!
Gina 👩🏻💻