• Horizon AI
  • Posts
  • How to Run Your Own Free, Offline, and Totally Private AI Chatbot in 5 Minutes 🤖

How to Run Your Own Free, Offline, and Totally Private AI Chatbot in 5 Minutes 🤖

Gemini 3 Deep Think gets major upgrade 💥

In partnership with

Welcome to another edition of Horizon AI,

Running LLMs locally might sound complicated, but it's easier than most people think. In today's issue, we show how you can use LM Studio to download, load, and chat with AI models privately, offline, and for free.

Let’s jump into it!

Read Time: 4.5 min

Here's what's new today in the Horizon AI

  • Gemini 3 Deep Think Gets Major Upgrade

  • GLM-5: The World’s Strongest Open-Source Model

  • AI Tutorial: How to run LLMs locally with LM Studio

  • AI Tools to check out

  • AI Findings/Resources

  • The Latest in AI and Tech 💡

AI News

GOOGLE

Gemini 3 Deep Think Gets Major Upgrade

Google has rolled out a significant upgrade to Gemini's Deep Think reasoning mode to let it "solve modern challenges across science, research, and engineering."

Details:

  • The company worked with scientists and researchers on this update, with the goal of using Deep Think to "tackle tough research challenges" that "often lack clear guardrails or a single correct solution and data is often messy or incomplete."

  • Google reports strong benchmark gains, including setting a new standard on Humanity's Last Exam, a remarkable 84.6% on ARC-AGI-2, an Elo of 3455 on Codeforces, and reaching gold-medal level performance on the International Math Olympiad 2025.

  • The leap in mathematics and competitive coding is joined by boosted performance in chemistry, physics, and other scientific domains.

  • Beyond benchmarks, the new Deep Think is designed "to drive practical applications," enabling researchers to interpret complex data and engineers to model physical systems through code.

This Gemini 3 Deep Think upgrade is available in the Gemini app for Google AI Ultra subscribers and via the Gemini API to "select researchers, engineers and enterprises."

TOGETHER WITH NEMOVIDEO

Stop Scrolling. NemoVideo Finds Viral Ideas And Edits Them Fast

NemoVideo is your video sidekick. It hunts proven viral ideas, breaks them down shot by shot, and shows you the structure behind why they worked. You can talk to it like a teammate—ask for a faster pace, different music feel, tighter captions, or more punch in the first three seconds.

Perfect for creators, freelancers, and small brands who want better videos without living in an editor.

  • Finds viral videos that fit your niche

  • Explains what makes them work

  • Builds a draft you can tweak by chat

  • Adds captions, split-screen, and smart cutaways

Less guessing. Less editing stress. More predictable results. More videos you're proud to post.

ZHIPU

GLM-5: The World’s Strongest Open-Source Model

Chinese AI company Zhipu AI has introduced GLM-5, the fifth generation of its large language model that the company claims matches Claude Opus 4.5 and GPT-5.2 on coding and agent tasks.

Details:

  • The new model doubled the size of its predecessor, GLM-4.7, from 355 billion to 744 billion parameters and increased the training data to 28.5 trillion tokens.

  • GLM-5's benchmarks make it the new most powerful open-source model in the world, according to Artificial Analysis, surpassing Chinese rival Moonshot's new Kimi K2.5 released just two weeks ago.

  • Beyond performance, GLM-5 undercuts competitors on price, at approximately $0.80–$1.00 per million input tokens and $2.56–$3.20 per million output tokens. That’s roughly 6x cheaper on input and nearly 10x cheaper on output than Claude Opus 4.6 ($5/$25).

  • The model weights are available under the MIT license, one of the most permissive open-source licenses.

While benchmark scores don't necessarily translate to real-world performance, the speed of the release is notable: its predecessor, GLM-4.7, was launched less than two months ago. It's also another example of how Chinese AI companies are catching up with proprietary Western rivals at a remarkable pace.

AI Tutorial

How to run LLMs locally with LM Studio

  1. Download LM Studio for your OS (Windows, macOS, or Linux).

  2. In LM Studio, go to the search tab and search for models by name or author (e.g., Llama, Google Gemma, Qwen, Mistral, GLM, etc.). You can even filter based on whether the model fits within the available memory on your current device.

  3. Once you find the one you're looking for, download it.

  4. Open the Chat tab and load the model from the top bar.

  1. Once the model is loaded, you can start a back-and-forth conversation with the model in the Chat tab.

AI Tools to check out

 Fimo: It helps teams build motion-first multi-page websites with AI-powered workflows, collaborative editing, and automated publishing.

📹 Leadde: Generative AI platform for business that transforms your content into professional, multilingual and interactive videos in minutes.

🐟 Fish Audio: Voice generation with emotion control, voice cloning, and pro audio tools.

💼 Resumly: Automates your job search with AI-powered job matching, resume creation, and one-click apply.

🎥 MovArt AI: Turn ideas into cinema, instantly.

TOGETHER WITH YOU.COM

AI is all the rage, but are you using it to your advantage?

Successful AI transformation starts with deeply understanding your organization’s most critical use cases. We recommend this practical guide from You.com that walks through a proven framework to identify, prioritize, and document high-value AI opportunities. Learn more with this AI Use Case Discovery Guide.

AI Findings/Resources

👉 Developer shares how they use Claude Code after 9 months of experience

🍝 "Will smith eating spaghetti" by Seedance 2.0

🔨 Spotify says its best devs stopped coding in December thanks to AI

The latest in AI and Tech

The new model "delivers a substantial leap in generation quality," offering improvements in generating complex scenes with multiple subjects and its ability to follow instructions. It can generate up to 15-second clips with audio, while taking camera movement, visual effects, and motion into account.

Although access is still limited (only available through ByteDance's Dreamina AI platform and through its AI assistant, Doubao), viral videos generated with the technology have been flooding social media for days now, including action scenes of famous actors like Tom Cruise and Brad Pitt, or fictional scenes from shows like Breaking Bad.

The videos gained so much traction that it even caught the attention of major US studios who now demand that the company "immediately cease its infringing activity."

The company describes GPT-5.3-Codex-Spark as a "smaller version" of the GPT-5.3-Codex model released earlier this month. They claim it's their "first model designed for real-time coding."

Ryan Beiermeister, OpenAI’s former vice president of product policy, was dismissed in January after a male colleague accused her of sexual discrimination. Beiermeister denied the allegation and said her termination followed internal disagreements, including concerns about a proposed ChatGPT “adult mode” feature.

The round was led by GIC and Coatue Management, with participation from firms such as Microsoft and Nvidia.

Elon Musk announced that the company implemented a reorganization that "required parting ways with some people" to "improve speed of execution." He also added that the company is "hiring aggressively."

That’s a wrap!

Thanks for sticking with us to the end! Let’s stay connected on LinkedIn and Twitter.

We'd love to hear your thoughts on today's email!

Your feedback helps us improve our content

Login or Subscribe to participate in polls.

Not subscribed yet? Sign up here and send it to a colleague or friend!

See you in our next edition!

Gina 👩🏻‍💻