Mistral AI: New Medium 3.5 Language Model and Cloud Coding Agents

French AI startup Mistral has unveiled the Medium 3.5 language model, along with new cloud features for coding agents and the AI assistant Le Chat.

listen Print view
The Mistral AI logo appears on a smartphone screen.

(Image: jackpress / Shutterstock.com)

3 min. read
Contents

An agentic chatbot, cloud AI coding, and a powerful new open-weights model: French AI startup Mistral has announced three pieces of news. The focus is on the new Mistral Medium 3.5 language model, which can already be operated with just four GPUs in self-hosting.

Mistral Medium 3.5 is a model with 128 billion parameters and a context window of 256,000 tokens, combining instruction following, reasoning, and coding in a single model. The special feature: The computational effort for reasoning can be configured per request, allowing the same model to deliver both a fast chat response and work through complex agent tasks. Mistral Medium 3.5 is thus the successor to a growing model family. With Mistral 3, the company had introduced four new models at the end of 2025, intended to compete with US and Chinese rivals.

The model replaces Devstral 2 in the Vibe CLI and will also become the new standard model in Le Chat, Mistral's AI assistant. Mistral builds on a framework for developing AI agents that the company already introduced last year. The weights are published as open weights on Hugging Face under a modified MIT license. Via the API, the model costs $1.50 per million input tokens and $7.50 per million output tokens. For comparison: DeepSeek-V4 costs $1.74 per million input tokens in its Pro variant, placing it in a similar price range but offering a significantly larger context window.

Videos by heise

Mistral positions the model primarily for customers who prioritize European data sovereignty, low API costs, and the option for self-hosting over pure benchmark performance. To secure this infrastructure in Europe long-term, Mistral is investing heavily. The company is taking out an $830 million loan for a data center near Paris.

Previously, coding agents like Vibe ran exclusively locally. This is now changing: Sessions can run in the cloud, multiple in parallel, and notify the user when they are finished. Running local CLI sessions can be “teleported” to the cloud, including session history, task status, and pending approvals. This allows users to switch seamlessly between local editing and the cloud.

Each coding session runs in an isolated sandbox. Once the work is done, the agent can automatically open a pull request on GitHub and notify the developer – so they only need to review the result, not every single step.

Also new is a “Work Mode” for the Le Chat assistant. The mode allows the agent to use multiple tools simultaneously and complete complex, multi-stage tasks entirely – for example, reviewing emails and calendars in one go, combining research from the web and internal documents, or sending summaries directly to Slack.

Each tool call and the associated reasoning justification should remain visible. For actions such as sending a message or changing data, the agent explicitly asks for permission.

(mki)

Don't miss any news – follow us on Facebook, LinkedIn or Mastodon.

This article was originally published in German. It was translated with technical assistance and editorially reviewed before publication.