Gemini 2.5 Flash-Lite is Google's fastest and most cost-effective AI model

Gemini 2.5 Flash and Pro are now generally available, but Google has another AI model. Gemini 2.5 Flash-Lite is supposed to be faster and cheaper.

(Image: Google)

Jun 18, 2025 at 10:54 am CEST

4 min. read

By

Frank Schräer

With Gemini 2.5 Flash and 2.5 Pro, Google presented its most powerful AI models to date just under a month ago. According to the data company, these are now generally available, but Google is now presenting another version of these services. Gemini 2.5 Flash-Lite is Google's fastest and most cost-efficient AI model to date, says Google. This version can initially be used as a preview.

At Google I/O in mid-May, the data company announced further details about Gemini 2.5 in addition to an AI subscription for 250 US dollars and the agent-based Gemini. Thanks to their performance, Gemini 2.5 Flash and 2.5 Pro are intended to lay the foundations for a new AI world with a world model that can do everything – like a real brain. Shortly beforehand, Google had already made a preliminary version of Gemini 2.5 Pro available to developers.

Gemini 2.5 as Flash Lite for developers

Now Gemini 2.5 Flash and 2.5 Pro are generally available as stable versions, so that (even cautious) developers can now confidently integrate these AI models into their software, writes Google on its blog post. However, this is apparently not enough for the company, as Google is simultaneously expanding the Gemini 2.5 models with a new version called Gemini 2.5 Flash-Lite.

Videos by heise

Such a Flash-Lite version already existed in the AI predecessor model Gemini 2.0, but Google claims to have improved the quality of Gemini 2.5 Flash-Lite in many areas such as coding, mathematics, science, reasoning and multimodal benchmarks. In particular, the latency times for numerous queries are said to have been reduced compared to Gemini 2.0 Flash-Lite and 2.0 Flash. This is important for translations and classifications, for example.

Fast, inexpensive and good, says Google

The preliminary version of Gemini 2.5 Flash-Lite (gemini-2.5-flashlite-preview-06-17) has the same capabilities that make Gemini 2.5 useful, according to Google DeepMind. This includes the possibility of reasoning, when the AI model can not only reproduce content, but also link it logically. It should also be possible to activate AI thought processes with different budgets to link Gemini with Google Search, for example, or to use it for programming tasks or multimodal input.

Costs and benchmarks of Gemini 2.5 Flash-Lite

(Image: Google)

As the cost of using Gemini 2.5 Flash-Lite is sometimes only a third or even a fraction of the price of Gemini 2.5 Flash or 2.5 Pro, Google describes Gemini 2.5 Flash-Lite as the most cost-effective AI model in this model family to date. It is also said to be the fastest Gemini model to date, with the quality of the results of 2.5 Flash-Lite only slightly behind the full-price models (see table above).

The pre-release version of Gemini 2.5 Flash-Lite is now available to developers in Google AI Studio and Vertex AI, alongside the now stable Gemini 2.5 Flash and 2.5 Pro. The latter can also be used in the Gemini app. Google also offers specially adapted versions of Gemini 2.5 Flash-Lite and 2.5 Flash in Google Search.