Google loosens usage limits for Gemini models after user feedback

Following criticism of Gemini models' usage limits being reached too quickly, Google is making adjustments. Complex requests now consume less quota.

Google Gemini.

(Image: Shutterstock)

at 10:33 am CEST

2 min. read

By

Andreas Floemer

Following the I/O 2026 developer conference, Google had announced changes to the access and usage limits of Gemini models. In the future, limits were to be based on the computing power of the tasks. Users apparently hit the newly set limits too quickly, forcing the company to make short-term adjustments.

This is stated by Josh Woodward, Vice President Google Labs, Gemini App & AI Studio, in a post on X. “We’ve heard your feedback about hitting limits too quickly on GeminiApp,” he writes. Because of this, Google is making improvements and introducing several corrections so that the quotas last longer and are more predictable.

Woodward explains that complex inputs with Gemini 3.1 Pro – especially when attaching large files – consumed quotas too quickly. Therefore, Google is now limiting the amount of quota a single input consumes so that users can get more out of the Pro model.

Read also

Report: Apple fully relies on local AI models at WWDC

"Gemini built in": Google offers reference designs for smart home devices

Volvo car with Google Gemini and text "Hey Google, can I park here?"

Volvo integrates Google Gemini and immersive navigation into new vehicles

A hand holds a card with "Fake" in the image, with "Real?" written below it.

Finally uniform AI labeling? Joint effort by OpenAI and Google

New chatbot Siri: Apple focuses on data protection

Furthermore, he clarifies that users do not pay for errors. “If a request fails, you won't be charged. Our system mistakes are on us, not you.” The existing quota is only consumed for successfully completed operations, he further explains.

Woodward further clarifies that prompts executed in Flash-Lite are also free of charge and are not counted towards the quota.

Videos by heise

More transparency in usage

Moreover, he explains that complex tasks such as deep research require more computing power, so Google will be “designing more detailed usage breakdowns and notifications to help you maximize your limits.” Currently, the Gemini dashboard at gemini.google.com/usage only offers a general overview.

The Google manager also points out that Gemini will remember when users select a specific model in the future. This selection will then be saved for all future sessions. It will only change if users manually adjust it or if an upper limit is reached. This triggers an automatic switch to a lighter model.

Finally, according to Woodward, Google has also fixed an error that caused “only one or two Omni videos” to consume quotas for “certain users.” Users of Google AI Ultra now have double the number of Omni generations. Furthermore, Woodward says that Google continues to look for ways to increase the amount of Omni.