Cloudflare gives AI agents long-term memory

Cloudflare aims to prevent context decay in AI agents with Agent Memory, saving developers costs.

(Image: Cloudflare)

Apr 20, 2026 at 1:04 pm CEST

3 min. read

By

Sven Festag

Cloudflare has introduced Agent Memory, a service designed to give AI agents a permanent memory. Instead of repeatedly providing all necessary information as context – which causes high token consumption – AI agents with Agent Memory should independently select relevant information and use it in their prompts to the language models. The service is initially available only in a closed beta version.

Agent Memory aims to prevent context decay

In addition to potential cost savings for developers resulting from lower token consumption, the US provider also aims to counteract so-called context decay with Agent Memory. Long prompts increasingly degrade the speed and reliability of an AI model's responses. Information from the beginning of a conversation gets lost because it no longer fits into the context window of the respective model.

According to a post on the Cloudflare blog, Agent Memory can be used as a persistent storage layer for AI agents hosted locally and in the cloud. Furthermore, developers can integrate the service into coordination frameworks for multiple agents to provide them with permanent storage across sessions and restarts. Storage profiles can also be shared, meaning information only needs to be transmitted once to an AI agent and can then be used and expanded by multiple agents.

Share and expand information within development teams

Cloudflare mentions integration into a development team's coding agents as a possible use case for Agent Memory. Initially, developers can input basic information that is important for all agents, such as internal conventions or architectural decisions. Subsequently, all connected agents use and expand this information.

The service can also be used for agentic code review – it should be able to remember what developers reject. With this information, the AI agent can adapt its feedback on program code and provide more relevant suggestions. Agent Memory can also be integrated into simple chatbots to save the message history and access it upon request.

Access via Cloudflare Workers and API

Agent Memory distinguishes between immutable facts, events from earlier points in time, current tasks, and instructions such as workflows or runbooks. The service independently updates outdated information and deletes duplicates. Access to the information is provided via an integration with Cloudflare Workers or a REST API.

Videos by heise

The interface offers five core operations: ingest for batch processing of conversations, remember for explicit storage, recall for synthesized queries, and list and forget for management and deletion. To map the entire API surface, Cloudflare recently published cf, a unified command-line tool. With it, developers should be able to control all the provider's services via a central tool and have them used by AI agents.

Registration for the closed beta of Agent Memory is not currently possible, but a waiting list is available. The date for general availability is not yet known.