AV2 saves 30 percent bitrate compared to AV1

Larger superblocks, data-driven transforms, new filters: AOMedia has released the final specification for AV1 successor AV2.

listen Print view
Empty cinema hall with AV2 logo

(Image: AOMedia / Bearbeitung heise medien)

5 min. read
Contents

The Alliance for Open Media (AOMedia) has released the final specification for AV2. The new video codec is intended to eventually replace AV1 as the next-generation, royalty-free format. According to developer evaluations, AV2 achieves approximately 30 percent lower bitrates than AV1 on average at comparable image quality. The specification describes numerous new coding tools, including data-driven transforms, improved entropy coding, new filters for image post-processing, and additional motion prediction methods.

AOMedia was founded by Google, Amazon, Microsoft, Cisco, Netflix, Mozilla, and Intel, among others, to develop open and royalty-free media standards. AV2's predecessor, AV1, is now supported by all major browsers, as well as many streaming platforms and hardware decoders. Unlike current research approaches for neural video codecs, AV2 remains a classic hybrid block codec. It predicts image areas, transforms the remaining error data, quantizes it, and finally encodes it entropically. The details are described in the official AV2 specification.

The most important metric so far comes from an evaluation study related to AV2. According to this, AV2 requires about 30 percent less bitrate than AV1 on average to achieve comparable visual quality. Streaming providers could thus deliver 4K video with significantly less bandwidth or stream in higher quality at the same bandwidth. However, the results largely come from within the AV2 development circle; independent comparative studies are still pending. The methodology and results are described in the paper Video Quality Evaluation Methodology and Result of AV2 Compression Performance.

Among the most striking changes compared to AV1 are larger coding units. While AV1 uses superblocks of up to 128 Ă— 128 pixels, AV2 now supports superblocks up to 256 Ă— 256 pixels. Combined with extended recursive block partitioning, the encoder can more efficiently group large, low-structured image areas, while fine details continue to be encoded in smaller blocks.

Technically, among the most interesting innovations are the "Data-Driven Transforms." Video codecs convert the error data remaining after prediction into a mathematically more compressible form. Traditionally, methods like the Discrete Cosine Transform (DCT) have dominated this. AV2 supplements these with statistically optimized transforms developed from training data, which are intended to map typical image structures more efficiently. According to the developers, new transform kernels and extended methods for splitting transform blocks are also included. The details can be found in the paper Transform and Entropy Coding in AV2.

The developers have also expanded the entropy coding. This is the final stage of actual compression and determines how efficiently the already transformed data can be stored as a bitstream. AV2 uses additional context models and finer probability estimates to better exploit statistical relationships between image data. While such changes may seem unspectacular at first glance, they often contribute significantly to bitrate savings.

Videos by heise

Further innovations concern motion prediction and compensation. Modern video codecs rarely store complete individual frames; instead, they primarily describe the changes between consecutive frames. The more precisely a codec models motion, the less residual data it needs to transmit. New methods include tools that track motion over multiple reference frames or generate virtual intermediate frames for motion prediction. The AV2 specification defines additional prediction tools and extended motion models for this purpose.

Also new are additional in-loop filters for post-processing decoded images. These intervene during the decoding process and aim to reduce artifacts without losing fine image details. New methods include Cross-Component Sample Offset (CCSO), which uses luminance information to correct color errors, and Guided Detail Filter (GDF), which aims to better preserve image details during smoothing.

Additionally, there is support for Intra Block Copy (IBC). This allows a codec to use image regions from the same frame as a reference, rather than relying exclusively on other frames in the video sequence. This method is particularly suitable for screen recordings, remote desktop applications, cloud gaming, or presentations where many identical elements repeat within a single image. Similar methods are already known from other modern video codecs like VVC.

With the mentioned efficiency gains, AV2 moves significantly closer to the compression efficiency of VVC (H.266), the currently most powerful standardized video codec. However, whether AV2 will reach or even surpass VVC in practice cannot yet be reliably assessed. Independent comparative measurements and mature encoder implementations are lacking for this.

Furthermore, the actual introduction of the format begins with the specification. Market success will depend on powerful encoders, hardware support in GPUs and SoCs, and integration into browsers and streaming platforms. Experience with AV1 shows that several years can pass between the publication of a specification and its widespread adoption. AOMedia provides further information on the project page for AV2.

(fo)

Don't miss any news – follow us on Facebook, LinkedIn or Mastodon.

This article was originally published in German. It was translated with technical assistance and editorially reviewed before publication.