Amazon S3 gets new bucket type and metadata management

Amazon is adding new functions to the S3 storage service for its in-house conference re:Invent. The focus is on data management.

listen Print view
Sign with Amazon company logo in front of company buildings

(Image: Michael Vi/Shutterstock.com)

4 min. read

Amazon is introducing a new bucket type for tables for its S3 cloud object storage and is presenting further innovations at the ongoing "AWS re:invent" user conference in Las Vegas. The "S3 Tables" are optimized for data analysis and should be able to process queries three times faster than self-managed tables. The developers rely on the open table format Apache Iceberg. Such formats offer similar functionality to databases, form an abstraction layer over data lakes and can therefore manage large data sets.

Because the buckets use the Iceberg format, queries can be executed with AWS and third-party applications. Users can use conventional SQL or Iceberg-specific functions. The table buckets also have a policy-based maintenance function. It can be used to manage snapshots and delete unreferenced data. An integration of S3 Tables with the AWS Data Glue Catalog is available as a preview. This allows users to query, transfer and visualize data with analysis services from AWS.

Amazon has also released a preview version of S3 Metadata. The service collects metadata from objects that are uploaded to a bucket and transfers it to a queryable, read-only table. The stored metadata includes system-defined characteristics, such as the size and origin of the object. On the other hand, user-defined properties can also be saved, for example article numbers or ratings.

This allows a user to search for files with defined properties, such as logs from a specific time period or images with a specific file size. If data changes within a bucket, the service writes the updated metadata to the table within a few minutes. All modifications to the data can be tracked in this table. S3 stores all metadata tables in a tablet bucket. In addition to the connection to the AWS analysis services, there is also integration with Amazon Bedrock. AI-generated videos can thus be linked to metadata that provides information on the time of creation and the model used.

In addition, the S3 Storage Browser is now available to all customers. The open-source front-end component can be integrated into a web application and adapted to its user interface. There it forms an interface for data management by selected end users, for example employees or customers. This allows them to access, search, copy and download files in S3 within the application within the scope of their rights. The Storage Browser can be used in web applications that were developed in React or a framework based on it. AWS Amplify and its UI React packages must also be installed.

For requests to upload objects, Amazon relies on new measures in S3 to protect data integrity. The AWS programming tools now automatically calculate CRC-based check digits for uploads during data transfer. S3 only accepts and saves the objects after checking the check digits. This is intended to ensure the integrity of the data during transmission. For multi-part uploads, there is now a check digit for the entire object. In addition to the previously used MD5 method, Amazon will also offer the algorithms CRC64NVME, CRC32, CRC32C, SHA-1 and SHA-256 for calculating the check digits.

Videos by heise

The Storage Browser for S3 is already available in all regions, the checksum calculation for data transfer is to be distributed everywhere in the coming weeks. Customers of German servers still have to wait for the new data buckets and the preview of S3 Metadata, as the functions are currently only available in the USA. Further information on the new data buckets, the Storage Browser, the use of checksums and the metadata preview can be found in the AWS announcements.

(sfe)

Don't miss any news – follow us on Facebook, LinkedIn or Mastodon.

This article was originally published in German. It was translated with technical assistance and editorially reviewed before publication.