39C3: When Molecules Become Cryptographic Functions

Chemist Anne LĂĽscher showed at 39C3 how synthetic DNA can be used for data storage and tamper-proof authentication.

listen Print view
Anne LĂĽscher at the 39C3 talk

(Image: media.ccc.de, CC BY 4.0)

7 min. read
Contents

DNA is commonly known as the blueprint of life. However, at the 39th Chaos Communication Congress, chemist Anne Lüscher consistently removed the molecule from its biological context and viewed it as what it also is from an information technology perspective: an extremely dense, stable, and surprisingly manageable information carrier. In her talk „Chaos Communication Chemistry: DNA security systems based on molecular randomness“, she explained why synthetic DNA is particularly suitable for data storage and security architectures – and why RNA plays hardly any role in this.

From a digital perspective, DNA is easy to read, according to LĂĽscher: four bases, clear pairing rules, sequential storage. "Just like with digital information, DNA stores data in a sequence, and basically, we just need to translate between base two and base four. We can simply assign two bits per base and thus translate back and forth between digital or binary information and DNA."

However, the physical properties are more crucial. DNA as a storage medium combines enormous information density with long-term stability – under suitable conditions, over periods far exceeding those of current storage media. The fact that the genome of an approximately 700,000-year-old horse bone could still be read is less a biological curiosity than a technical argument. In the lab, these conditions can be artificially created, for example, by encapsulation in tiny glass beads.

Empfohlener redaktioneller Inhalt

Mit Ihrer Zustimmung wird hier ein externer Inhalt geladen.

Ich bin damit einverstanden, dass mir externe Inhalte angezeigt werden. Damit können personenbezogene Daten an Drittplattformen übermittelt werden. Mehr dazu in unserer Datenschutzerklärung.

Furthermore, there is an aspect that is gaining increasing importance in computer science: parallelism. Molecular systems do not work sequentially but massively in parallel. "Because when you think of a tiny drop of water – there are so many molecules in it, and in the case of DNA, each of these molecules can potentially be its own processor, performing calculations independently and simultaneously, independent of the others. And this opens up possibilities for parallel operations that are not possible with traditional computer technology."

The question of RNA arises, not least due to its prominent role in medicine. In the Q&A session, Lüscher explained why there are clear technical reasons against it: RNA is single-stranded and chemically unstable. An additional hydroxyl group makes it particularly susceptible to hydrolysis. This is unsuitable for applications where data needs to be preserved over long periods. DNA, on the other hand, is double-stranded, robust, and accompanied by a tool ecosystem that has evolved over decades: synthesis, PCR, sequencing, and targeted manipulation are established and reliably available. For other biomolecules like proteins, these direct tools are largely missing – for example, a protein cannot be directly copied by another protein.

Videos by heise

Large players like Microsoft and Seagate have also established their own teams for DNA data storage, Lüscher reported. Advances in random access, error correction, and optimized coding through epigenetic methods have been achieved. Nevertheless, most realized projects so far have been more in the realm of art and PR – for example, storing the music of the band Massive Attack in DNA, which was then mixed into spray paint for an album cover.

DNA becomes particularly interesting where randomness comes into play, LĂĽscher explained. "In a single reaction, by combining the four bases, we can generate enormous amounts of randomness in a single reaction environment. And here you see some numbers. We can generate hundreds of petabytes of randomness for under 100 Euros." This randomness is practically irreconstructible, neither algorithmically nor through re-synthesis. Based on this, so-called Chemical Unclonable Functions (CUFs) can be realized: random DNA pools that are not fully known or copyable, but can be specifically "queried."

The principle works via PCR with defined primers, according to Lüscher. These primers search the pool for matching sequences, bind there, and copy the intervening section. The result is specific to the combination of pool and primer pair – reproducible, but not predictable or reversible. Similar to Physical Unclonable Functions (PUFs), this creates a system that behaves like a cryptographic hash function but is based on a chemical rather than a mathematical foundation.

In contrast to classical PUFs, these systems are not tied to a single physical object, Lüscher emphasized. Through chemical processes, identical copies of the random pools can be produced without knowing their exact composition. Subsequently, these copies can be "locked" so that they can no longer be replicated. This defines the number of possible queries in advance, and multiple users can use the same pool for decentralized applications – for example, for mutual authentication or joint key generation.

DNA can also be integrated into materials, LĂĽscher explained. Embedded in paints, plastics, or 3D printing filaments, it enables object-bound metadata with extremely long durability. One research project, for example, stored an STL print file in DNA, integrated it into the printing filament, and produced a plastic rabbit from it. From a tiny piece of the ear, the DNA could be extracted, and the rabbit could be printed again. "And it also has some practical applications. Because when you think about objects with a very long lifespan, like buildings or public infrastructure, it can be really difficult to retain the data and metadata for these objects over a longer period. And in this way, we could solve this by simply integrating this information directly into the building materials."

Specific applications for CUFs range from the authentication of artworks to the counterfeit protection of medicines. A tiny material chip is sufficient to read out a unique chemical signature and compare it with a reference. Since the pools are neither fully sequencable nor synthetically reproducible, an attack would be extremely complex: chemical modification prevents the usual sequencing preparation, and even with successful sequencing, the targeted re-synthesis of all sequences would cost billions.

Despite the potential, LĂĽscher's view remained realistic. "But for these operations, i.e., a single challenge-response per pass, it currently takes a few hours, and then we have to sequence the results, which takes a few more hours. So, if you want to authenticate a medicine, you would essentially have to wait a day. That's the current state."

The real value, according to Anne Lüscher, lies in the perspective: thinking of chemistry as information science and viewing physical systems with a digital eye. DNA is not presented as a replacement for silicon but as a supplement – where durability, density, randomness, and physical non-clonability are crucial. The field requires expertise from various disciplines: people with laboratory experience as well as those with a hacker mindset who are willing to tackle these challenges.

(vza)

Don't miss any news – follow us on Facebook, LinkedIn or Mastodon.

This article was originally published in German. It was translated with technical assistance and editorially reviewed before publication.