DNA has an incredible capability to store information. Now, thanks to a simple cipher, DNA can be manipulated to act as a storage system for digital data.
The importance of archiving data holds significantly more relevance in today’s world, where information is generated at an increasing pace. From GDP economic trends to classical compositions like Shakespeare’s sonnets, there is a surplus of information that needs to be stored and preserved, and the list keeps growing every day.
However, there are two fundamental issues with archiving huge amounts of data: first, the sheer volume of information, and second, storing data in a format that will remain universal over long periods of time.
This is where DNA comes in. The idea of storing information in DNA struck scientists Ewan Birney and Nick Goldman of the European Bioinformatics Institute, over a few beers at a pub. They were discussing the issue of trying to cut down the costs associated with maintaining a vast archival unit of hard drives, which takes up a lot of space and electricity.
Nature has an easy answer to this problem. DNA stores information to create a multicellular organism from a single cell; it performs this task using a minimum amount of space, and in a manner that preserves the information in a universal format for long periods of time.
Computers store information using a binary number system, which encodes a series of 0’s and 1’s. DNA stores information in terms of nucleotide bases known as Adenine, Thymine, Cytosine, and Guanine—abbreviated A, T, C, and G respectively. Just as the combination of 0’s and 1’s leads to a myriad of images, games, sounds, text, and videos, the combination of the four bases A, C, T, and G leads to a set of instructions for the formation of every single cell in the body.
To store digital data in the bases of DNA, Birney and Goldman used a system that stored a byte (a sequence of eight 1’s or 0’s) as five DNA letters. To create an encoding pattern with zero error, they constructed strings of DNA letters that had no adjacent repeats. Every stream of data was encoded in exactly 117 letters, each with indexing information that would indicate where this stream belonged in the overall code.
Another advantage of DNA storage is that it avoids the problems caused by rapidly changing technology. Recall the Floppy Disk, once the most efficient portable storage media. If any important data were to be found stored on these disks today, it would essentially be lost.
On the other hand, DNA will always hold importance—even if the mechanisms to access information change. One could leave a vial with DNA in a time capsule, and 500 years later, it would still be readable and accessible by future generations.
A research team led by George Church and Sriram Kosuri from the Harvard Wyss Institute set a world record in data storage, by storing 700 terabytes (Tb) of information in a gram of DNA. To put that in perspective, one would need 151 kilos of three Tb hard disks to store the same amount of information. Essentially, they had smashed the previous information storage density record by over a thousand times.
Currently, the costs associated with DNA storage are estimated to be fairly high—$12,400 to write the storage system and $220 to read it—but these costs are falling significantly faster than those of other electronics. The benefits of this system, such as the single writing cost, drive the increase use of DNA storage systems.
This technology has one more interesting application: the DNA used to store data could very well be the DNA in your skin. Due to the short lifespan of skin cells, data stored within this DNA survives for only a short duration of time. This would allow secure transmission of sensitive information, with the assurance that it would be destroyed soon after the recipient had seen it.
Looking to the future, DNA may no longer play just a biological role in our lives. Soon it could be cheaper for companies to keep DNA archives, rather than a warehouse full of hard drives.