to BioTechniques free email alert service to receive content updates.
From DVD to DNA: Next-generation DNA Data Storage

Chris Tachibana

A book written in oligonucleotides demonstrates an in vitro, DNA-based system for archiving and retrieving information.

DNA is an ideal data storage material. It can be stable for millennia and readable when damaged. Unlike floppy disks, the equipment for reading and writing DNA data—polymerases and nucleotides—won’t be obsolete anytime soon.

A next-generation system for archiving information in DNA just passed a proof-of-concept test. A book of 53,426 words, 11 images, and a JavaScript program was encoded into DNA and successfully read again afterward. The book was the html version of a draft text on synthetic biology by George Church, professor at Harvard Medical School and lead author of the Science Express report (1). Instead of a classic such as Moby Dick, the research team chose Church’s book, says Sriram Kosuri, co-author of the study and staff scientist at Harvard University’s Wyss Institute for Biologically Inspired Engineering, because it contained html tags and other “modern formatting.”

Comparison of data storage methods. Reprinted with permission from AAAS.

The book was divided into 96-bit chunks of digital data that were converted into the sequence of nearly 55,000 oligonucleotides, using A or C for zero and G or T for one. This flexibility in coding meant that sequences with secondary structure or repeats, that are difficult to sequence, could be avoided. The oligonucleotides were synthesized with 96 nucleotides of data plus flanking sequences for amplification and sequencing, and a barcode indicating the location in the book, similar to page and line numbers. The book was now storable in a test tube. Reading the book was slightly more difficult than operating a Kindle e-reader. The oligonucleotides were amplified by PCR and sequenced using the flanking regions. Using only sequences with the correct length and a perfect barcode, the researchers reassembled and decoded the information, finding only 10 errors among the 5.27 million bits.

The main advantage of this system, over earlier DNA storage mechanisms, is that it is entirely in vitro, avoiding time-consuming cloning in bacteria. However, the new system is not rewritable or searchable. Changing the information requires re-synthesis and seeking requires sequencing. So, don’t expect your library in a microtiter plate soon. The project was mainly to show how DNA could be used for high-density, long-lived information archiving, said Kosuri, calling the study “a good starting point for thinking about alternative storage mechanisms.”

As we accumulate digital information, explained Kosuri, we need better archiving technologies. Three-dimensional polymers such as DNA store information much more densely than formats that encode data on a surface. A DNA-based information storage system can be 10 orders of magnitude more dense (measured as logĀ­10 bits per mm3) than a CD. Kosuri estimates that a petabyte of information (1000 terabytes) can be stored in less than 1.5 milligrams of DNA.

Besides advancing information storage, the study was groundbreaking because the senior investigator did the bench work. “It was a bit of a role reversal for us, but [Church] wanted to get back in the lab, so he did most of the work,” said Kosuri. Church said doing the PCR and sequencing preparation was "like riding a bicycle. It all came back."

1. Church G.M., Y. Gao, and S. Kosuri. 2012. Next-Generation Digital Information Storage in DNA. Science [citation here: / 16 August 2012 / Page 1/ 10.1126/science.1226355 ]

Keywords:  DNA storage