A new MIT technique for storing DNA files in capsules promises to make later retrieval easier
Data storage in DNA it was promising already back in 2007 and years later it turned out to be of interest at a more widespread level, including by Microsoft, who have come to develop a machine to automate the process. This interest is far from being mitigated and now a team of researchers has devised a new process to store and retrieve information in DNA.
The fact of trying to find a practical (and above all economic) system to store information in this biomolecule is part of the advantages it represents: information takes up less space and promises very long durability. Another obstacle is in the recovery of stored files, which is not quite as simple and practical as they would like for now, and that is where this new technique could bring improvements.
Fluorescence and Boolean logic to find files earlier
The idea of storing data in DNA is based on encode the information bits (zeros and ones) in DNA sequences. DNA (deoxyribonucleic acid) is a chain of small units called nucleotides that are identified by one of its parts (the nitrogenous base), by which it is given a letter (which is the initial of the differential component of the nitrogenous base): A (adenine), T (thymine), C (cytosine), and G (guanine). That is, each link would be one of these letters, and the order of them determines the genes and which protein is expressed, hence the genetic code.
Hence, broadly speaking, it is a matter of translating the zeros and ones of the binary into A, T, C and G, and thus take advantage of the fact that more or less one nucleotide equals two bits. This translates into two bits occupying roughly 1 cubic nanometer, thus an exabyte of data (for example, from a server center) in DNA more or less would fit in the palm of our hand.
The method is promising, but there are those two points that we reflected as main bottlenecks: the cost and retrieve a specific file among all those that have been stored. At the moment the method is similar to the one that the cellular machinery (it is considered to) performs, since it is based on the recognition of a marker (primer) to find and amplify the desired sequence, but this leads to confusion between the primer and other streams from other files and, like explain at MIT, is a process that consumes much of the DNA in the sample (and requires many enzymes).
In this sense, what these MIT researchers have published in Nature Materials it is a new technique to recover files in a more practical way encapsulating DNA in silica particles. Each of these particles is labeled with a barcode-like DNA strip that corresponds to the content of the file.
Researchers encoded 20 images in DNA molecules of about 3,000 nucleotides, which is equivalent to about 100 bytes (although the capsules would manage to store up to 1 GB, as described). Each file was tagged with some clear reference, such as “cat” or “airplane”, so that the primers of said references.
These primers they were labeled with fluorescence or magnetic particles, so that it was easier to identify matches with the desired DNA sequence. What it allows, as they explain, is that the desired DNA fragment (and, therefore, the file) is recovered without damaging the rest of the DNA. In addition, it allows do a search similar to how the Google search engine works with the images when supporting boolean operators, so that a formula like “animal AND white” generates “cat” as the result.
A small step towards promising, but still expensive storage
Another advantage of this new technique is that current instruments and techniques can be used laboratory for sequencing, amplifying, etc. the DNA. Mark Bathe, one of the researchers on the project, sees usefulness in the future for information that is not accessed on a regular basis.
Nevertheless, cost is still an issue. According to MIT, it would currently cost 1 trillion dollars to store a petabyte of data (1 million GB), and according to Bathe calculates it will not be until at least a decade (or two) when it would begin to be cost competitive with it. magnetic storage.
So it remains to be seen if, while the processes get cheaper as calculated, they are also finding the most efficient systems also at the level of saving speed. Meanwhile, what we do know for sure is that GIFs can be stored in DNA, So the GIF culture is saved.
Image | Vectorjuice (Freepik