DNA Sequencing in the Pharma Industry Demands Cloud Computing for Next Level Data Management

The exponential growth in the power of computing has affected the amount of scientific data produced, managed and analyzed over the last decade, turning biology into a data-intensive science, states a new report by healthcare experts GBI Research.

The new report* states that the advent of genomics will change our understanding of biology and human diseases, but cloud computing must step up, to store and share this enormous amount of data.

Research in the pharmaceutical industry has moved towards next-generation sequencing, and research centers all over the globe are generating thousands of gigabytes of DNA sequences. Over 10,000 human genomes were completely sequenced by the end of 2011, but it is estimated that over a million could be sequenced by 2015. In addition to genome sequencing, understanding of the whole genome expression data also reveals information on the normal and diseased states of the human body. Although large amounts of genomic data, coupled with other clinical and biological texts, are easily available for downloading, there is currently a lack of a conceptual framework to integrate all the data. This is where cloud computing can help.

A biomedical cloud with large amounts of publicly available data on biology, medicine, technology and healthcare, could be accessed by individuals on personal devices and by companies through large data centers, through a secure platform. The cloud could also enable the use of software programs, such as CrossBow, which is capable of analyzing the entire human genome in a single day.

Global pharmaceutical company Merck has used cloud computing since 2003 – one of the earliest uses of cloud computing platforms by a life sciences company. Intensive drug research generated massive amounts of data related to genotype and gene expression, and Merck built one of the largest computer networks in the pharmaceutical industry to deal with this. With the eventual advent of next-generation sequencing, Merck examined the option of the cloud service which had been just launched by Amazon.

In early 2009, when Merck shut down its genomics operations, the data generated was inherited by Sage Bionetworks, a not-for-profit, open-access medical research organization. Sage is now exploring other cloud computing services, as the rate of growth of sequencing data is exponential. This transfer of scientific knowledge from Pharma giant to charitable research body represents an exciting movement in the medical field, and the concept of a biomedical cloud with shared genomic data would work to further this communal element of medical discovery.