The Personal Proteome: What It Can Tell Us and How Current Advances in Technology Could Make the Proteome an Everyday Measurement

Featured Article

 The Personal Proteome: What It Can Tell Us and How Current Advances in Technology Could Make the Proteome an Everyday Measurement

Public interest in consumer-level personalized health monitoring is ever increasing. With access to wearables that measure a myriad of physical metrics to services that offer to probe one’s DNA, the options are seemingly endless. Physicians routinely measure metabolites,1 lipids, and cell counts, and visually inspect tissue samples in order to make a personalized assessment of patients. However, missing from this type of monitoring are the very biological components responsible for orchestrating the day-to-day chemistry of our lives: proteins.

Proteins sit between our DNA blueprints and the metabolic end products. They are responsible for the careful breakdown of all we ingest, the systematic assembly of cellular components, and the complex messaging system our bodies rely on to function in a coordinated manner. Yet, we do not measure them routinely in a complete picture, but rather infrequently and individually. This is like looking at a single tax return to understand a country’s GDP. To take the analogy a step further, genetic testing gives us a chance to reveal the underlying tax code, but there currently lacks a method to survey the hundreds of proteins that define the overall economics of our well-being. Although there is plenty of evidence that genomic factors have an important impact on assessing the risk of common diseases, to date there has been little use of this information in prevention. Perhaps proteomic factors can influence prevention, but first we have to begin collecting this data, ideally from longitudinal sampling across large groups of people, to determine if it is possible.

What is the personal proteome?

The personal proteome is simply the simultaneous analytical measurement of proteins found within the human blood plasma.2 Samples can be collected using a variety of methods, ranging from typical several-milliliter venipuncture collections to microliters collected from a simple finger prick, akin to what diabetics do to test blood glucose levels. Plasma from these collections can be processed through standardized bench protocols3 to transform proteins into smaller peptides and prepare the sample for analysis by liquid chromatography/mass spectrometry (LC/MS). Utilizing standardized analytical methods, the proteome is essentially digitized to record peptide abundance and amino acid sequence. As shown in Figure 1, the personal proteome contains an abundance of biologically rich information that can elucidate characteristics of the underlying genetics based on observation of known protein variations, inform about current biological state, and provide markers for changes observed over time.

ImageFigure 1 –  LC/MS molecular feature map, where the horizontal axis rep- resents by mass over charge in units of Daltons, at a resolution of 0.001 Daltons, and the vertical axis represents chromatographic elution time in minutes at a resolution of 1 second. The dark background represents minimally observed signal, while the bright yellow represents maximally observed signal. The observation of individually bright points (molecular features) can be associated to distinct peptides, with a secondary dataset (not shown) detailing the amino acid composition. On close inspection for every molecular feature, the isotopic contributions of carbon-13 are observable.

What the personal proteome can tell us

Proteomics can tell us about state, whereas genomics tells us about fate. As an example, genetics can only account for the morphology of the insect Rhopalocera as being either a caterpillar or a butterfly, while proteomics can determine which one it currently is.

Human plasma contains proteins from most, if not all, tissues in the body.It is the highway by which the body functions and is collected every day in doctors’ offices and in testing centers all over the globe, yet the requirement for analytically measuring the proteome is a mere 10 microliters. From this sample, thousands of proteins can be simultaneously measured in both abundance and composition, offering a potential for several-thousand single marker analytes to track, and a near infinite number of combinations that could be used to discern anything from age to the more practical observation of early-onset diseases.

Aside from the obvious accounting of proteins found within plasma and the semiquantitative measure each of them can impart either alone or over time, there are thousands of single amino acid polymorphisms related to diseases and disorders.5 To this last point, the natural variation of protein amino acid sequences seems to elude the majority of current scientific research in the field of high-throughput proteomics, which is focused on the quantitation of known proteins, rather than the characterization and composition. This is a bit like trying to determine if life exists in the universe based only on the brightness of a star, ignoring whether or not a planet orbits within a comfortable distance from the star, or the measurement of a planet’s atmospheric chemistry.

How advances in technology could make the proteome an everyday measurement

The ability to access the personal proteome is increasingly becoming a reality through a combination of technological advancements in several areas. Starting with sample collection, advances in integrated microsampling devices that can collect and stabilize samples into a single device lower the barrier to sample collection and open the possibility for self-collection without the need for trained medical staff. Furthermore, these advances make longitudinal sampling from the sample person a possibility, particularly if sample collection could be accomplished at home by the person providing the sample.

While a decade ago, proteomics was a labor-intensive bench exercise that required dozens of laboratory personnel to accomplish with any degree of throughput, this process can be more effectively scaled through automation. In addition, materials and reagents required in the processing workflow have become well-standardized and suited to higher throughputs. Additionally, as advances in cloud computing have facilitated the automation and reproducibility of the analysis, the costs associated with big-data biology have dropped considerably.

The authors estimate that over 100 individual samples, roughly two 96-well plates containing 60 samples, along with QC and normalization standards can be processed on a single LC/MS instrument per day, thus defining a “production line.” While this production line forms a distinct unit to scale, the actual bottleneck is the serial operation of the LC/MS, not the massive parallelization of the automated liquid handlers that can scale sample preparation without the need to add technical staff. It is this component that mainly drives the higher cost of a lower-throughput lab. Figure 2 illustrates the higher costs associated with lower throughputs, and hints at fairly aggressive price points achievable with higher throughputs. In a relatively small lab, a single technician may be responsible for several aspects of the sample preparation and data collection process, and asynchronous maintenance and operations are not possible. In a larger lab, i.e., one that is able to process 1000 samples per day or more, lab technicians can be dedicated to specific tasks and work in parallel to increase overall efficiency.

ImageFigure 2 – Estimated costs per sample as a function of laboratory throughput. The dark blue line indicates the median of the cost estimate, and the light blue lines the 90% CI of the estimate. As a simple validation, the current cost estimates for small study sizes are in line with current market prices.

Given the technological advances that make production-level high-throughput proteomics a possibility, and the cost savings associated with economies of scale, the personal proteome has real potential to become a daily assessment tool that could benefit individuals needing to track disease progression or recovery outcome. While this seems tantalizing enough, the biggest advances will come from collecting the proteomes of a million individuals longitudinally, and allow data-mining and machine-learning algorithms to uncover the biological patterns that can help to signal and prevent disease.

References

  1. Bouatra, S.; Aziat, F. et al. The human urine metabolome. PloS One 2013, 8(9), e73076.
  2. Farrah, T.; Deutsch, E.W. et al. A high-confidence human plasma proteome reference set with estimated concentrations in PeptideAtlas. Mol. Cell. Proteomics 2011, 10(9), M110.006353.
  3. Issaq, H.J.; Xiao, Z. et al. Serum and plasma proteomics. Chem. Rev. 2007, 107(8), 3601–20.
  4. Anderson, N.L. and Norman G. Anderson, N.G. The human plasma proteome: history, character, and diagnostic prospects. Mol. Cell. Proteomics 2002, 1(11), 845–67.
  5. de Beer, T.A.P.; Laskowski, R.A. et al. Amino acid changes in disease-associated variants differ radically from variants observed in the 1000 genomes project dataset. PLoS Computational Biol. 2013, 9(12), e1003382.

 Jeff Jones, Ph.D., is CEO, and Ryan Benz, Ph.D., is CDS, SoCal Bioinformatics Inc., 2029 Verdugo Blvd. #739, Montrose, CA 91020-1626, U.S.A.; tel.: 323-285-0745; e-mail: [email protected]www.socalbioinformatics.com

Related Products