Ask the Scientist: The Present and Future of Proteomics

 Ask the Scientist: The Present and Future of Proteomics

In the last two decades of scientific innovation, the field of proteomics has not seen the kind of advancements other disciplines have been afforded.

Proteomics is still very much evolving. According to Asim Siddiqui, VP of Research and Technology Development at Seer, we’ve just begun to develop tools that enable a deep, unbiased look at the proteome, and every day we’re learning how proteoforms influence human health.

In this Ask the Scientist, Siddiqui gives a broad look at the present—and future—of proteomics, including current bottlenecks, analytical challenges and opportunities, single-molecule innovations and more. 

Q: Over the past 15 years, the fields of genomics and transcriptomics have undergone incredible advancements. In your opinion, why did that technology explosion evade the field of proteomics?

A: There are two main reasons why genomics advanced more rapidly than proteomics. First, proteomics was limited by the available molecular biology tools in a way that genomics wasn’t. There are several natural processes for manipulating DNA, including replication, which led to the development of PCR amplification. When you make copies of molecules, it’s easier and faster to study than at the single-molecule level. This is because you can average the data signals across a large population, yielding a more robust measurement. There is no analogous way to copy protein molecules.

The second factor that has made proteins more difficult to study than genes is their sheer diversity. Four DNA bases make up the code for the approximately 20,000 genes within the human genome, and 20 amino acids are the building blocks of proteins. Once post-translational modifications are made, there are about one million distinct protein variants, or proteoforms, within a single cell type. Multiply that by the number of cell types within a person, and again by the protein variants between individuals, and the proteome grows exponentially. With so much diversity, there are proteins we haven’t discovered, let alone know their biological function. In fact, it is estimated that of the nearly one billion genetic variants that have been currently identified, less than 0.2% of them have been functionally characterized.

Q: What are the current bottlenecks of proteomics?

A: Proteomics has yet to reach its full potential because the field’s research tools are inefficient at addressing this complexity, at both the data collection and data analysis stages. Processing samples in a way that enables a deep, unbiased look at the proteome is incredibly laborious and time-consuming. In addition, doing so at sufficient scale without compromising this depth is also challenging. And once that proteomic information is procured, it’s computationally challenging to derive meaning from that complex data.

Q: How do you envision overcoming these obstacles on a large scale within the entire field of proteomics?

A: There are three main approaches that researchers are using to improve proteomics. The first involves grouping similar molecules. If you can separate individual proteins and peptides that are distinct, and group those that are similar, you can present a detector, such as a mass spectrometer, with those proteins and peptides at the same time. This is the classical approach taken by liquid chromatography-mass spectrometry (LC-MS), where LC-based separation precedes detection to analyze fractions individually.

The second approach is single molecule-focused. It’s technically challenging, but some technologies are in development with a goal of identifying each protein molecule individually. This method is challenged by difficulties with signal processing at the single molecule level.

The third approach uses indirect measurements to study proteins. You can tag a protein with an antibody, and then measure the abundance of the antibody, rather than the protein itself. This method is challenged by the fact that you don’t get a direct measurement of the protein itself. Additionally, because of the magnitude of diversity of the proteome, you are unable to tag novel, individual proteoforms, because we cannot tag protein molecules that we don’t know about. We know that individual proteoforms are functionally relevant to understanding the human body, so proteomic analysis needs to be able to home in at the proteoform level, rather than at just the protein level. A deep, unbiased approach to identifying proteoforms at the peptide level is required to overcome challenges within proteomics, and this is precisely the premise upon which Seer was founded.

Q: How specifically does Seer contribute to bringing about the next frontier of proteomics?

A: Seer’s technology uses mass spectrometry and fundamentally improves upon the challenge of grouping similar peptides and proteins. Grouping proteins by liquid chromatography can be difficult due to the vast dynamic range of protein abundance in a given sample, such as human plasma. A highly abundant peptide may elute at the same time a low abundant species does, drowning out the signal from the lower abundance protein. The current standard for addressing this issue is a process called fractionation, which separates out peptides in an orthogonal manner. However, this approach is incredibly complex, time consuming and laborious, limiting the size of studies. In fact, we believe that prior to Seer, the largest study of this type included only 40 or 50 samples.

Seer has developed a proprietary, automated, nanoparticle-based technology that is faster and easier to run than fractionation. Each nanoparticle has a unique functionalization on its surface, which means that it attracts a subset of proteins with an affinity for that functionalization. This affinity compresses the dynamic range of the proteome, almost like a zip drive, sampling across the range of proteins present in a highly reproducible way, which enables the detector to observe low abundant proteins more easily. As we are able to access more and more of the proteome, data analysis will become a bottleneck for the field as information increases and studies scale, so Seer has made great strides in data processing to enable studies at scale to happen more seamlessly. We’ve scaled our computing in parallel with our processing, enabling deep, unbiased analyses that are orders of magnitude larger than what’s been previously done, at a fraction of the time and cost.

Q: How important is it for us, as a human species, to significantly advance our understanding of biology, health, and disease?

A: Biology happens at the level of the proteome. The genome is like the director, operating behind the scenes to determine who appears on stage, but proteins are the actors. To understand disease mechanisms and processes such as aging, we need to understand biology at the protein level. We do this by studying proteins and how they change over time, and how they fluctuate when someone is healthy versus when they’re experiencing a disease. Understanding these underlying biological mechanisms will give us novel insights into early detection of disease, new interventions and therapies, and even cures.

Q: What has recent published research using Seer's novel technology shown/revealed?

A: Working in non-small cell lung cancer, we found evidence of novel isoforms of proteins that potentially differentiate between cancer and healthy states pointing to disease mechanisms. We also conducted a study in Alzheimer’s disease patients that combined proteomics and genomics generating 4,706 protein groups and an additional ~1400 variant peptides. This study yielded a classifier separating control patients from disease and a set of protein quantitative trait loci (pQTLs) that may provide biological insight into the mechanisms underlying Alzheimer’s disease.

 

Subscribe to our e-Newsletters!
Stay up to date with the latest news, articles, and events. Plus, get special offers from Labcompare – all delivered right to your inbox! Sign up now!
  • <<
  • >>