Accelerating Biomarker Development in Translational Cancer Research: How the Latest Proteomics Workflows Offer Enhanced Reproducibility and Scalability

Accelerating Biomarker Development in Translational Cancer Research: How the Latest Proteomics Workflows Offer Enhanced Reproducibility and Scalability

The worldwide cancer burden rose to 18.1 million new cases and 9.6 million deaths in 2018, according to the latest estimates from the World Health Organization [1]. With the global aged population growing, cancer incidence rates are set to keep on rising and are predicted to increase by a staggering 61.7% by 2040 [2].

Against this backdrop, the development of improved strategies for cancer prevention, diagnosis and treatment is a key priority for cancer researchers. However, a major hurdle in the battle against cancer is the considerable heterogeneity in disease progression and clinical presentation between individuals, resulting in large variability in patient response to ‘one-size-fits-all’ treatments. Therefore, many cancer research laboratories are working towards identifying protein biomarkers that can be used to inform treatment decisions designed around the needs of individual patients. Ultimately, it is hoped that the clinical use of these biomarkers will facilitate early diagnosis, improve monitoring and increase the potential for more effective personalized treatments.

Recent advances in the analytical capabilities of translational proteomics workflows are accelerating the identification of promising protein biomarkers. In particular, improvements in the sensitivity, accuracy and mass resolution of modern liquid chromatography-mass spectrometry (LC-MS) systems are now supporting the reliable identification and quantification of large numbers of proteins and peptides present in complex biological samples, leading to significant advances in biomarker discovery.

While biomarker discovery has accelerated over the last decade, there has been slower progress in the verification and validation studies required at the latter stages of the translational proteomics pipeline. A key challenge here has been the need to develop workflows suitable for large-cohort proteomics studies. With international cooperation in oncology research increasingly encouraged by initiatives such as the US Cancer Moonshot, these studies are often conducted across multiple sites in different geographical locations. However, this raises additional requirements in terms of reproducibility, transferability and scalability. Ensuring consistency between different sites has traditionally been a significant hurdle, as workflows must be streamlined and fully standardized, with experimental steps well-defined all the way from sample preparation to data analysis. Furthermore, processes must be sufficiently robust to tolerate differences in operators, biological samples and reagent batches, and to ensure consistency across experiments and sites.

The considerable difficulty in developing proteomics workflows to meet these requirements has historically held back progress in translating promising biomarkers to the clinic. However, recent advances in next-generation LC-MS workflows are now providing solutions to the challenges around reproducibility and scalability. In this article, we discuss how the latest MS techniques are improving the accuracy, consistency and throughput of methods used throughout the translational proteomics pipeline, and highlight how a recent Cancer Moonshot study demonstrates the effectiveness of the latest workflows for multi-site studies involving large numbers of patients.

Supporting scalability and reproducibility with advanced label-free data-dependent acquisition workflows

In the earliest stages of translational biomarker development, scalability and inter-site reproducibility are not pressing concerns for laboratories, as initial discovery studies generally tend to involve the analysis of a limited number of patient samples. As such, many laboratories use small-scale multiplexing tools such as tandem mass tags (TMTs) to accelerate the identification of promising protein biomarkers by allowing the analysis of multiple samples simultaneously.

Although TMT workflows have proven to be very effective in identifying promising biomarker candidates in discovery-stage research, these approaches may not provide a scalable solution for the larger-cohort proteomics studies used for biomarker verification and validation. Given the vast number of patient samples to analyze, the proteomics workflows employed in these latter stages of the translational pipeline require far greater throughput and scalability. 

Label-free data-dependent acquisition (DDA) methods are frequently used to meet these challenges, enabling researchers to compare the relative abundance of different proteins across multiple LC-MS/MS experiments without employing isotopic tags. In contrast to TMT workflows, where the number of samples that can be included in each run is restricted by the size of the kit, label-free DDA workflows support an unlimited number of sample comparisons. 

These scalable workflows rely on the fact that tandem MS analysis is performed on only the most abundant precursor ions. This allows researchers to minimize the analysis of redundant precursors, extending proteome coverage and improving throughput and efficiency. However, as proteins are measured individually with this technique, inconsistencies in sample preparation or instrument use tend to generate greater variability. Consequently, reproducibility has traditionally been a limitation of experiments based on label-free DDA approaches, and these studies therefore typically require more repeat measurements [3]. Such concerns are of key importance for biomarker verification and validation studies as these large-cohort studies are often conducted across multiple sites in different geographical locations. It is therefore essential that the proteomics workflows used are scalable and offer high throughput, and can also generate reproducible results that are consistent between laboratories.

‘DDA+’ workflows combine the latest advances in LC-MS instrumentation and software to achieve more robust standardization and improved reproducibility. For example, increasingly precise capillary flow high-performance LC technologies are enhancing separation performance, improving the consistency of data generated. Furthermore, the latest high-resolution accurate mass (HRAM) MS instruments now offer improvements in sampling depth and sequencing speed, and can achieve higher sensitivity, precision and accuracy, with greater run-to-run reproducibility. When incorporated into DDA+ workflows, these advanced technologies can be used to generate high-quality and consistent data in large-scale proteomics studies. 

Data-independent acquisition workflows enhance proteome coverage while improving reproducibility

While DDA+ workflows offer an effective and scalable solution for biomarker verification and validation, it can sometimes prove challenging to achieve sufficient analytical sensitivity with these methods. In such situations, data-independent acquisition (DIA) can be a superior way of meeting the needs of translational proteomics workflows. This method better supports the quantification of low-level proteins, since all peptide fragments within narrow, consecutive mass-to-charge windows are analyzed. However, given the diversity of proteins typically present in biological samples, this technique produces highly complex spectra, which can prove difficult to analyze.  

The latest high-resolution MS1 DIA (HRMS1-DIA) workflows are helping to address this challenge by using much narrower acquisition windows to facilitate the deconvolution of feature-rich spectra. Employing hybrid quadrupole-Orbitrap mass analyzers, which offer excellent resolution, these workflows provide a new standard in quantitative sensitivity, accuracy and precision, allowing comprehensive protein profiling of complex clinical samples. By enabling enhanced proteome coverage as well as improving the accuracy, precision and fidelity of data, HRMS1-DIA workflows are facilitating more reproducible profiling in translational proteomics studies. 

The ability to improve reproducibility is a key advantage for biomarker verification and validation, since these studies are so frequently conducted across multiple laboratories. Indeed, encouraging evidence is now available to demonstrate the reproducibility, transferability and scalability of the latest HRMS1-DIA workflows in a multi-site context. A recent large cohort, international Cancer Moonshot study benchmarked a high-resolution MS1-based quantitative DIA workflow employing online capillary LC coupled with a Thermo Scientific Q Exactive HF hybrid quadrupole Orbitrap mass spectrometer [4]. The one-hour workflow was applied to profile the proteomes of human, yeast and E. coli samples across eleven different laboratories, following standardized procedures and using identical instrument platforms and software. Data from each laboratory was processed individually and then combined to assess the reproducibility of protein identification and quantification between laboratories and over time.  

The study produced consistent and reproducible data across eleven sites and over seven consecutive days, highlighting the robustness of the workflow. The proportion of protein groups identified and quantified in common exceeded 80%, both between different laboratories and across different days on the same site. Since this method achieved deep proteome coverage with such consistent results, it can be hoped that standardized DIA workflows employing the latest MS instrumentation and software could significantly accelerate progress in biomarker validation and verification. 

Conclusion

While the field of biomarker discovery is progressing rapidly, the challenges around reproducibility, transferability and scalability in biomarker validation and verification workflows still cause a major bottleneck in the translational pipeline. Recent advances in high-throughput HRMS1-DIA workflows are now enabling scientists to overcome these challenges, supporting the generation of reliable results that are reproducible between different laboratories. These workflows are set to hasten progress along the path between discovery and translational research, helping to realize the aims of the Cancer Moonshot by accelerating developments in cancer treatment, early diagnosis and prevention.

 

Yue Xuan is a senior product marketing manager at Thermo Fisher Scientific 

References

1. World Health Organization Press Release, 12 September 2018. https://www.who.int/cancer/PRGlobocanFinal.pdf.

2. Cancer Research UK Worldwide cancer incidence statistics. https://www.cancerresearchuk.org/health-professional/cancer-statistics/worldwide-cancer/incidence#heading-One (Accessed August 2019).

3. L. Anderson, “Six decades searching for meaning in the proteome,” J. Proteomics, vol. 107, pp. 24–30, 2014.

4. Y. Xuan et al., “Advancing Mass Spectrometry-Based Large-Cohort Proteomics for Precision Medicine – An International Cancer Moonshot Multi-Site Study”. Poster presented by Thermo Fisher Scientific, 2018. https://assets.thermofisher.com/TFS-Assets/CMD/posters/po-65231-ms-international-cancer-moonshot-multiple-site-study-asms2018-po65231-en.pdf (Accessed August 2019).

 

  • <<
  • >>