Title:Towards a Metadata Model for Mass-Spectrometry Based Clinical Proteomics
Volume: 7
Issue: 3
Author(s): John Springer, Fan Zhang, Peter Hussey, Charles Buck, Fred Regnier and Jake Chen
Affiliation:
Keywords:
Clinical proteomics, OWL, RDF, semantic web, Metadata Model, Mass-Spectrometry, Clinical Proteomics, proteomic biomarkers, CPAS, Cancer Biomedical Informatics Grid (caBIG).
Abstract: Recent proteomics studies of clinical samples have generated substantial interest. Aided by advances in
analytical chemistry and bioinformatics, clinical proteomics has become a driving force behind molecular biomarker
development. However, it is still difficult to manage and interpret large amounts of clinical proteomics data due to data
integration challenges. The lack of practical metadata representation standards has prevented sharing and interpretation of
mass spectrometry experimental results derived from different experimental conditions or different proteomics labs, and
ultimately this absence has resulted in missed opportunities for proteomic biomarker discovery. Therefore, in this paper,
we describe methods for deploying Semantic Web technologies to design an ontology using OWL for clinical proteomics
information and to manage such information using various mechanisms, such as CPAS. We developed a practical
proteomics experimental metadata model using Semantic Web technologies and demonstrated the manner in which this
model can be integrated with current proteomics data analysis software systems. We demonstrated the manner in which
systems employing the metadata model can begin to enable inter-laboratory sharing and analysis of clinical proteomics
data. We also discussed the manner in which these tools and techniques have aided in proteomic biomarker discovery
studies. Our work reflects an approach to adopt a Cancer Biomedical Informatics Grid (caBIG) compliant software system
through the use of an ontology-based metadata model. This effort is the first step in a bigger initiative to move toward an
ontology-based approach that enables a standards-driven approach to large-scale inter-laboratory proteomics data
integration and analyses with the overarching goal of the discovery of proteomic biomarkers.