We have developed an algorithm called Q5 for probabilistic
classification of healthy vs. disease whole serum samples using mass
spectrometry. The algorithm employs Principal Components Analysis
(PCA) followed by Linear Discriminant Analysis (LDA) on whole spectrum
Surface-Enhanced Laser Desorption/Ionization Time of Flight
(SELDI-TOF) Mass Spectrometry (MS) data, and is demonstrated on four
real datasets from complete, complex SELDI spectra of human blood
serum. Q5 is a closed-form, exact solution to the problem of
classification of complete mass spectra of a complex protein
mixture. Q5 employs a probabilistic classification algorithm built
upon a dimension-reduced linear discriminant analysis. Our solution
is computationally efficient; it is non-iterative and computes the
optimal linear discriminant using closed-form equations. The optimal
discriminant is computed and verified for datasets of complete,
complex SELDI spectra of human blood serum. Replicate experiments of
different training/testing splits of each dataset are employed to
verify robustness of the algorithm. The probabilistic classification
method achieves excellent performance. We achieve sensitivity,
specificity, and positive predictive values above 97% on three ovarian
cancer datasets and one prostate cancer dataset. The Q5 method
outperforms previous full-spectrum complex sample spectral
classification techniques, and can provide clues as to the molecular
identities of differentially-expressed proteins and peptides.