Greiner Lab Websites

 [last update: June 2014]





Proteome Analyst


PSSP: Patient-Specific Survival Prediction

There are many tools for survival analysis -- eg, separating subpopulations (eg, log-rank statistics for Kaplan Meier curves), or finding features that appear relevant for distinguishing survival time (eg, Cox Proportional Hazard model).  However, these tools were not designed to predict an individual subject’s specific survival time, based on all of the characteristics of this subject. Our Patient-Specific Survival Prediction (PSSP) addresses this challenge.  We have found that PSSP works very effectively, over many databases,.


CFM-ID: Analysis of MS/MS Spectra:

CFM-ID provides a method for accurately and efficiently identifying metabolites in spectra generated by electrospray tandem mass spectrometry (ESI-MS/MS). The program uses Competitive Fragmentation Modeling to produce a probabilistic generative model for the MS/MS fragmentation process, and machine learning techniques to adapt the model parameters from data.


Bayesil: Interpreting NMR Spectra (biofluids)

Bayesil is a web system that automatically identifies and quantifies metabolites from 1D 1H NMR spectra of complex mixtures, including biofluids such as ultra-filtered plasma, serum or cerebrospinal fluid. The NMR spectra must be collected in a standardized fashion (see How To Collect NMR Spectra for Bayesil) for Bayesil to perform optimally. Bayesil first performs all spectral processing steps, including Fourier transformation, phasing, solvent filtering, chemical shift referencing, baseline correction and reference line shape convolution automatically. It then deconvolutes the resulting NMR spectrum using a reference spectral library, which here contains the signatures of more than 60 metabolites (see here for a list). This deconvolution process determines both the identity and quantity of the compounds in the biofluid mixture. Extensive testing shows that Bayesil meets or exceeds the performance of highly trained human experts.


Proteome Analyst (PA)

Proteome Analyst (PA) is a publicly-available, high-throughput, Web-based system for predicting various properties of each protein in an entire proteome. Using machine-learned classifiers, PA can predict, for example, the GeneQuiz general function and Gene Ontology (GO) molecular function of a protein. In addition, PA is currently the most-accurate and most-comprehensive system for predicting subcellular localization, the location within a cell where a protein performs its main function. Two other capabilities of PA are notable. First, PA can create a custom classifier to predict a new property, without requiring any programming, based on labeled training data (i.e., a set of examples, each with the correct classification label), provided by a user. PA has been used to create custom classifiers for potassium-ion channel proteins and other general-function ontologies. Second, PA provides a sophisticated explanation feature that shows why one prediction is chosen over another. The PA system produces a Naïve Bayes classifier, which is amenable to a graphical and interactive approach to explanations for its predictions; transparent predictions increase the user’s confidence in, and understanding of, PA.



Intelligent Diabetes Management (IDM / EASI)

A typical patient with Type I diabetes must give him/herself insulin injections several times a day, to keep his/her blood glucose level (BG) in an acceptable range. The amount of each injection (for each type of insulin) depends on a formula that uses his/her current BG, as well as other factors, including the amount of carbohydrates that s/he is about to consume, and the anticipated exercise, as well as previous BG levels and responses. As this specific formula can vary from patient to patient (and for a single patient, over time), patients maintain a "Diabetes Diary" (DD), that records all of this information (glucose readings, insulin dose, carbohydrate intake and exercise) at several times. The patient's diabetes team will periodically examine this DD, to adjust the formula for this patient. Unfortunately, this important feedback may be infrequent (or perhaps even unavailable for patients in 3rd-world countries).

We are developing a tool that can automate this adjustment process: given relevant patient information (age, gender, ...), and his/her recent DD, modify the current parameters of formula. Our initial task is to replicate the health care suggestions. Later, we will explore ways to improve this process, to provide suggestions that better keep the patient's BG in the acceptable range, using techniques from Machine Learning, especially Reinforcement Learning.

Still under development -- only available with explicit permission.

The website:

The associated cell-phone app: )



Two web-apps, for illustrating Artificial Intelligence ideas: for Decision Trees (learning ), and for IDA* (Search)

The website: 

See also