Medical Tasks

Over the last many years, many AICML researchers (including PIs, PDFs, students and research programmers) have been actively engaged with many teams of medical researchers, exploring ways to use patient data to produce classifiers that can make accurate predictions about future patients.

These projects involve using various information about a patient with the goal of predicting some relevant property of that patient.

We seek ways to "learn" these predictors, from historical data, often augmented with other prior biological data (such as metabolic or signaling pathways).

See also Amii pages on:  enhancing cancer care    fMRI-based analysis


Medical Tasks

Predicting Characteristics of Kidney Transplants - retired

Our colleagues at ATAGC (the Alberta Transplant Applied Genomics Centre) are trying to better understand why some transplants will respond well, while others will not. To obtain the relevant information, they have performed many hundreds of kidney transplants, of both mice and of men, and have been tracking the developments, recording both histological information as well as gene expressions values (microarrays) of biopsies -- both protocol biopsies (obtained just after the transplant) and biopsies for cause (obtained after patients develop some symptoms). This team has also defined a set of "coherent gene sets", called Pathogenesis-Based Transcript Sets (PBTs), which are intended to summarize the 50K expression values in a microarray, using just a few dozen values.

Relevant Technologies: Histology, Microarrays, Clinical

Active Collaborative Projects:


Relevant Publications:

Predictions for Cancer Patients

Our colleagues at the Cross Cancer Institute (part of the Alberta Health Services) are trying to better understand various cancers, and in particular, determine which patients should receive which treatment. This team has amassed a wealth of important data: gene expression values (microarray) on flash frozen tumor specimens from patients and SNP (single nucleotide polymorphism) profiles from hundreds of patients and controls.

Relevant Technologies: Microarrays, SNPs

Active Collaborative Projects:

Previous Projects:


Joint Grants (with Drs. Damaraju and Mackey)

Relevant Publications:

Producing and Analyzing Metabolomic Profiles

Metabolites are organic compounds that are used or produced during metabolism. Under different conditions, different organisms may change their metabolism. Using the advanced technology of NMR spectroscopy, each urine sample generates a unique metabolomic signature. This metabolomic signature can help diagnosis of diseases and can help doctors to choose the best treatment.

Relevant Technologies: Metabolomics


Active Projects:

Joint Grants:

Relevant Publications:

Brain Tumour Analysis Project (BTAP) - retired

Our colleagues at Cross Cancer Institute (part of the Alberta Cancer Board) are analyzing brain tumour patient data, in order to better understand tumour behaviors, treatment effects, and likely patient outcomes, towards using this analysis to design more efficient and effective treatments for patients. This team has assembled hundreds of expert-labeled Magnetic Resonance (MR) patient scans.

Relevant Technologies: Image Analysis

Active Collaborative Projects (please also see the project website):

Partners (see the full list):

Joint Grants:

Relevant Publications:

Intelligent Diabetes Management

A typical patient with Type I diabetes must give him/herself insulin injections several times a day, to keep his/her blood glocose level (BG) in an acceptable range. The amount of each injection (for each type of insulin) depends on a formula that uses his/her current BG, as well as other factors, including the amount of carbohydrates that s/he is about to consume, and the anticipated exercise, as well as previous BG levels and responses. As this specific formula can vary from patient to patient (and for a single patient, over time), patients maintain an "extended glucose log" (EGL), that records all of this information (glucose readings, insulin dose, carbohydrate intake and exercise) at several times. The patient's diabetes team will periodically examine this EGL, to adjust the formula for this patient. Unfortunately, time is limited, which means this important feedback may be infrequent (or perhaps even unavailable for patients in 3rd-world countries).

Our goal is a tool that can automate this adjustment process: given relevant patient information (age, gender, BMI, ...), and his/her recent GL, modify the current parameters of formula. Our initial task is to replicate the health care suggestions. Later, we will explore ways to improve this process, to provide suggestions that better keep the patient's BG in the acceptable range, using techniques from Artificial Intelligence.


More information can be found at the Project Website

Technologies Utilized


Microarrays are a way to assay a large amount of biological material using high-throughput screening methods. We have analyzed a variety of microarray data, including DNA microarrays.

Single Nucleotide Polymorphisms

Single Nucleotide Polymorphisms (SNPs) are variations in single nucleotides in DNA sequences. Analyzing SNP data usually requires dealing with very large feature sets, and requires dimensionality reduction.


Metabolomics is the study of small molecules in the body on a large scale. The study of Metabolomics brings up many data processing issues, dependent on the technology used to analyze samples (e.g. NMR, MS), and the type of body fluid being analyzed (e.g. Urine, Blood).

Image Analysis

Image Analysis is the automated extraction of useful information from Imaging data.


Histology is the study of cells. We have analyzed histological data of breast cancer patients using machine learning techniques.

Subcellular Localization

Subcellular Localization refers to the location that a protein does the majority of its work within the cell.

Clinical Features

Clinical Features are those gathered from clinical studies, and describe high-level characteristics of patients (e.g. age, sex). These descriptors are often included with other analyses (e.g. Microarray, SNP), and provide valuable information about the patients health.

Prediction Tasks

Foundational Machine Learning Issues

Dealing with the above projects requires addressing many fundamental questions, in the field of machine learning