Clinical Persona performs a variety of predictive modeling projects in research and clinical settings.
The projects include but are not limited to the following key types of problems:
Predicting outcome of disease in the absence of treatment, or under a standard of care. This is also known as prognosis.
Predicting treatment benefit. This analysis generally involves Randomized Controlled Trial results, and identifies patients most likely to benefit from a new treatment.
Predicting disease susceptibility in asymptomatic populations. This type of analysis is, ideally, designed to lead to a screening test.
Classifying disease (diagnosis). An example is identifying molecular subtypes within an otherwise homogenous disease, leading to different clinical outcomes. Another is identifying the primary site of a cancer in a case with ambiguous pathology.
Protein classification. For example, this type of prediction may pinpoint active sites in protein structures, thereby helping focus search for targets during preclinical development.
Bladder Cancer Diagnosis and Recurrence
Clinical Persona developed algorithm  for detecting bladder cancer in patients with hematuria or history of bladder cancer. The assay uses cell RNA expression in urine samples. It achieved cross-validation AUROC of 0.85.
In 2014, Dr. Buturovic was the lead author on a study  on predicting locations of functional sites in unannotated protein structures using the molecules' physicochemical features. The methods were implemented at San Francisco State University and are now available as a WebFeature service hosted at Stanford University.
An analysis of 3318 Rheumatoid Arthritis patients from Wellcome Trust Case Control Consortium revealed a potential Single Nucleotide Polymorphism (SNP)-based signature with Area Under Curve of 0.92. Though not yet adequate for screening use, the pilot project demonstrated the power of combining large number of SNP measurements in an intelligent machine learning predictor.
The goal of this project was predicting response to a targeted cancer treatment in a population of Patient-Derived Xenograft mice. The inputs were Affymetrix GeneChip expression measurements of the implanted tumors, and the output was the binary indicator of drug response. The results suggested that at least 200 genomic markers were required to achieve a potentially clinically useful level of predictive performance, as indicated by sensitivity and specificity of the best classifier.
How it works
For all projects, the client supplies:
Genomic and/or clinical data (independent variables). Taking cancer prognosis as an example, genomic measurements could be gene expression or SNP measurements from a suitable tissue sample. Clinical data could be variables such as age, sex, ECOG status, prior treatment, blood tests, chromosomal abnormalities, etc. For HIV, the genomic measurements could be protein sequence of relevant viral proteins.
Outcome of interest. For example, this could be a binary variable (responder/non-responder; active protein site/inactive site) or continuous (RECIST criteria; overall survival; remission duration; HIV load; etc.).
Predictive performance criteria. This could be sensitivity and specificity of the classifier, hazard ratio for overall survival between predicted responders and non-responders, Area Under Curve, precision vs. recall, etc.
For research projects, Clinical Persona typically delivers a feasibility report. It consists of a document (PDF file) written as a scientific manuscript, and a Power Point presentation highlighting the main points. The key conclusions relate to the feasibility of developing a predictor meeting the client requirements.
For clinical/production projects, the goal is delivery of a signature (predictive model) which could potentially be taken to product development, for example as a component of a companion diagnostic test. In these situations, Clinical Persona delivers software that takes as input the genomic and other predictor variables and produces a predictive measure (for example, estimated probability of disease recurrence; or estimated survival time) suitable for end user. This type of projects require close collaboration between Clinical Persona and the client to maximize the likelihood of successful product development.
Following are examples of machine learning-based predictive modeling projects we have conducted.
Disease classification (diagnosis)
The founders developed predictive algorithms for two FDA-cleared Tissue of Origin Tests  using over a thousand microarray RNA expression measurements. The FFPE variant of the TOO test is in clinical use, offered by Cancer Genetics Inc.
 Monzon FA, Lyons-Weiler M, Buturovic LJ et al. (2009) Multicenter validation of a 1,550-gene expression profile for identification of tumor tissue of origin. Journal of Clinical Oncology 27 (15): pp. 2503-2508.
Analysis of a Randomized Clinical Trial survival data revealed a genomically-defined subpopulation of patients who experienced three-month Progression Free Survival (PFS) with hazard ratio of 0.49. In contrast, unselected patients experienced one months PFS. Though clinically not adequate, the results were nevertheless intriguing. The winning model used 100 gene expression markers.
Bladder Cancer Diagnosis and Recurrence
NEJM SPRINT Challenge: Precision Medicine for Hypertension
The SPRINT Data Analysis Challenge was about deriving novel clinically useful findings from SPRINT clinical trial which compared intensive and standard treatments for hypertension. We developed machine learning software SafeSPRINT which suggests whether a hypertension patient should receive the intensive treatment or standard.