Representation learning for health with multi-modal clinical data

Seminar
Thursday, February 09, 2017
5:00 AM to 6:00 AM
POB 2.402
Free and open to the public

The explosion of clinical data provides an exciting new opportunity to use machine learning to discover new and impactful clinical information. Among the questions that can be addressed are establishing the value of treatments and interventions in heterogeneous patient populations, creating risk stratification for clinical endpoints, and investigating the benefit of specific practices or behaviors. However, there are many challenges to overcome. First, clinical data are noisy, sparse, and irregularly sampled. Second, many clinical endpoints (e.g., the time of disease onset) are ambiguous, resulting in ill-defined prediction targets.

In this talk, I will discuss the need for practical, evidence-based medicine, and the challenges of creating multi-modal representations for prediction targets varying both spatially and temporally. I will present recent work that addresses the learning good representations across clinical applications that deal with missing and noisy data. I will discuss work using the electronic medical records for over 30,000 intensive care patients from the MIMIC-III dataset to predict both mortality and clinical interventions, as well as work from a non-clinical setting that uses non-invasive wearable data to detect harmful voice patterns, and the presence of pathological physiology. To our knowledge, classification results on these task are better than those of previous work. Moreover, the learned representations hold intuitive meaning - as topics inferred from narrative notes, and as latent autoregressive states over vital signs. The learned representations capture higher-level structure and dependencies between multi-modal time series data and multiple time-varying targets.

x x

Speaker

Marzyeh Ghassemi

MIT’s Computer Science and Artificial Intelligence Lab

Marzyeh Ghassemi is a PhD student in the Clinical Decision Making Group (MEDG) at MIT’s Computer Science and Artificial Intelligence Lab (CSAIL) supervised by Dr. Peter Szolovits. Her research focuses on machine learning with clinical data to predict and stratify relevant human risks, encompassing unsupervised learning, supervised learning, structured prediction. Marzyeh’s work has been applied to estimating the physiological state of patients during critical illnesses, modelling the need for a clinical intervention, and diagnosing phonotraumatic voice disorders from wearable sensor data.