predicting patients' functional status from clinical notes and diagnoses

(Wojtusiak, Giang, Alemi, Oz)

Assessing functional status of residents in nursing homes and medical foster homes is a time consuming and costly process. It requires assessment by a registered nurse specifically trained in the assessments in consultation with other members of the interdisciplinary team.  The status is usually assessed using a standardized form called the Minimum Data Set (MDS). The MDS resident assessment is conducted quarterly, at admission, readmission, discharge, and with a significant change in condition. The MDS has nearly 400 data elements, including cognitive function, physical functioning, continence, preferences for routine and activity, psychosocial well-being, mood state, disease diagnoses, health conditions, nutritional status. This project concerns predicting patients’ functional status as measured by Barthel Index which consists of 10 data elements and can be considered a simplified version of the MDS.

 The investigated approach is to apply a set of machine learning methods to analyze patients’ history given by a set of diagnoses as well as clinical notes. The two approaches are then integrated to provide the final predictions. Past patients’ diagnoses are provided in a time-stamped structured database. Clinical notes are retrieved from 6 months prior to the time of assessment of the functional status.

 The specific methods used in the project include:

-       AQ21 rule learning for analyzing structured data

-       Guided Bayesian approach for analyzing clinical notes

-       Random forest-based selection of relevant diagnoses and parts of notes

-       Mapping of structured data and notes onto concepts within the Unified Medical Language System (UMLS)


The general architecture of the system is presented in the figure below. A set of independent classifiers predict scores for all elements of the Barthel Index. The scores are then aggregated to obtain the Barthel Score. At the same time a regression model is applied to directly predict the Barthel Score. Finally, the Bayesian Guided approach is applied to clinical notes to obtain the score independently from structured data essay-papers. All scores are averaged to obtain the resulting Barthel Score for a given patient.



The project is done in collaboration with and funded in part by the Department of Veterans’ Affairs.


For references, see publications section.

MLI Copyright © 2015 Machine Learning and Inference Laboratory
College of Health and Human Services, George Mason University
4400 University Dr, MSN 1J3, Fairfax, VA 22030, U.S.A