Research

General areas of my research include development and analysis of machine learning and artificial intelligence methods, as well as application of intelligent and computational methods in health.

I work on several research projects related to these areas. Many of the projects intersect, as some are more focused on (M)ethods and others on (A)pplications. The list below consists of selected recent and current projects. There is no specific order to the listing, although older projects tend to be later in the list.

(M) Machine Learning that Makes Sense in Health follows the early concepts of natural induction, a machine learning paradigm in which results of learning should be natural to people. This means that the machine learning process, created models and results of individual prediction need to be transparent and grounded in existing domain knowledge. The work also includes investigation of criteria for reporting results of ML modeling to fully explain what models do, as well as model comparison. It follows a simple framework in which ML methods can be described as inputs, algorithms, and outputs.

(MA) Integrated platform for bruise detection and analysis is a large interdisciplinary project that aims at helping victims of violence by developing technology that can equitably and fairly detect and assess bruises. The work in collaboration with Dr. Scafide (nursing) and Dr. Lattanzi (engineering) combines deep learning technology to analyze images, with data fusion methods and mobile technologies within an advanced platform.

(A) Social distancing and Symptom Modeling aims at analyzing movement data from GPS trackers along with daily symptom reports in order to understand how people move during pandemics.

(MA) Patient Functional Improvement Decline prediction is important for clinicians, caregivers, health administrators, and obviously patients. The project uses novel approaches to temporal machine learning to construct and predict trajectories of patients’ functional decline. One example of constructed tool is Computational Barthel Index, https://hi.gmu.edu/cbit

(MA) Technology-based contact prediction. Inspired by the extraordinary need of tracking people during COVID-19 pandemics, this work focuses on the use of enterprise WIFI network data. Data are used to reconstruct location and movements of individuals and backtrack their locations in the need of contact tracing. Both methodological and practical aspects of the work are investigated. The project also integrates other sources of data such as EHRs, course registrations, known office locations, access card entries, etc.

Some older projects, but many are still active:

(M) Generation of Synthetic Patient Data uses machine learning methods to learn patterns from data and uses these patterns to generate new realistic synthetic data. Learn more about intelligent Patient Data Generator (iPDG). The work has been extended by Dr. Mojtaba Zare in his PhD dissertation.

(M) Rule learning that includes development of machine learning algorithms for deriving accurate and transparent attributional rules from data and background knowledge. The novelty of the methods is in their “understanding” of concepts which are linked to domain ontologies. In principle, the method knows how attributes and their value are related. For example, in medical domain it can derive from UMLS (a large medical ontology) knowledge that an ICD code corresponds to a diagnosis or a procedure, and what is its relation to a HCPCS code representing a treatment within the data. Several other forms of incorporating semantic information into the machine learning process are also investigated. By doing so, the method can arrive at results that are potentially more accurate and more natural to domain experts, thus have higher chance of being accepted.

(M) Learning from aggregated data is a novel approach to machine learning in which individual data points are replaced with aggregated summaries which provided as input for learning. This type of data can be present when learning from published medical results, in which only statistical summaries of cohorts of patients are available, and in distributed analysis of massive datasets in which a single node cannot process all the data and the aggregated summaries/models can be shared among nodes. The methods are investigated for learning rules and other representations. Some investigated applications include prediction of liver complications in patients with metabolic syndrome.

(MA) Opioid Abuse Trajectory Prediction uses temporal clustering and prediction method to early detect patients at risk for high opioid use. Claims and prescription data undergo spatiotemporal clustering to detect common use trajectories. Supervised learning is used to predict to which trajectories a person is likely to belong.

(MA) Medical claim payments prediction is an important practical problem in revenue cycle management of hospitals and private practices. The work in collaboration with Jay Shiver (GMU) and Ron Ewald (Inova) aims at discovering patterns that describe claims for provided services which are partially or entirely denied. The project includes also methodological work on unsupervised labeling of data for supervised learning, and combining classification and regression learning.

(A) Treatment options selection and classification for prostate cancer patients is part of a larger project whose goal is to compare selection of treatment options among prostate cancer patients. In this work, machine learning methods are applied to predict mortality, as well as create homogenous groups of patients for whom disparities in treatment selection can be investigated.

(M) Learnable evolution model is a novel non-Darwinian evolutionary method that uses machine learning to guide evolutionary optimization process. Instead of randomly searching space of possible solutions, LEM hypothesizes why some candidate solutions perform better than others and used these hypotheses to create new solutions likely to perform better.

(MA) Non-Darwinian evolutionary optimization for engineering design investigates theoretical and practical aspects of applying the learnable evolution model to solve hard optimization problems in engineering. In a project supported by the National Institute for Standards and Technology, LEM is applied to optimize heat exchangers (ISHED system).

(MA) Autonomous learning and optimization in intelligent transportation logistics is a project at the University of Bremen, Germany, whose goal is to allow learning capabilities in distributed logistic systems. Focused on transportation logistics, the project resulted in more general algorithms that allow for learning in autonomous distributed environments.

(M) Inferential theory of learning is a theoretical framework created by R.S. Michalski that describes learning process as a set of operations (called transmutations). The theory has been recently extended to describe operation performed within the learnable evolution model.

(A) Prediction of possible claims resulting from reported medical errors investigates the possibility of using machine learning methods to predict which reported medical errors/near misses result in claims or lawsuits. It is part of a larger project led by Lorens Helmchen (GMU) in collaboration with Inova Health System whose goal is to improve reporting, and ultimately reduce error rates and improve patient care.

Most of the projects involve my current and past PhD and MS students: Dr. Che Ngufor, Talha Oz, Kat Irvin, Bo Yu, Chris Jose, Hedyeh Mobahi, Reynaneh Morgharab Nia, Eman Elashkar, Dr. Mojtaba Zare, Kat Irvin, Dr. Negin Asadzadehzanjani, Naren Durbha, Fatemah Aloudah, Wid Yamani, Wejdan Bagais and others. To learn about research interests of the Machine Learning and Inference Laboratory, please click here. The website includes more detailed descriptions of several of the mentioned projects. You can find additional information in my publications, or general MLI publications. If you have further questions, feel free to contact me.

I’d like to thank for funding support that made this research possible. Among the supporting agencies are: National Institute for Standards and Technology, Department of Veterans Affairs, Mason-Inova fund, GMU provost office, National Science Foundation, Robert Wood Johnson Foundation, Healthcare Risk Management and Patient Safety, Cochrane Collaboration group at GMU, German Science Foundation, Alzheimer’s & Related Diseases Research Award Fund, National Institutes of Health, National Institute of Justice and others.