Informative Bayesian Learning and Data-Acquiring Through Limited Knowledge

Our research in this theme focuses on enabling modeling, learning, design, and decision-making in limited knowledge domains such as medical domains and cyber-physical systems, where learning should be conducted according to criteria such as limited expert demonstrations, limited or lack of knowledge about system dynamics, and huge sources of uncertainty. These include the development of highly scalable, efficient, and reliable tools for inverse learning, data-acquiring, and multi-task/multi-domain learning in limited knowledge domains.

Statistical Signal Processing

Many practical systems that we are dealing with, consist of several components interacting with each other dynamically and often monitored through noisy data acquired by multiple sensors. These include systems in metagenomics, healthcare systems, smart grids, smart cities, social networks, and manufacturing. The lack of scalability of the existing dynamical models, such as hidden Markov models or partially-observed Markov decision processes, limits proper modeling and analysis of these large-scale systems. Our research aims to take advantage of the structure of these systems to develop highly scalable and efficient signal models and related tools to deal with a wide range of practical problems.

Data-Driven and Model-Based Experimental Design

Design and decision-making are pervasive in scientific and industrial endeavors: scientists aim to gain insights into physical and social phenomena, engineers like to design machines to execute tasks more efficiently, and biologists seek to discover new drugs to fight diseases. Design in most practical domains is fraught with choices, choices that are often expensive, complex and high-dimensional, with interactions and uncertainties that make them difficult for individuals to reason about. Our research aims to provide a new perspective to experimental design and develop model-based and data-driven experimental design techniques capable of scalable, efficient and reliable design and decision-making.

Genomics and Metagenomics

Microbial communities and their hosts play a key role in many domains including protecting humans or plants against diseases or developing the next generation of biofuels and biological remediation systems that are needed for sustainable growth. Gaining a deep understanding of the fundamental biology of these systems is the key to harnessing their potential. Advances in high-throughput multi-omic techniques like metagenomics, metatranscriptomics, exometabolomics, and proteomics, allow us to capture multiple snapshots of these complex biological processes at once. These snapshots create large-scale high-dimensional datasets of omics features (e.g., microbial species, microbial genes, proteins, and small molecules). Our research aims to develop methods and related tools to characterize the time component and to capture the dynamical behavior of microbial communities through various omics data.

Funded Projects

Army Research Office— $560K
“Scalable and Reliable Optimization of Expensive Multi-Modal Functions: A Bayesian Perspective” (Single-PI, Date: 7/2021 – 6/2025)

National Science Foundation (NSF) — $175K
“Informative Bayesian Learning and Data Gathering Through Expert-Acquired Data” (Single-PI, Date: 7/2020 – 6/2022)

The George Washington University, University Facilitating Fund (UFF) — $20K
“Machine Learning for Scalable, Reliable and Online Design and Decision Making” (Single-PI, Date: 7/2020 – 6/2021)