This page is updated annually. Some projects may already be taken, and new projects may be available. The projects below give an indication of the types of projects available in each lab, but please browse faculty web pages and contact professors directly to discuss current opportunities.
View Rotation Projects by Faculty: BISB or BMI
Also see the Division of Biomedical Informatics projects page.
Labs with BMI Rotation Projects
Tiffany Amariuta | Halıcıoğlu Data Science Institute
Ferhat Ay | La Jolla Institute for Immunology
Vineet Bafna | Computer Science and Engineering
Vikas Bansal | Pediatrics
Matteo D'Antonio | Biomedical Informatics
Kelly Frazer | Pediatrics
Lilia Iakoucheva | Psychiatry
Rob Knight | Pediatrics
Jejo Koola | Biomedical Informatics
Tsung-Ting Kuo | Biomedical Informatics
Amit Majithia | School of Medicine
Amy Sitapati | Biomedical Informatics
Benjamin Smarr | Bioengineering
Yingxiao (Peter) Wang | Bioengineering
Dominik Wodarz | Biological Sciences
Rose Yu | Computer Science and Engineering
Tiffany Amariuta | Halıcıoğlu Data Science Institute
We are a statistical genetics lab focusing on developing methods to study complex traits and polygenic diseases across global populations, with a specific focus on minority groups that have been underrepresented in the fields of genetics and genomics. We are interested in developing novel multi-ancestry statistical methods for fine-mapping disease genes and their cell types of action. The goal of this research is to identify targets for gene-based therapeutics.
-
Mapping the genetic architecture of polygenic disease, complex traits, and gene expression levels
Last Updated:The Amariuta Lab is happily accepting PhD rotation students. We have a variety of predefined projects but are also open to student-led projects and ideas that fall within the general scope of research in our lab. One potential rotation project involves the investigation of different ways to estimate gene expression heritability (e.g., the proportion of gene expression variance that can be explained by genetics), including using cutting edge fine-mapping algorithms and single cell RNA-sequencing data. A second potential project involves developing a new method to map the cell-type-specificity of the genetic component of gene expression regulation, which is often confounded by strong patterns of co-expression and co-regulation across genes and other cell types. A third potential project involves mapping the causal genes underlying putative causal tissues and cell types mapped via our previously published method called Tissue Co-regulation SCore regression (TCSC). A fourth potential project involves identifying putative causal tissues and cell types underlying the genetic correlation and hence pleiotropy of immune-mediated diseases. Lastly, we are also beginning to explore deep learning models to use sequence data to predict gene expression levels in novel ways and would welcome students who wish to gain skills in any of these areas.
Ferhat Ay | La Jolla Institute for Immunology
We are interested in the analysis and modeling of the three-dimensional chromatin structure from high-throughput sequencing experiments. We develop methods that are based in statistics, machine learning, optimization and graph theory to understand how changes in the 3D genome affect cellular outcome such as development, differentiation and gene expression. We have ongoing interests in the systems level analysis and reconstruction of regulatory networks, inference of enhancer-promoter contacts, predictive models of gene expression and integration of three-dimensional chromatin structure with one-dimensional epigenetic measurements in the context of cancer, malaria, asthma and several autoimmune diseases.
-
Integrative analysis of multi cell-type gene expression and epigenomic data in tumor immune response
Last Updated:This project will focus on developing regulatory network inference methods for the joint analysis of gene expression and histone modification data from several different types of tumor infiltrating lymphocytes, which are gathered from a cohort of patients with solid tumors.
-
Predictive and comparative modeling of epigenetic gene regulation in different human immune cell types
Last Updated:The goal of this project is to model the natural variation in gene expression across many immune cell types using an already established database at LJI (https://dice-database.org) and to identify cell type-specific epigenetic regulators of important immune genes.
-
Statistical methods for inferring functional DNA-DNA contacts from Hi-C and HiChIP/PLAC-seq data
Last Updated:This project focuses on developing computational tools for better analysis of the wealth of data from chromosome conformation capture assays with the ultimate goal of inferring functional chromatin contacts such as those between enhancers and promoters.
Vineet Bafna | Computer Science and Engineering
Our lab is focused on design and implementation of algorithms for biological data interpretation. Within this broad framework, we have a number of open projects relating to problems in proteomics (interpretation of mass spectrometry data), genetics, and genomics. The projects listed below are a small sampling of available projects. Interested students should be have taken a class in algorithms design, and have some facility with machine learning approaches.
-
Extrachromosomal DNA analysis
Last Updated:Extrachromosomal DNA formation is an important pathological condition found in nearly a third of cancers and all cancer subtypes. Our lab is developing computational tools to characterize their structural and functional properties of ecDNA and related focal amplifications.
The interested students should have an interest in learning about, designing and implementing graph algorithms, and should commit to taking my winter class CSE280A.
Vikas Bansal | Pediatrics
Research in our lab is focused on developing computational methods and tools for variant calling in human genomes and using these tools for disease association studies. We focus on challenging variant types such as haplotypes and variants in repetitive regions and work with both short-read (Illumina) and long-read sequencing technologies.
-
Duplicated genes and association with disease
Last Updated:Hundreds of duplicated genes in the human genome are duplicated and many are known to be associated with a number of human diseases. However, the short read lengths of current sequencing technologies make the analysis of such genes difficult. We have developed novel tools to genotype the copy number of duplicated genes using whole-genome sequencing. The goal of this project is to analyze large-scale sequencing datasets (using cloud computing platforms) for Mendelian and complex human diseases to identify novel disease associations.
-
Haplotype-based variant calling using long-read sequencing
Last Updated:Long-read sequencing technologies have the potential to overcome some of the key limitations of short-read sequencing, particular in long repetitive regions of the human genome, but require the development of new algorithms. We have previously developed computational methods for variant calling (Longshot, Nature Communications 2019) and read mapping in segmental duplications (Duplomap, Nucleic Acids Research 2020) using long-read sequencing technologies. The goal of this project is to implement a haplotype-based model for variant calling using long reads that automatically identifies genomic regions that can be called with high confidence.
Joseph Califano | Surgery
-
Chromatin dysregulation and DNA methylation at transcription start sites associated with transcriptional repression in cancers
Last Updated:Chromatin dysregulation and DNA methylation at transcription start sites associated with transcriptional repression in cancers
-
Genetic and epigenetic analysis of HPV-positive and HPV-negative Oropharyngeal Squamous Cell Carcinoma
Last Updated:The Lab of Dr. Joseph Califano under the sponsorship of the Japan Society for the Promotion of Science (JSPS) will conduct collaborative research on a new strategy for the treatment for HPV-associated oropharyngeal cancer based on comprehensive epigenetic analysis. This year, we are proud to congratulate Postdoctoral Researcher, Dr. Takuya Nakagawa for his work entitled “Genetic and epigenetic analysis of HPV-positive and HPV-negative Oropharyngeal Squamous Cell Carcinoma”. Dr. Nakagawa graduated from Chiba University School of Medicine in Japan where he also received the Medical Pharmacy Director’s Award.
Christine Cheng | Psychiatry
Dr. Cheng’s research focuses on transcriptional regulatory network and aims to develop a comprehensive understanding of how aberrant regulatory circuits contribute to human disease. Dr. Cheng’s laboratory is particularly interested in understanding transcriptional and epigenetic regulation of the interplay between the immune system and central nervous system in neurodegenerative diseases, substance use disorder and HIV infection. Current projects focus on applying single-cell transcriptomics and epigenetics assays to characterize Alzheimer’s disease, HIV and opioid use disorder patient samples, with the goal of finding diagnostic markers and therapeutic targets. Dr. Cheng’s lab also has developed 3D brain organoid models for Alzheimer’s disease and HIV infection. Dr. Cheng received her M.S. degree in Computer Science from Stanford University, and she received her Ph.D. degree in Bioinformatics and Systems Biology from University of California, San Diego. After completing her doctoral study, Dr. Cheng did her postdoctoral training at the Broad Institute of MIT and Harvard.
-
3D brain organoid model of Alzheimer’s disease revealed by single cell transcriptomics
Last Updated:We developed a novel tau propagation model using 3D spheroid model that rapidly develop tau pathology and neurodegeneration in just three weeks. Single cell transcriptomics of the model reveals cell type specific changes that resemble transcriptomic signatures from Alzheimer’s disease postmortem brain.
-
Single cell transcriptomics and epigenetics of human Alzheimer’s disease brain
Last Updated:To understand cell type specific vulnerability of Alzheimer’s disease, we utilize snRNA-seq to characterize human brain tissues from Alzheimer’s disease patients across different brain regions.
-
Single cell transcriptomics and epigenetics of the opioid use disorder and HIV syndemic in the human brain
Last Updated:As part of the NIH NIDA SCORCH consortium, we will dissect the dysregulated molecular circuitry in the brains of individuals with opioid use disorder and/or HIV infection. This project aims to identify genes that contribute to opioid use disorder and HIV-associated neurocognitive disorders. These approaches could lead to novel gene therapies to control and perhaps reverse the relentless disease state. We are in the process of generating snRNA-seq and snATAC-seq profiles from more than 300 patient samples across 3 different brain regions.
-
Single Cell Transcriptomics of the Cocaine Use Disorder in the Context of HIV
Last Updated:As part of the NIH NIDA SCORCH consortium, we will dissect the dysregulated molecular circuitry in the brains of individuals with cocaine use disorder and/or HIV infection. This project will focus on understanding how neurovasculature and neuroimmune cells contribute to cocaine use disorder and HIV-associated neurocognitive disorders. We will be generating snRNA-seq and snATAC-seq profiles from more than 300 patient samples across 3 different brain regions.
Matteo D'Antonio | Biomedical Informatics
Our goal is to understand the associations between genetic variation and human disease. As part of the Center for Admixture Science and Technology (CAST), we work with large genetic datasets, such as the UK BioBank, GTEx, All of Us (AoU) and the Million Veterans Program (MVP) to characterize the associations between genetic variation and disease in global and local ancestry-aware settings.
-
Ancestry-aware genome-wide association studies
Last Updated:For the past 500 years, the American continent has been the site of ongoing mixing of Europeans, Native Americans, Africans and Asians, resulting in a significant percentage of Americans carrying ancestry from outside their self-identified race. However, genome-wide association studies (GWAS) and polygenic risk score (PRS) calculations have been performed and optimized on individuals of European descent, with minor exceptions. As part of CAST (Center for Admixture Science and Technology), we are developing methods to perform GWAS and calculate PRS on diverse and admixed populations, using large diverse cohorts, such as the All of Us and Million Veteran Program.
Kelly Frazer | Pediatrics
Welcome to the Frazer Lab! We are using two complementary approaches to achieve our goal of identifying and characterizing functional human genetic variants. Our first approach utilizes iPSCORE, a resource that was generated to enable both familial and association-based genetic studies of molecular and physiological phenotypes in induced pluripotent stem cells (iPSCs) and derived cell types. Our second approach involves conducting association studies in well-characterized cohorts with the goal of identifying variants that play roles in human disease and to assess their contributions to disease pathogenesis, progression, and prognosis.
-
Investigate fetal-specific cardiac regulatory variants and their overlap with cardiac GWAS lead variants
Last Updated:We have derived iPSC-CVPCs from 180 individuals and showed that their transcriptomes are more similar to fetal heart than to adult cardiac tissues. Our goal is to leverage these data in combination with WGS to perform eQTL analyses. We plan to assess whether fetal-specific eQTLs are associated with complex adult cardiac traits, by colocalizing eQTLs with summary statistics from GWAS (cardiac traits.) Our preliminary analyses show that eQTLs in iPSC-CVPCs identifies cardiac disease GWAS variants that are active in the fetal but not adult heart, indicating that they play a role in development. Our findings provide genetic evidence supporting the fetal origins of the cardiovascular disease hypothesis and highlight the importance of investigating genetic associations across stages of development (i.e. fetal and adult tissues) to fully understand the genetic underpinnings of complex traits and disease. We are looking for rotation students to conduct QTL analyses using large ATAC-seq and ChiP-seq for H3K27ac datasets generated from the iPSC-CVPCs.
Lilia Iakoucheva | Psychiatry
The lab has a variety of bioinformatics projects aimed at improving understanding of the functional impact of autism risk mutations derived from exome and whole genome sequencing of the patients. We created mouse models carrying some of these mutations using CRISPR/Cas9, and also produced patient-derived cerebral organoids with autism risk mutations. We performed bulk RNA-seq from various brain regions or time periods in these models. Gene-level analyses of RNA-seq data has been completed (manuscripts in preparation). We are now pursuing isoform-level analyses of these data to better understand functional impact of autism risk mutations on splicing isoform transcriptome.
-
Isoform transcriptome of Cul3-HET mouse model
Last Updated:The project deals with constructing the isoform-level co-expression and protein interaction networks for predicting functional impact of mutations in high risk autism gene Cul3. We have collected RNA-seq and TMT-proteomics data from various brain regions of Cul3+/- transgenic mouse. We are aiming at integrating isoform-level RNA-seq data with quantitative proteomic (peptide-level) data from the same samples to understand the impact of Cul3 mutation.
-
Isoform transcriptome of patient-derived cerebral organoids from 16p11.2 CNV carriers with autism
Last Updated:Copy number variants (CNVs) represent significant risk factors for Autism Spectrum Disorders (ASD). One of the most frequent CNVs involved in ASD is a deletion or duplication of the 16p11.2 CNV locus, spanning 29 protein-coding genes. Despite the progress in linking 16p11.2 genetic changes with the phenotypic (macrocephaly and microcephaly) abnormalities in the patients and model organisms, the specific molecular pathways impacted by this CNV remain unknown. We generated bulk RNA-seq and TMT proteomic data from patient-derived cerebral organoids (3 deletion, 3 duplication and 3 control patients). The goal of the project is to analyze isoform-level RNA-seq data, as well as proteomics data to investigate functional impact of 16p11.2 CNV.
Rob Knight | Pediatrics
The Knight lab has broad interests in the human microbiome, the collection of trillions of microbes that inhabits our bodies, especially in developing techniques to read out these complex microbial communities and use the resulting data to understand human health, links between humans and the environment, and to prevent and cure disease. We offer a fast-paced environment with many collaborative opportunities on different projects.
-
Machine Learning for the Microbiome
Last Updated:We have amassed a database of microbial DNA sequences from hundreds of thousands of biological specimens. Understanding how these changes relate to disease requires a range of machine learning and multivariate statistical approaches. There are many opportunities ranging from entry-level (benchmarking classifier performance on specific sample sets) to extremely challenging (using deep learning to infer the structure of global sample set relationships).
-
Multi-omics integration
Last Updated:An increasing need is to integrate data from different "omics" level, e.g. genomes, metagenomes, metatranscriptomes, metaproteomes, metabolomes, immunological profiling, etc., into a single coherent picture separating healthy and disease states. Improved methods for performing this task, either directly or via intermediate representations such as mapping to metabolic and regulatory pathways, is essential for improving understanding. Projects in this category range from simple (testing where existing techniques like correlation networks or Procrustes analysis do/don't connect two specific data layers) to challenging (use transfer learning to integrate heterogeneous data layers and improve the underlying network annotation). An especially exciting emerging research direction here is XAI (explainable artificial intelligence), which can provide for clinical applications a better justification for a specific classification or suggestion.
-
Optimizing microbiome algorithms
Last Updated:Many algorithms used in microbiome studies, especially in metagenomic assembly, are extremely computationally expensive. Opportunities exist for either exploiting new hardware architectures to accelerate existing algorithms, or for developing new approximate algorithms, to tackle problems in the workflow including inferring taxonomy and function from DNA sequence data, genome and metagenome assembly and annotation, computing community distance metrics from sparse compositional data, and high-level analyses of hundreds of thousands of microbiomes. Again these projects range from entry level (compare results of two multiple sequence alignment techniques for subsequent community analysis) to advanced (use non-von Neumann architectures to perform pattern classification in real time at the whole community level for disease detection).
Jejo Koola | Biomedical Informatics
Dr. Koola is a physician scientist specializing in Biomedical Informatics and Hospital Medicine. He specializes in the area of big data machine learning for predictive analytics. In particular, he is interested in using electronic health records to improve care delivery--particularly for patients with advanced liver disease. Using risk prediction models in a healthcare context requires understanding of: (i) the healthcare system of intended use; (ii) risk model building; (iii) risk model assessment; and (iv) risk model re-calibration. Additionally, Dr. Koola is interested in visual analytics, data modeling, and health services research.
-
Designing the "Green Button" informatics consult service using big data analytics for personalized medicine
Last Updated:In 2012 the Institute of Medicine released a desiderata for a learning healthcare system, where evidence informs practice and practice informs evidence. Though the randomized clinical trial (RCT) serves as the gold standard for informing clinical decisions, flaws exist in terms of achieving recruitment, overly stringent inclusion/exclusion criteria, and lack of patient-centered decision making. Observational cohort studies have grown as an important complement to RCTs allowing comparative effectiveness research and patient-centered trials. The surge of Electronic Health Records (EHR) and its resulting zettabyte of data5 allows us to realize this vision for the first time. Despite the growth of observational cohort studies, challenges still remain bringing the knowledge from the bench-to-the-bedside; moreover, model performance degrades when used in a cohort outside of its development.
To ameliorate these difficulties, we propose to launch and study a novel “informatics consult” service. The service would allow clinicians, when no clear evidence based guidelines exist regarding care decisions, to query the UCSD clinical data warehouse by identifying patients similar to the index case. First proposed in the seminal “Green Button” paper by Longhurst et al., such a system would leverage our ability to truly deliver personalized, patient-centered care. Small-scale limited efforts have been put into practice to answer questions regarding treatment of melanoma8 and systemic lupus erythematosus complications. We note, however, the opportunity for a much larger service with broad impact starting with insights borne of data from UCSD, and potentially mining insights from the entire state-wide UC Health data warehouse.
We note several novel challenges to this proposed system: (i) Performing semi-automated phenotyping so that we can identify clinical outcomes of interest10. (ii) Identifying patients that are similar to the index patient (often called clustering). (iii) Incorporating automated, computable search regarding guideline recommended care. (iv) Performing visual analytics to understand similarity of cohorts. (v) Communication of probability and statistical information to healthcare professionals so they can effectively manage uncertainty.
Student responsibilities:
- Participate in project meetings
- Help design one of several possible algorithms/interfaces:
a. patient clustering algorithm using unsupervised learning
b. visual analytic interface for describing similar cohort of patients
c. visual analytic interface to help communicate statistical risk
-
Integrating patient reported outcomes into the electronic health record to improve cardiovascular care.
Last Updated:Unhealthy dietary choices—a lack of nutritious foods and an excess of unhealthy food—was shown as the major contributor in the 400,000 U.S. deaths in 2015 from cardiovascular diseases (CVD). Eating more nuts, vegetables, and whole grains, and less salt and trans fats, could save tens of thousands of lives in the U.S. each year. Obesity is one critical outcome of poor diet, which also contributes to heightened CVD risk. Thousands of smartphone apps are available to download for weight loss, but these apps primarily focus on caloric intake, rather than the overall quality of diet and lifestyle critical for CVD prevention.
Mobile Health (mHealth) applications also have not been systematically tested for their effectiveness and are criticized for not having an evidence-based foundation. In this study, we adapt the design of mHeart to communicate automatically with the UCSD Electronic Health Record to help healthcare providers have access to psychosocial aspects of patient's care outside of the direct hospital system. In particular, the provider will be able to view logs of patient activity, dietary choices, and other lifestyle choices. The provider will also be able to send feedback to the patient to alter behavior.
Student opportunities:
- Help modify smartphone app to make use of healthcare connection protocols like Apple HealthKit and Google Fit
- Understand interfaces that communicate with electronic health records (like FHIR)
- Help design point-to-point interface between smartphone app and electronic health record data, which is presented to provider
- Participate in meetings designing pilot study to test app performance
-
Systematic review and meta-analysis of hospital readmission for patients with cirrhosis.
Last Updated:Patients with cirrhosis, a late stage of chronic liver disease, are at increased risk of hospitalization and hospital readmission. Although several studies have looked at models for predicting readmission for patients with cirrhosis, they are limited by small sample sizes, limited candidate predictor variables, and limited evaluation of discrimination and calibration. A systematic review and meta-analysis of available evidence can help shed new light on the problem, and help identify modifiable risk factors.
Student responsibilities:
- Understand the basics of a systematic review
- Perform literature review
- Abstract necessary information in case report forms and help perform meta-analysis
- Help write manuscript
Tsung-Ting Kuo | Biomedical Informatics
Dr. Tsung-Ting Kuo is an Assistant Professor of Medicine in University of California San Diego (UCSD) Health Department of Biomedical Informatics (DBMI). He is mainly conducting biomedical, healthcare and genomic studies based on blockchain and predictive modeling. His research focuses on blockchain technologies, machine learning, and natural language processing.
-
Developing privacy-preserving predictive modeling algorithms on blockchain networks
Last Updated:Predictive modeling can advance research and facilitate quality improvement initiatives and substantiate research results, especially when data from multiple healthcare systems can be included. However, current, state-of-the-art privacy-preserving predictive modeling frameworks are still centralized, in other words, the models from distributed sites are integrated in a central server to build a global model. This centralization carries several risks, e.g., single-point-of-failure at the central server. To improve the security and robustness of predictive modeling frameworks, we will develop and implement novel and advanced algorithms on decentralized blockchain networks (a distributed ledger/database technology adopted by crypto-currencies such as Bitcoin and Ethereum) to build better models. The outcome will be algorithms that improve the predictive power of data from multiple healthcare systems through a distributed system. Selected references: PMID 36402113, 34923447, and 31943009.
Amit Majithia | School of Medicine
Insulin resistance is a major cause of the epidemic diseases of our society: diabetes, heart attacks, strokes, and fatty liver. Our goal is to understand who develops insulin resistance, how, and why. We use longitudinal health records, functional genomics, and human genetics to address this goal. In addition to this discovery science, we focus on clinical translation by developing ML/AI-driven clinical tools to interpret large scale genetic, genomic, and longitudinal health data to diagnose and treat people with diseases related to insulin resistance.
-
aiDose: A foundation model for clinical decision support in hospitalized patients with diabetes
Last Updated:Hospitalized patients with diabetes require daily management by a team of clinicians that must make daily insulin dosing decisions. While each decision is usually made according to standard algorithms, hundreds of such decisions must be made daily by a few providers leading to errors and burnout; this project will address this clinical burden by developing aiDose, a foundation model for inpatient diabetes management that can incorporate medical records and previous clinician notes to provide decision support for inpatient diabetes services.
-
Longitudinal metabolic analysis of Diabetes Prevention Program (DPP) participants to identify patient subgroups with differential micro and macrovascular complication risk
Last Updated:Type 2 Diabetes (T2D) complications cause morbidity and mortality, but occur heterogeneously among those at risk and thus are difficult to predict. Previous studies to identify individuals at risk of diabetic complications focus on single timepoint data for a few features and do not examine phenotypic variables over time. This project will analyze multiple longitudinal clinical phenotypes to identify clusters of individuals at risk of diabetic complications.
-
Pre-adipocyte cell fate reprogramming
Last Updated:Adipose tissue secretes cytokines to regulate essential functions, but comprehensive study is prevented by difficult-to-access depots such as visceral epicardial adipose tissue and differences between donor sources and methods to generate cell lines. This project will start with an isogenic population of cultured human preadipocytes and reprogram them to specific types of adipocytes using a combinatorial process analogous to that of induced pluripotent stem cells. This project includes working with single cell (sc)-RNAseq and sc-ATACseq datasets using XGBoost and other models.
Amy Sitapati | Biomedical Informatics
The Sitapati Lab is an operational & translational space with expertise in the following domains: (1) clinical informatics, (2) population health (i.e. registries, outreach), (3) quality informatics, (4) vital records informatics, (5) NIH All of Us researcher workbench. The lab includes teams from Information Services at UCSD, the CalIVRS team, Quality and Patient Safety, and the Population Health Services Organization to name a few.
-
EMR based registries
Last Updated:Our IS Population Health team typically builds registries in 90 day cycles that complete the organizational needs and mission. These vary but require workflow, data cleaning/mapping, creation of metrics.
-
QIP: Public health quality informatics
Last Updated:UCSD has an active quality improvement program that advances health to patients. Within the program there are opportunities to improve data quality, outreach campaigns, and outcomes measurement as part of quality informatics. Most projects would last 6-24 months.
-
Vital Records Informatics
Last Updated:Advanced processes that aim to modernize vital records for public health purposes such as interoperability, usability, and accessibility are needed. Projects that evaluate current state with primary outcomes description of future state and manuscript could be helpful to the field advancement.
Benjamin Smarr | Bioengineering
My research focuses on time series analysis in biological systems, with an emphasis on practical information extraction for translational applications. The lab is divided into applications and approaches, though these all serve each other, and students collaborate routinely. Indeed, a positive attitude and an eagerness to support one another is requisite in the lab. **Applications include but are not limited to: illness detection, prediction, and recovery monitoring; pregnancy detection and outcome forecasting; mental health monitoring; defining sleep in the body (as opposed to EEG); diabetes forecasting; and carbon footprint optimization of distributed computer systems. **Approaches include, but are not limited to: multimodal time series information extraction; differentiating multiple outcome types from random assortment; reduction of high dimensional spaces with both modality, individual, and time series components; explicable machine learning model development; non-stationary signal analysis; novel approaches do diversity mapping and phenotyping from physiology and behavior data. I seek to find a fit with each individual and the lab’s ongoing projects; no one comes in and is just given marching orders – you’ll do better work when it’s the work that you actually want to do!
-
COVID-19 recovery monitoring
Last Updated:Some individuals seem to have lingering or failed recoveries after COVID-19 infections. Students comfortable with basic programming or data science skills are encouraged to enhance our description of recovery profiles from TemPredict, and search for features that can contribute to pre-recovery classification.
-
Diversity within physiological data
Last Updated:Algorithms tend to be one size fits all, where as people are similar or dissimilar in complex and unmapped ways. Help map differences in normal routines, as well as in illness and recovery trajectories. These might arise from known demographic information, co-morbid conditions (diabetes, pregnancy, etc.), or be represent different patterns in illness associated with unknown or latent variables.
-
Improving women’s health outcomes
Last Updated:We have shown repeatedly in humans and animal models that females are as tractable with statistics as males (actually, often more than). Yet female physiology remains inappropriately understudied. Help us refine algorithms, map changes like pregnancy and menopause, and explore diversity within as well as across traditional sex categories.
Yingxiao (Peter) Wang | Bioengineering
Our research focuses on molecular engineering for cellular imaging and reprogramming, and image-based bioinformatics, with applications in stem cell differentiation and cancer treatment.
-
Image-based reconstruction of biochemical networks in live cells
Last Updated:Fluorescence resonance energy transfer (FRET)-based biosensors have been widely used in live-cell imaging to accurately visualize specific biochemical activities. We have developed the Fluocell image analysis software package to efficiently and quantitatively evaluate the intracellular biochemical signals in real-time, and to provide statistical inference on the biological implications of the imaging results. However, important questions arise on how to use these results to reconstruct the quantitative parameters in the underlying biochemical networks, which determine cellular functions and ultimately their fates. In this rotation project, we will integrate optimization-based machine learning approaches with biochemical network models to seek answers to these questions, with applications in cancer treatment against drug resistance.
-
Intelligent Diagnosis of Infectious Diseases by Deep Learning
Last Updated:The diagnosis of infectious diseases often requires tissue biopsy and microscopic examination by pathologists, which is time-consuming, labor-intensive, and error-prone. To develop a software-assisting system for identifying microorganisms on digital images, we utilize the convolutional neural network and transfer learning for training and validating an intelligent software system for the classification of pathology slides. The goal of this project is to provide a diagnosis of pathogens with high efficiency and accuracy. Students will work in an interdisciplinary team, collecting and labelling imaging data, developing deep-learning based algorithms and user interfaces, characterizing and optimizing the accuracy and functionality of the software package.
Dominik Wodarz | Biological Sciences
We study mathematical and computational models of biomedical processes, with a focus on infection, the immune system, and cancer. We also study mathematical models of evolutionary processes and develop evolutionary theory. We aim to couple mathematical modeling work with data from the relevant fields through collaborations with experimental and clinical laboratories.
-
Evolutionary theory
Last Updated:This project develops basic evolutionary theory, with relevance to biomedical applications. For example, we study the evolution and emergence of mutants in spatially structured populations under various assumptions. Much remains to be discovered about the principles of mutant evolution in structured populations, and this has important applications for cancer biology and cancer therapy, since most tumor grow as a mass of cells with strong spatial structure.
-
Mathematical models on in vivo virus dynamics
Last Updated:The project will be concerned with mathematical models of virus replication within hosts, and the interactions of viruses with immune responses. Much of this modeling work is concerned with human immunodeficiency virus (HIV), due to the availability of experimental and clinical data. Topics include the evolution of HIV within hosts, the effect of spatial lymphoid tissue structure on HIV dynamics and evolution, and the dynamics of HIV during antiretroviral therapy in relation to the latent viral reservoir.
-
Mathematical Oncology
Last Updated:This project is concerned with mathematical models of cancer initiation, cancer progression, and cancer therapy. This involves mathematical models of tissue stem cell dynamics, clonal cellular evolution in tissues during aging in relation to the development of cancer, and evolutionary models of drug resistance in cancers. Hematological malignancies are a major focus of this work. With respect to therapies and drug resistance, this work involves the use of mathematical models with patient-specific parameters to make personalized predictions about treatment outcome.
Rose Yu | Computer Science and Engineering
My research interests lie primarily in machine learning, especially for large-scale spatiotemporal data. I am generally interested in deep learning, optimization, and spatiotemporal reasoning. I am particularly excited about the interplay between physics and machine learning. My work has been applied to learning dynamical systems in sustainability, health and physical sciences.
-
Automatic Blood Pressure Control with Machine Learning
Last Updated:This project seeks to develop novel deep learning methods to forecast and control patients blood pressure using large-scale sensor data from artificial heart pump.