Biomedical Informatics Graduate Rotation Projects

This page is updated annually. Some projects may already be taken, and new projects may be available. The projects below give an indication of the types of projects available in each lab, but please browse faculty web pages and contact professors directly to discuss current opportunities.

View Rotation Projects by Faculty: BISB or BMI

Also see the Division of Biomedical Informatics projects page.

Labs with BMI Rotation Projects

Tiffany Amariuta | Halıcıoğlu Data Science Institute

tamariutabartell@ucsd.edu

Ferhat Ay | La Jolla Institute for Immunology

Vineet Bafna | Computer Science and Engineering

vbafna@ucsd.edu

Vikas Bansal | Pediatrics

Joseph Califano | Surgery

Christine Cheng | Psychiatry

Kelly Frazer | Pediatrics

kafrazer@health.ucsd.edu

Lilia Iakoucheva | Psychiatry

Rob Knight | Pediatrics

Jejo Koola | Biomedical Informatics

Hojun Li | Pediatrics

hojun@health.ucsd.edu

Amit Majithia | School of Medicine

Pia Pannaraj | Pediatrics

ppannaraj@ucsd.edu

Amy Sitapati | Biomedical Informatics

Benjamin Smarr | Bioengineering

Yingxiao (Peter) Wang | Bioengineering

Dominik Wodarz | Biological Sciences

Rose Yu | Computer Science and Engineering

roseyu@ucsd.edu

Tiffany Amariuta

Tiffany Amariuta
| Halıcıoğlu Data Science Institute

tamariutabartell@ucsd.edu | Lab

We are a statistical genetics lab focusing on developing methods to study complex traits and polygenic diseases across global populations, with a specific focus on minority groups that have been underrepresented in the fields of genetics and genomics. We are interested in developing novel multi-ancestry statistical methods for fine-mapping disease genes and their cell types of action. The goal of this research is to identify targets for gene-based therapeutics.

Mapping the genetic architecture of polygenic disease, complex traits, and gene expression levels
Last Updated: 10/07/2024
The Amariuta Lab is happily accepting PhD rotation students. We have a variety of predefined projects but are also open to student-led projects and ideas that fall within the general scope of research in our lab. One potential rotation project involves the investigation of different ways to estimate gene expression heritability (e.g., the proportion of gene expression variance that can be explained by genetics), including using cutting edge fine-mapping algorithms and single cell RNA-sequencing data. A second potential project involves developing a new method to map the cell-type-specificity of the genetic component of gene expression regulation, which is often confounded by strong patterns of co-expression and co-regulation across genes and other cell types. A third potential project involves mapping the causal genes underlying putative causal tissues and cell types mapped via our previously published method called Tissue Co-regulation SCore regression (TCSC). A fourth potential project involves identifying putative causal tissues and cell types underlying the genetic correlation and hence pleiotropy of immune-mediated diseases. Lastly, we are also beginning to explore deep learning models to use sequence data to predict gene expression levels in novel ways and would welcome students who wish to gain skills in any of these areas.

Ferhat Ay

Ferhat Ay
| La Jolla Institute for Immunology

ferhatay@lji.org | Lab

We are interested in the analysis and modeling of the three-dimensional chromatin structure from high-throughput sequencing experiments. We develop methods that are based in statistics, machine learning, optimization and graph theory to understand how changes in the 3D genome affect cellular outcome such as development, differentiation and gene expression. We have ongoing interests in the systems level analysis and reconstruction of regulatory networks, inference of enhancer-promoter contacts, predictive models of gene expression and integration of three-dimensional chromatin structure with one-dimensional epigenetic measurements in the context of cancer, malaria, asthma and several autoimmune diseases.

Integrative analysis of multi cell-type gene expression and epigenomic data in tumor immune response
Last Updated: 04/15/2018
This project will focus on developing regulatory network inference methods for the joint analysis of gene expression and histone modification data from several different types of tumor infiltrating lymphocytes, which are gathered from a cohort of patients with solid tumors.
Predictive and comparative modeling of epigenetic gene regulation in different human immune cell types
Last Updated: 04/15/2018
The goal of this project is to model the natural variation in gene expression across many immune cell types using an already established database at LJI (https://dice-database.org) and to identify cell type-specific epigenetic regulators of important immune genes.
Statistical methods for inferring functional DNA-DNA contacts from Hi-C and HiChIP/PLAC-seq data
Last Updated: 04/15/2018
This project focuses on developing computational tools for better analysis of the wealth of data from chromosome conformation capture assays with the ultimate goal of inferring functional chromatin contacts such as those between enhancers and promoters.

Vineet Bafna

Vineet Bafna
| Computer Science and Engineering

vbafna@ucsd.edu | Profile | Lab

Our lab is focused on design and implementation of algorithms for biological data interpretation. Within this broad framework, we have a number of open projects relating to problems in proteomics (interpretation of mass spectrometry data), genetics, and genomics. The projects listed below are a small sampling of available projects. Interested students should be have taken a class in algorithms design, and have some facility with machine learning approaches.

Extrachromosomal DNA analysis
Last Updated: 10/07/2024
Extrachromosomal DNA formation is an important pathological condition found in nearly a third of cancers and all cancer subtypes. Our lab is developing computational tools to characterize their structural and functional properties of ecDNA and related focal amplifications.

The interested students should have an interest in learning about, designing and implementing graph algorithms, and should commit to taking my winter class CSE280A.

Vikas Bansal

Vikas Bansal
| Pediatrics

vibansal@ucsd.edu | Lab

Research in our lab is focused on developing computational methods and tools for variant calling in human genomes and using these tools for disease association studies. We focus on challenging variant types such as haplotypes and variants in repetitive regions and work with both short-read (Illumina) and long-read sequencing technologies.

Duplicated genes and association with disease
Last Updated: 06/11/2021
Hundreds of duplicated genes in the human genome are duplicated and many are known to be associated with a number of human diseases. However, the short read lengths of current sequencing technologies make the analysis of such genes difficult. We have developed novel tools to genotype the copy number of duplicated genes using whole-genome sequencing. The goal of this project is to analyze large-scale sequencing datasets (using cloud computing platforms) for Mendelian and complex human diseases to identify novel disease associations.
Haplotype-based variant calling using long-read sequencing
Last Updated: 06/11/2021
Long-read sequencing technologies have the potential to overcome some of the key limitations of short-read sequencing, particular in long repetitive regions of the human genome, but require the development of new algorithms. We have previously developed computational methods for variant calling (Longshot, Nature Communications 2019) and read mapping in segmental duplications (Duplomap, Nucleic Acids Research 2020) using long-read sequencing technologies. The goal of this project is to implement a haplotype-based model for variant calling using long reads that automatically identifies genomic regions that can be called with high confidence.

Joseph Califano

Joseph Califano
| Surgery

jcalifano@health.ucsd.edu | Profile | Lab

Chromatin dysregulation and DNA methylation at transcription start sites associated with transcriptional repression in cancers
Last Updated: 07/13/2021
Chromatin dysregulation and DNA methylation at transcription start sites associated with transcriptional repression in cancers
Genetic and epigenetic analysis of HPV-positive and HPV-negative Oropharyngeal Squamous Cell Carcinoma
Last Updated: 07/13/2021
The Lab of Dr. Joseph Califano under the sponsorship of the Japan Society for the Promotion of Science (JSPS) will conduct collaborative research on a new strategy for the treatment for HPV-associated oropharyngeal cancer based on comprehensive epigenetic analysis. This year, we are proud to congratulate Postdoctoral Researcher, Dr. Takuya Nakagawa for his work entitled “Genetic and epigenetic analysis of HPV-positive and HPV-negative Oropharyngeal Squamous Cell Carcinoma”. Dr. Nakagawa graduated from Chiba University School of Medicine in Japan where he also received the Medical Pharmacy Director’s Award.

Christine Cheng

Christine Cheng
| Psychiatry

csc003@ucsd.edu | Profile

Dr. Cheng’s research focuses on transcriptional regulatory network and aims to develop a comprehensive understanding of how aberrant regulatory circuits contribute to human disease. Dr. Cheng’s laboratory is particularly interested in understanding transcriptional and epigenetic regulation of the interplay between the immune system and central nervous system in neurodegenerative diseases, substance use disorder and HIV infection. Current projects focus on applying single-cell transcriptomics and epigenetics assays to characterize Alzheimer’s disease, HIV and opioid use disorder patient samples, with the goal of finding diagnostic markers and therapeutic targets. Dr. Cheng’s lab also has developed 3D brain organoid models for Alzheimer’s disease and HIV infection. Dr. Cheng received her M.S. degree in Computer Science from Stanford University, and she received her Ph.D. degree in Bioinformatics and Systems Biology from University of California, San Diego. After completing her doctoral study, Dr. Cheng did her postdoctoral training at the Broad Institute of MIT and Harvard.

3D brain organoid model of Alzheimer’s disease revealed by single cell transcriptomics
Last Updated: 08/03/2022
We developed a novel tau propagation model using 3D spheroid model that rapidly develop tau pathology and neurodegeneration in just three weeks. Single cell transcriptomics of the model reveals cell type specific changes that resemble transcriptomic signatures from Alzheimer’s disease postmortem brain.
Single cell transcriptomics and epigenetics of human Alzheimer’s disease brain
Last Updated: 08/03/2022
To understand cell type specific vulnerability of Alzheimer’s disease, we utilize snRNA-seq to characterize human brain tissues from Alzheimer’s disease patients across different brain regions.
Single cell transcriptomics and epigenetics of the opioid use disorder and HIV syndemic in the human brain
Last Updated: 08/03/2022
As part of the NIH NIDA SCORCH consortium, we will dissect the dysregulated molecular circuitry in the brains of individuals with opioid use disorder and/or HIV infection. This project aims to identify genes that contribute to opioid use disorder and HIV-associated neurocognitive disorders. These approaches could lead to novel gene therapies to control and perhaps reverse the relentless disease state. We are in the process of generating snRNA-seq and snATAC-seq profiles from more than 300 patient samples across 3 different brain regions.
Single Cell Transcriptomics of the Cocaine Use Disorder in the Context of HIV
Last Updated: 08/03/2022
As part of the NIH NIDA SCORCH consortium, we will dissect the dysregulated molecular circuitry in the brains of individuals with cocaine use disorder and/or HIV infection. This project will focus on understanding how neurovasculature and neuroimmune cells contribute to cocaine use disorder and HIV-associated neurocognitive disorders. We will be generating snRNA-seq and snATAC-seq profiles from more than 300 patient samples across 3 different brain regions.

Kelly Frazer

Kelly Frazer
| Pediatrics

kafrazer@health.ucsd.edu | Profile | Lab

Welcome to the Frazer Lab! We are using two complementary approaches to achieve our goal of identifying and characterizing functional human genetic variants. Our first approach utilizes iPSCORE, a resource that was generated to enable both familial and association-based genetic studies of molecular and physiological phenotypes in induced pluripotent stem cells (iPSCs) and derived cell types. Our second approach involves conducting association studies in well-characterized cohorts with the goal of identifying variants that play roles in human disease and to assess their contributions to disease pathogenesis, progression, and prognosis.

Investigate fetal-specific cardiac regulatory variants and their overlap with cardiac GWAS lead variants
Last Updated: 11/19/2019
We have derived iPSC-CVPCs from 180 individuals and showed that their transcriptomes are more similar to fetal heart than to adult cardiac tissues. Our goal is to leverage these data in combination with WGS to perform eQTL analyses. We plan to assess whether fetal-specific eQTLs are associated with complex adult cardiac traits, by colocalizing eQTLs with summary statistics from GWAS (cardiac traits.) Our preliminary analyses show that eQTLs in iPSC-CVPCs identifies cardiac disease GWAS variants that are active in the fetal but not adult heart, indicating that they play a role in development. Our findings provide genetic evidence supporting the fetal origins of the cardiovascular disease hypothesis and highlight the importance of investigating genetic associations across stages of development (i.e. fetal and adult tissues) to fully understand the genetic underpinnings of complex traits and disease. We are looking for rotation students to conduct QTL analyses using large ATAC-seq and ChiP-seq for H3K27ac datasets generated from the iPSC-CVPCs.

Lilia Iakoucheva

Lilia Iakoucheva
| Psychiatry

lilyak@ucsd.edu | Profile | Lab

The lab has a variety of bioinformatics rotation projects aimed at improving understanding of the functional impact of autism risk mutations. We are using iPSC-derived human brain organoid models to investigate dysregulated cellular and molecular pathways behind these mutations. We are collecting single cell RNA-seq and single cell ATAC-seq (10x multiome) from organoids with ASD mutations. We also have several projects aiming at investigating the impact of environmental chemicals (PFAS) on fetal brain development using organoid models. Currently, we are looking for rotation students that have previous experience with scRNA-seq and/or scATAC-seq data analyses.

Investigating changes in cell type proportions, gene expression and gene regulation impacted by the 16p11.2 copy number variants
Last Updated: 03/18/2025
Copy number variants (CNVs) represent significant risk factors for Autism Spectrum Disorders (ASD). One of the most frequent CNVs involved in ASD is a deletion or duplication of the 16p11.2 CNV locus, spanning 29 protein-coding genes. Despite the progress in linking 16p11.2 genetic changes with the phenotypic (macrocephaly and microcephaly) abnormalities in the patients and model organisms, the specific molecular pathways impacted by this CNV remain unknown. We generated 10x multiome data from brain organoids carrying 16p11.2 deletions and duplications at various time points. The goal of this rotation project is to analyze these data to discover pathways and gene regulatory networks dysregulated by the 16p11.2 CNV. The student is required to have familiarity with single cell RNA data analyses.
Investigating the impact of Cul3 mutations in brain organoids
Last Updated: 03/18/2025
Cul3 (Cullin 3) ubiquitin ligase is one of the genes implicated in Autism Spectrum Disorders. We generated brain organoids with Cul3 haploinsufficiency, and collected 10x multiome data on these organoids. The goal of the rotation project is to analyze these data to identify cell types, genes and chromatin states impacted by this ASD mutation. The student is required to have familiarity with single cell RNA data analyses.

Rob Knight

Rob Knight
| Pediatrics

rknight@ucsd.edu | Lab

The Knight lab has broad interests in the human microbiome, the collection of trillions of microbes that inhabits our bodies, especially in developing techniques to read out these complex microbial communities and use the resulting data to understand human health, links between humans and the environment, and to prevent and cure disease. We offer a fast-paced environment with many collaborative opportunities on different projects.

Machine Learning for the Microbiome
Last Updated: 09/05/2017
We have amassed a database of microbial DNA sequences from hundreds of thousands of biological specimens. Understanding how these changes relate to disease requires a range of machine learning and multivariate statistical approaches. There are many opportunities ranging from entry-level (benchmarking classifier performance on specific sample sets) to extremely challenging (using deep learning to infer the structure of global sample set relationships).
Multi-omics integration
Last Updated: 09/05/2017
An increasing need is to integrate data from different "omics" level, e.g. genomes, metagenomes, metatranscriptomes, metaproteomes, metabolomes, immunological profiling, etc., into a single coherent picture separating healthy and disease states. Improved methods for performing this task, either directly or via intermediate representations such as mapping to metabolic and regulatory pathways, is essential for improving understanding. Projects in this category range from simple (testing where existing techniques like correlation networks or Procrustes analysis do/don't connect two specific data layers) to challenging (use transfer learning to integrate heterogeneous data layers and improve the underlying network annotation). An especially exciting emerging research direction here is XAI (explainable artificial intelligence), which can provide for clinical applications a better justification for a specific classification or suggestion.
Optimizing microbiome algorithms
Last Updated: 09/05/2017
Many algorithms used in microbiome studies, especially in metagenomic assembly, are extremely computationally expensive. Opportunities exist for either exploiting new hardware architectures to accelerate existing algorithms, or for developing new approximate algorithms, to tackle problems in the workflow including inferring taxonomy and function from DNA sequence data, genome and metagenome assembly and annotation, computing community distance metrics from sparse compositional data, and high-level analyses of hundreds of thousands of microbiomes. Again these projects range from entry level (compare results of two multiple sequence alignment techniques for subsequent community analysis) to advanced (use non-von Neumann architectures to perform pattern classification in real time at the whole community level for disease detection).

Jejo Koola

Jejo Koola
| Biomedical Informatics

jkoola@ucsd.edu | Profile

Dr. Koola is a physician scientist specializing in Biomedical Informatics and Hospital Medicine. He specializes in the area of big data machine learning for predictive analytics. In particular, he is interested in using electronic health records to improve care delivery--particularly for patients with advanced liver disease. Using risk prediction models in a healthcare context requires understanding of: (i) the healthcare system of intended use; (ii) risk model building; (iii) risk model assessment; and (iv) risk model re-calibration. Additionally, Dr. Koola is interested in visual analytics, data modeling, and health services research.

Designing the "Green Button" informatics consult service using big data analytics for personalized medicine
Last Updated: 09/07/2017
In 2012 the Institute of Medicine released a desiderata for a learning healthcare system, where evidence informs practice and practice informs evidence. Though the randomized clinical trial (RCT) serves as the gold standard for informing clinical decisions, flaws exist in terms of achieving recruitment, overly stringent inclusion/exclusion criteria, and lack of patient-centered decision making. Observational cohort studies have grown as an important complement to RCTs allowing comparative effectiveness research and patient-centered trials. The surge of Electronic Health Records (EHR) and its resulting zettabyte of data5 allows us to realize this vision for the first time. Despite the growth of observational cohort studies, challenges still remain bringing the knowledge from the bench-to-the-bedside; moreover, model performance degrades when used in a cohort outside of its development.

To ameliorate these difficulties, we propose to launch and study a novel “informatics consult” service. The service would allow clinicians, when no clear evidence based guidelines exist regarding care decisions, to query the UCSD clinical data warehouse by identifying patients similar to the index case. First proposed in the seminal “Green Button” paper by Longhurst et al., such a system would leverage our ability to truly deliver personalized, patient-centered care. Small-scale limited efforts have been put into practice to answer questions regarding treatment of melanoma8 and systemic lupus erythematosus complications. We note, however, the opportunity for a much larger service with broad impact starting with insights borne of data from UCSD, and potentially mining insights from the entire state-wide UC Health data warehouse.

We note several novel challenges to this proposed system: (i) Performing semi-automated phenotyping so that we can identify clinical outcomes of interest10. (ii) Identifying patients that are similar to the index patient (often called clustering). (iii) Incorporating automated, computable search regarding guideline recommended care. (iv) Performing visual analytics to understand similarity of cohorts. (v) Communication of probability and statistical information to healthcare professionals so they can effectively manage uncertainty.

Student responsibilities:
1. Participate in project meetings
2. Help design one of several possible algorithms/interfaces:
  a. patient clustering algorithm using unsupervised learning
  b. visual analytic interface for describing similar cohort of patients
  c. visual analytic interface to help communicate statistical risk
Integrating patient reported outcomes into the electronic health record to improve cardiovascular care.
Last Updated: 09/07/2017
Unhealthy dietary choices—a lack of nutritious foods and an excess of unhealthy food—was shown as the major contributor in the 400,000 U.S. deaths in 2015 from cardiovascular diseases (CVD). Eating more nuts, vegetables, and whole grains, and less salt and trans fats, could save tens of thousands of lives in the U.S. each year. Obesity is one critical outcome of poor diet, which also contributes to heightened CVD risk. Thousands of smartphone apps are available to download for weight loss, but these apps primarily focus on caloric intake, rather than the overall quality of diet and lifestyle critical for CVD prevention.

Mobile Health (mHealth) applications also have not been systematically tested for their effectiveness and are criticized for not having an evidence-based foundation. In this study, we adapt the design of mHeart to communicate automatically with the UCSD Electronic Health Record to help healthcare providers have access to psychosocial aspects of patient's care outside of the direct hospital system. In particular, the provider will be able to view logs of patient activity, dietary choices, and other lifestyle choices. The provider will also be able to send feedback to the patient to alter behavior.

Student opportunities:
1. Help modify smartphone app to make use of healthcare connection protocols like Apple HealthKit and Google Fit
2. Understand interfaces that communicate with electronic health records (like FHIR)
3. Help design point-to-point interface between smartphone app and electronic health record data, which is presented to provider
4. Participate in meetings designing pilot study to test app performance
Systematic review and meta-analysis of hospital readmission for patients with cirrhosis.
Last Updated: 09/07/2017
Patients with cirrhosis, a late stage of chronic liver disease, are at increased risk of hospitalization and hospital readmission. Although several studies have looked at models for predicting readmission for patients with cirrhosis, they are limited by small sample sizes, limited candidate predictor variables, and limited evaluation of discrimination and calibration. A systematic review and meta-analysis of available evidence can help shed new light on the problem, and help identify modifiable risk factors.

Student responsibilities:
1. Understand the basics of a systematic review
2. Perform literature review
3. Abstract necessary information in case report forms and help perform meta-analysis
4. Help write manuscript

Hojun Li

Hojun Li
| Pediatrics

hojun@health.ucsd.edu | Profile | Lab

The Li lab is focused on using single cell resolution profiling of the continuum of stem and progenitor cell states in hematopoiesis to identify basic mechanisms and novel treatments for blood-based disorders.

Single cell dissection of blood cell formation and regeneration
Last Updated: 04/22/2025
The Li lab has generated multiple single cell datasets of normal, perturbed, and regenerative hematopoiesis that serve as a launching pad for exploration of novel bioinformatic approaches to reveal biology relevant to understanding and treating blood disorders. Rotation projects leveraging existing high-value datasets are available for prospective graduate students.

Amit Majithia

Amit Majithia
| School of Medicine

amajithia@ucsd.edu | Profile | Lab

Insulin resistance is a major cause of the epidemic diseases of our society: diabetes, heart attacks, strokes, and fatty liver. Our goal is to understand who develops insulin resistance, how, and why. We use longitudinal health records, functional genomics, and human genetics to address this goal. In addition to this discovery science, we focus on clinical translation by developing ML/AI-driven clinical tools to interpret large scale genetic, genomic, and longitudinal health data to diagnose and treat people with diseases related to insulin resistance.

Longitudinal metabolic analysis of Diabetes Prevention Program (DPP) participants to identify patient subgroups with differential micro and macrovascular complication risk
Last Updated: 11/01/2024
Type 2 Diabetes (T2D) complications cause morbidity and mortality, but occur heterogeneously among those at risk and thus are difficult to predict. Previous studies to identify individuals at risk of diabetic complications focus on single timepoint data for a few features and do not examine phenotypic variables over time. This project will analyze multiple longitudinal clinical phenotypes to identify clusters of individuals at risk of diabetic complications.
Pre-adipocyte cell fate reprogramming
Last Updated: 11/01/2024
Adipose tissue secretes cytokines to regulate essential functions, but comprehensive study is prevented by difficult-to-access depots such as visceral epicardial adipose tissue and differences between donor sources and methods to generate cell lines. This project will start with an isogenic population of cultured human preadipocytes and reprogram them to specific types of adipocytes using a combinatorial process analogous to that of induced pluripotent stem cells. This project includes working with single cell (sc)-RNAseq and sc-ATACseq datasets using XGBoost and other models.

Pia Pannaraj

Pia Pannaraj
| Pediatrics

ppannaraj@ucsd.edu | Profile | Lab

Our research focuses on understanding how human milk, the gut microbiome, immunizations, and repeated infections impact infant immune development and protection from infectious diseases. Studies include integration of transcriptomics, metagenomics, and proteomics with immunological data.

Mother-Infant Multi-omics Study
Last Updated: 11/17/2025
Multiple projects available with longitudinal data ready to be analyzed including metagenomic, transcriptomic, and proteomic data associated with clinical and immunologic profiles from human studies.

Amy Sitapati

Amy Sitapati
| Biomedical Informatics

asitapati@ucsd.edu | Profile | Lab

The Sitapati Lab is an operational & translational space with expertise in the following domains: (1) clinical informatics, (2) population health (i.e. registries, outreach), (3) quality informatics, (4) vital records informatics, (5) NIH All of Us researcher workbench. The lab includes teams from Information Services at UCSD, the CalIVRS team, Quality and Patient Safety, and the Population Health Services Organization to name a few.

EMR based registries
Last Updated: 07/26/2023
Our IS Population Health team typically builds registries in 90 day cycles that complete the organizational needs and mission. These vary but require workflow, data cleaning/mapping, creation of metrics.
QIP: Public health quality informatics
Last Updated: 07/26/2023
UCSD has an active quality improvement program that advances health to patients. Within the program there are opportunities to improve data quality, outreach campaigns, and outcomes measurement as part of quality informatics. Most projects would last 6-24 months.
Vital Records Informatics
Last Updated: 07/26/2023
Advanced processes that aim to modernize vital records for public health purposes such as interoperability, usability, and accessibility are needed. Projects that evaluate current state with primary outcomes description of future state and manuscript could be helpful to the field advancement.

Benjamin Smarr

Benjamin Smarr
| Bioengineering

bsmarr@ucsd.edu | Profile | Lab

My research focuses on time series analysis in biological systems, with an emphasis on practical information extraction for translational applications. The lab is divided into applications and approaches, though these all serve each other, and students collaborate routinely. Indeed, a positive attitude and an eagerness to support one another is requisite in the lab. **Applications include but are not limited to: illness detection, prediction, and recovery monitoring; pregnancy detection and outcome forecasting; mental health monitoring; defining sleep in the body (as opposed to EEG); diabetes forecasting; and carbon footprint optimization of distributed computer systems. **Approaches include, but are not limited to: multimodal time series information extraction; differentiating multiple outcome types from random assortment; reduction of high dimensional spaces with both modality, individual, and time series components; explicable machine learning model development; non-stationary signal analysis; novel approaches do diversity mapping and phenotyping from physiology and behavior data. I seek to find a fit with each individual and the lab’s ongoing projects; no one comes in and is just given marching orders – you’ll do better work when it’s the work that you actually want to do!

COVID-19 recovery monitoring
Last Updated: 04/25/2023
Some individuals seem to have lingering or failed recoveries after COVID-19 infections. Students comfortable with basic programming or data science skills are encouraged to enhance our description of recovery profiles from TemPredict, and search for features that can contribute to pre-recovery classification.
Diversity within physiological data
Last Updated: 04/25/2023
Algorithms tend to be one size fits all, where as people are similar or dissimilar in complex and unmapped ways. Help map differences in normal routines, as well as in illness and recovery trajectories. These might arise from known demographic information, co-morbid conditions (diabetes, pregnancy, etc.), or be represent different patterns in illness associated with unknown or latent variables.
Improving women’s health outcomes
Last Updated: 04/25/2023
We have shown repeatedly in humans and animal models that females are as tractable with statistics as males (actually, often more than). Yet female physiology remains inappropriately understudied. Help us refine algorithms, map changes like pregnancy and menopause, and explore diversity within as well as across traditional sex categories.

Yingxiao (Peter) Wang

Yingxiao (Peter) Wang
| Bioengineering

yiw015@ucsd.edu | Profile | Lab

Our research focuses on molecular engineering for cellular imaging and reprogramming, and image-based bioinformatics, with applications in stem cell differentiation and cancer treatment.

Image-based reconstruction of biochemical networks in live cells
Last Updated: 04/20/2018
Fluorescence resonance energy transfer (FRET)-based biosensors have been widely used in live-cell imaging to accurately visualize specific biochemical activities. We have developed the Fluocell image analysis software package to efficiently and quantitatively evaluate the intracellular biochemical signals in real-time, and to provide statistical inference on the biological implications of the imaging results. However, important questions arise on how to use these results to reconstruct the quantitative parameters in the underlying biochemical networks, which determine cellular functions and ultimately their fates. In this rotation project, we will integrate optimization-based machine learning approaches with biochemical network models to seek answers to these questions, with applications in cancer treatment against drug resistance.
Intelligent Diagnosis of Infectious Diseases by Deep Learning
Last Updated: 04/20/2018
The diagnosis of infectious diseases often requires tissue biopsy and microscopic examination by pathologists, which is time-consuming, labor-intensive, and error-prone. To develop a software-assisting system for identifying microorganisms on digital images, we utilize the convolutional neural network and transfer learning for training and validating an intelligent software system for the classification of pathology slides. The goal of this project is to provide a diagnosis of pathogens with high efficiency and accuracy. Students will work in an interdisciplinary team, collecting and labelling imaging data, developing deep-learning based algorithms and user interfaces, characterizing and optimizing the accuracy and functionality of the software package.

Dominik Wodarz

Dominik Wodarz
| Biological Sciences

dwodarz@ucsd.edu | Profile

We study mathematical and computational models of biomedical processes, with a focus on infection, the immune system, and cancer. We also study mathematical models of evolutionary processes and develop evolutionary theory. We aim to couple mathematical modeling work with data from the relevant fields through collaborations with experimental and clinical laboratories.

Evolutionary theory
Last Updated: 12/11/2023
This project develops basic evolutionary theory, with relevance to biomedical applications. For example, we study the evolution and emergence of mutants in spatially structured populations under various assumptions. Much remains to be discovered about the principles of mutant evolution in structured populations, and this has important applications for cancer biology and cancer therapy, since most tumor grow as a mass of cells with strong spatial structure.
Mathematical models on in vivo virus dynamics
Last Updated: 12/11/2023
The project will be concerned with mathematical models of virus replication within hosts, and the interactions of viruses with immune responses. Much of this modeling work is concerned with human immunodeficiency virus (HIV), due to the availability of experimental and clinical data. Topics include the evolution of HIV within hosts, the effect of spatial lymphoid tissue structure on HIV dynamics and evolution, and the dynamics of HIV during antiretroviral therapy in relation to the latent viral reservoir.
Mathematical Oncology
Last Updated: 12/11/2023
This project is concerned with mathematical models of cancer initiation, cancer progression, and cancer therapy. This involves mathematical models of tissue stem cell dynamics, clonal cellular evolution in tissues during aging in relation to the development of cancer, and evolutionary models of drug resistance in cancers. Hematological malignancies are a major focus of this work. With respect to therapies and drug resistance, this work involves the use of mathematical models with patient-specific parameters to make personalized predictions about treatment outcome.

Rose Yu

Rose Yu
| Computer Science and Engineering

roseyu@ucsd.edu | Profile | Lab

My research interests lie primarily in machine learning, especially for large-scale spatiotemporal data. I am generally interested in deep learning, optimization, and spatiotemporal reasoning. I am particularly excited about the interplay between physics and machine learning. My work has been applied to learning dynamical systems in sustainability, health and physical sciences.

Automatic Blood Pressure Control with Machine Learning
Last Updated: 07/13/2021
This project seeks to develop novel deep learning methods to forecast and control patients blood pressure using large-scale sensor data from artificial heart pump.

Labs with BMI Rotation Projects

Tiffany Amariuta | Halıcıoğlu Data Science Institute

Ferhat Ay | La Jolla Institute for Immunology

Vineet Bafna | Computer Science and Engineering

Vikas Bansal | Pediatrics

Joseph Califano | Surgery

Christine Cheng | Psychiatry

Kelly Frazer | Pediatrics

Lilia Iakoucheva | Psychiatry

Rob Knight | Pediatrics

Jejo Koola | Biomedical Informatics

Hojun Li | Pediatrics

Amit Majithia | School of Medicine

Pia Pannaraj | Pediatrics

Amy Sitapati | Biomedical Informatics

Benjamin Smarr | Bioengineering

Yingxiao (Peter) Wang | Bioengineering

Dominik Wodarz | Biological Sciences

Rose Yu | Computer Science and Engineering

Tiffany Amariuta | Halıcıoğlu Data Science Institute

Mapping the genetic architecture of polygenic disease, complex traits, and gene expression levels

Ferhat Ay | La Jolla Institute for Immunology

Integrative analysis of multi cell-type gene expression and epigenomic data in tumor immune response

Predictive and comparative modeling of epigenetic gene regulation in different human immune cell types

Statistical methods for inferring functional DNA-DNA contacts from Hi-C and HiChIP/PLAC-seq data

Vineet Bafna | Computer Science and Engineering

Extrachromosomal DNA analysis

Vikas Bansal | Pediatrics

Duplicated genes and association with disease

Haplotype-based variant calling using long-read sequencing

Joseph Califano | Surgery

Chromatin dysregulation and DNA methylation at transcription start sites associated with transcriptional repression in cancers

Genetic and epigenetic analysis of HPV-positive and HPV-negative Oropharyngeal Squamous Cell Carcinoma

Christine Cheng | Psychiatry

3D brain organoid model of Alzheimer’s disease revealed by single cell transcriptomics

Single cell transcriptomics and epigenetics of human Alzheimer’s disease brain

Single cell transcriptomics and epigenetics of the opioid use disorder and HIV syndemic in the human brain

Single Cell Transcriptomics of the Cocaine Use Disorder in the Context of HIV

Kelly Frazer | Pediatrics

Investigate fetal-specific cardiac regulatory variants and their overlap with cardiac GWAS lead variants

Lilia Iakoucheva | Psychiatry

Investigating changes in cell type proportions, gene expression and gene regulation impacted by the 16p11.2 copy number variants

Investigating the impact of Cul3 mutations in brain organoids

Rob Knight | Pediatrics

Machine Learning for the Microbiome

Multi-omics integration

Optimizing microbiome algorithms

Jejo Koola | Biomedical Informatics

Designing the "Green Button" informatics consult service using big data analytics for personalized medicine

Integrating patient reported outcomes into the electronic health record to improve cardiovascular care.

Systematic review and meta-analysis of hospital readmission for patients with cirrhosis.

Hojun Li | Pediatrics

Single cell dissection of blood cell formation and regeneration

Amit Majithia | School of Medicine

Longitudinal metabolic analysis of Diabetes Prevention Program (DPP) participants to identify patient subgroups with differential micro and macrovascular complication risk

Pre-adipocyte cell fate reprogramming

Pia Pannaraj | Pediatrics

Mother-Infant Multi-omics Study

Amy Sitapati | Biomedical Informatics

EMR based registries

QIP: Public health quality informatics

Vital Records Informatics

Benjamin Smarr | Bioengineering

COVID-19 recovery monitoring

Diversity within physiological data

Improving women’s health outcomes

Yingxiao (Peter) Wang | Bioengineering

Image-based reconstruction of biochemical networks in live cells

Intelligent Diagnosis of Infectious Diseases by Deep Learning

Dominik Wodarz | Biological Sciences

Evolutionary theory

Mathematical models on in vivo virus dynamics

Mathematical Oncology

Rose Yu | Computer Science and Engineering

Automatic Blood Pressure Control with Machine Learning

Tiffany Amariuta
| Halıcıoğlu Data Science Institute

Ferhat Ay
| La Jolla Institute for Immunology

Vineet Bafna
| Computer Science and Engineering

Vikas Bansal
| Pediatrics

Joseph Califano
| Surgery

Christine Cheng
| Psychiatry

Kelly Frazer
| Pediatrics

Lilia Iakoucheva
| Psychiatry

Rob Knight
| Pediatrics

Jejo Koola
| Biomedical Informatics

Hojun Li
| Pediatrics

Amit Majithia
| School of Medicine

Pia Pannaraj
| Pediatrics

Amy Sitapati
| Biomedical Informatics

Benjamin Smarr
| Bioengineering

Yingxiao (Peter) Wang
| Bioengineering

Dominik Wodarz
| Biological Sciences

Rose Yu
| Computer Science and Engineering