I'm new to UCSD, so my lab is still taking shape. My research focuses on time series analysis in biological systems, with an emphasis on practical information extraction for translational applications. Currently the main project is TemPredict, in which we have gathered 50K people worth of wearable device data, coupled with over 1 million daily symptom reports. Collection is ongoing, as is the addition of 20K antibody tests to key participants to improve classification certainty from symptom and diagnosis self-reports. This massive data object is being used to identify signs of COVID-19 onset, progression, and recovery, as well to compare COVID-19 progression profiles with other conditions that arose over 2020 in these individuals. We believe this is the largest public data set of time series physiology data ever accumulated, so there are many additional uses for the creative physiology and data-minded student or collaborator.
Quantitative Foundations of Computational Biology
Key to Computational Biology are the approaches and algorithms for processing, analyzing and modeling knowledge, information and data, that are relevant to research questions from across the life sciences. The development of mathematical and computational methods with probabilistic, statistical, combinatorial, or heuristic foundations continue to drive innovation in Bioinformatics and Systems Biology.
We are interested in developing scalable methods for performing epidemiological analyses of large viral (primarily HIV) sequence and phylogenetic datasets. Topics of interest include large-scale phylogenetic analyses, developing novel models of sequence and tree evolution, performing epidemiological simulation experiments, and developing methods for predicting epidemic outcomes.
Research in the Mesirov Lab focuses on cancer genomics applying machine-learning methods to functional data derived from patient tumors. The lab analyzes these molecular data to determine the underlying biological mechanisms of specific tumor subtypes, to stratify patients according to their relative risks of relapse, and to identify candidate compounds for new treatments. The overall goal is to treat patients as individuals specific to their tumors. Importantly, the lab is committed to the development of practical, accessible software tools to bring these methods to the general biomedical research community.
The McVicker laboratory aims to understand how chromatin state and organization are encoded by the human genome. Our approach to this problem is to exploit naturally occurring human genetic variation to identify sequence variants that disrupt chromatin function. We are currently focused on chromatin within immune cells and we are also interested in how variants that affect chromatin and gene regulation lead to disease risk. The problems that we work on often require the development of sophisticated computational and statistical methods that can extract subtle signals from noisy experimental data.
Our lab specializes in reconstruction of evolutionary histories (phylogenies) from large scale datasets and applications of phylogenetic analyses to downstream analyses. Large-scale datasets include those with many genes and those with many species, and we focus on high accuracy and scalability at the same time. Many projects in this area are available, some of which are described below, but students can contact me to start on other projects as well.
We are interested in the interface of theoretical computer science and systems biology. By thinking computationally about the goals, constraints, and algorithmic strategies used by biological systems, we hope to advance both computer science (by developing new bio-inspired algorithms) and biology (by raising testable hypotheses and developing theory and models to predict system behavior).