Biomedical Informatics Graduate Rotation Projects

This page is updated annually. Some projects may already be taken, and new projects may be available. The projects below give an indication of the types of projects available in each lab, but please browse faculty web pages and contact professors directly to discuss current opportunities.

Labs with BMI Rotation Projects

Vikas Bansal,

Robert El-Kareh,

Yoav Freund,

Christopher Glass,

Chun-Nan Hsu,

Jina Huh,

Trey Ideker,

Rob Knight,

Scott Rifkin,

Michael Rosenfeld,

Jonathan Sebat,

Gene Yeo,

  • Gene Yeo, Cellular and Molecular Medicine

Vikas Bansal | Pediatrics

Vikas Bansal | Pediatrics

Email Contact: vibansal [at] ucsd.edu

Research in our lab is focused on human genetic variation and understanding its role in human disease using high-throughput sequencing technologies. We develop tools for the discovery of DNA sequence variants and identifying disease associated variants. Projects involve development of computational and statistical methods, analysis of data from DNA sequencing technologies and genetic association analysis.

Project: Identifying genetic variants associated with risk of diabetes

When: Any Quarter
Last updated: 08/12/2017

The goal of this project is utilize large-scale exome and whole-genome sequence datasets to identify genetic variants associated with risk of type 2 diabetes. Our focus is on genes previously associated with risk for diabetes in humans and model organisms. We have recently identified a novel low-frequency variant that causes early onset diabetes in a specific population.

Vikas Bansal | Pediatrics

Email Contact: vibansal [at] ucsd.edu

Research in our lab is focused on human genetic variation and understanding its role in human disease using high-throughput sequencing technologies. We develop tools for the discovery of DNA sequence variants and identifying disease associated variants. Projects involve development of computational and statistical methods, analysis of data from DNA sequencing technologies and genetic association analysis.

Project: Loss-of-function variants and RNA splicing

When: Any Quarter
Last updated: 08/12/2017

Loss-of-function variants represent the most deleterious form of variants in human genome. Predicting whether a variant will lead to loss-of-function is not always straightforward. Some variants annotated as loss-of-function variants by in-silico prediction tools can be 'rescued' by alternative splicing. Conversely, variants annotated as benign may lead to loss-of-function by activating cryptic splice sites. Our goal is to leverage large-scale exome and RNA-seq datasets to identify variants that have an 'atypical' impact on gene transcription.

Vikas Bansal | Pediatrics

Email Contact: vibansal [at] ucsd.edu

Research in our lab is focused on human genetic variation and understanding its role in human disease using high-throughput sequencing technologies. We develop tools for the discovery of DNA sequence variants and identifying disease associated variants. Projects involve development of computational and statistical methods, analysis of data from DNA sequencing technologies and genetic association analysis.

Project: Variant detection in duplicated genes in the genome

When: Any Quarter
Last updated: 08/12/2017

A significant fraction of the human genome consists of repetitive or duplicated sequences that are problematic for short-read sequencing technologies since reads generated from such regions cannot be uniquely aligned to the reference genome. Hundreds of duplicated genes are associated with rare and complex human diseases. Our goal is to develop variant calling methods that enable the detection of variants in duplicated genes using sequencing methods and technologies that contain long-range information.

Robert El-Kareh | School of Medicine

Robert El-Kareh | School of Medicine

Email Contact: relkareh [at] ucsd.edu

Project: Development of a Research Electronic Health Record for Clinical Decision Support Studies

When: Fall 2011, Winter 2012, Spring 2012

Creation of effective clinical decision support tools has the potential to significantly improve the quality of care delivered within our healthcare system. However, developing and testing prototypes of these tools requires access to realistic electronic health record (EHR) environments. This process has often involved prohibitively long turnaround times due to time and resource constraints of healthcare information systems groups. For research and educational purposes, these barriers could be avoided by creating an investigator-controlled research EHR and populating it with realistic clinical data. Such a system could enable researchers and students to develop a wide range of novel and innovative clinical decision support tools much more rapidly.

Aims:

  1. Install a sophisticated, open-source EHR (OpenMRS)
  2. Populate this EHR with deidentified data from a rich clinical database (MIMIC II)
  3. Develop one or more prototype clinical decision support tools within this environment

Yoav Freund | Computer Science and Engineering

Yoav Freund | Computer Science and Engineering

Email Contact: yfreund [at] ucsd.edu

Project: Digital Mouse Brain Atlas

When: Summer 2013, Any Quarter
Last updated: 08/26/2013

This Project is in collaboration with the Kleinfeld Lab (http://physics.ucsd.edu/neurophysics/) and the Mitra Lab (http://brainarchitecture.org/mouse/about).

The idea is to use a combination of machine learning and computer vision algorithms to create a digital atlas of the mouse brain.

This would require developing detectors of landmarks that exist in a majority of the brains. As the data size is large (tens of tera-bytes) the work will involve using a hadoop cluster.

Requirements: Python, computer vision, machine learning/statistics.
 

Christopher Glass | Cellular and Molecular Medicine

Christopher Glass | Cellular and Molecular Medicine

Email Contact: ckg [at] ucsd.edu

Dr. Glass’ primary interests are to understand transcriptional mechanisms that regulate the development and function of macrophages. Macrophages play key roles in immunity, wound repair, development and tissue homeostasis. Dysregulation of macrophage functions contribute to a broad spectrum of human diseases, including atherosclerosis, diabetes, neurodegenerative diseases, and cancer. A major effort of the Glass laboratory is to use genomics assays and associated bioinformatics approaches to understand how macrophage gene expression programs are established and how they are influenced by different tissue environments and disease. An important concept to emerge from these studies is that enhancers can be exploited to deduce the transcription factors and upstream signaling pathways that drive context-specific transcriptional outputs. BMI students are welcome to perform rotations and can select projects from current areas of active investigation.

Project: Natural genetic variation and macrophage gene expression

When: Any Quarter
Last updated: 07/26/2017

Many lines of evidence, including genome-wide association studies, indicate that non-coding genetic variation plays a major role in determining phenotypic diversity. We were among the first laboratories to define the impact of natural genetic variation on enhancer selection and function (Heinz et al, Nature 2013 PMID 24121437), but at present it remains difficult to predict the impact of non-coding variation on gene expression. In a novel and ambitious effort, we systematically characterized the genome wide patterns of mature RNA (RNA-seq), nascent RNA (GRO-Seq), transcriptional initiation (5’GRO-seq), histone modifications and binding profiles of lineage-determining and signal-dependent transcription factors (ChIP-seq), DNA methylation (bisulfite sequencing), and chromatin conformation (HiC, capture HiC and PLAC-seq), in resting and activated macrophages derived from 5 different inbred strains of mice providing ~60 million single nucleotide variants, ~6 million InDels and several hundred thousand structural variants. This data set provides a unique resource for investigating the impact of non-coding variants on transcription factor binding, enhancer activation and target gene expression. We are currently developing new computational methods for analyses of these data with a goal of explaining effects of non-coding mutations and predicting patterns of gene expression in new mouse strains. Related projects are investigating the relationships of genetic variation between selected mouse strains and their different susceptibilities to metabolic and cardiovascular disease. This general project area is both challenging and open ended and there are a wide range of directions that rotation projects could take. As examples, recent rotation students have implemented machine learning approaches to investigate how sequence variants affect collaborative binding between lineage-determining transcription factors.

Christopher Glass | Cellular and Molecular Medicine

Email Contact: ckg [at] ucsd.edu

Dr. Glass’ primary interests are to understand transcriptional mechanisms that regulate the development and function of macrophages. Macrophages play key roles in immunity, wound repair, development and tissue homeostasis. Dysregulation of macrophage functions contribute to a broad spectrum of human diseases, including atherosclerosis, diabetes, neurodegenerative diseases, and cancer. A major effort of the Glass laboratory is to use genomics assays and associated bioinformatics approaches to understand how macrophage gene expression programs are established and how they are influenced by different tissue environments and disease. An important concept to emerge from these studies is that enhancers can be exploited to deduce the transcription factors and upstream signaling pathways that drive context-specific transcriptional outputs. BMI students are welcome to perform rotations and can select projects from current areas of active investigation.

Project: Nature and nurture of microglia

When: Any Quarter
Last updated: 07/26/2017

Each population of tissue resident macrophages exhibits a distinct pattern of gene expression that is tuned to the developmental and homeostatic needs of that tissue. For example, brain macrophages called microglia produce factors that are trophic for neurons and monitor synapses, functions that require a brain-specific program of gene expression. A key question is how this tissue-specific program of gene expression is achieved. Through analysis of gene expression and enhancer landscapes, we obtained evidence that the microglia-specific molecular phenotype results from instructive signals in the brain that direct the activation of microglia-specific enhancers (Gosselin et al., Cell 2014 PMID 25480297). Of particular interest, delineation of the gene expression patterns and enhancer landscapes of human microglia revealed that a substantial fraction of the genes associated with non-coding GWAS risk alleles are preferentially or exclusively expressed in microglia, and many are brain environment dependent (Gosselin et al. Science 2017 PMID 28546318). These findings raise several important questions that are under active investigation, including what are the environmental factors that dictate the brain specific program of gene expression and how do human genetic variants affect the regulation of genes that are linked to neurodegenerative disease. We are taking a multi-disciplinary approach including studies of in vivo mouse models, in vitro human iPSC-derived microglia, genomic assays of microglia nuclei derived from control and Alzheimer’s disease brains, and direct analyses of the relation of genotype to gene expression in a growing of RNA-seq data base derived from purified human microglia. As an example, a recent rotation project investigated the question of whether there is any relationship between circulating monocytes (a white blood cell that can differentiate into macrophages in tissues) and microglia gene expression patterns from the same individual. 

Olivier Harismendy | Pediatrics

Olivier Harismendy | Pediatrics

Email Contact: oharismendy [at] ucsd.edu

The Oncogenomics laboratory is located in the Moores Cancer Center. Its research program is focused on the identification of genetic and epigenetic markers for cancer prevention and progression as well as drug response. The laboratory is a humid laboratory, combining both wet-lab techniques and bioinformatics analysis to study cancer samples from patients and animal models of cancer. The laboratory is also an important partner for multiple principal investigators at the Moores Cancer Center, collaborating on the design, analysis and interpretation of their genomic experiments.

Project: Development of Genomics Virtual Machines in HIPAA compliant cloud

When: Any Quarter
Last updated: 06/09/2016

Genetic information is considered protected health information (PHI) and as a consequence the highest security standards need to be applied for its storage, analysis and sharing. The oncogenomics laboratory is using state of the art iDASH compute cloud for its main computation. As a consequence, we participate in the development of optimal workflows and virtual machines for the analysis of patient-derived genomic datasets such as whole exomes, whole genomes, RNA-seq or genotyping arrays. 

In this project we will develop robust provisioning methods to establish virtual machines capable of running popular human genomic analysis workflows. We will benchmark these machines and workflows and convert some of them into standard recipes for production-grade, reproducible genomic analysis.

Olivier Harismendy | Pediatrics

Email Contact: oharismendy [at] ucsd.edu

The Oncogenomics laboratory is located in the Moores Cancer Center. Its research program is focused on the identification of genetic and epigenetic markers for cancer prevention and progression as well as drug response. The laboratory is a humid laboratory, combining both wet-lab techniques and bioinformatics analysis to study cancer samples from patients and animal models of cancer. The laboratory is also an important partner for multiple principal investigators at the Moores Cancer Center, collaborating on the design, analysis and interpretation of their genomic experiments.

Project: Genetic and epigenetic of cisplatin resistance

When: Any Quarter
Last updated: 06/09/2016

Cisplatin (cDDP) is the most commonly used chemotherapeutic drug, but most cancer eventually become resistant, leading to tumor recurrence. Several biological processes may modulate cDDP sensitivity: Drug import, export, detoxification, DNA repair, apoptosis. Drug resistance is transmitted to daughter cells, and one can build up resistant cell lines in vitro using sequential treatments. We are interested in identifying the genetic mutations that mediate this resistance. For this, we have derived resistant cell-lines from single clones of a cDDP sensitive ovarian cancer cell line. Using exome sequencing as well as target sequencing, we propose to determine mutations in genes and pathways that drive drug resistance. We will then expand the findings to the TCGA samples, using time to recurrence as an indicator of drug sensitivity.

Olivier Harismendy | Pediatrics

Email Contact: oharismendy [at] ucsd.edu

The Oncogenomics laboratory is located in the Moores Cancer Center. Its research program is focused on the identification of genetic and epigenetic markers for cancer prevention and progression as well as drug response. The laboratory is a humid laboratory, combining both wet-lab techniques and bioinformatics analysis to study cancer samples from patients and animal models of cancer. The laboratory is also an important partner for multiple principal investigators at the Moores Cancer Center, collaborating on the design, analysis and interpretation of their genomic experiments.

Project: The role of inherited variation in cancer somatic landscape

When: Any Quarter
Last updated: 06/09/2016

The role of germline or inherited variation in cancer has been studied in selected families and led to the identification of genetic variants that are dominant and responsible for cancer syndromes. Similarly, rare recessive variants with lower penetrance are responsible for the increase risk in breast and ovarian cancer (BRCA1/2). More common variants in the population have also been identified through GWAS, and have revealed multiple SNPs associated a modest increase in cancer risk. Despite these advances, multiple variants of intermediate allelic frequency in the population, or carried by patients with undocumented family history still remain variants of unknown significance (VUS) and can still play a role in tumor development. In addition, the contribution of variants located outside of the coding region has been underexplored and can now be reexamined in the light of recent maps of the regulatory landscape. The long-term goal of this research is to utilize germline genetics variation in cancer prevention and care to better stage patients or predict their response to treatment.

We propose to identify the germline variants in the UCSD Cancer center patients (targeted gene panel) as well as in the public TCGA/ICGC datasets (whole genomes). We will then test these variants, alone or in combination to identify the ones that impact cancer onset, the tumor somatic landscape or tissue specific regulatory network. The project will involve processing of high throughput sequencing data, population genetics and statistical analysis, in a HIPAA compliant cloud-computing environment.

Chun-Nan Hsu | Biomedical Informatics

Chun-Nan Hsu | Biomedical Informatics

Email Contact: chunnan [at] ucsd.edu

Project: Electronic phenotyping

When: Any Quarter
Last updated: 09/29/2014

As more medical record data are now in electronic format, how to re-use the data for clinical research and healthcare quality improvement becomes an important research topic. Selecting patients from electronic medical records satisfying certain phenotypic conditions may require understanding and disambiguating free texts given in narrative notes. The project will develop capabilities of algorithmic selection that can be used to enhance diagnostic decision-making. 

Jina Huh | School of Medicine

Jina Huh | School of Medicine

Email Contact: jinahuh [at] ucsd.edu

My lab works in the fields of human-computer interaction, mobile health, social media, and underserved populations.

Project: InfoMediator: Weaving clinical expertise in online health comunities

When: Any Quarter
Last updated: 07/21/2017

I am looking for a student who can develop browser-based, real-time text classifiers that classify what information online forum posts need as they get posted online in the context of online health communities. The classification model in terms of features and algorithms is there; I need implementation of the model and improvement of performance. For further information, please see:
http://jinahuh.net/infomediator-weaving-clinical-expertise-in-peer-patient-conversations/

Jina Huh | School of Medicine

Email Contact: jinahuh [at] ucsd.edu

My lab works in the fields of human-computer interaction, mobile health, social media, and underserved populations.

Project: Physical activity improvement in elementary school kids through UNISEF bands

When: Fall 2017
Last updated: 07/21/2017

Protege effect refers to changing behavior better when kids are changing behavior for someone else (e.g., friend, pet, parents, etc).

UNISEF bands are designed so that when kids wear it and have physical activity, the data will be collected for funders to give needed goods for kids in developing countries, such as food, shelter, and school items.

We are introducing UNISEF bands, which is a physical activity tracker band that kids can wear on their wrists, to elementary schools. We are collecting data between baseline (not wearing UNISEF bands) and when the kids start wearing the bands to test the efficacy of wearing the bands on improved participation in their running clubs at school.

The students interested in this project will participate in collecting data, analyzing data, and publication of the research.

Jina Huh | School of Medicine

Email Contact: jinahuh [at] ucsd.edu

My lab works in the fields of human-computer interaction, mobile health, social media, and underserved populations.

Project: SHINE-L: Improving Latino families wellness through acoustic sensing

When: Any Quarter
Last updated: 07/21/2017

This project is funded by the NSF Smart and Connected Health program. We are working with Latino families to prevent child obesity through sensing and visualizing behavioral routines at home. For more information, please see:
http://jinahuh.net/fresh-improving-family-routines/
 

Lilia Iakoucheva | Psychiatry

Lilia Iakoucheva | Psychiatry

Email Contact: lilyak [at] ucsd.edu

The lab has a variety of bioinformatics projects aimed at improving understanding of the functional impact of autism mutations derived from exome and genome sequencing of the patients. We build spatio-temporal gene co-expression and protein interaction networks for psychiatric diseases and we use these networks to generate the testable hypothesis about the mechanisms of disease. We also test these hypothesis experimentally in the lab, thereby adding a translational aspect to our work. 

Project: Evaluating the effect of splicing mutations on isoform networks in autism

When: Any Quarter
Last updated: 07/21/2017

The project deals with constructing the isoform-level co-expression and protein interaction networks for predicting functional impact of the de novo splice site mutations from the patients with autism spectrum disorder (ASD). Hundreds of splice site de novo mutations are currently identified in the ASD patients, but not a single disease mechanism is established for any of these mutations. We will build and analyze isoform-level networks of brain co-expressed and physically interacting proteins; map de novo ASD mutations onto isoform-level networks to predict their functional impact; and validate the disrupted networks and pathways using CRISPR/Cas technology in neuronal and animal models. This project will discover and characterize cellular and molecular processes that are disrupted by the de novo splice site ASD mutations.

Lilia Iakoucheva | Psychiatry

Email Contact: lilyak [at] ucsd.edu

The lab has a variety of bioinformatics projects aimed at improving understanding of the functional impact of autism mutations derived from exome and genome sequencing of the patients. We build spatio-temporal gene co-expression and protein interaction networks for psychiatric diseases and we use these networks to generate the testable hypothesis about the mechanisms of disease. We also test these hypothesis experimentally in the lab, thereby adding a translational aspect to our work. 

Project: Integrative functional genomic study of pathways impacted by recurrent autism CNV

When: Any Quarter
Last updated: 07/21/2017

Copy number variants (CNVs) represent significant risk factors for Autism Spectrum Disorders (ASD). One of the most frequent CNVs involved in ASD is a deletion or duplication of the 16p11.2 CNV locus, spanning 29 protein-coding genes. Despite the progress in linking 16p11.2 genetic changes with the phenotypic (macrocephaly and microcephaly) abnormalities in the patients and model organisms, the specific molecular pathways impacted by this CNV remain unknown. To test the hypothesis that RhoA signaling is disrupted by this CNV, we will generate KCTD13 and CUL3 mouse models using CRISPR/Cas9 system and investigate dysregulated molecular pathways using RNAseq at various stages of the developing mouse fetal brain.

Trey Ideker | School of Medicine

Trey Ideker | School of Medicine

Email Contact: tideker [at] ucsd.edu

​The overall objective of the Ideker Laboratory is to develop an artificially intelligent model of the cell able to translate a patient's data into precision diagnosis and treatment. Towards this goal, we are developing methods that learn how to structure cell models directly from genomics data sets:

For this purpose, we run an experimental facility for systematic measurement of gene and protein interaction networks:

A second big challenge is to work out the functional logic by which these models process information, e.g., from genotype to phenotype. Here too, we have made recent progress

but much remains to be done before we have a cell model capable of making robust predictions about patients. As supporting software, we are developers of Cytoscape, a popular platform for visualization and modeling of biological networks which is supported by a consortium of many labs including our own (http://www.cytoscape.org/).

Project: Using a hierarchical cellular model to analyze tumor genetic mutations

When: Fall 2017, Spring 2018
Last updated: 08/04/2017

The student will explore whether a hierarchical model we have recently constructed for predicting growth of simple cells can be translated to predict aggressiveness of human cancer. The model will be provided, along with access to tumor exomes from both public and internal sources. The goal is to determine, over a 10 week rotation, whether and to what extent the model can be used to analyze a patient's exome. If so, this project could be readily developed into a PhD thesis.

Prerequisites: Computer programming or scripting skills; some knowledge of genomic biology.

Trey Ideker | School of Medicine

Email Contact: tideker [at] ucsd.edu

​The overall objective of the Ideker Laboratory is to develop an artificially intelligent model of the cell able to translate a patient's data into precision diagnosis and treatment. Towards this goal, we are developing methods that learn how to structure cell models directly from genomics data sets:

For this purpose, we run an experimental facility for systematic measurement of gene and protein interaction networks:

A second big challenge is to work out the functional logic by which these models process information, e.g., from genotype to phenotype. Here too, we have made recent progress

but much remains to be done before we have a cell model capable of making robust predictions about patients. As supporting software, we are developers of Cytoscape, a popular platform for visualization and modeling of biological networks which is supported by a consortium of many labs including our own (http://www.cytoscape.org/).

Project: Computing a minimal set of genes required for life

When: Fall 2017, Spring 2018
Last updated: 08/04/2017

A long standing question in biology is how many (and which) genes are required for life. This essential core set of genes, or minimal genome, makes up the cell's “life support system” or “chassis and power supply” on which more complex functions and processes are built. This set of genes is of keen interest in the field Synthetic Biology, which aims to synthesize the complete minimal genome of an organism and add additional functions to this genome for biotechnological, pharmaceutical and agricultural ends. This project will attempt to use our whole-cell model of the networks and pathways in a cell to predict which genes and gene combinations are essential for life and, conversely, which genes and gene combinations can be removed. If successful, this project will be able to predict minimal genomes for synthesis and testing. It will also address whether there actually is a single “minimal genome” or whether there exist many different configurations all of which are near or at the global minimum.

Prerequisites: Computer programming or scripting skills.
Optional: Experimental laboratory skills, which would allow student to make tests of model predictions.

Trey Ideker | School of Medicine

Email Contact: tideker [at] ucsd.edu

​The overall objective of the Ideker Laboratory is to develop an artificially intelligent model of the cell able to translate a patient's data into precision diagnosis and treatment. Towards this goal, we are developing methods that learn how to structure cell models directly from genomics data sets:

For this purpose, we run an experimental facility for systematic measurement of gene and protein interaction networks:

A second big challenge is to work out the functional logic by which these models process information, e.g., from genotype to phenotype. Here too, we have made recent progress

but much remains to be done before we have a cell model capable of making robust predictions about patients. As supporting software, we are developers of Cytoscape, a popular platform for visualization and modeling of biological networks which is supported by a consortium of many labs including our own (http://www.cytoscape.org/).

Project: Development of a software pipeline for generating cell function hierarchies from genomic data

When: Fall 2017, Spring 2018
Last updated: 08/04/2017

We have developed algorithms (NeXO and CliXO) by which systematic datasets are used to organize genes into a gene ontology, reflecting the hierarchical organization of cellular structures and molecular pathways in the cell. Currently these algorithms are coded in Python; however, a user-friendly and expandable interface would allow end-users to quickly build and update gene ontologies from new data sets. Coding of this interface is the main goal of this rotation; If successful, this tool could seed a thesis project to construct a gene ontology for a particular cellular process (e.g. DNA damage response) or disease (e.g. cancer) of interest.

Prerequisites: Computer programming or scripting skills; some knowledge of genomic biology.

Trey Ideker | School of Medicine

Email Contact: tideker [at] ucsd.edu

​The overall objective of the Ideker Laboratory is to develop an artificially intelligent model of the cell able to translate a patient's data into precision diagnosis and treatment. Towards this goal, we are developing methods that learn how to structure cell models directly from genomics data sets:

For this purpose, we run an experimental facility for systematic measurement of gene and protein interaction networks:

A second big challenge is to work out the functional logic by which these models process information, e.g., from genotype to phenotype. Here too, we have made recent progress

but much remains to be done before we have a cell model capable of making robust predictions about patients. As supporting software, we are developers of Cytoscape, a popular platform for visualization and modeling of biological networks which is supported by a consortium of many labs including our own (http://www.cytoscape.org/).

Project: Experimental mapping of the DNA damage response

When: Fall 2017, Spring 2018
Last updated: 08/04/2017

Cell colonies on agar grow in a near linear fashion with growth rates reflective of their "fitness". The laboratory has developed an experimental platform that can make continuous measurements of growth rates via time-lapse image capture of thousands of specific genetic mutant strains, enabling us to determine the relevance of every gene in the response to stimuli such as DNA damage via radiation or chemotherapy. During the rotation the student will grow ~50,000 cell colonies in parallel and capture their growth curves using digital images and intermittent radiation exposure. The project includes working in Matlab for the analysis of growth curves and the elucidation of DNA damage response pathways. If successful, the project could be developed into a thesis which uses these data to construct a hierarchical model of DNA damage responses.

Prerequisites: Prior experience in a genetics or biochemistry experimental laboratory.

Rob Knight | Pediatrics

Rob Knight | Pediatrics

Email Contact: robknight [at] ucsd.edu

The Knight lab has broad interests in the human microbiome, the collection of trillions of microbes that inhabits our bodies, especially in developing techniques to read out these complex microbial communities and use the resulting data to understand human health, links between humans and the environment, and to prevent and cure disease. We offer a fast-paced environment with many collaborative opportunities on different projects.

Project: Machine Learning for the Microbiome

When: Any Quarter
Last updated: 09/05/2017

We have amassed a database of microbial DNA sequences from hundreds of thousands of biological specimens. Understanding how these changes relate to disease requires a range of machine learning and multivariate statistical approaches. There are many opportunities ranging from entry-level (benchmarking classifier performance on specific sample sets) to extremely challenging (using deep learning to infer the structure of global sample set relationships).

Rob Knight | Pediatrics

Email Contact: robknight [at] ucsd.edu

The Knight lab has broad interests in the human microbiome, the collection of trillions of microbes that inhabits our bodies, especially in developing techniques to read out these complex microbial communities and use the resulting data to understand human health, links between humans and the environment, and to prevent and cure disease. We offer a fast-paced environment with many collaborative opportunities on different projects.

Project: Multi-omics integration

When: Any Quarter
Last updated: 09/05/2017

An increasing need is to integrate data from different "omics" level, e.g. genomes, metagenomes, metatranscriptomes, metaproteomes, metabolomes, immunological profiling, etc., into a single coherent picture separating healthy and disease states. Improved methods for performing this task, either directly or via intermediate representations such as mapping to metabolic and regulatory pathways, is essential for improving understanding. Projects in this category range from simple (testing where existing techniques like correlation networks or Procrustes analysis do/don't connect two specific data layers) to challenging (use transfer learning to integrate heterogeneous data layers and improve the underlying network annotation). An especially exciting emerging research direction here is XAI (explainable artificial intelligence), which can provide for clinical applications a better justification for a specific classification or suggestion.

Rob Knight | Pediatrics

Email Contact: robknight [at] ucsd.edu

The Knight lab has broad interests in the human microbiome, the collection of trillions of microbes that inhabits our bodies, especially in developing techniques to read out these complex microbial communities and use the resulting data to understand human health, links between humans and the environment, and to prevent and cure disease. We offer a fast-paced environment with many collaborative opportunities on different projects.

Project: Optimizing microbiome algorithms

When: Any Quarter
Last updated: 09/05/2017

Many algorithms used in microbiome studies, especially in metagenomic assembly, are extremely computationally expensive. Opportunities exist for either exploiting new hardware architectures to accelerate existing algorithms, or for developing new approximate algorithms, to tackle problems in the workflow including inferring taxonomy and function from DNA sequence data, genome and metagenome assembly and annotation, computing community distance metrics from sparse compositional data, and high-level analyses of hundreds of thousands of microbiomes. Again these projects range from entry level (compare results of two multiple sequence alignment techniques for subsequent community analysis) to advanced (use non-von Neumann architectures to perform pattern classification in real time at the whole community level for disease detection).

Lucila Ohno-Machado | School of Medicine

Lucila Ohno-Machado | School of Medicine

Email Contact: machado [at] ucsd.edu

Project: Bioinformatics Rotation Projects Available

When: Any Quarter
Last updated: 08/21/2012

Scott Rifkin | Biological Sciences

Scott Rifkin | Biological Sciences

Email Contact: sarifkin [at] ucsd.edu

The Rifkin laboratory studies how environmental, genetic, and stochastic variation interact to generate phenotypic variation and thereby mold the course of evolution. We use yeasts and nematodes as model organisms and work primarily at the level of gene regulatory and signal transduction networks.

Project: Constraints and selection that drive gene family evolution

When: Winter 2018, Spring 2018, Summer 2018
Last updated: 08/02/2017

High quality genome sequences are increasingly available that cover entire genera. This facilitates fine scale investigations of the forces that drive protein evolution. We are using the Caenorhabditis (roundworm) genus and the Saccharomyces (yesat) genus to study the role of selection and constraint on gene family evolution in the context of developmental and physiological networks. A rotation project would include phylogenetic analyses of gene families, evolutionary tests for selection and constraint, and integration with mechanistic and functional data from systems biology.

Scott Rifkin | Biological Sciences

Email Contact: sarifkin [at] ucsd.edu

The Rifkin laboratory studies how environmental, genetic, and stochastic variation interact to generate phenotypic variation and thereby mold the course of evolution. We use yeasts and nematodes as model organisms and work primarily at the level of gene regulatory and signal transduction networks.

Project: Statistics on the morphogenesis of hybrid inviability

When: Winter 2018, Spring 2018, Summer 2018
Last updated: 08/02/2017

Hybrids between different species do not, as a rule, do well. We are using microscopy, image informatics, and statistics to understand the developmental biology of hybrid incompatibility. We are imaging the complete development (3D position of every nucleus over time) of hybrid worm embryos in trying to determine whether there are particularly sensitive parts in development where genomes of different species prove particularly incompatible, even if other parts of development proceed properly. A rotation project on this topic would include image informatics in processing the microscopy images and statistical investigation of variation in this complicated trait.

Michael Rosenfeld | School of Medicine

Michael Rosenfeld | School of Medicine

Email Contact: mrosenfeld [at] ucsd.edu

Lab Location: CMM-West, Rm. 345

Lab Phone: 858-534-5858

Lab Composition and Activities: Five graduate students from several programs, and a talented group of enthusiastic (also helpful) postdoctoral fellows and a full time laboratory manager. We have one general laboratory meeting, one graduate student-only meeting, and one personal meeting each week. We also have joint lab meetings with two other labs weekly.

Research Interests: Our central laboratory focus this year is to continue to utilize global genomic approaches to uncover and investigate the “enhancer code” controlled by new, previously unappreciated pathways that integrate the genome-wide response to permit proper development and homeostasis, and that also functions in disease and senescence. We have investigated these events in differentiated cells, neuronal development, stem cells, and cancer. Our biological focus is on molecular mechanisms of the “enhancer code” regulating learning and memory; aggressive prostate and breast cancer, and they underlying events of senescence/aging. Epigenomic events studied include non-histone methylation events and non-coding RNAs. We are investigating these events in development, breast and prostate cancers, and in inflammation-based disease, including degenerative CNS disease and diabetes. The emerging importance of non-coding RNAs and regulation of nuclear architecture is rapidly altering our concepts of homeostasis and disease. Our laboratory is “Seq-ing” (RIP-seq, ChIP-seq, RNA-seq, GRO-seq, CLIP-seq, ChIRP-seq), and a new “FISH-seq”, for open-ended discovery of long-distance genome interactions to uncover new “rules” of regulated gene transcriptional programs and new roles for lncRNAs in biology of normal, cancer neuro-affective disorders and aging cells. Coupling this with chemical library screens, we hope to introduce new types of therapies based on targeting specific gene enhancers, histone protein readers and writers, and lncRNAs for cancers and other diseases. Recent surprising findings have been novel roles of lncRNAs prostate and breast cancer, connection between DNA damage repair/transcription and replication, and unexpected roles of enhancer RNAs.

Current interests include:

  • The “enhancer code,” Epigenomics and transcriptional regulatory mechanisms.
  • Roles of by ncRNAs in enhancer function in signal-dependent genomic relocation and in establishing subnuclear architecture.
  • Mechanisms of signal-induced tumor chromosomal translocations events and new chemical screens for inhibitors for breast and prostate cancer.
  • The “enhancer code” or regulation of learning and memory, including Reelin-regulated enhancers.
  • Linkage of DNA damage/repair and transcription.
  • Retinoic Acid regulation of Pol III-transcribed DNA repeats in maintenance of the stem cell state, in neuronal differentiation and in senescence.
  • Molecular mechanisms of prevelant disease associated sequence variations (GWAS) in disease susceptibility loci.
  • “Epigenomics” in neuronal differentiation, cancer, diabetes and degenerative brain disease.
  • Answering the question when and how enhancers arise and became functional (stem cells to mature cell types).

Project: Bioinformatics Rotation Projects

When: Any Quarter
Last updated: 08/12/2013

Potential projects include:

  • Projects employing use of genome-wide technologies, including ChIP-seq, GRO-seq, CLIPseq-, RNA-seq, and ChIRP-seq, to elucidate molecular mechanisms of regulated enhancer lncRNA actions in cancer and stem cells;
  • Roles and mechanisms of enhancer actions in prostate and breast cancers;
  • Enhancer-based model of neurodevelopment and CNS disorders;
  • New mechanisms of long non-coding RNAs dictating physiological gene regulation in cancer transcriptional programs;
  • Understanding subnuclear structures: Roles of relocation of transcription units between subnuclear architectural structures in regulated gene expression;
  • Chemical library screens to gene signature and translocation responses as an approach toward new cancer therapeutic reagents;
  • Roles of epigenomic regulators and expression of DNA repeats in stem cells, neuronal differentiation and in senescence.

Jonathan Sebat | Cellular and Molecular Medicine

Jonathan Sebat | Cellular and Molecular Medicine

Email Contact: jsebat [at] ucsd.edu

Our laboratory is interested in how rare and de novo mutations in the human genome contribute to patterns of genetic variation and risk for disease in humans. To this end, we are developing novel approaches to gene discovery that are based on advanced technologies for the detection of rare variants, including studies of copy number variation (CNV) and deep whole genome sequencing (WGS). Our goal is to identify genes related to psychiatric disorders and determine how genetic variants impact the function of genes and corresponding cellular pathways.

Project: Determining the effect of autism mutations on development of the head and face

When: Any Quarter
Last updated: 06/09/2016

We have collected whole genome sequence data and 3D digital images of the head and face from a set of 300 autism families. This project will examine quantitative measurement of facial features in autism patients and sibling controls and determine the degree to which specific mutations affect craniofacial structure. We will apply unsupervised clustering of genetic and phenotype data to define diagnostic subgroups of patients.

Jonathan Sebat | Cellular and Molecular Medicine

Email Contact: jsebat [at] ucsd.edu

Our laboratory is interested in how rare and de novo mutations in the human genome contribute to patterns of genetic variation and risk for disease in humans. To this end, we are developing novel approaches to gene discovery that are based on advanced technologies for the detection of rare variants, including studies of copy number variation (CNV) and deep whole genome sequencing (WGS). Our goal is to identify genes related to psychiatric disorders and determine how genetic variants impact the function of genes and corresponding cellular pathways.

Project: Determining the frequency of spontaneous reversion in the human genome

When: Any Quarter
Last updated: 06/09/2016

Structural Variants (SVs) in the human genome are poorly ascertained in genome-wide association studies (GWAS).Tandem duplications in particular are not efficiently tagged by adjacent SNPs. The reasons for this are not known. We hypothesize that SVs, once formed, create local instability resulting in a high rate of spontaneous reversion. This project will directly determine the rates of spontaneous reversion in whole genomes of 300 trio families. In addition, we will examine the local patterns of genetic variation adjacent to SVs to infer the occurrence of reversion events.

Jonathan Sebat | Cellular and Molecular Medicine

Email Contact: jsebat [at] ucsd.edu

Our laboratory is interested in how rare and de novo mutations in the human genome contribute to patterns of genetic variation and risk for disease in humans. To this end, we are developing novel approaches to gene discovery that are based on advanced technologies for the detection of rare variants, including studies of copy number variation (CNV) and deep whole genome sequencing (WGS). Our goal is to identify genes related to psychiatric disorders and determine how genetic variants impact the function of genes and corresponding cellular pathways.

Project: Identifying human essential genes by deletion mapping of a large population

When: Any Quarter
Last updated: 06/09/2016

Studies of genetic variation in large populations makes it possible to determine the degree of natural selection acting on specific sequences. Our lab has mapped structural variation (SV, including deletions and duplications) in large samples (N>100,000). By generating a null model based on regional patterns of SV, we propose to identify sequences that deviate dramatically from expectations. Sequences that display extreme deviation are likely to be genes that are essential for life.

Yingxiao (Peter) Wang | Bioengineering

Yingxiao (Peter) Wang | Bioengineering

Email Contact: yiw015 [at] eng.ucsd.edu

Our research focuses on molecular engineering for cellular imaging and reprogramming, and image-based bioinformatics, with applications in stem cell differentiation and cancer treatment.

Project: Image-based reconstruction of biochemical networks in live cells

When: Any Quarter
Last updated: 06/02/2016

Fluorescence resonance energy transfer (FRET)-based biosensors have been widely used in live-cell imaging to accurately visualize specific biochemical activities. We have developed the Fluocell image analysis software package to efficiently and quantitatively evaluate the intracellular biochemical signals in real-time, and to provide statistical inference on the biological implications of the imaging results. However, important questions arise on how to use these results to reconstruct the quantitative parameters in the underlying biochemical networks, which determine cellular functions and ultimately their fates. In this rotation project, we will integrate optimization-based machine learning approaches with biochemical network models to seek answers to these questions, with applications in cancer treatment against drug resistance.

Gene Yeo | Cellular and Molecular Medicine

Gene Yeo | Cellular and Molecular Medicine

Email Contact: geneyeo [at] ucsd.edu

We have a wide scope of projects ranging from developing novel algorithms for studying RNA processing in diseases, development and personalized medicine, and for analyzing single-cell RNA-seq data.

Project: Single-cell RNA-seq analysis

When: Any Quarter
Last updated: 06/02/2016

We have projects that deal with developing new algorithms for single-cell RNA-seq analysis pertaining to studying heterogeneity in complex mixtures of cells upon environmental challenges.