Bioinformatics and Systems Biology Rotation Projects

This page is updated annually. Some projects may already be taken, and new projects may be available. The projects below give an indication of the types of projects available in each lab, but please browse faculty web pages and contact professors directly to discuss current opportunities.

Labs with BISB Rotation Projects

Ruben Abagyan,

Vineet Bafna,

Nuno Bandeira,

Vikas Bansal,

Steve Briggs,

Pieter Dorrestein,

Joseph Ecker,

Charles Elkan,

Kelly Frazer,

Yoav Freund,

Kyle Gaulton,

Christopher Glass,

Joseph Gleeson,

Lawrence Goldstein,

Melissa Gymrek,

Nan Hao,

Jeff Hasty,

Vivian Hook,

Trey Ideker,

Amy Kiger,

Richard Kolodner,

Julie Law,

  • Julie Law, Salk Institute for Biological Studies

Prashant Mali,

Andrew McCammon,

Graham McVicker,

Siavash Mirarab,

Saket Navlakha,

Pavel Pevzner,

Anjana Rao,

Bing Ren,

  • Bing Ren, Cellular and Molecular Medicine
  • Bing Ren, Cellular and Molecular Medicine
  • Bing Ren, Cellular and Molecular Medicine

Doug Richman,

Scott Rifkin,

Michael Rosenfeld,

Julian Schroeder,

Dorothy Sears,

Jonathan Sebat,

Susan Taylor,

Glenn Tesler,

Wei Wang,

Roy Wollman,

Gene Yeo,

  • Gene Yeo, Cellular and Molecular Medicine

Kun Zhang,

Sheng Zhong,

Ruben Abagyan | School of Pharmacy

Ruben Abagyan | School of Pharmacy

Email Contact: rabagyan [at] ucsd.edu

Our field of research is computational structural chemical biology, pharmacology, toxicology, and structure based drug discovery. We develop mathematical and computational methods for molecular mechanics, docking, visualization and cheminformatics and apply them. We also use maintain several web servers and our own large scale clusters for massively parallel calculations. See our website http://ablab.ucsd.edu for more details.

Project: Building a mixed docking/pharmacophore based engine for activity prediction

When: Any Quarter
Last updated: 08/16/2013

The lab now built a collection of about 1000 pocket ensembles. Computational docking a new compound to them and scoring the compound will predict the activity of it. The project involves building mixed models in which the docking score weights are optimized and the docking score is complemented with a different kind of score based on the continuous pharmacophoric models. The resulting models will be tested for their ability to predict a selectivity profile of the test compounds. This project will involve cluster or distributed computing.

Ruben Abagyan | School of Pharmacy

Email Contact: rabagyan [at] ucsd.edu

Our field of research is computational structural chemical biology, pharmacology, toxicology, and structure based drug discovery. We develop mathematical and computational methods for molecular mechanics, docking, visualization and cheminformatics and apply them. We also use maintain several web servers and our own large scale clusters for massively parallel calculations. See our website http://ablab.ucsd.edu for more details.

Project: Data visualization: 3D trees

When: Any Quarter
Last updated: 08/16/2013

This project will focus on the development of the server for a new kind of data visualization of large data sets. Feel free to contact me for further details.

Ruben Abagyan | School of Pharmacy

Email Contact: rabagyan [at] ucsd.edu

Our field of research is computational structural chemical biology, pharmacology, toxicology, and structure based drug discovery. We develop mathematical and computational methods for molecular mechanics, docking, visualization and cheminformatics and apply them. We also use maintain several web servers and our own large scale clusters for massively parallel calculations. See our website http://ablab.ucsd.edu for more details.

Project: Development of activity and toxicity models related to anti-targets

When: Any Quarter
Last updated: 08/16/2013

Many chemicals and metabolites can cause serious adverse effects we can read about on drug labels. We have gradually accumulated a large body of data relating some of these activities to binding to specific proteins. This project will involved the development and optimization of a specific so continuous pharmacophoric models which can be built without the knowledge of the three dimensional structure of the target. The models will be built for several key anti-targets: hERG, PPARg, ESR, D2, etc., 5HT2c and 2b, M1. The models will be tested against the known therapeutics and their metabolites.

Vineet Bafna | Computer Science and Engineering

Vineet Bafna | Computer Science and Engineering

Email Contact: vbafna [at] ucsd.edu

Our lab is focused on design and implementation of algorithms for biological data interpretation. Within this broad framework, we have a number of open projects relating to problems in proteomics (interpretation of mass spectrometry data), genetics, and genomics. The projects listed below are a small sampling of available projects. Interested students should be have taken a class in algorithms design, and have some facility with machine learning approaches.

Project: Genome query

When: Fall 2011, Winter 2012
Last updated: 08/23/2011

We are developing a tool-kit for archiving and querying genomic information. Part of the project involves developing a compiler for interpreting the query language.

Vineet Bafna | Computer Science and Engineering

Email Contact: vbafna [at] ucsd.edu

Our lab is focused on design and implementation of algorithms for biological data interpretation. Within this broad framework, we have a number of open projects relating to problems in proteomics (interpretation of mass spectrometry data), genetics, and genomics. The projects listed below are a small sampling of available projects. Interested students should be have taken a class in algorithms design, and have some facility with machine learning approaches.

Project: MS imaging

When: Fall 2011, Winter 2012
Last updated: 08/23/2011

MS imaging is a technique for visualizing proteins in their spatial context. We have projects on analyzing MSI data, clustering, and identifying signatures of interest, and identification of peptides.

Nuno Bandeira | Computer Science and Engineering

Nuno Bandeira | Computer Science and Engineering

Email Contact: nbandeira [at] ucsd.edu

The Bandeira lab develops novel computational mass spectrometry approaches for the discovery and characterization of biomarker metabolites, proteins, post-translational modifications and protein-protein interactions, with the ultimate goal of substantially improving the capabilities of proteomics discovery pipelines towards the development of novel drug therapeutics.

Project: Computational Mass Spectrometry

When: Fall 2011, Winter 2012, Spring 2012, Summer 2012
Last updated: 08/22/2011

Several rotation projects are available, each requiring a different mix of algorithmic, programming and data analysis skills:

  1. Characterization of protein sequence polymorphisms in drug-resistant TB strains.
  2. Categorization of post-translationally modified opioid neuropeptides.
  3. Discovery of novel post-translational modifications in human cancer
  4. Early detection of biosimilar precursors of current and possible therapeutic drugs, including proteomic analysis of monoclonal antibodies and venom proteins.

Vikas Bansal | Pediatrics

Vikas Bansal | Pediatrics

Email Contact: vibansal [at] ucsd.edu

Research in our lab is focused on human genetic variation and understanding its role in human disease using a combination of computational algorithms, statistical analysis and data from high-throughput DNA sequencing technologies. We have a number of projects available for rotation students that have the potential to become thesis projects. Students with an interest in developing new algorithms/methods and analyzing high-throughput DNA sequencing data are welcome to contact us.

Project: PCR duplication rate in high-throughput DNA sequencing experiments

When: Winter 2016, Spring 2016, Summer 2016, Any Quarter
Last updated: 01/13/2016

PCR amplification is an important step in the preparation of DNA sequencing libraries for high-throughput sequencers. Estimating the true PCR duplication rate is important to avoid bias in downstream analyses such as variant calling (DNA-seq), estimating gene expression levels (RNA-seq) and detection of allele specific binding (Chip-seq). In this project, the goal is to estimate and analyze the PCR duplication rate using statistical methods that utilize genetic variation present in an individual human genome. 

Vikas Bansal | Pediatrics

Email Contact: vibansal [at] ucsd.edu

Research in our lab is focused on human genetic variation and understanding its role in human disease using a combination of computational algorithms, statistical analysis and data from high-throughput DNA sequencing technologies. We have a number of projects available for rotation students that have the potential to become thesis projects. Students with an interest in developing new algorithms/methods and analyzing high-throughput DNA sequencing data are welcome to contact us.

Project: Small insertion/deletion variants in the human genome

When: Winter 2016, Spring 2016, Summer 2016, Any Quarter
Last updated: 01/13/2016

Short insertions/deletions (indels) represent the second most frequent form of variation in the human genome and are frequently the causal mutation in rare Mendelian disorders. Short indels are over-represented in low-complexity regions of the genome that are prone to errors during library preparation and sequencing and mis-alignment after sequencing. We currently lack an accurate estimate of the number of short indels in an individual human genome. The goal of this project is to develop new methods for the accurate detection of indels in an individual human genome and to build a gold standard set of indels using multiple types of DNA sequencing datasets including long read PacBio and Hi-C.

Vikas Bansal | Pediatrics

Email Contact: vibansal [at] ucsd.edu

Research in our lab is focused on human genetic variation and understanding its role in human disease using a combination of computational algorithms, statistical analysis and data from high-throughput DNA sequencing technologies. We have a number of projects available for rotation students that have the potential to become thesis projects. Students with an interest in developing new algorithms/methods and analyzing high-throughput DNA sequencing data are welcome to contact us.

Project: Variant calling in difficult regions of the genome using Hi-C data

When: Winter 2016, Spring 2016, Summer 2016, Any Quarter
Last updated: 01/13/2016

Hi-C technology, a method originally developed to study the 3-D architecture of genomes, has found many applications in the assembly and analysis of genetic variation in human genomes. We have previously published a method to generate chromosomal-scale haplotypes using whole-genome Hi-C data (Nature Biotech. 31, 1111–1118, 2013). In this project, he goal is to utilize Hi-C read pair data to identify mutations in regions of the genome that are not accessible using standard DNA sequencing approaches. This project potentially involves both computational methods development and generation of sequence data using the Hi-C method.

Steve Briggs | Biological Sciences

Steve Briggs | Biological Sciences

Email Contact: sbriggs [at] ucsd.edu

We model relationships between the proteotypes and the phenotypes of cells/organisms, with an emphasis on innate immunity in plants. Proteotypes are measured using custom methodology for high-throughput proteomics based on mass spectrometry. Students have an opportunity to integrate training in bioinformatics with chemistry and biology.

Project: Plant systems biology

When: Any Quarter
Last updated: 07/15/2014

We have generated the most comprehensive quantitative proteomics description of a higher eukaryote. The data include protein abundance and phosphorylation levels of 20,000 proteins across 34 tissues and stages of development in maize (corn) and it is paired with RNAseq data. Projects are available for students to use machine learning (random generalized linear models) to model the relationships between regulators and targets. For example, we are comparing the protein abundance and phosphorylation levels of transcription factors to the mRNA abundance of all genes. There are many other types of regulators whose roles can be explored using this dataset. We evaluate modeled regulatory relationships using transgene expression in protoplasts. Models are assembled into networks and integrated with pathway annotations to predict phenotypic consequences of synthetic genes. These predictions are tested in transgenic plants. Students of bioinformatics and systems biology who wish to integrate computational with experimental training in chemistry and biology are especially encouraged to apply.

Pieter Dorrestein | School of Pharmacy

Pieter Dorrestein | School of Pharmacy

Email Contact: pdorrestein [at] ucsd.edu

Project: Mapping the mass spectrometry chemical space for structure

When: Open
Last updated: 08/31/2010

The chemical space is enormous, however molecules are related in chemistry behave similar by mass spectrometry. Mass spectrometry generates a output in numbers and these numbers can be correlated. In this project we want to build a comparative network to generate a predictive structural classification. If succesful, this project will transform therapeutic discovery, the investigations of signalling molecules and will benefit disease diagnosis.

Pieter Dorrestein | School of Pharmacy

Email Contact: pdorrestein [at] ucsd.edu

Project: Predictable disease prognosis network interactions of microbial interactions

When: Open
Last updated: 08/31/2010

For every human cell there are 10 microbes. Microbes are importan for our metabolism but also the proliferation and control of diseases. Our lab has been developing imaging mass spectrometry based tools to understand how neighboring microbes control the proliferation of pathogens. In this project we want to come up with a predictive model of the signalling molecules and microbial interactions.

Joseph Ecker | Biological Sciences

Joseph Ecker | Biological Sciences

Email Contact: ecker [at] salk.edu

Nature vs. nurture, genes vs. environment—what is more important? My group is interested in understanding the roles of genetic and 'epigenetic' processes in cell growth and development. By understanding how the genome and epigenome talk to one another, we hope to be able to untangle the complexity of gene regulatory processes that underlie development and disease in plants and humans." Although the human genome sequence lists almost every single DNA base of the roughly 3 billion bases that make up a human genome, it doesn't tell biologists much about how its function is regulated. That job belongs to the epigenome, the layer of genetic control beyond the regulation inherent in the sequence of the genes themselves. Being able to study the epigenome in its entirety promises a better understanding of how genome function is regulated in health and disease, as well as how gene expression is influenced by diet and the environment. One of the ways epigenetic signals can tinker with genetic information is through DNA methylation, a chemical modification of one letter, C (cytosine), of the four letters (A, G, C, and T) that comprise our DNA. In the last couple of years, Ecker's laboratory started to zero in on genomic methylation patterns, which are essential for normal development and associated with a number of key cellular processes, including carcinogenesis. To ascertain how the epigenome of a differentiated cell differs from the epigenome of a pluripotent stem cell, his team used an ultra high-throughput methodology to determine precisely whether or not each C in the genome is methylated and to layer the resulting epigenomic map upon the exact genome it regulates. The study revealed a highly dynamic, yet tightly controlled, landscape of chemical signposts known as methyl groups and resulted in the first detailed map of the human epigenome, comparing the epigenomes of human embryonic stem cells and differentiated connective cells from the lung called fibroblasts. The head-to-head comparison brought to light a novel DNA methylation pattern unique to stem cells, which may explain how stem cells establish and maintain their pluripotent state. Now that they are able to create high-resolution maps of the human epigenome, Ecker's group will begin to examine how it changes during normal development as well as in a variety of disease states.

Project: Epigenetic Variation and Inheritance

When: Any Quarter

The goal of this project is to understand the degree of epigenetic and genetic variation that occurs in Arabidopsis (a reference organism for all plants) . The rotation would involve characterization of DNA methylation, transcription and chromatin pattern in hundreds of individual strains collected from around the world. The project is expected to reveal new roles of epigenetic inheritance in many biological processes, some that affect plant fitness and adapation to climate.

Joseph Ecker | Biological Sciences

Email Contact: ecker [at] salk.edu

Nature vs. nurture, genes vs. environment—what is more important? My group is interested in understanding the roles of genetic and 'epigenetic' processes in cell growth and development. By understanding how the genome and epigenome talk to one another, we hope to be able to untangle the complexity of gene regulatory processes that underlie development and disease in plants and humans." Although the human genome sequence lists almost every single DNA base of the roughly 3 billion bases that make up a human genome, it doesn't tell biologists much about how its function is regulated. That job belongs to the epigenome, the layer of genetic control beyond the regulation inherent in the sequence of the genes themselves. Being able to study the epigenome in its entirety promises a better understanding of how genome function is regulated in health and disease, as well as how gene expression is influenced by diet and the environment. One of the ways epigenetic signals can tinker with genetic information is through DNA methylation, a chemical modification of one letter, C (cytosine), of the four letters (A, G, C, and T) that comprise our DNA. In the last couple of years, Ecker's laboratory started to zero in on genomic methylation patterns, which are essential for normal development and associated with a number of key cellular processes, including carcinogenesis. To ascertain how the epigenome of a differentiated cell differs from the epigenome of a pluripotent stem cell, his team used an ultra high-throughput methodology to determine precisely whether or not each C in the genome is methylated and to layer the resulting epigenomic map upon the exact genome it regulates. The study revealed a highly dynamic, yet tightly controlled, landscape of chemical signposts known as methyl groups and resulted in the first detailed map of the human epigenome, comparing the epigenomes of human embryonic stem cells and differentiated connective cells from the lung called fibroblasts. The head-to-head comparison brought to light a novel DNA methylation pattern unique to stem cells, which may explain how stem cells establish and maintain their pluripotent state. Now that they are able to create high-resolution maps of the human epigenome, Ecker's group will begin to examine how it changes during normal development as well as in a variety of disease states.

Project: Human ES/iPS Epigenomes

When: Any Quarter

Our aim is to understand the contribution of the epigenome in reprogramming and differentiation of induced pluripotent stem (iPS) cells. We are using novel sequencing approaches to study reprogramming of the epigenomes of somatic cells to an ES-like state and the degree to which epigenomic reprogramming affects iPSC potential. Computational and experimental studies can be combined in this training experience. This project is a collaboration with Ron Evans' laboratory.

Joseph Ecker | Biological Sciences

Email Contact: ecker [at] salk.edu

Nature vs. nurture, genes vs. environment—what is more important? My group is interested in understanding the roles of genetic and 'epigenetic' processes in cell growth and development. By understanding how the genome and epigenome talk to one another, we hope to be able to untangle the complexity of gene regulatory processes that underlie development and disease in plants and humans." Although the human genome sequence lists almost every single DNA base of the roughly 3 billion bases that make up a human genome, it doesn't tell biologists much about how its function is regulated. That job belongs to the epigenome, the layer of genetic control beyond the regulation inherent in the sequence of the genes themselves. Being able to study the epigenome in its entirety promises a better understanding of how genome function is regulated in health and disease, as well as how gene expression is influenced by diet and the environment. One of the ways epigenetic signals can tinker with genetic information is through DNA methylation, a chemical modification of one letter, C (cytosine), of the four letters (A, G, C, and T) that comprise our DNA. In the last couple of years, Ecker's laboratory started to zero in on genomic methylation patterns, which are essential for normal development and associated with a number of key cellular processes, including carcinogenesis. To ascertain how the epigenome of a differentiated cell differs from the epigenome of a pluripotent stem cell, his team used an ultra high-throughput methodology to determine precisely whether or not each C in the genome is methylated and to layer the resulting epigenomic map upon the exact genome it regulates. The study revealed a highly dynamic, yet tightly controlled, landscape of chemical signposts known as methyl groups and resulted in the first detailed map of the human epigenome, comparing the epigenomes of human embryonic stem cells and differentiated connective cells from the lung called fibroblasts. The head-to-head comparison brought to light a novel DNA methylation pattern unique to stem cells, which may explain how stem cells establish and maintain their pluripotent state. Now that they are able to create high-resolution maps of the human epigenome, Ecker's group will begin to examine how it changes during normal development as well as in a variety of disease states.

Project: Mouse Brain Methylome

When: Any Quarter

Epigenetic regulation of gene transcription, specifically when related to changes in DNA-methylation patterns (methylome), is a plausible mechanism underlying long-term environmental contributions to neuropsychiatric disorders. For example, pharmacological or environmentally-induced methylome alterations may lead to the silencing or aberrant activation of genes involved in the postnatal maturational process of brain circuitry, leading to functional and behavioral alterations appearing when the system reaches maturity. The proposed rotation project will contribute to creating complete maps of mouse brain methylomes, at the tissue and cell-type levels, during the period of postnatal development until adulthood. The goal is to delineate the methylome changes and transcriptional consequences produced by two developmental manipulations that lead to aberrant behaviors in adulthood, at the neuronal and peripheral tissue level. These reference for methylome and transcriptome databases consulted in relation to neuropsychiatric disorders with known and unknown developmental origins. This work is a collaboration with Marga Behrens/Terry Sejnowski.

Charles Elkan | Computer Science and Engineering

Charles Elkan | Computer Science and Engineering

Email Contact: celkan [at] ucsd.edu

Project: Bioinformatics Rotation Projects Available in Machine Learning

When: Any Quarter

Students are welcome for computational rotation projects that apply methods from machine learning and statistics to problems in sequence analysis, structure prediction, and data and text mining.

Specific possible tasks include:

  1. Identifying and disambiguating all mentions of transport proteins in Medline abstracts.
  2. Predicting contacts between amino acids in globular proteins.
  3. Applying so-called "topic models" to gene expression data.
  4. Predicting entry of HIV into cerebrospinal fluid, based on longitudinal patient characteristics.

These projects are all somewhat speculative and high-risk high-reward, and they all require programming skills combined with mathematical ability. They are based closely on high-quality recent research, but they demand innovation.

Kelly Frazer | Pediatrics

Kelly Frazer | Pediatrics

Email Contact: kafrazer [at] ucsd.edu

Project: Gene-based association of exome data to identify rare variants associated with venous thromboembolism

When: Fall 2012, Winter 2013, Spring 2013

Common diseases are likely influenced by rare genetic variation of moderate to high effect. It has been a challenge, however, to assay rare variants effectively and perform association tests because variants can be only present in a single or a few individuals. Recently, targeted sequencing of all exons in the human genome has allowed for the comprehensive characterization of variation in protein coding regions. This has resulted in the ability to sequence hundreds of individuals and perform large-scale genetic association studies to identify inherited variation responsible for disease. New statistical analyses allow for the evaluation of many rare variants as a group, resulting in effective tests for gene-based association. We have generated exome sequence data for an 820-person case-control study to study the genetic basis of venous thromboembolism (VTE), an important clotting disorder that affects 1,000,000 people in the US every year. For this project, we will compare and contrast a spectrum of recently developed statistical analyses that test whether groups of rare variants are associated with disease and apply them to identify rare inherited variation in VTE.

Kelly Frazer | Pediatrics

Email Contact: kafrazer [at] ucsd.edu

Project: Identification of aberrant splicing events in Chronic lymphocytic leukemia

When: Fall 2012, Winter 2013, Spring 2013

Mutation of splicing factor SF3B1 is one of the most common somatic aberrations in chronic lymphocytic leukemia (CLL). We have performed whole transcriptome sequencing on several CLL clinical samples, both wild type and mutated for SF3B1, and are interested in identifying aberrant or novel splicing events. This project will require evaluating existing analytical methods for transcript isoform discovery and characterization as well as provide an opportunity to develop new approaches to characterize splicing aberrations.

Kelly Frazer | Pediatrics

Email Contact: kafrazer [at] ucsd.edu

Project: Tumor Subtype Discovery by Integrated Datatype Analysis

When: Fall 2012, Winter 2013, Spring 2013

TCGA (http://cancergenome.nih.gov) is a large NIH-driven effort to molecularly profile 500 tumor samples for 20 different cancer types using a range of technologies. The effort is producing, for each of the 20x500=10,000 tumor samples, profiles of DNA somatic variants, copy number alterations, and mRNA and miRNA expression. Key aims of the effort are to discover and molecularly define the different subtypes inherent to each cancer and to gain detailed understanding about pathogenesis. While recent publications have reported on important developments regarding these key aims, there is still a vast amount of data to be analyzed and numerous insights to be gained. In this project we will develop an analysis technique based on anomaly detection that will naturally integrate the heterogenous data types, reveal the (potentially overlapping) bases for different subtypes, and decompose the relatedness among the defining molecular characteristics of each subtype. From the latter we expect to gain important insights regarding molecular drivers of disease. Experience in algorithm development and in the main datatypes that encompass cancer genomics will be gained by working on this project.

Yoav Freund | Computer Science and Engineering

Yoav Freund | Computer Science and Engineering

Email Contact: yfreund [at] ucsd.edu

Project: Digital Mouse Brain Atlas

When: Summer 2013, Any Quarter
Last updated: 08/26/2013

This Project is in collaboration with the Kleinfeld Lab (http://physics.ucsd.edu/neurophysics/) and the Mitra Lab (http://brainarchitecture.org/mouse/about).

The idea is to use a combination of machine learning and computer vision algorithms to create a digital atlas of the mouse brain.

This would require developing detectors of landmarks that exist in a majority of the brains. As the data size is large (tens of tera-bytes) the work will involve using a hadoop cluster.

Requirements: Python, computer vision, machine learning/statistics.
 

Kyle Gaulton | Pediatrics

Kyle Gaulton | Pediatrics

Email Contact: kgaulton [at] ucsd.edu

The Gaulton lab integrates genome-wide genetic and epigenomic data to understand the genetic basis and molecular mechanisms of type 1 and type 2 diabetes risk.

Project: Genetic and epigenomic fine-mapping of diabetes risk loci

When: Any Quarter
Last updated: 05/27/2016

This rotation project involves dense genetic fine-mapping of diabetes risk loci, integrating fine-mapping data with large-scale genomic and epigenomic maps using published and novel models to identify causal variants, cell types and networks, and applying these predictive models to identify additional diabetes risk loci. 

Kyle Gaulton | Pediatrics

Email Contact: kgaulton [at] ucsd.edu

The Gaulton lab integrates genome-wide genetic and epigenomic data to understand the genetic basis and molecular mechanisms of type 1 and type 2 diabetes risk.

Project: Predicting causal genes at diabetes risk loci

When: Any Quarter
Last updated: 05/27/2016

This rotation project involves development of novel methods for integrating genetic association data with epigenomic annotation, expression QTLs and chromatin QTLs to predict causal genes of diabetes risk variants

Kyle Gaulton | Pediatrics

Email Contact: kgaulton [at] ucsd.edu

The Gaulton lab integrates genome-wide genetic and epigenomic data to understand the genetic basis and molecular mechanisms of type 1 and type 2 diabetes risk.

Project: Predicting genome-wide pleiotropic effects of diabetes risk variants

When: Any Quarter
Last updated: 05/27/2016

This rotation project involves development of novel mixture model approaches to predicting and quantifying the extent of pleiotropy among diabetes risk variants genome-wide.

Christopher Glass | Cellular and Molecular Medicine

Christopher Glass | Cellular and Molecular Medicine

Email Contact: cglass [at] ucsd.edu

Project: Bioinformatics Rotation Projects Available

When: Any Quarter
Last updated: 08/31/2009

Research Interests: Dr. Glass’ laboratory investigates transcriptional mechanisms that regulate the development and function of the macrophage, a cell that plays key roles in immunity and inflammatory diseases. Current efforts are to determine the biochemical and biological roles of sequence-specific transcription factors and their associated co-regulators at a genome-wide scale. A combination of biochemical, cellular and genetic model systems are used, incorporating macrophage-specific knockouts, microarray technologies, massively parallel sequencing and bioinformatics approaches, to unravel the contributions of specific factors to the development of specialized macrophage functions in immunity and the pathogenesis of inflammatory diseases.

Bioinformatics rotation projects focus on analysis of data derived from the application of chromatin immunoprecipitation coupled to massively parallel sequencing (ChIP-Seq) to define genome-wide locations of transcription factors that control specific aspects of macrophage biology. This information is integrated with corresponding high throughput transcriptomic data to develop testable models for transcriptional circuits that underlie biological responses.

Joseph Gleeson | Neurosciences (School of Medicine)

Joseph Gleeson | Neurosciences (School of Medicine)

Email Contact: jogleeson [at] ucsd.edu

Our lab is interested in how the human brain is assembled. We use genetic strategies to explore causes of human disease and then link these genes to function using model systems. We are focused on diseases like mental retardation, epilepsy and autism, where brain development is disrupted. We have two full time bioinformatics (graduate student and staff programmer) developing computational solutions, database design and machine learning paradigms for large datasets in the area of genome sequencing.

Project: Develop dynamic query system for supporting phenotype mining in genetic studies

When: Any Quarter
Last updated: 08/13/2012

Rady Children's Hospital San Diego is the second largest children's hospital in the country and has recently adopted an advanced electronic medical record system. The goal of this project is to develop code to search the clinical data archive to identify patients appropriate for clinical research in collaboration with other clinical bioinformatics groups on campus.

  • Integrate hospital electronic medical record with an open-source environment for data mining to create a query system aimed at supporting identification of patients for research.
  • Combine different conditions relative to the electronic medical record data (e.g., the presence of a particular pathology) to identify those most appropriate for study.
  • Develop a multidimensional database to store/retrieve clinical data and update dynamically.

Joseph Gleeson | Neurosciences (School of Medicine)

Email Contact: jogleeson [at] ucsd.edu

Our lab is interested in how the human brain is assembled. We use genetic strategies to explore causes of human disease and then link these genes to function using model systems. We are focused on diseases like mental retardation, epilepsy and autism, where brain development is disrupted. We have two full time bioinformatics (graduate student and staff programmer) developing computational solutions, database design and machine learning paradigms for large datasets in the area of genome sequencing.

Project: Informatics approaches in deciphering exomes

When: Any Quarter
Last updated: 08/13/2012

We are in the middle of analyzing exome data from an accruing group of >1500 individuals with brain disorders, and are looking to develop better algorithms to uncover pathogenic mutations.

  • Develop code to sift through the 30 megabases of sequence per patient.
  • Identify novel genes involved in disorders like epilepsy and autism.
  • Develop platforms for storage, visualization, annotation, comparison and recovery of these large datasets.
  • Improve ability to predict patient outcomes and identify "actionable items"

Lawrence Goldstein | Cellular and Molecular Medicine

Lawrence Goldstein | Cellular and Molecular Medicine

Email Contact: lgoldstein [at] ucsd.edu

Project: Stem Cell GWAS in Alzheimers

When: Any Quarter
Last updated: 04/19/2012

Genome-wide association studies (GWAS) have identified polymorphic variants in many new genes linked to late-onset sporadic Alzheimer's disease (SAD). Human stem cells from patients with disease give us the opportunity to examine the connection between these variants and phenotype at the individual and cellular level. Currently we are investigating the role of the sortilin-related receptor 1 (SORL1) gene in the pathogenesis of SAD. Variants in the 5' and 3' regions of SORL1 have been linked to SAD and may lead to defective gene expression, which contributes to amyloid beta production and neuronal death in the disease. However, the variants identified by GWAS are common and not likely to be causative mutations. One important project is to use bioinformatic tools to probe the genomic region around the GWAS-associated risk variants to identify candidate rare polymorphisms, which may have an effect on phenotype. Once these variants are identified, we can genotype stem cell lines for these polymorphisms and design experiments to determine the contribution of these variants to Alzheimer's disease.

Melissa Gymrek | Computer Science and Engineering

Melissa Gymrek | Computer Science and Engineering

Email Contact: mgymrek [at] ucsd.edu

Our overall goal is to understand complex genetic variants that underlie human disease. We are particularly interested in repetitive DNA variants known as short tandem repeats (STRs) as a model for complex variation. Our work focuses on developing computational tools for analyzing and visualizing complex variation from large-scale sequencing data and applying these tools to learn about the contribution of repetitive variation to human disease.

Project: Analyzing repeat expansions in human genomes

When: Any Quarter
Last updated: 09/29/2016

Tandem repeats (TRs) have been implicated in more than 40 Mendelian diseases, such as Fragile X Syndrome and Huntington's Disease. While a variety of tools can now profile short TRs from next generation sequencing data, these approaches do not immediately expand to longer TRs that cannot be spanned by a single sequencing read. In this project, we will model properties of read alignments at long TRs in order to build a statistical framework to detect expanded repeats in the genome. We will also evaluate the ability of long read (e.g. Nanopore, PacBio) and synthetic long read technologies (e.g. 10X Genomics) to capture long repeats.

Melissa Gymrek | Computer Science and Engineering

Email Contact: mgymrek [at] ucsd.edu

Our overall goal is to understand complex genetic variants that underlie human disease. We are particularly interested in repetitive DNA variants known as short tandem repeats (STRs) as a model for complex variation. Our work focuses on developing computational tools for analyzing and visualizing complex variation from large-scale sequencing data and applying these tools to learn about the contribution of repetitive variation to human disease.

Project: Genome-wide association studies of short tandem repeats

When: Any Quarter
Last updated: 09/29/2016

Genome-wide association studies (GWAS) have uncovered thousands of genetic loci associated with human phenotypes. However, in most cases these loci fail to explain the majority of trait heritability. Importantly, GWAS has focused largely on simple single nucleotide polymorphisms (SNPs), and therefore are likely to miss important contributions from more complex types of variants. In this project we will (1) measure the power of SNP-based GWAS to detect repeat associations (2) evaluate the ability of haplotype-based tests to identify underlying repeat associations and (3) use these results to inform a method to perform STR association tests.

Melissa Gymrek | Computer Science and Engineering

Email Contact: mgymrek [at] ucsd.edu

Our overall goal is to understand complex genetic variants that underlie human disease. We are particularly interested in repetitive DNA variants known as short tandem repeats (STRs) as a model for complex variation. Our work focuses on developing computational tools for analyzing and visualizing complex variation from large-scale sequencing data and applying these tools to learn about the contribution of repetitive variation to human disease.

Project: Phasing and imputing short tandem repeats

When: Any Quarter
Last updated: 09/29/2016

Imputation is a vital step of genome-wide association studies. It leverages the correlation structure in the genome induced by recombination to learn about genome-wide polymorphisms by only genotyping a small subset of variants. While imputation of single nucleotide polymorphisms (SNPs) has proven to be quite robust, statistical phasing and imputation of tandem repeats (TRs) in unrelated samples is challenging, largely because TRs and SNPs have diminished linkage disequilibrium due to the fast mutation rates and high prevalence of recurrent mutations in TRs. The goal of this project is to evaluate phasing and imputation techniques at TRs by leveraging two types of data: (1) family-based data, which allows tracing inheritance patterns to infer phase and (2) long read and synthetic long read technologies, which allows physical phasing of TRs with nearby SNPs.

Nan Hao | Biological Sciences

Nan Hao | Biological Sciences

Email Contact: nhao [at] ucsd.edu

Our laboratory aims to obtain a quantitative and predictive understanding of how complex biological systems operate and function. We focus on the two complementary directions: systems biology analysis to deconstruct natural systems and synthetic biology to build artificial systems analogous to natural systems.

Project: Understanding the design principles of gene regulatory networks

When: Fall 2013, Winter 2014, Spring 2014
Last updated: 10/29/2013

Understanding the design principles of gene regulatory networks using systems and synthetic biology approaches. In particular, we will study the function of parallel transcription factor systems in single cells.

Nan Hao | Biological Sciences

Email Contact: nhao [at] ucsd.edu

Our laboratory aims to obtain a quantitative and predictive understanding of how complex biological systems operate and function. We focus on the two complementary directions: systems biology analysis to deconstruct natural systems and synthetic biology to build artificial systems analogous to natural systems.

Project: Understanding the feedback regulation of MAPK system

When: Fall 2013, Winter 2014, Spring 2014
Last updated: 10/29/2013

We will use the MAPK system in yeast as a model to study the function of feedback regulation, in particular, how feedback loops contribute to signal specificity and cellular information encoding. We will combine single-cell imaging analysis, quantitative biochemical assays, and computational modeling in the project.

Olivier Harismendy | Pediatrics

Olivier Harismendy | Pediatrics

Email Contact: oharismendy [at] ucsd.edu

The Oncogenomics laboratory is located in the Moores Cancer Center. Its research program is focused on the identification of genetic and epigenetic markers for cancer prevention and progression as well as drug response. The laboratory is a humid laboratory, combining both wet-lab techniques and bioinformatics analysis to study cancer samples from patients and animal models of cancer. The laboratory is also an important partner for multiple principal investigators at the Moores Cancer Center, collaborating on the design, analysis and interpretation of their genomic experiments.

Project: Development of Genomics Virtual Machines in HIPAA compliant cloud

When: Any Quarter
Last updated: 06/09/2016

Genetic information is considered protected health information (PHI) and as a consequence the highest security standards need to be applied for its storage, analysis and sharing. The oncogenomics laboratory is using state of the art iDASH compute cloud for its main computation. As a consequence, we participate in the development of optimal workflows and virtual machines for the analysis of patient-derived genomic datasets such as whole exomes, whole genomes, RNA-seq or genotyping arrays. 

In this project we will develop robust provisioning methods to establish virtual machines capable of running popular human genomic analysis workflows. We will benchmark these machines and workflows and convert some of them into standard recipes for production-grade, reproducible genomic analysis.

Olivier Harismendy | Pediatrics

Email Contact: oharismendy [at] ucsd.edu

The Oncogenomics laboratory is located in the Moores Cancer Center. Its research program is focused on the identification of genetic and epigenetic markers for cancer prevention and progression as well as drug response. The laboratory is a humid laboratory, combining both wet-lab techniques and bioinformatics analysis to study cancer samples from patients and animal models of cancer. The laboratory is also an important partner for multiple principal investigators at the Moores Cancer Center, collaborating on the design, analysis and interpretation of their genomic experiments.

Project: Genetic and epigenetic of cisplatin resistance

When: Any Quarter
Last updated: 06/09/2016

Cisplatin (cDDP) is the most commonly used chemotherapeutic drug, but most cancer eventually become resistant, leading to tumor recurrence. Several biological processes may modulate cDDP sensitivity: Drug import, export, detoxification, DNA repair, apoptosis. Drug resistance is transmitted to daughter cells, and one can build up resistant cell lines in vitro using sequential treatments. We are interested in identifying the genetic mutations that mediate this resistance. For this, we have derived resistant cell-lines from single clones of a cDDP sensitive ovarian cancer cell line. Using exome sequencing as well as target sequencing, we propose to determine mutations in genes and pathways that drive drug resistance. We will then expand the findings to the TCGA samples, using time to recurrence as an indicator of drug sensitivity.

Olivier Harismendy | Pediatrics

Email Contact: oharismendy [at] ucsd.edu

The Oncogenomics laboratory is located in the Moores Cancer Center. Its research program is focused on the identification of genetic and epigenetic markers for cancer prevention and progression as well as drug response. The laboratory is a humid laboratory, combining both wet-lab techniques and bioinformatics analysis to study cancer samples from patients and animal models of cancer. The laboratory is also an important partner for multiple principal investigators at the Moores Cancer Center, collaborating on the design, analysis and interpretation of their genomic experiments.

Project: The role of inherited variation in cancer somatic landscape

When: Any Quarter
Last updated: 06/09/2016

The role of germline or inherited variation in cancer has been studied in selected families and led to the identification of genetic variants that are dominant and responsible for cancer syndromes. Similarly, rare recessive variants with lower penetrance are responsible for the increase risk in breast and ovarian cancer (BRCA1/2). More common variants in the population have also been identified through GWAS, and have revealed multiple SNPs associated a modest increase in cancer risk. Despite these advances, multiple variants of intermediate allelic frequency in the population, or carried by patients with undocumented family history still remain variants of unknown significance (VUS) and can still play a role in tumor development. In addition, the contribution of variants located outside of the coding region has been underexplored and can now be reexamined in the light of recent maps of the regulatory landscape. The long-term goal of this research is to utilize germline genetics variation in cancer prevention and care to better stage patients or predict their response to treatment.

We propose to identify the germline variants in the UCSD Cancer center patients (targeted gene panel) as well as in the public TCGA/ICGC datasets (whole genomes). We will then test these variants, alone or in combination to identify the ones that impact cancer onset, the tumor somatic landscape or tissue specific regulatory network. The project will involve processing of high throughput sequencing data, population genetics and statistical analysis, in a HIPAA compliant cloud-computing environment.

Jeff Hasty | Biological Sciences

Jeff Hasty | Biological Sciences

Email Contact: jmhasty [at] ucsd.edu

Project: Modeling the dynamics of gene regulation

When: Any Quarter
Last updated: 09/07/2011

This rotation project will involve two questions in the area of modeling the dynamics of gene regulation. We have generated fluorescence data from a system where a promoter is driven periodically with an external inducer and the resulting gene expression has been measured using GFP. The first question involves the deduction of a mesoscopic dynamical model from the data, and how this approach can be generalized to characterize a library of promoters for use in constructing genetic circuits. The second question is how the data and the resulting mesoscopic model can be used to constrain a large set of parameters that define a microscopic model.

Vivian Hook | School of Pharmacy

Vivian Hook | School of Pharmacy

Email Contact: vhook [at] ucsd.edu

Project: Bionformatics of Mass Spectrometry Peptide Spectral Libraries for Identification of Neuropeptides

When: Winter 2011, Spring 2011, Summer 2011
Last updated: 09/22/2010

Identification of small peptides by mass spectrometry is a distinct discipline compared to protein identification. Small ‘peptides’ in biological tissues encompass a large range of amino acid lengths from about 3-40 residues; whereas, ‘tryptic peptides’ derived from proteins have a narrow range of length of about 7/8-20 residues. Current mass spectrometry instrumentation and bioinformatics are not well designed for ‘peptides’, but have been designed for ‘proteins’ that are identified by tryptic digests. The bioinformatic challenge in this project is to design new algorithms based on peptide spectral mass spectrometry data for identification of peptides of unusual lengths. This project has tremendous relevance to all biological systems, because we are now missing identification of an entire class of peptides. The computational peptide spectral library has high impact for the biological and bioomedical sciences. This project is being conducted as a collaboration with the laboratories of Dr. Vivian Hook and Dr. Nuno Bandeira.

Vivian Hook | School of Pharmacy

Email Contact: vhook [at] ucsd.edu

Project: Neuropeptidomic Profiles in the Brain and Neurological Diseases

When: Winter 2011, Spring 2011, Summer 2011
Last updated: 09/22/2010

Multiple peptide neurotransmitters are are essential for cell-cell communication among neuronal cells of the nervous system for regulation of brain and physiological organ systems. The bioinformatic challenge to address in current neuropeptide research is how to define profiles of neuropeptides, neuropeptidomes, utilized in health and diseases of the nervous system? These ‘neuropeptidomes’ will be investigated by nano-LC-MS/MS mass spectrometry with particular attention to organization of bioinformatic tools to optimize peptide identifications. An example of this research was published this year (Gupta et al., 2010). This research is conducted as a collaboration by the laboratories of Dr. Vivian Hook, Dr. Pavel Pevzner, and Dr. Nuno Bandeira.

Terence Hwa | Physics

Terence Hwa | Physics

Email Contact: thwa [at] ucsd.edu

Terence Hwa, ​Departments of Physics and Molecular Biology

The Hwa lab (a.k.a. the Quantitative Microbiology Lab) uses a combination of experimental and theoretical approaches to elucidate the organizational principles of living systems. The goal is to quantitatively characterize the physiological behaviors and understand how they arise in terms of the underlying molecular interactions. Our lab focuses on the bacterium E. coli, because it is perhaps the best characterized in terms of molecular components and interactions. But we do also study higher organisms together with collaborating labs. Please visit our lab webpage (http://matisse.ucsd.edu) for further information.

Project: Quantitative studies of bacterial physiology

When: Any Quarter
Last updated: 08/26/2013

An outstanding challenge in making biology quantitative and predictive is how to deal with the millions or even billions of missing parameters that describe the underlying molecular interactions. In recent years, our lab pioneered a top-down approach which exploited a number of phenomenological laws to accurately predict the physiological responses of bacteria to environmental and genetic changes (e.g., nutrients, antibiotics, heterologous protein expression) [DOI: 10.1126/science.1192588]. Furthermore, insight from this quantitative physiological approach is able to pinpoint key missing molecular interactions in long-studied biological processes [DOI: 10.1038/nature12446]. The lab has a number of projects further extending this basic approach to a variety of problems in microbiology, including growth transitions, stress response, antibiotic resistance, and biofilm formation.

Lilia Iakoucheva | Psychiatry

Lilia Iakoucheva | Psychiatry

Email Contact: lilyak [at] ucsd.edu

The lab has a variety of bioinformatics projects aimed at improving understanding of the functional impact of autism mutations derived from exome and genome sequencing of the patients. We build spatio-temporal gene co-expression and protein interaction networks for psychiatric diseases and we use these networks to generate the testable hypothesis about the mechanisms of disease. We also test these hypothesis experimentally in the lab, thereby adding a translational aspect to our work. 

Project: Evaluating the effect of splicing mutations on isoform networks in autism

When: Any Quarter
Last updated: 07/12/2016

The project deals with constructing the isoform-level co-expression and protein interaction networks for predicting functional impact of the de novo splice site mutations from the patients with autism spectrum disorder (ASD). Hundreds of splice site de novo mutations are currently identified in the ASD patients, but not a single disease mechanism is established for any of these mutations. We will build and analyze isoform-level networks of brain co-expressed and physically interacting proteins; map de novo ASD mutations onto isoform-level networks to predict their functional impact; and validate the disrupted networks and pathways using CRISPR/Cas technology in neuronal and animal models. This project will discover and characterize cellular and molecular processes that are disrupted by the de novo splice site ASD mutations.

Lilia Iakoucheva | Psychiatry

Email Contact: lilyak [at] ucsd.edu

The lab has a variety of bioinformatics projects aimed at improving understanding of the functional impact of autism mutations derived from exome and genome sequencing of the patients. We build spatio-temporal gene co-expression and protein interaction networks for psychiatric diseases and we use these networks to generate the testable hypothesis about the mechanisms of disease. We also test these hypothesis experimentally in the lab, thereby adding a translational aspect to our work. 

Project: Integrative functional genomic study of pathways impacted by recurrent autism CNV

When: Any Quarter
Last updated: 07/12/2016

Copy number variants (CNVs) represent significant risk factors for Autism Spectrum Disorders (ASD). One of the most frequent CNVs involved in ASD is a deletion or duplication of the 16p11.2 CNV locus, spanning 29 protein-coding genes. Despite the progress in linking 16p11.2 genetic changes with the phenotypic (macrocephaly and microcephaly) abnormalities in the patients and model organisms, the specific molecular pathways impacted by this CNV remain unknown. To test the hypothesis that RhoA signaling is disrupted by this CNV, we will generate KCTD13 and CUL3 mouse models using CRISPR/Cas9 system and investigate dysregulated molecular pathways using RNAseq at various stages of the developing mouse fetal brain.

Trey Ideker | School of Medicine

Trey Ideker | School of Medicine

Email Contact: tideker [at] ucsd.edu

​The overall objective of the Ideker Laboratory is to develop an artificially intelligent model of the cell able to translate a patient's data into precision diagnosis and treatment. Towards this goal, we are developing methods that learn how to structure cell models directly from genomics data sets:

For this purpose, we run an experimental facility for systematic measurement of gene and protein interaction networks:

A second big challenge is to work out the functional logic by which these models process information, e.g., from genotype to phenotype. Here too, we have made recent progress

but much remains to be done before we have a cell model capable of making robust predictions about patients. As supporting software, we are developers of Cytoscape, a popular platform for visualization and modeling of biological networks which is supported by a consortium of many labs including our own (http://www.cytoscape.org/).

Project: Computing a minimal set of genes required for life

When: Fall 2016, Spring 2017
Last updated: 07/13/2016

A long standing question in biology is how many (and which) genes are required for life. This essential core set of genes, or minimal genome, makes up the cell's “life support system” or “chassis and power supply” on which more complex functions and processes are built. This set of genes is of keen interest in the field Synthetic Biology, which aims to synthesize the complete minimal genome of an organism and add additional functions to this genome for biotechnological, pharmaceutical and agricultural ends. This project will attempt to use our whole-cell model of the networks and pathways in a cell to predict which genes and gene combinations are essential for life and, conversely, which genes and gene combinations can be removed. If successful, this project will be able to predict minimal genomes for synthesis and testing. It will also address whether there actually is a single “minimal genome” or whether there exist many different configurations all of which are near or at the global minimum.

Prerequisites: Computer programming or scripting skills.
Optional: Experimental laboratory skills, which would allow student to make tests of model predictions.

Trey Ideker | School of Medicine

Email Contact: tideker [at] ucsd.edu

​The overall objective of the Ideker Laboratory is to develop an artificially intelligent model of the cell able to translate a patient's data into precision diagnosis and treatment. Towards this goal, we are developing methods that learn how to structure cell models directly from genomics data sets:

For this purpose, we run an experimental facility for systematic measurement of gene and protein interaction networks:

A second big challenge is to work out the functional logic by which these models process information, e.g., from genotype to phenotype. Here too, we have made recent progress

but much remains to be done before we have a cell model capable of making robust predictions about patients. As supporting software, we are developers of Cytoscape, a popular platform for visualization and modeling of biological networks which is supported by a consortium of many labs including our own (http://www.cytoscape.org/).

Project: Development of a software pipeline for generating cell function hierarchies from genomic data

When: Fall 2016, Spring 2017
Last updated: 07/13/2016

We have developed algorithms (NeXO and CliXO) by which systematic datasets are used to organize genes into a gene ontology, reflecting the hierarchical organization of cellular structures and molecular pathways in the cell. Currently these algorithms are coded in Python; however, a user-friendly and expandable interface would allow end-users to quickly build and update gene ontologies from new data sets. Coding of this interface is the main goal of this rotation; If successful, this tool could seed a thesis project to construct a gene ontology for a particular cellular process (e.g. DNA damage response) or disease (e.g. cancer) of interest.

Prerequisites: Computer programming or scripting skills; some knowledge of genomic biology.

Trey Ideker | School of Medicine

Email Contact: tideker [at] ucsd.edu

​The overall objective of the Ideker Laboratory is to develop an artificially intelligent model of the cell able to translate a patient's data into precision diagnosis and treatment. Towards this goal, we are developing methods that learn how to structure cell models directly from genomics data sets:

For this purpose, we run an experimental facility for systematic measurement of gene and protein interaction networks:

A second big challenge is to work out the functional logic by which these models process information, e.g., from genotype to phenotype. Here too, we have made recent progress

but much remains to be done before we have a cell model capable of making robust predictions about patients. As supporting software, we are developers of Cytoscape, a popular platform for visualization and modeling of biological networks which is supported by a consortium of many labs including our own (http://www.cytoscape.org/).

Project: Experimental mapping of the DNA damage response

When: Fall 2016, Spring 2017
Last updated: 07/13/2016

Cell colonies on agar grow in a near linear fashion with growth rates reflective of their "fitness". The laboratory has developed an experimental platform that can make continuous measurements of growth rates via time-lapse image capture of thousands of specific genetic mutant strains, enabling us to determine the relevance of every gene in the response to stimuli such as DNA damage via radiation or chemotherapy. During the rotation the student will grow ~50,000 cell colonies in parallel and capture their growth curves using digital images and intermittent radiation exposure. The project includes working in Matlab for the analysis of growth curves and the elucidation of DNA damage response pathways. If successful, the project could be developed into a thesis which uses these data to construct a hierarchical model of DNA damage responses.

Prerequisites: Prior experience in a genetics or biochemistry experimental laboratory.

Trey Ideker | School of Medicine

Email Contact: tideker [at] ucsd.edu

​The overall objective of the Ideker Laboratory is to develop an artificially intelligent model of the cell able to translate a patient's data into precision diagnosis and treatment. Towards this goal, we are developing methods that learn how to structure cell models directly from genomics data sets:

For this purpose, we run an experimental facility for systematic measurement of gene and protein interaction networks:

A second big challenge is to work out the functional logic by which these models process information, e.g., from genotype to phenotype. Here too, we have made recent progress

but much remains to be done before we have a cell model capable of making robust predictions about patients. As supporting software, we are developers of Cytoscape, a popular platform for visualization and modeling of biological networks which is supported by a consortium of many labs including our own (http://www.cytoscape.org/).

Project: Improving the construction of gene ontologies from data

When: Fall 2016, Spring 2017
Last updated: 07/13/2016

While the manually curated Gene Ontology (GO) is widely used, inferring a GO directly from -omics data is a compelling new problem. Recently, we have shown that GO can be inferred directly from molecular data. However, our previous methods use heuristic algorithms with problems such as:

  1. The parameters are application-dependent and must be adapted by hand.
  2. The methods are greedy and it is hard to prove or verify their correctness in theory.
  3. The memory consumption is large (10-15G memory footprint) resulting in slow run-times for large datasets.

The aim of this project is to replace the original heuristic objective function with a new mathematical one with an explicit form. It also includes developing a new efficient optimization algorithm (based on Integer Linear Programming) to solve the new objective function accurately.

Prerequisites: Prior coursework or research activity in computer algorithms; Computer programming or scripting skills.

Trey Ideker | School of Medicine

Email Contact: tideker [at] ucsd.edu

​The overall objective of the Ideker Laboratory is to develop an artificially intelligent model of the cell able to translate a patient's data into precision diagnosis and treatment. Towards this goal, we are developing methods that learn how to structure cell models directly from genomics data sets:

For this purpose, we run an experimental facility for systematic measurement of gene and protein interaction networks:

A second big challenge is to work out the functional logic by which these models process information, e.g., from genotype to phenotype. Here too, we have made recent progress

but much remains to be done before we have a cell model capable of making robust predictions about patients. As supporting software, we are developers of Cytoscape, a popular platform for visualization and modeling of biological networks which is supported by a consortium of many labs including our own (http://www.cytoscape.org/).

Project: Using a hierarchical cellular model to analyze tumor genetic mutations

When: Fall 2016, Spring 2017
Last updated: 07/13/2016

The student will explore whether a hierarchical model we have recently constructed for predicting growth of simple cells can be translated to predict aggressiveness of human cancer. The model will be provided, along with access to tumor exomes from both public and internal sources. The goal is to determine, over a 10 week rotation, whether and to what extent the model can be used to analyze a patient's exome. If so, this project could be readily developed into a PhD thesis.

Prerequisites: Computer programming or scripting skills; some knowledge of genomic biology.

Suckjoon Jun | Physics

Suckjoon Jun | Physics

Email Contact: jun [at] ucsd.edu

We are interested in fundamental questions in biology, and we bring tools and ways of thinking from physical sciences and engineering. Researchers in our lab are proficient in both experiment and theory.

Project: Single-cell Physiology

When: Fall 2013, Winter 2014, Spring 2014
Last updated: 08/11/2013

We will use the “mother machine,” a microfluidic continuous cell culture device developed in our lab, to follow thousands of individual cells for hundreds of consecutive generations. We will analyze the images using high-throughput image analysis methods that use advanced algorithms. We will focus on the single-cell physiology in the context of cell size control, cell cycle control, and cell death.

Amy Kiger | Biological Sciences

Amy Kiger | Biological Sciences

Email Contact: akiger [at] biomail.ucsd.edu

Cells must continuously maintain integrity and compartmentalization with demands for cellular remodeling throughout development, immunity, aging and disease. Using functional genomics, genetics and cell biological approaches in the fruit fly, Drosophila, we are studying the central roles for membrane regulation of dynamic cell structure. We have identified novel endocytosis and autophagy membrane trafficking pathways that control macrophage and muscle remodeling, with relevance to human disease. Current projects in the lab aim to discover new mechanisms of cellular remodeling through functional genomic and proteomic approaches, and to better understand the pathway networks and dynamics during cellular remodeling.

Project: Autophagy protein and functional networks

When: Any Quarter
Last updated: 08/15/2012
  • Expand on our ongoing co-immunoprecipitation and mass spectrometry datasets to identify protein-protein interactions involved in autophagy.
  • In collaboration with the Ideker lab and the SDCSB Network Assembly Core, analyze coIP results and incorporate functional data into an ‘autophagy network’.
  • Test new insights predicted from network by in vivo autophagy assays.
  • Future directions include a hierarchical analysis of dynamic protein-protein interactions in an autophagy timecourse, and RNAi screens of the autophagy network and for new autophagy gene functions.

Amy Kiger | Biological Sciences

Email Contact: akiger [at] biomail.ucsd.edu

Cells must continuously maintain integrity and compartmentalization with demands for cellular remodeling throughout development, immunity, aging and disease. Using functional genomics, genetics and cell biological approaches in the fruit fly, Drosophila, we are studying the central roles for membrane regulation of dynamic cell structure. We have identified novel endocytosis and autophagy membrane trafficking pathways that control macrophage and muscle remodeling, with relevance to human disease. Current projects in the lab aim to discover new mechanisms of cellular remodeling through functional genomic and proteomic approaches, and to better understand the pathway networks and dynamics during cellular remodeling.

Project: High-throughput image analysis of cell morphology

When: Any Quarter
Last updated: 08/15/2012
  • In collaboration with the Tsimring lab at the BioCircuits Institute, optimize newly developed machine learning image analysis algorithms to quantify cell shape and cell shape changes.
  • Conduct new RNAi screens to test optimized image analysis algorithms, and employ established methodology to identify new and modifying (enhancer/suppressor) gene functions in cellular remodeling.
  • Future directions for long-term projects include use of successful image analysis algorithms in other applications, development of related methods for dynamic analysis of cell shape changes, and/or the analysis of large-scale RNAi screen image data into functional networks.

Richard Kolodner | School of Medicine

Richard Kolodner | School of Medicine

Email Contact: rkolodner [at] ucsd.edu

Research in the Kolodner lab is primarily directed at studying the genetic and biochemical mechanisms of genetic recombination, DNA repair and suppression of spontaneous mutations primarily using Saccharomyces cerevisiae as a model system. Work in S. cerevisiae falls in two interrelated areas - 1) the analysis of the proteins and genes that function in DNA mismatch repair; and 2) elucidation of the pathways that prevent translocations and other types of gross chromosomal rearrangements, and the analysis of the proteins that function in these pathways. We also have research interests in the area of investigating the genetics of cancer susceptibility and development that follows on previous studies showing that a common cancer susceptibility syndrome, Lynch Syndrome (hereditary non-polyposis colorectal cancer), is due to inherited defects in DNA mismatch repair genes. This work is focused on understanding whether genes that prevent genome instability act as tumor suppressor genes in mouse and humans and whether it is possible to develop therapeutics that target genome instability seen in cancer.

Project: Analysis of synthetic lethality in mammalian cells

When: Fall 2011, Winter 2012, Spring 2012, Summer 2012
Last updated: 09/03/2011

A number of genetic defects that cause increased cancer susceptibility, such as mutations in the breast cancer susceptibility gene BRCA2, cause defects in DNA repair and lead to increased genome instability. We have used yeast to identify genes that when mutated cause increased genome instability and have also identified other genes, so called synthetic lethal genes, in which mutations specifically kill cells that contain a mutation causing increased genome instability cause lethality. We are using RNAi knockdown and chemical genetic approaches with human tumor cell lines containing defined genetic defects to determine if the same synthetic lethal relationships can be demonstrated in human tumor cells. This project will provide experience in basic molecular biology, yeast genetics and mammalian genetic approaches.

Sakai, W, Swisher, EM, Jacquemont, C, Venkatapoorna Chandramohn, K, Couch, FJ, Langdon, SP, Wurz, K, Higgins, J, Villegar, E, and Taniguchi, T. Functional restoration of BRCA2 protein by secondary BRCA2 mutations in BRCA2-mutated ovarian carcinoma. Cancer Res. 2009;69:6381-6.

Richard Kolodner | School of Medicine

Email Contact: rkolodner [at] ucsd.edu

Research in the Kolodner lab is primarily directed at studying the genetic and biochemical mechanisms of genetic recombination, DNA repair and suppression of spontaneous mutations primarily using Saccharomyces cerevisiae as a model system. Work in S. cerevisiae falls in two interrelated areas - 1) the analysis of the proteins and genes that function in DNA mismatch repair; and 2) elucidation of the pathways that prevent translocations and other types of gross chromosomal rearrangements, and the analysis of the proteins that function in these pathways. We also have research interests in the area of investigating the genetics of cancer susceptibility and development that follows on previous studies showing that a common cancer susceptibility syndrome, Lynch Syndrome (hereditary non-polyposis colorectal cancer), is due to inherited defects in DNA mismatch repair genes. This work is focused on understanding whether genes that prevent genome instability act as tumor suppressor genes in mouse and humans and whether it is possible to develop therapeutics that target genome instability seen in cancer.

Project: High throughput yeast genetics analysis of pathways that prevent genome instability

When: Fall 2011, Winter 2012, Spring 2012, Summer 2012
Last updated: 09/03/2011

We have developed assays that allow us to measure the rates of accumulation of genome rearrangements such as translocations in the yeast Saccharomyces cerevisiae as well as different assays that allow us to measure DNA damage responses that underlie the formation of genome rearrangements, such as a fluorescence based assay for phosphorylation-dependent degradation of Sml1 that can be followed microscopically. We have adapted these assays for use in a high-throughput robotic mating scheme against a selected subset of the yeast deletion collection to systematically generate single and double mutants containing these assays. Analysis of these mutants will allow elucidation of the genetic networks that prevent genome instability as well as those that function in the DNA damage response. Variations on this project include performing a global mutant analysis, investigating specific pathways such as double strand break repair and their interacting pathways, and bioinformatics analysis of the genetic data for network construction and using next generation sequencing methods for characterizing the structure of genome rearrangements. This project will provide experience in basic molecular biology, yeast genetics, basic cell biology methods, bioinformatics network construction, next generation DNA sequencing and the use of robotic methods.

Putnam, CD, Hayes, TK, and Kolodner, RD. Specific pathways prevent duplication-mediated genome rearrangements. Nature. 2009;460:984-989.

Julie Law | Salk Institute for Biological Studies

Julie Law | Salk Institute for Biological Studies

Email Contact: jlaw [at] salk.edu

In eukaryotic organisms DNA is organized into chromatin, a combination of DNA and specialized packaging proteins called histones, which serves as a backdrop for a vast array of essential cell biology including DNA replication, transcription, recombination and repair. However chromatin is not uniform. An extra layer of information is added at precise genomic locations via the covalent attachment of various chemical tags, termed chromatin modifications, to DNA and/or histones. These modifications alter the landscape of the genome and play critical roles in chromatin biology by communicating with the rest of the cellular machinery through mechanisms that remain poorly understood. Gaining insight into this aspect of chromatin function will be critical not only in understanding normal biological processes but also in understanding how alterations in chromatin modifications can lead to developmental defects and disease. To begin investigating these complex and diverse facets of chromatin my lab will focus on the characterization of proteins, termed “chromatin readers”, which bind specific DNA or histone modifications, and thus offer an excellent tool to begin dissecting the events occurring downstream of chromatin modifications. The plant model Arabidopsis thaliana provides an ideal system to study the effects of chromatin modifications: It is genetically malleable, highly amenable to genome-wide analyses and tolerant of dramatic changes in its chromatin landscape. Furthermore, many of the pathways involved in the establishment, maintenance, and removal of chromatin modifications are conserved between plants and mammals. Thus, understanding how chromatin modifications influence cellular processes in plants will also be informative for mammalian systems.

Project: Integrating chromatin readers, the epigenetic landscape, and specific cellular outputs

When: Any Quarter
Last updated: 11/20/2013

The genome wide distribution of chromatin binding proteins is influence by many factors including the local chromatin environment, other chromatin modifications, and the underlying DNA sequence.  Once recruited to chromatin such proteins often further modify chromatin structure and function, setting into motion a complex and dynamic interplay between chromatin and specific cellular machinery that is required to orchestrate diverse biological processes.   We are interested in integrating these genomic and epigenomic features, as assessed using next generation sequencing technology, within the context of specific chromatin readers to gain insight into their biological functions on a genome-wide scale.  Several projects along these lines are available.

Prashant Mali | Bioengineering

Prashant Mali | Bioengineering

Email Contact: pmali [at] ucsd.edu

Project: Analysis of gene network perturbations in human cells

When: Any Quarter
Last updated: 10/24/2014

In this project we will develop new methods for perturbation and analysis of gene networks in pluripotent stem cells using the CRISPR-Cas systems. The student will have an opportunity to understand the experimental procedures for human genome engineering, to learn the state-of-the-art methods for bioinformatics analysis ranging from computational reagent design to next generation sequencing to network modeling, and to also develop innovative strategies for enhanced multiplexed reverse genetic screens in human cells.

Andrew McCammon | Chemistry and Biochemistry

Andrew McCammon | Chemistry and Biochemistry

Email Contact: jmccammon [at] ucsd.edu

Project: Computer-aided Drug Discovery

When: Winter or Spring, depending on desk availability
Last updated: 04/06/2012

The McCammon group conducts a very wide range of research activities, from the deeply biological (studies of protein and nucleic acid targets for drugs for infectious diseases, studies of protein kinase regulation, etc.) to the development of mathematical and physical methods for simulating biological processes (development of methods for solving partial differential equations, exploring the role of hydrodynamic interactions in protein-protein association, etc.). All of this work involves the use of computers; we do no experimental work in the traditional sense, but we have extensive collaborations with experimental labs at UCSD, The Scripps Research Institute, The Salk Institute, and elsewhere. A more complete perspective can best be obtained by visiting the McCammon group website (http://mccammon.ucsd.edu/).

Participants in the Bioinformatics Graduate Program who have completed their Ph.D.’s in the McCammon group have often focused on computer-aided drug discovery. Rotations can typically be arranged that involve working on aspects of ongoing projects. Current drug discovery work includes efforts to identify compounds that might be effective as antiviral agents (for HIV/AIDS or influenza), anti-trypanosomal agents (for African sleeping sickness or Chagas disease), etc. Our previous work has facilitated the discoveries of the HIV-1 protease inhibitor nelfinavir (approved by the FDA in 1997) and the first-in-class HIV-1 integrase inhibitor raltegravir (approved by the FDA in 2007).

Andrew McCulloch | Bioengineering

Andrew McCulloch | Bioengineering

Email Contact: amcculloch [at] ucsd.edu

Project: Systems Biology of Hypoxia Tolerance and Susceptibility

When: Fall 2011, Winter 2012, Spring 2012, Summer 2012, Any Quarter
Last updated: 04/05/2012

This project combines bioinformatics and systems biology to design experiments, analyze high-throughput/high­‐content data (on DNA sequence, gene expression, protein‐DNA interactions and metabolomics), reconstruct network models and develop new hypotheses on the molecular mechanisms of hypoxia susceptibility and tolerance in Drosophila, mouse and humans. This project includes the opportunity to interact with investigators from several labs and to participate in experimental studies and data acquisition as well as performing systems biology analysis. Individual rotation projects will address smaller questions within this overall theme based on the specific interests of the student.

Graham McVicker | Salk Institute for Biological Studies

Graham McVicker | Salk Institute for Biological Studies

Email Contact: gmcvicker [at] salk.edu

The McVicker laboratory aims to understand how chromatin state and organization are encoded by the human genome. Our approach to this problem is to exploit naturally occurring human genetic variation to identify sequence variants that disrupt chromatin function. We are currently focused on chromatin within immune cells and we are also interested in how variants that affect chromatin and gene regulation lead to disease risk. The problems that we work on often require the development of sophisticated computational and statistical methods that can extract subtle signals from noisy experimental data.

Project: A genetic approach to identifying enhancer-promoter interactions

When: Any Quarter
Last updated: 02/26/2016

Enhancers activate genes from a distance, but little is understood about how they find the correct promoters to regulate. Enhancers will often activate non-adjacent genes and will 'skip-over' the promoters of nearby genes. The goal of this project is to develop a method to identify genetic variants that jointly affect enhancer activity and transcription from promoters (i.e. joint enhancer-promoter QTLs). This will allow us to determine which enhancers regulate which genes and learn about the sequences that are involved in enhancer-promoter interaction.

Graham McVicker | Salk Institute for Biological Studies

Email Contact: gmcvicker [at] salk.edu

The McVicker laboratory aims to understand how chromatin state and organization are encoded by the human genome. Our approach to this problem is to exploit naturally occurring human genetic variation to identify sequence variants that disrupt chromatin function. We are currently focused on chromatin within immune cells and we are also interested in how variants that affect chromatin and gene regulation lead to disease risk. The problems that we work on often require the development of sophisticated computational and statistical methods that can extract subtle signals from noisy experimental data.

Project: Discovery of genetic variants that disrupt enhancer-promoter looping

When: Any Quarter
Last updated: 02/26/2016

Enhancers-promoter interaction is believed to occur through chromatin looping. The goal of this project is to identify genetic variants that disrupt chromatin looping between enhancers and promoters using Hi-C experimental data.

Graham McVicker | Salk Institute for Biological Studies

Email Contact: gmcvicker [at] salk.edu

The McVicker laboratory aims to understand how chromatin state and organization are encoded by the human genome. Our approach to this problem is to exploit naturally occurring human genetic variation to identify sequence variants that disrupt chromatin function. We are currently focused on chromatin within immune cells and we are also interested in how variants that affect chromatin and gene regulation lead to disease risk. The problems that we work on often require the development of sophisticated computational and statistical methods that can extract subtle signals from noisy experimental data.

Project: Using chromatin to identify important cell types for Systemic Lupus Erythematosus (SLE)

When: Any Quarter
Last updated: 02/26/2016

SLE is a complex autoimmune disease that is poorly understood. The goal of this project is to use cell-type specific chromatin marks and GWAS data to determine which cell types are most relevant to this disease. This project will involve development and/or application of methods that can correctly take into account linkage disequilibrium and sub-genome-wide significant GWAS hits to identify cell-type specific enrichments of GWAS signals. SLE is far more common in women than in men, and potentially it will be possible to use sex-specific chromatin accessibility or marks, (depending on what samples are available).

Christian Metallo | Bioengineering

Christian Metallo | Bioengineering

Email Contact: cmetallo [at] ucsd.edu

Metabolism touches virtually all aspects of biology, and defects in the regulation of biochemical reactions often contribute to disease pathogenesis. In the context of cancer, tumors must reprogram these networks to fuel their aggressive growth and survive. The complexity and interconnected nature of metabolic pathways necessitates the use of systems biology approaches to characterize their function. Our lab develops and applies quantitative methods to study metabolic regulation in cancer, stem cells, and the cardiovascular system. These data-driven metabolic flux analysis (MFA) approaches involve stable isotope tracers, mass spectrometry, and computational algorithms that analyze data as a system. We use these techniques to identify metabolic dependencies in cancer cells which are driven by tumor genetics or the surrounding microenvironment.

Project: Metabolic flux profiling in cancer

When: Any Quarter
Last updated: 08/12/2012

Systems biology rotation projects are available to probe various metabolic pathways in cancer cells, including carbohydrate, fatty acid, and amino acid metabolism as well as redox pathways. Students will have the opportunity to build network models and generate MFA data in engineered cancer cells.

Siavash Mirarab | Electrical and Computer Engineering

Siavash Mirarab | Electrical and Computer Engineering

Email Contact: smirarabbaygi [at] ucsd.edu

Our lab specializes in reconstruction of evolutionary histories (phylogenies) from large scale datasets and applications of phylogenetic analyses to downstream analyses. Large-scale datasets include those with many genes and those with many species, and we focus on high accuracy and scalability at the same time. Many projects in this area are available, some of which are described below, but students can contact me to start on other projects as well.

Project: Multiple sequence alignment

When: Fall 2015, Winter 2016, Spring 2016, Summer 2016
Last updated: 10/08/2015

Developing methods for computing a consensus among large numbers of large multiple sequence alignments using the concept of an equivalence class.

Siavash Mirarab | Electrical and Computer Engineering

Email Contact: smirarabbaygi [at] ucsd.edu

Our lab specializes in reconstruction of evolutionary histories (phylogenies) from large scale datasets and applications of phylogenetic analyses to downstream analyses. Large-scale datasets include those with many genes and those with many species, and we focus on high accuracy and scalability at the same time. Many projects in this area are available, some of which are described below, but students can contact me to start on other projects as well.

Project: Reconstruction of species trees from gene trees

When: Fall 2015, Winter 2016, Spring 2016, Summer 2016
Last updated: 10/08/2015

Several projects are available, with different emphasis. Other projects in this general area can also be defined based on student interest.

  1. Improving ASTRAL (an algorithm for species tree reconstruction from gene trees) to handle more varied datasets, to improve scalability as the number of genes increases, and to give better theoretical analysis of the algorithm. An HPC implementation is also of interest.
     
  2. Testing ASTRAL for gene trees that include duplication and loss events in addition to incomplete lineage sorting.
     
  3. Re-analyzing a set of biological datasets using recently developed methods, and comparing their empirical performance

Saket Navlakha | Salk Institute for Biological Studies

Saket Navlakha | Salk Institute for Biological Studies

Email Contact: navlakha [at] salk.edu

We are interested in the interface of theoretical computer science and systems biology. By thinking computationally about the goals, constraints, and algorithmic strategies used by biological systems, we hope to advance both computer science (by developing new bio-inspired algorithms) and biology (by raising testable hypotheses and developing theory and models to predict system behavior).

Project: Algorithms in Nature

When: Any Quarter
Last updated: 06/22/2015

There are many parallels in the constraints and goals of biological and computational systems, which suggest that one can learn from the other. Such constraints include distributed information processing (without a centralized controller), bounded costs (time, energy, or resources available to solve a problem), and limited communication channels between nodes. Goals include enabling efficient input-output responses, robustness in the face of perturbations or attacks, and adaptability to different conditions or environments.

We explore the perspective that biological processes are inherently algorithms that nature has designed to solve computational problems. We study a variety of systems, including neural interactions in the brain, molecular interactions in the cell, and 3D plant growth, and their relation to problems in network design, machine learning, and engineering.

I would be happy to chat about details any time. Thanks.

Bernhard Palsson | Bioengineering

Bernhard Palsson | Bioengineering

Email Contact: bpalsson [at] ucsd.edu

Project: Bioinformatics Rotation Projects Available

When: Any Quarter
Last updated: 09/17/2009

See http://gcrg.ucsd.edu/About_Us for information about Systems Biology research in the Palsson lab.

Note: This list is only somewhat complete. Feel free to come with your own idea and make sure to speak with people in lab regarding new projects that are always coming up. There is also a useful course series (BENG 211-213) that can serve as an introduction to the work we do in the lab.

  1. Modeling metabolic effects of light exposure in green microalgae
    Contact: Roger Chang, rlchang [at] ucsd.edu
    The goal of this project is to model the effects of light on reaction activity and biomass composition. This also requires proper representation of the spectral composition of light sources and the elucidating the quantitative difference between incident light and metabolically available light.
  2. Human physiological/metabolic network reconstruction
    Contact: Roger Chang, rlchang [at] ucsd.edu
    This project is meant to assemble a knowledgebase of human metabolic interactions across organ systems and biofluids. In addition to providing an easily understandable network capturing intersystem metabolism, this result can be integrated with context-specific models to create composite multi-system models to study precise metabolic interactions among the organs and biofluids.
  3. Experimental gap filling iAF1260c (E. coli)
    Contact: Jeff Orth, jorth [at] ucsd.edu
    This project is meant to improve the current metabolic model of E. coli. In order to improve the topology and stoichiometry of the network, experimental work is necessary to validate computationally suggested additions and subtractions to the core model. Strains from the KEIO knockout collection will be grown and tested in different media to assess growth phenotypes. Iterative improvements will be made to the model based on the results.
  4. Y. pestis metabolic reconstruction
    Contact: Pep Charusanti, pcharusanti [at] ucsd.edu
    This project will culminate in the creation of a metabolic reconstruction for Y. pestis, an important human pathogen. Work in this area can leverage the recent reconstructions of closely related organisms and strains.
  5. Improving an automated genome assembly pipeline (using only small reads)
    Contact: Harish Nagarajan, nh [at] ucsd.edu, and Christian Barrett, cbarrett [at] ucsd.edu
  6. Brain mitochondria modeling
    Contact: Nathan Lewis, natelewis [at] gmail.com
  7. Microarray data analysis (E. coli)
    Contact: Nathan Lewis, natelewis [at] gmail.com
  8. Speed optimize an algorithm for binding site identification for ChIP-on-chip data
    Contact: Nathan Lewis, natelewis [at] gmail.com
  9. Got SARS? Modeling the impact of SARS-CoV infection on human epithelial cell metabolism.
    Contact: Daniel R. Hyduke, hyduke [at] ucsd.edu
    Nearly 5 years after the emergence of SARS-CoV, effective licensed vaccines and drugs do not exist to protect the public health. The risk of its re-emergence, or of its recombining with a common human "cold" coronavirus thus facilitating transmission, makes this virus a continuing public health threat. To identify effective strategies to counter SARS infection, we are interested in understanding how SARS reprograms epithelial cell metabolism to promote viral replication. The project involves the development of a tissue-specific model for the human Calu-3 lung epithelial cell line from a preexisting generic model of human metabolism. To model infection, SARS-CoV genes encoding enzymes and proteins necessary to produce the viral envelope will be integrated with the Calu-3 model. The integrated model will be used in conjunction with transcriptome data from the Systems Virology Center to model the shifts in metabolic pathway activity upon viral infection.

Pavel Pevzner | Computer Science and Engineering

Pavel Pevzner | Computer Science and Engineering

Email Contact: ppevzner [at] ucsd.edu

Project: Error-correction in next generation sequencing data: applications to structural variations

When: Fall 2010
Last updated: 09/05/2010

The recent emergence of high-throughput sequencing technologies has revolutionized genomics by providing a new wealth of data for biologists to learn from. It has facilitated our ability to discover the genomic sequence of novel species through a process called sequence assembly. However, even for a human, whose genome has been already assembled, different individuals have slightly variant genomes, accounting for a myriad of important phenotypical differences. Discovering this variation is often the starting point for finding the underlying genetic causes of diseases, and has, among other things, improved our abilities for early detection of cancer. Despite these exciting applications, the growth of sequencing technology has outpaced the development of downstream algorithms. Many assembly and variation discovery projects have been done without properly addressing a key property of the data – it is prone to errors and is not always reliable. Algorithms for error- correction have been inadequate in many assembly projects and simply non-existent in most variation discovery projects. Accurate error-correction algorithms will have a wide-effect on the quality of downstream algorithms, since they correct potential mistakes at the very early stages.

The proposed project is to design an effective error-correction algorithm (with an eye towards variation detection and assembly). The design of such an algorithm is a challenging problem, and though we have general ideas, most design decisions remain to be made. The project therefore requires a motivated student with a superb understanding of algorithms and a good handle on programming. Since the datasets for this problem are hundreds of gigabytes in size, an understanding of data structures is crucial. Some background in either machine learning or graph theory is a plus, though not required. No biological knowledge is required.

The successful completion of this project will ideally lead to a publication in a leading conference or journal. More importantly, it is a good stepping stone into understanding the field of assembly and variation detection and into other related projects in Pavel Pevzner’s lab.

Anjana Rao | Pharmacology

Anjana Rao | Pharmacology

Email Contact: anrao [at] ucsd.edu

We are interested in all aspects of gene regulation, and have used model systems including cells of the immune system, embryonic stem cells, haematopoietic stem cells and cells of the mouse embryo.

Project: Changes in chromatin accessibility during differentiation

When: Any Quarter
Last updated: 07/23/2014

To study changes in chromatin during cell fate determination, we are using several models of differentiation in the haematopoietic system as well as transdifferentiation and reprogramming.  The combination of RNA-seq, ChIP and ATAC-seq (Assay for Transposase-Accessible Chromatin, [PMID: 24097267]) has revealed a striking interplay between chromatin regulators involved in histone and DNA modifications.  Additionally, we are using RNA-interference with shRNAs targeting transcription factors and chromatin-related genes to characterize changes in chromatin and gene expression.

Anjana Rao | Pharmacology

Email Contact: anrao [at] ucsd.edu

We are interested in all aspects of gene regulation, and have used model systems including cells of the immune system, embryonic stem cells, haematopoietic stem cells and cells of the mouse embryo.

Project: Identification of long-range interactions in chromatin mediated by transcription factors (ChIA-PET)

When: Any Quarter
Last updated: 07/23/2014

Regulatory elements that control gene expression are often positioned far away from the coding regions of genes. The regulatory regions can be “looped”, bringing them and their associated DNA binding proteins close to a gene’s promoter or other regulatory element. We recently identified a unique structural homodimer of the transcription factor FOXP3 that enables this dimer to bind two sites in DNA that may be separated by long distances [PMID: 21458306]. FOXP3 controls the function of regulatory T cells, and mutations in FOXP3 that abolish dimer formation are associated with an autoimmune syndrome called IPEX, owing to defects in T regulatory cell function [PMID: 22224781]. We are using a method known as ChIA-PET [PMID: 22926262] and several next generation sequencing strategies to assess how the FOXP3 homodimer mediates long-range interactions in DNA and controls T regulatory cell function.

Anjana Rao | Pharmacology

Email Contact: anrao [at] ucsd.edu

We are interested in all aspects of gene regulation, and have used model systems including cells of the immune system, embryonic stem cells, haematopoietic stem cells and cells of the mouse embryo.

Project: Interplay between histone and DNA modifications

When: Any Quarter
Last updated: 07/23/2014

Histone ubiquitylation is an important modification that is less studied than the more standard histone modifications (methylation, acetylation, phosphorylation).  The two histones that are modified by ubiquitylation are H2A and H2B.  H2A ubiquitylation and deubiquitylation are controlled by several enzymes, many of which are frequently mutated or otherwise dysregulated in different cancer.  Because the DNA methylcytosine oxidase TET2 [PMID: 24220273] and ASXL1, a component of an H2A deubiquitinase complex [PMID: 20436459], are frequently co-mutated in haematopoietic cancers, we are studying the relation between H2A ubiquitylation and DNA methylation status in normal cells and during cancer development.

Anjana Rao | Pharmacology

Email Contact: anrao [at] ucsd.edu

We are interested in all aspects of gene regulation, and have used model systems including cells of the immune system, embryonic stem cells, haematopoietic stem cells and cells of the mouse embryo.

Project: Mapping DNA methylation and oxi-mC distribution

When: Any Quarter
Last updated: 07/23/2014

Contribution of TET proteins to DNA methylation patterns and gene expression during normal differentiation and in cancer (mapping DNA methylation and oxi-mC distribution).  Cytosine methylation is an important modification in DNA and controls gene expression by altering transcription factor binding and local chromatin structure.  Our lab discovered the enzymatic activity of Ten-eleven translocation (TET) proteins [PMID: 19372391], dioxygenases that modify DNA methylation patterns by oxidizing 5-methylcytosine (5mC) to 5-hydroxymethylcytosine (5hmC), 5-formylcytosine (5fC), and 5-carboxycytosine (5-caC).  We have developed tools to map the location of these modified bases in DNA by next-generation sequencing [PMID: 21552279, 24474761], and have generated mice conditionally deficient for each of the three TET proteins.  We are using these methods and reagents to explore how TET proteins, 5mC and the oxidised methylcytosines (5hmC, 5fC and 5caC, together termed oxi-mC) influence cell differentiation and cancer.

Anjana Rao | Pharmacology

Email Contact: anrao [at] ucsd.edu

We are interested in all aspects of gene regulation, and have used model systems including cells of the immune system, embryonic stem cells, haematopoietic stem cells and cells of the mouse embryo.

Project: Mapping enhancers to interrogate the role of single nucleotide polymorphisms (SNPs) in immune diseases

When: Any Quarter
Last updated: 07/23/2014

We have recently completed a study in which we used ChIP-seq for the enhancer-associated modification H3K4me2 to identify enhancer associated with T cell differentiation in healthy subjects as well as patients with asthma [Seumois G et al., Nature Immunology, in press].  Similar studies will be ongoing over the next few years.

Anjana Rao | Pharmacology

Email Contact: anrao [at] ucsd.edu

We are interested in all aspects of gene regulation, and have used model systems including cells of the immune system, embryonic stem cells, haematopoietic stem cells and cells of the mouse embryo.

Project: Modelling of single-cell RNA biology

When: Any Quarter
Last updated: 07/23/2014

We are using single-cell RNA-seq to investigate the regulation of gene expression during normal and dysregulated differentiation of mouse embryos and embryonic stem cells. To identify key regulatory factors of gene regulation, it will be crucial to apply existing computational methods and develop novel computational methods for the analysis of these single-cell RNA-seq data, including methods for unsupervised clustering, differential expression, and time-series analyses.

Bing Ren | Cellular and Molecular Medicine

Bing Ren | Cellular and Molecular Medicine

Email Contact: biren [at] ucsd.edu

The laboratory’s research is focused on understanding the fundamental mechanisms controlling gene expression in mammalian cells. In particular, the laboratory is investigating three related problems:

  1. What are the transcriptional regulatory sequences that control cell-specific gene expression programs in the mammalian genomes?
  2. How do these sequence elements interact with transcription factors and chromatin binding proteins to regulate gene expression during cellular differentiation?
  3. How do epigenetic mechanisms (DNA methylation and chromatin modifications) influence the gene regulatory process?

Project: dynamic chromatin landscapes in differentiating human ES cells

When: Fall 2011, Winter 2012, Spring 2012, Summer 2012
Last updated: 08/16/2011

As part of the NIH Roadmap epigenome project, my lab has comprehensively mapped various chromatin modification status genome-wide in the human ES cells and several partially differentiated cell types derived from the ES cells. Rotation project will be focused on understanding how chromatin modifications dynamics is related to differentiation and cell fate determination.

Bing Ren | Cellular and Molecular Medicine

Email Contact: biren [at] ucsd.edu

The laboratory’s research is focused on understanding the fundamental mechanisms controlling gene expression in mammalian cells. In particular, the laboratory is investigating three related problems:

  1. What are the transcriptional regulatory sequences that control cell-specific gene expression programs in the mammalian genomes?
  2. How do these sequence elements interact with transcription factors and chromatin binding proteins to regulate gene expression during cellular differentiation?
  3. How do epigenetic mechanisms (DNA methylation and chromatin modifications) influence the gene regulatory process?

Project: Study of the role and regulation of long-range genomic interactions in the human genome

When: Fall 2011, Winter 2012, Spring 2012, Summer 2012
Last updated: 08/16/2011

Many gene regulatory elements can regulate target genes located at a long distance, and this has been thought to occur through DNA-looping. However, the exact role and mechanisms of long-range DNA interactions in human cells have remained to be resolved. We have been able to generate comprehensive maps long-range DNA interactions for different cell types, and are finding interesting principles of genome organization in mammalian cells. Rotation projects will be focused on understand the relationships between long-range DNA interactions and regulation of gene expression, and development of predictive models of gene expression.

Bing Ren | Cellular and Molecular Medicine

Email Contact: biren [at] ucsd.edu

The laboratory’s research is focused on understanding the fundamental mechanisms controlling gene expression in mammalian cells. In particular, the laboratory is investigating three related problems:

  1. What are the transcriptional regulatory sequences that control cell-specific gene expression programs in the mammalian genomes?
  2. How do these sequence elements interact with transcription factors and chromatin binding proteins to regulate gene expression during cellular differentiation?
  3. How do epigenetic mechanisms (DNA methylation and chromatin modifications) influence the gene regulatory process?

Project: Systematic identification and characterization of transcriptional regulatory elements in the mouse genome

When: Fall 2011, Winter 2012, Spring 2012, Summer 2012
Last updated: 08/16/2011

The laboratory, as part of the mouse ENCODE project, is in the midst of producing the first comprehensive map of cis-regulatory elements in the mouse genome. Computational analyses are being conducted to identify the regulatory sequences controlling tissue-specific gene expression, determine the evolutionary conservation of these sequences, and predict potential transcription factors involved in lineage pecification and animal development.

Doug Richman | Pathology

Doug Richman | Pathology

Email Contact: drichman [at] ucsd.edu

The Richman Laboratory is a translational HIV research laboratory that uses basic science techniques to answer clinically relevant questions. These laboratory techniques include a wide range and combination of wet lab and computational methods. Prospective students will have exposure to aspects of both wet laboratory and computational investigation. They will also have exposure to clinical aspects of HIV disease to place their research into clinical perspective. Currently available projects include:

Project: Characterize the composition of the latent HIV reservoir, including the DNA sequences comprising this reservoir

When: Any Quarter
Last updated: 08/02/2012

The latent HIV reservoir is the obstacle to cure. We are characterizing the cellular and viral composition of this reservoir using multiple flow cytometric, molecular and sequencing techniques.

Doug Richman | Pathology

Email Contact: drichman [at] ucsd.edu

The Richman Laboratory is a translational HIV research laboratory that uses basic science techniques to answer clinically relevant questions. These laboratory techniques include a wide range and combination of wet lab and computational methods. Prospective students will have exposure to aspects of both wet laboratory and computational investigation. They will also have exposure to clinical aspects of HIV disease to place their research into clinical perspective. Currently available projects include:

Project: Determining the incidence of HIV dual infection following primary infection

When: Any Quarter
Last updated: 08/02/2012

HIV-1 dual infection is when an individual is infected with more than one variant of HIV. Using well-characterized cohorts from San Diego and multiple locations in Africa, we are determining the presence and absence of HIV-1 dual infection using next generation sequencing, and correlating this with other important pathogenetic parameters.

Doug Richman | Pathology

Email Contact: drichman [at] ucsd.edu

The Richman Laboratory is a translational HIV research laboratory that uses basic science techniques to answer clinically relevant questions. These laboratory techniques include a wide range and combination of wet lab and computational methods. Prospective students will have exposure to aspects of both wet laboratory and computational investigation. They will also have exposure to clinical aspects of HIV disease to place their research into clinical perspective. Currently available projects include:

Project: Determining the level of neutralizing antibody required to protect an individual from superinfection (a type of dual infection)

When: Any Quarter
Last updated: 08/02/2012

Use next generation sequencing of HIV populations in longitudinally collected samples to determine viral genetic changes associated with the development of protective neutralizing antibody, an important objective towards designing an HIV vaccine.

Scott Rifkin | Biological Sciences

Scott Rifkin | Biological Sciences

Email Contact: sarifkin [at] ucsd.edu

The Rifkin laboratory studies how environmental, genetic, and stochastic variation interact to generate phenotypic variation. We use yeasts and nematodes as model organisms and work primarily at the level of gene regulatory and signal transduction networks.

Project: Bioinformatics Rotation Projects Available

When: Winter and Spring quarters
Last updated: 08/16/2011

Project 1:

We have collected data on gene expression patterns underlying endodermal specification in the principal model strain of C. elegans. We use a technology that has single molecule spatial resolution so it is possible to count and localize transcripts in individual worm embryos. Possible rotation projects would be to measure gene expression in other strains of C. elegans and/or related species which would expose students to nematode culture techniques, developmental genetics and developmental systems biology, fluorescence microscopy, and image analysis.

Project 2:

We are currently measuring gene expression and signal transduction dynamics in individual yeast cells under a variety of conditions, over a diversity of genetic backgrounds, and in a few pathways. Rotation students could participate in this project and would learn yeast culture and molecular genetics, fluorescence microscopy, and image analysis.

Project 3:

There is also the opportunity to develop a more theoretical project simulating and analyzing the effects of different sources of variation on genetic networks, either using small toy models or established models of real networks.

Michael Rosenfeld | School of Medicine

Michael Rosenfeld | School of Medicine

Email Contact: mrosenfeld [at] ucsd.edu

Lab Location: CMM-West, Rm. 345

Lab Phone: 858-534-5858

Lab Composition and Activities: Five graduate students from several programs, and a talented group of enthusiastic (also helpful) postdoctoral fellows and a full time laboratory manager. We have one general laboratory meeting, one graduate student-only meeting, and one personal meeting each week. We also have joint lab meetings with two other labs weekly.

Research Interests: Our central laboratory focus this year is to continue to utilize global genomic approaches to uncover and investigate the “enhancer code” controlled by new, previously unappreciated pathways that integrate the genome-wide response to permit proper development and homeostasis, and that also functions in disease and senescence. We have investigated these events in differentiated cells, neuronal development, stem cells, and cancer. Our biological focus is on molecular mechanisms of the “enhancer code” regulating learning and memory; aggressive prostate and breast cancer, and they underlying events of senescence/aging. Epigenomic events studied include non-histone methylation events and non-coding RNAs. We are investigating these events in development, breast and prostate cancers, and in inflammation-based disease, including degenerative CNS disease and diabetes. The emerging importance of non-coding RNAs and regulation of nuclear architecture is rapidly altering our concepts of homeostasis and disease. Our laboratory is “Seq-ing” (RIP-seq, ChIP-seq, RNA-seq, GRO-seq, CLIP-seq, ChIRP-seq), and a new “FISH-seq”, for open-ended discovery of long-distance genome interactions to uncover new “rules” of regulated gene transcriptional programs and new roles for lncRNAs in biology of normal, cancer neuro-affective disorders and aging cells. Coupling this with chemical library screens, we hope to introduce new types of therapies based on targeting specific gene enhancers, histone protein readers and writers, and lncRNAs for cancers and other diseases. Recent surprising findings have been novel roles of lncRNAs prostate and breast cancer, connection between DNA damage repair/transcription and replication, and unexpected roles of enhancer RNAs.

Current interests include:

  • The “enhancer code,” Epigenomics and transcriptional regulatory mechanisms.
  • Roles of by ncRNAs in enhancer function in signal-dependent genomic relocation and in establishing subnuclear architecture.
  • Mechanisms of signal-induced tumor chromosomal translocations events and new chemical screens for inhibitors for breast and prostate cancer.
  • The “enhancer code” or regulation of learning and memory, including Reelin-regulated enhancers.
  • Linkage of DNA damage/repair and transcription.
  • Retinoic Acid regulation of Pol III-transcribed DNA repeats in maintenance of the stem cell state, in neuronal differentiation and in senescence.
  • Molecular mechanisms of prevelant disease associated sequence variations (GWAS) in disease susceptibility loci.
  • “Epigenomics” in neuronal differentiation, cancer, diabetes and degenerative brain disease.
  • Answering the question when and how enhancers arise and became functional (stem cells to mature cell types).

Project: Bioinformatics Rotation Projects

When: Any Quarter
Last updated: 08/12/2013

Potential projects include:

  • Projects employing use of genome-wide technologies, including ChIP-seq, GRO-seq, CLIPseq-, RNA-seq, and ChIRP-seq, to elucidate molecular mechanisms of regulated enhancer lncRNA actions in cancer and stem cells;
  • Roles and mechanisms of enhancer actions in prostate and breast cancers;
  • Enhancer-based model of neurodevelopment and CNS disorders;
  • New mechanisms of long non-coding RNAs dictating physiological gene regulation in cancer transcriptional programs;
  • Understanding subnuclear structures: Roles of relocation of transcription units between subnuclear architectural structures in regulated gene expression;
  • Chemical library screens to gene signature and translocation responses as an approach toward new cancer therapeutic reagents;
  • Roles of epigenomic regulators and expression of DNA repeats in stem cells, neuronal differentiation and in senescence.

Julian Schroeder | Biological Sciences

Julian Schroeder | Biological Sciences

Email Contact: jischroeder [at] ucsd.edu

Project: Systems Biology and Engineering of Environmental and Drought Tolerance in Plants

When: Any Quarter
Last updated: 06/24/2015

Julian Schroeder's research is directed at discovering the signal transduction mechanisms and the underlying signaling networks that mediate resistance to environmental stresses in plants, in particular drought, salinity stress and heavy metal stress. These environmental (abiotic) stresses have substantial negative impacts and reduce global plant growth and biomass production. These environmental stresses are also relevant in reference to climate change and to maintaining available arable land to meet human needs. Research in Julian Schroeder's laboratory is using multidisciplinary approaches including genomics, bioinformatics, cell signaling, network modeling, proteomics and molecular biological towards uncovering the signal transduction network and receptors in plants that translate drought stress hormone reception, CO2 sensing and salinity stress to specific resistance responses. Some of recent research advances are being used in the biotechnology industry with the goal of enhancing stress resistance of plants and crop yields.

A rotation project will be pursued to model and identify the drought stress and CO2-induced signaling networks based on “omic” scale data sets. Models will be directly tested by wet lab experimentation.

Julian Schroeder is Co-Director of the Center for Food and Fuel for the 21st Century.  See http://www-biology.ucsd.edu/labs/schroeder for more information on the Schroeder lab.

Selected publications

  • Nishimura et al., Science (2009).
  • H.H. Hu et al., Nature Cell Biol. (2010).
  • T.H. Kim et al. Current Biol (2011).
  • Xue et al., EMBO J. (2011).
  • F. Hauser et al. Current Biol (2011).
  • B. Brandt et al., PNAS (2012).
  • R. Waadt et al., eLife (2014).
  • A.M. Jones et al., Science (2014).
  • C.B. Engineer et al., Nature (2014).
  • B. Brandt, S. Munemasa et al. eLife (2015).
  • See also: http://labs.biology.ucsd.edu/schroeder/publications.html

Dorothy Sears | School of Medicine

Dorothy Sears | School of Medicine

Email Contact: dsears [at] ucsd.edu

The focus of my research is on the fields of insulin resistance and obesity in humans, rodents and cell culture models. A new aspect of this research is the study of how insulin resistance promotes breast cancer. We employ systems biology approaches for complex data analysis. Research efforts include the generation of large transcriptomic, metabolomic, proteomic and lipidomic data sets and the analysis of these data sets independently and as network overlays using bioinformatic pathway tools. Our goals are to identify and characterize individual genes and biochemical pathways that regulate insulin resistance and pharmacological insulin sensitization.

Project: Insulin resistance

When: Any Quarter
Last updated: 08/16/2012

Potential projects:

  1. Analysis of plasma lipidomic data from female human subjects with, without, or at risk for cancer (breast, endometrial, uterine). These lipidomic datasets would be integrated with clinical and diagnostic parameters from the same subjects. One aim is to identify lipid expression patterns that characterize cancer risk, type, and/or recurrence. Another aim is to identify novel targets for cancer preventative and/or interventional therapy.
  2. Analysis of plasma lipidomic data and urine metabolomics data from obese human subjects before and after diet interventions designed to improved insulin sensitivity. These datasets would be integrated with clinical and diagnostic parameters from the same subjects. One aim is to identify lipid or metabolite expression patterns that characterize insulin resistance and changes induced by diet intervention. Another aim is to identify novel targets for obesity interventional therapy.
  3. Promoter sequence analysis of genes that are co-regulated with insulin resistance and/or pharmacological therapy in humans, mice and rats. The aim is to identify common transcriptional regulators of these genes, based on the identification of common DNA sequence motifs present in their promoter sequences. Findings would lead to further analyses of how the identified transcriptional regulators are themselves modulated by insulin resistance and/or pharmacological therapy.
  4. Proteomic and mRNA expression data analysis of two subtypes of macrophages that are characteristic of those found in inflamed adipose tissue of obese rodents and humans. The aim is to identify gene expression patterns that can increase our understanding of how these two macrophage subtypes differ in their inflammatory and metabolic behaviors.

Jonathan Sebat | Cellular and Molecular Medicine

Jonathan Sebat | Cellular and Molecular Medicine

Email Contact: jsebat [at] ucsd.edu

Our laboratory is interested in how rare and de novo mutations in the human genome contribute to patterns of genetic variation and risk for disease in humans. To this end, we are developing novel approaches to gene discovery that are based on advanced technologies for the detection of rare variants, including studies of copy number variation (CNV) and deep whole genome sequencing (WGS). Our goal is to identify genes related to psychiatric disorders and determine how genetic variants impact the function of genes and corresponding cellular pathways.

Project: Determining the effect of autism mutations on development of the head and face

When: Any Quarter
Last updated: 06/09/2016

We have collected whole genome sequence data and 3D digital images of the head and face from a set of 300 autism families. This project will examine quantitative measurement of facial features in autism patients and sibling controls and determine the degree to which specific mutations affect craniofacial structure. We will apply unsupervised clustering of genetic and phenotype data to define diagnostic subgroups of patients.

Jonathan Sebat | Cellular and Molecular Medicine

Email Contact: jsebat [at] ucsd.edu

Our laboratory is interested in how rare and de novo mutations in the human genome contribute to patterns of genetic variation and risk for disease in humans. To this end, we are developing novel approaches to gene discovery that are based on advanced technologies for the detection of rare variants, including studies of copy number variation (CNV) and deep whole genome sequencing (WGS). Our goal is to identify genes related to psychiatric disorders and determine how genetic variants impact the function of genes and corresponding cellular pathways.

Project: Determining the frequency of spontaneous reversion in the human genome

When: Any Quarter
Last updated: 06/09/2016

Structural Variants (SVs) in the human genome are poorly ascertained in genome-wide association studies (GWAS).Tandem duplications in particular are not efficiently tagged by adjacent SNPs. The reasons for this are not known. We hypothesize that SVs, once formed, create local instability resulting in a high rate of spontaneous reversion. This project will directly determine the rates of spontaneous reversion in whole genomes of 300 trio families. In addition, we will examine the local patterns of genetic variation adjacent to SVs to infer the occurrence of reversion events.

Jonathan Sebat | Cellular and Molecular Medicine

Email Contact: jsebat [at] ucsd.edu

Our laboratory is interested in how rare and de novo mutations in the human genome contribute to patterns of genetic variation and risk for disease in humans. To this end, we are developing novel approaches to gene discovery that are based on advanced technologies for the detection of rare variants, including studies of copy number variation (CNV) and deep whole genome sequencing (WGS). Our goal is to identify genes related to psychiatric disorders and determine how genetic variants impact the function of genes and corresponding cellular pathways.

Project: Identifying human essential genes by deletion mapping of a large population

When: Any Quarter
Last updated: 06/09/2016

Studies of genetic variation in large populations makes it possible to determine the degree of natural selection acting on specific sequences. Our lab has mapped structural variation (SV, including deletions and duplications) in large samples (N>100,000). By generating a null model based on regional patterns of SV, we propose to identify sequences that deviate dramatically from expectations. Sequences that display extreme deviation are likely to be genes that are essential for life.

Susan Taylor | Chemistry and Biochemistry

Susan Taylor | Chemistry and Biochemistry

Email Contact: staylor [at] ucsd.edu

cAMP-dependent protein kinase (PKA) is ubiquitous in every mammalian cell with the PKA signaling network regulating processes as diverse as memory, differentiation, development, the cell cycle, and circadian rhythms. One of our goals, in addition to elucidating structures of the PKA subunits, is to map the PKA proteome as it relates to PKA signaling. The PKA interaction network consists not only of the PKA regulatory and catalytic subunits as well as the GPCRs, G-Proteins, cyclases, and phosphodiesterases, as well as PKA substrates but also the scaffold proteins (A Kinase Anchoring Proteins: AKAPs) that target PKA to specific sites in the cell. We are interested, in particular to map PKA that is targeted to organelles such as the mitochondria. A second goal is to map the activity of PKA in live cells using FRET PKA activity reporters that are targeted to specific sites such as the plasma membrane, the mitochondria, or the nucleus.

Project: Rotation Project for Systems Biology

When: Fall 2010
Last updated: 10/13/2010

PKA is a broad spectrum kinase that has many protein substrates. It consists of regulatory (R) and catalytic (C) subunits and assembles into an inactive tetramer (R2C2) in the absence of cAMP. Binding of cAMP to the dimeric R-subunits unleashes the catalytic activity. There are four functionally non-redundant R-subunits (RIα, RIβ, RIIα, and RIIβ) and three C-subunits (Cα, Cβ, and Cγ). The GPCRs are the most abundant gene family encoded for by the human genome and many of these couple to cyclases that generate the cAMP second messenger. In addition to PKA R and C-subunits and PKA substrates, the PKA proteome includes scaffold proteins called A Kinase Anchoring Proteins (AKAPs) that target PKA to specific sites in the cell at the correct time. This dynamic spatio-temporal aspect is essential for correct PKA signaling in cells. One of our goals is to identify the proteins that are part of this PKA proteome and to establish how this proteome is altered in response to stress signals such as starvation and to the cell cycle. In addition, we hope to establish how the proteome varies as a consequence of disease or of genetic perturbation. To initially profile the PKA proteome we will be using two cell/tissue types, S49 mouse lymphoma cells and heart tissue. Eventually we will extend this analysis to mouse macrophages (RAW cells). In each case we have perturbed the PKA signaling pathway. In the S49 cells we have generated a mutant cell line that makes no active C-subunit. Although the protein is expressed in these cells it is not active, is not soluble, and remains associated with particulate fractions. In RAW cells we have silenced the Cα and Cβ genes. Our initial goal is to compare each of the wild type S49 cells lines with the cell lines where PKA function has been perturbed. To do this we will use mass spectrometry to identify the proteins that change as well as the phosphoproteome, and these changes will be compared to changes in gene expression. The S49 project will be done in collaboration with Paul Insel (Pharmacology) and Nuno Bandiera (Pharmacy/Computer Science) while the macrophage study will be done in collaboration with Mel Simon (Pharmacology). Similar profiling is being done for cardiac myocytes, and this will be done in collaboration with Hemal Patel (Medicine) and Andrew McCulloch (Bioengineering).

Glenn Tesler | Mathematics

Glenn Tesler | Mathematics

Email Contact: gptesler [at] ucsd.edu

Project: Bioinformatics Rotation Projects Available

When: Summer 2009
Last updated: 08/27/2011

There are three projects available for Summer 2009.

  1. The first project is in comparative genomics and concerns genome rearrangements. I work on algorithms for comparing rearranged genomes, and on applying them to real data sets. Expertise in algorithm development and discrete mathematics is preferred.
  2. The second project is joint with Pavel Pevzner and concerns genome assembly algorithms. Our current focus is on single cell assembly.
  3. The third project is joint with Vineet Bafna. We are developing a method to find pairs of genes that interact with disease status in GWAS microarray data. Finding simple diseases caused by one gene can be done efficiently by brute force in time O(mn) (m = # genes, n = # patients), but brute force for pairs of genes would require time O(m2 n), which may be prohibitive. We developed a new mathematical technique for computing χ2 of a contingency table that allows us to find the interacting pairs more easily. This project involves statistics, abstract algebra, and programming.

Wei Wang | Chemistry and Biochemistry

Wei Wang | Chemistry and Biochemistry

Email Contact: wei-wang [at] ucsd.edu

Project: Bioinformatics Rotation Projects Available

When: Any Quarter
Last updated: 04/05/2012

We are interested in studying the biological and physical principles underlying genetic networks and protein recognition. Rotation projects are available in the following areas. Specific projects will be tailored to fit a student's research interest and scientific background.

  1. Decipher epigenetic code: develop computational methods to identify common patterns in the histone modification and DNA methylation data associated with regulatory elements; predict regulatory elements, transcription factor binding sites or non-coding RNAs based on their chromatin signatures.
  2. Reconstruct genetic networks: reconstruct physical interactions from genomic and proteomic data; analyze the robustness and landscape of the networks.
  3. Decipher protein recognition code: employ structure-based computer modeling to characterize the energetic patterns of protein-protein and protein-ligand interactions; predict the specificity of protein recognition.
  4. Design resistance-evading drugs: computer-aided drug design to combat resistance; develop new methods for drug lead optimization.

More information can be found at http://wanglab.ucsd.edu.

Yingxiao (Peter) Wang | Bioengineering

Yingxiao (Peter) Wang | Bioengineering

Email Contact: yiw015 [at] eng.ucsd.edu

Our research focuses on molecular engineering for cellular imaging and reprogramming, and image-based bioinformatics, with applications in stem cell differentiation and cancer treatment.

Project: Image-based reconstruction of biochemical networks in live cells

When: Any Quarter
Last updated: 06/02/2016

Fluorescence resonance energy transfer (FRET)-based biosensors have been widely used in live-cell imaging to accurately visualize specific biochemical activities. We have developed the Fluocell image analysis software package to efficiently and quantitatively evaluate the intracellular biochemical signals in real-time, and to provide statistical inference on the biological implications of the imaging results. However, important questions arise on how to use these results to reconstruct the quantitative parameters in the underlying biochemical networks, which determine cellular functions and ultimately their fates. In this rotation project, we will integrate optimization-based machine learning approaches with biochemical network models to seek answers to these questions, with applications in cancer treatment against drug resistance.

Roy Wollman | Chemistry and Biochemistry

Roy Wollman | Chemistry and Biochemistry

Email Contact: rwollman [at] ucsd.edu

The Wollman lab studies spatio-temporal regulation of self-organization processes during the innate immune response to stress. At the Intra-cellular scale we study how signaling network control the organization of the actin cytoskeleton in chemotaxis, phagocytosis and secretion. At the inter-cellular scale we study how cells utilize paracrine signaling to coordinate their functions with one another to produce a coherent and robust response to stress. In the lab we combine microscopy, computational image analysis and mathematical modeling.

Project: Development of generalizable, benchmark driven machine learning tool for optimal image segmentation of cells in high-throughput microscopy datasets

When: Any Quarter
Last updated: 08/12/2012

Fully automated microscopes that can image autonomously for days and produce data at a rate of few megabytes per second have become a ubiquitous tool. This creates an enormous challenge to develop appropriate algorithms and tools to analyze large image datasets. Analysis is often held back by the lack of tools to automatically segment images and identify cells in them. Several algorithmic approaches exist but they all require a large number of parameters to be ‘tweaked’. This is often done is an ad-hoc and non-generalizable manner that hampers the quality of analysis. Our lab has recently developed a combined cell labeling, imaging and analysis technology that allow the rapid generation of training sets where cells can be accurately segmented. The goal of this project is to develop new methods that use machine learning and optimization tools to identify the optimal parameters required for image analysis algorithms. Several long terms projects that build on this work are possible and include further enhancement of these tools to include cell tracking and on-line learning as well as projects that combine experimental work and use these tools to study paracrine signaling in the innate immune system.

Roy Wollman | Chemistry and Biochemistry

Email Contact: rwollman [at] ucsd.edu

The Wollman lab studies spatio-temporal regulation of self-organization processes during the innate immune response to stress. At the Intra-cellular scale we study how signaling network control the organization of the actin cytoskeleton in chemotaxis, phagocytosis and secretion. At the inter-cellular scale we study how cells utilize paracrine signaling to coordinate their functions with one another to produce a coherent and robust response to stress. In the lab we combine microscopy, computational image analysis and mathematical modeling.

Project: Spatial modeling of tissue-scale signaling gradients during the clearance of apoptotic cells

When: Any Quarter
Last updated: 08/12/2012

As cells commit to apoptosis they secrete soluble signaling molecules to generate a “find-me” signal to recruit macrophage that will then phagocytose the apoptotic cell. Proper clearance of apoptotic cells is critical for health and deficiencies in this process have been associated with several autoimmune diseases. Although a few of the molecules and pathways that can act as “find-me” signals are known, the extent, kinetics, and temporal and spatial scale of theses “find-me” signals is unknown. The goal of this project is to use the CompuCell3D framework in combination with partial differential equations to construct models of the generation of chemotactic “find-me” gradient by apoptotic cells. The computational models will be used to answer key questions about the ability of apoptotic cells to recruit macrophages in healthy and pathophysiological conditions.

Gene Yeo | Cellular and Molecular Medicine

Gene Yeo | Cellular and Molecular Medicine

Email Contact: geneyeo [at] ucsd.edu

We have a wide scope of projects ranging from developing novel algorithms for studying RNA processing in diseases, development and personalized medicine, and for analyzing single-cell RNA-seq data.

Project: Single-cell RNA-seq analysis

When: Any Quarter
Last updated: 06/02/2016

We have projects that deal with developing new algorithms for single-cell RNA-seq analysis pertaining to studying heterogeneity in complex mixtures of cells upon environmental challenges.

Kun Zhang | Bioengineering

Kun Zhang | Bioengineering

Email Contact: k4zhang [at] ucsd.edu

Project: Allele-specific gene regulations

When: Fall, Winter, Spring
Last updated: 08/16/2012

Each human cell contains two sets of chromosomes of the paternal and maternal origins respectively. Due to the presence of genetic polymorphisms, a same gene on the two chromosome sets might behave differently, which has profound implications in high-level phenotypes, including disease susceptibilities. In this rotation project, the student will analyze RNA sequencing or methylation sequencing data to identify genes that exhibit parental preferences and investigate the functional implications of such phenomena. Towards the end of the rotation, the student will learn a number of latest methods in mapping and analyzing next-generation sequencing data, and how to study transcriptome/methylome in the context of genetic polymorphisms.

Kun Zhang | Bioengineering

Email Contact: k4zhang [at] ucsd.edu

Project: Single-cell genome sequencing: de novo assembly and annotations

When: Any Quarter
Last updated: 08/16/2012

Single-cell genome sequencing allows us to dissect a heterogenous cell populations and to relate the genome diversity to functions. One major challenge is de novo genome assembly, because the sequencing data from single cells are typically highly biased in locus representation. In this rotation, the student will have an opportunity to understand the experimental procedures for single-cell genome sequencing, to learn the state-of-the-art methods for de novo genome assembly, and to develop innovative strategies for genome assembly and annotations.

Sheng Zhong | Bioengineering

Sheng Zhong | Bioengineering

Email Contact: szhong [at] ucsd.edu

We study causal relationships between gene regulation and cellular behaviors, by developing computational and experimental methods on network modeling, stem cell engineering, epigenomic and single-cell analyses.

Project: Continued development of Comparative Epigenome Browser (BISB Rotation)

When: Any Quarter
Last updated: 08/21/2013

The Comparative Epigenome Browser (CEpBrowser, http://www.cepbrowser.org) is an online data management, visualization, and analysis tool that allows the public to perform multi-species epigenomic analysis (Cell, 2012, 149: 1381-1391) (Bioinformatics, 2013, 29 (9): 1223-1225). In collaboration with the ENCODE project, this rotation project will extend CEpBrowser to incorporate new ENCODE and mouse ENCODE data, implement interactive data management features, and implement new data analysis features.

Sheng Zhong | Bioengineering

Email Contact: szhong [at] ucsd.edu

We study causal relationships between gene regulation and cellular behaviors, by developing computational and experimental methods on network modeling, stem cell engineering, epigenomic and single-cell analyses.

Project: Genome-wide mapping of nucleosomes in embryonic stem cells

When: Any Quarter
Last updated: 08/21/2013

The main goal is to learn, adapt, and polish a ground-breaking technology of mapping nucleosome positions. A recent technology, termed micrococcal nuclease digestion followed by DNA sequencing (MNase-seq), has generated tremendous impacts by allowing for mapping nucleosome positions in mammalian cells (Nature Structural & Molecular Biology, 2012, 19:1185-1192). Applying this technology to embryonic stem cells, the rotation student should master this technology and achieve single-nucleosome level of precision. The main milestones are:

  1. Achieve consistency in single-nucleosome digestion of stem cell chromatins using MNase.
  2. Generate a MNase-seq dataset using DNA sequencing of the MNase treated stem cell sample.