Jack Chen, Professor
B.Sc., Fudan University, Shanghai
Ph.D., Chinese Academy of Sciences, Qingdao
Genome annotaiton and identification of functional genomic elements |
We are interested in developing and applying innovative bioinformatics tools to identify various types of functional elements in genome, including genes, gene families, and operons using de novo prediction methods, homology-based methods, as well as evidence-based methods including methods based on (RNA-seq). Although many gene finding programs have been developed, accumulating evidence indicates that a large number of genes remain to be discovered. For homology-based searches and gene prediction, we have developed a program suite called genBlast, consisting of two computer programs, genBlastA (She et al., 2009) and genBlastG (She et al., 2011). It has also been shown that gene finding using computer algorithms alone is inadequate. Thus we are developing a set of combined computational and experimental approaches to identify novel genes in the Caenorhabditis genomes. Using this strategy, we have identified novel genes and revised existing genes in C. elegans (Nesbitt et al., 2010). Applying genBlastG and RNA-seq, we have re-annotated the genome of C. briggsae (Uyar et al., 2012), a closely related nematode to the model organism C. elegans.
In the meantime,
we are also developing programs to identify gene families. A gene family consists of a group of genes
that share structural and functional features. For example, the C. elegans genome carries >1,000
chemosensory genes, which can be divided into many gene families, including the srab gene family we have identified
(Chen et al., 2005). In a separate project, we identified a family of over 600 putative chemosensory genes in the model organism sea urchin. We further developed a novel
stratgey for systematically classifying gene families called comparative gene family classification.
We have applied bioinformatics methods to identify mutational landscape in the intrahepatic cholangiocarcinoma (Zou et al., 2014).
Transcriptional regulation in health and disease
Transcriptional regulation controls unique combinations of genes expressed in cells, which in turn determines cell identity and function. Approximately 5% of the protein coding capacity of any genome encodes transcription factors (TFs), which present an enormous regulatory capability at the transcription level alone. Each TF regulates up to hundreds of genes by binding to their promoters/enhancers, while each gene can be transcriptionally regulated by an array of TFs. Such many-to-many transcriptional relationships create a large number of transcriptional regulatory circuits (TRCs) and eventually many elaborate transcriptional regulatory networks (TRNs). Identification and understanding of transcription factor binding sites (TFBSs) holds the key to understanding TRCs and TRNs. By applying comparative genomics, microarray analysis, SAGE (serial analysis of gene expression), we have identified a large number of target genes of DAF-19, a tissue-specific transcription factor, in C. elegans. Notably, many of these target genes are C. elegans orthologs of human Bardet-Biedl Syndrome (BBS) genes (Chen et al, 2006). We found that multiple instances of DAF-19 binding sites (i.e., X-box motifs) co-exist in the promoters of many target genes, providing a fine-tuning mechanism for controled expression level (Chu et al., 2011). This project is currently supported by NSERC (2006-2014) and MSHFR (2007-2013). Syed Aftab, an ISS student in my group, together with graduate students Lucie Semenec and Jeffrey Chu, has identified and characterized two new transcription factors in human of the RFX family: RFX6 and RFX7 (Aftab et al., 2008). Recent ly, we have shown that the acquired transcriptional regulation of genes by RFX transcription factors might have played key role in the origination of metazoans (Chu et al, 2010).
OrthoCluster a nd OrthoClusterDB
genBlast: genBlastA and genBlastG