![]() |
Jack Chen, Associate ProfessorDepartment of Molecular Biology and Biochemistry
Office: SSB8111 |
|
Identification of genes, gene families, and operons (read about RNAseq) Although many gene finding programs have been developed, accumulating evidence indicates that a large number of genes remain to be discovered. For homology-based searches and gene prediction, we have developed a program suite called genBlast, consisting of two computer programs, genBlastA (She et al., 2009) and genBlastG (She et al., 2011). It has also been shown that gene finding using computer algorithms alone is inadequate. Thus we are developing a set of combined computational and experimental approaches to identify novel genes in the Caenorhabditis genomes. Using this strategy, we have identified novel genes and revised existing genes in C. elegans (Nesbitt et al., 2010). In the meantime, we are also developing programs to identify gene families. A gene family consists of a group of genes that share structural and functional features. For example, the C. elegans genome carries >1,000 chemosensory genes, which can be divided into many gene families, including the srab gene family we have identified (Chen et al., 2005). In a separate project, we identified a family of over 600 putative chemosensory genes in the model organism sea urchin. We further developed a novel stratgey for systematically classifying gene families called comparative gene family classification. Syed Aftab, an ISS student in my group, together with graduate students Lucie Semenec and Jeffrey Chu, has identified and characterized two new transcription factors in human of the RFX family: RFX6 and RFX7 (Aftab et al., 2008). Recently, we have shown that the acquired transcriptional regulation of genes by RFX transcription factors might have played key role in the origination of metazoans (Chu et al, 2010). |
|
C. elegans Genome structural variations (SVs) (read more) Genes are not randomly distributed in a genome. On the one hand, the arrangement of genes and functional elements has been shown to critical in gene expression regulation. In C. elegans, for example, we have demonstrated that divergent and parallel neighboring gene pairs are positively correlated in gene expression, while convergent neighboring gene pairs either lack such correlation or show some negative correlation (Chen and Stein, 2006). On the other hand, a genome is a highly dynamic structure. Each genome contains significant number of structural variations including structural rearrangements via insertions, deletions, tandem repeats, inversions, and single nucleotide polymorphisms (SNPs). Many genomic rearrangements have been associated with well-defined clinical syndromes. We have recently developed a novel computer program, OrthoCluster, for identifying genome-wide synteny blocks, as well as genome rearrangement events (Zeng et al., 2008), and OrthoClusterDB (Ng et al., 2009), a web server which allows users to run OrthoCluster online and view pre-computed synteny blocks. OrthoCluster can also be used for identifying segmental duplications within a genome. Using OrthoCluster, we have identified thousands of segmental duplications in C. elegans, the largest of which generates two duplicons in tandem. Each duplicon is 108 KB in length and contains 26 putative protein-coding genes. Genotyping of about 100 C. elegans strains, many of which are N2 strains obtained from different research labs, revealed that the largest segmental duplication is polymorphic (Vergara, et al., 2009). |
|
Genome structure and function of malaria parasites and other pathogens The relationships between host and pathogens are antagonisc interactions. The outcome of such interactions depends on the virulence of the pathogen and on the susceptibility and resistance of the host. We are interested in characterizing genes, in host and pathogens, and their transcriptional regulation. This project is currently supported by the SFU Community Trust Endowment Fund (CTEF)(2007-2012). Applying programs genBlastG and OrthoCluster that we have developed in collaboration with scientists at SFU, we have compared genomes of malaria parasites that are human parasites and those that are rodent parasites, we have identified genes that are unique to human malaria parasites, which may be useful drug targets (Frech and Chen, 2011) |
|
Transcriptional regulation in health and disease Transcriptional regulation controls unique combinations of genes expressed in cells, which in turn determines cell identity and function. Approximately 5% of the protein coding capacity of any genome encodes transcription factors (TFs), which present an enormous regulatory capability at the transcription level alone. Each TF regulates up to hundreds of genes by binding to their promoters/enhancers, while each gene can be transcriptionally regulated by an array of TFs. Such many-to-many transcriptional relationships create a large number of transcriptional regulatory circuits (TRCs) and eventually many elaborate transcriptional regulatory networks (TRNs). Identification and understanding of transcription factor binding sites (TFBSs) holds the key to understanding TRCs and TRNs. By applying comparative genomics, microarray analysis, SAGE (serial analysis of gene expression), we have identified a large number of target genes of DAF-19, a tissue-specific transcription factor, in C. elegans. Notably, many of these target genes are C. elegans orthologs of human Bardet-Biedl Syndrome (BBS) genes (Chen et al, 2006). We found that multiple instances of DAF-19 binding sites (i.e., X-box motifs) co-exist in the promoters of many target genes, providing a fine-tuning mechanism for controled expression level (Chu et al., 2011). This project is currently supported by NSERC (2006-2014) and MSHFR (2007-2013). |
|
Gene interaction networks and synthetic rescue (read more) Specific cellular functions are rarely carried out by single genes, but by groups of cellular compoents. Characterizing how genes within a cell interact with each other is an important and challenging task. We are interested in constructing gene intereaction network by identifying synthetic rescue gene pairs, as well as identifying genetic interaction modules (Colak et al., 2010). |
Related readings |
|
|