We are the Birol Lab, a group invested in genome research. We operate at the interface of biology and computer science, along a spectrum that ranges from the development of fundamental computing tools, innovative bioinformatics technologies and genomics data analysis frameworks to serve the common goal of pushing further the bounds of science, helping derive biological insights with far-reaching applications in the genetics, genomics, research and clinical realms. Below is an overview of our key activities.
Fundamental Bioinformatics: Algorithms and Data Structures
Bioinformatics is a big data science. With the continual improvements in high-throughput DNA and RNA sequencing, scalable algorithms and data structures are essential to underpin the bioinformatics tools needed to analyze this immense data. To support and facilitate the development of these bioinformatics tools, the Birol Lab develops advanced, efficient data structures, common code libraries and algorithms for storing, hashing and processing ‘omics data.
Genome and Transcriptome Assembly and Analysis
The many revolutionary improvements to sequencing technologies have made data generation accessible to labs big and small. Yet, deciphering the precise nucleotide sequence that makes up the genome and transcriptome of a species – de novo assembly – remains an informatics challenge, especially when the genome is large and complex. Beginning with ABySS/ABySS2, the first short-read assembler to scale to human-sized genomes, and more recently GoldRush, a linear time complexity genome assembler for long sequencing reads, the Birol Lab has an established track record in developing scalable and high-quality assembly tools. We develop open-source solutions for all steps of the assembly workflow, including the initial assembly of reads, correction, scaffolding, polishing, and gap-filling.
As sequencing technologies and assembly algorithms mature, the diversity of available complete, chromosome-scale genome assemblies is exploding. With this rapidly growing data, there are great opportunities for leveraging these resources to gain critical evolutionary insights within and between species. To fully realize the potential of these resources, scalable comparative genomics tools are essential. We are developing efficient and multi-faceted comparative genomics solutions for analyzing genomes across the tree of life.
Bacteria can rapidly evolve to develop resistance to antibiotics, presenting a growing and very dangerous problem. In a “post-antibiotic era”, bacterial diseases would once again be untreatable, and many standard treatments, including surgical operations, could become unusable. To boost the search for new treatment options, in the collaborative PeptAid project, we are focusing on short proteins called antimicrobial peptides (AMPs), which are produced naturally by various animal and plant species. These host defence proteins can protect against infection, or reduce the harm caused by an existing infection.
The iTrackDNA project utilizes advanced DNA tracking technology to monitor and trace various species in the environment. It aims to enhance conservation efforts by providing detailed insights into wildlife movements and population dynamics. By analyzing DNA samples from environmental sources, such as water and soil, iTrackDNA enables efficient and non-invasive biodiversity monitoring. Integral to the project, the Birol Lab develops bioinformatics technologies to help characterize environmental DNA and identify unique regions and/or conserved regions within individual species genomes. This research aims to develop specific eDNA assays, enhancing the project’s ability to identify and track target species within diverse ecosystems. Through this integration of bioinformatics, iTrackDNA strengthens its capacity for precise biodiversity monitoring and conservation efforts.
Substantial advancements in healthcare can be realized through the development of genomics technologies to detect variations, sequence repeats and mutations in DNA and RNA in a manner that allows: i) effective preventative care, and/or ii) efficient diagnosis and treatment. One technology that enables this vision is high throughput DNA and RNA sequencing. This requires thorough and streamlined downstream data analysis and interpretation. To address this challenge, we are building and validating bioinformatics frameworks for clinical genomics. In collaborative projects, we are deploying these solutions for clinical research, including COVID-19, cancer research and for diagnosing rare genetic diseases.
Forest Genomics & Beyond
Conifers in general and spruce trees in particular are Canada’s most significant forest resource. Spruces produce high quality wood and fibre that is widely used in the industry, and as dominant species of Canada’s forests, they provide essential local and global ecosystem services. As part of the spruce-up project and in collaboration with multiple organizations that includes CGEn, CanSeq150, the Canada BioGenome and Earth BioGenome projects, we are building genomic resources and technologies to help protect our forests and preserve their biodiversity for future generations.