Defining the essential gene set of Candida albicans to identify and characterize novel therapeutic targets

Like Comment
Read the paper

Systemic fungal infections claim over 1.5 million human lives each year1. Among the leading causal agents is Candida albicans, an opportunistic pathogen with associated mortality rates exceeding 40% in immunocompromised individuals2. With just three drug classes available for clinical use: the azoles, polyenes, and echinocandins, treatment options for these invasive infections are extremely limited.  As immunocompromised patient populations continue to expand and resistance to existing antifungals threatens to restrict treatment options, the need for novel antifungal strategies cannot be overstated.

The current armamentarium of antifungals used to treat systemic infections all target cellular processes or gene products essential for fungal survival. This points to the fact that essential genes serve as one of the most promising sources of potential drug targets. While genomic assessments of factors essential for survival have been performed in the model yeast Saccharomyces cerevisiae, this analysis serves as a poor predictor of essential genes in human fungal pathogens including C. albicans3. Thus, there exists a great need to define a comprehensive compendium of genes in C. albicans required for survival.

Excited by this potential, our team sought to fill this gap by adopting a multi-pronged, collaborative approach. To do so, we leveraged the largest C. albicans functional genomics resource currently available - the Gene Replacement and Conditional Expression (GRACE) library4. In this library, which originally covered ~40% of the C. albicans genome, one allele of a target gene is deleted, and the remaining allele is under the control of a tetracycline-repressible promoter from which transcription is repressed upon treatment with the tetracycline analog doxycycline (DOX)4. By harnessing this resource, we devised a powerful strategy that capitalized on our team’s diverse expertise in computational biology as well as functional and chemical genomics to systematically predict and analyze all C. albicans essential genes.

To generate a gold standard list of essential genes, we screened the GRACE library for strains that display a severe growth defect upon transcriptional repression of their target gene. To do this, all GRACE strains were pre-treated with a high DOX concentration (100 µg/mL) then transferred onto solid medium containing the same high DOX concentration and left to grow for 48 hours. Each strain was then scored for its ability to grow and those displaying little to no growth by 48 hours in the presence of DOX were designated to correspond to essential genes. This stringent screening method generated a robust characterization of 523 essential and 1,804 non-essential genes that were used to train a machine learning model to predict essentiality for the rest of the C. albicans genome (Figure 1). 

Figure 1. Pipeline for building, training, and experimentally testing a machine learning model to predict C. albicans essential genes.

Building the prediction model required careful selection of input features that contribute meaningful information about gene essentiality. We used a broad set of features that can be grouped into four categories: gene expression datasets, sequence features, datasets derived from S. cerevisiae, and data from a C. albicans essentiality study that employed a transposon mutant collection5 (Figure 1). The supervised machine learning model developed using these features was then trained on 80% of our GRACE set and performance was evaluated on the remaining 20%. This revealed an average precision of 0.77 and an area under the receiver operating characteristic curve of 0.92. Importantly, we observed that not only did each feature contribute unique information to our model, but the use of numerous data sources improved its accuracy in making essentiality predictions compared to a previous study5. To experimentally assess the accuracy of our model, we selected 866 genes covering a range of essentiality prediction scores, constructed their corresponding GRACE strains, and tested the newly derived mutants for essentiality phenotypes (Figure 1). Of the 115 genes predicted to be essential by our model, 74 of the corresponding strains were experimentally validated as essential, with an additional 23 displaying a slow growth phenotype. Overall, this validation set confirmed our model is highly accurate in predicting C. albicans essentiality, thus providing the field with the most comprehensive essentiality database to date. Further, our efforts expanded the GRACE library’s genome coverage by almost 10%, further advancing the comprehensive nature of this invaluable functional genomics resource.

In addition to essentiality, another hallmark of an ideal drug target is lack of conservation in humans, as this reduces the likelihood of host toxicity6,7. Our computational model identified 149 essential C. albicans genes that lack human homologs, and we chose to investigate the function of two of these that were previously uncharacterized. First was C1_01070C, which possessed a conserved domain implicating it as a member of the MIND (Mis12/Mtw1-Nnf1-Nsl1-Dsn1) kinetochore complex. We observed that like other cell cycle regulatory genes, C. albicans cells filament upon transcriptional repression of C1_01070C. Further, we observed its localization at the kinetochore through fluorescence microscopy and affinity purification of C1_01070c coupled to mass spectrometry confirmed this protein physically interacts with other kinetochore subcomplex members. Thus, C1_01070C encodes an essential protein that likely regulates cell cycle progression at the kinetochore and therefore it was renamed to kinetochore related protein 1 or KRP1. To define the function of our next uncharacterized fungal-specific essential gene, C6_03200W, we leveraged its co-expression network8, as co-expression partners often have related functions . This network was dominated by mitochondrial-related genes, providing a helpful clue as to its function. Subsequent analyses confirmed this protein localizes to mitochondrial nucleoids and is required for maintenance of mitochondrial integrity and mtDNA levels and was thus renamed essential mitochondrial function 1 or EMF1. Finally, we characterized an additional essential gene, C2_04370W, as a likely member of the eukaryotic translation initiation factor 3 (eIF3) complex. While the eIF3 complex is conserved in eukaryotes, the number of conserved subunits, as well as the essentiality of subunits, varies substantially across species, motivating us to examine C2_04370W in more detail. Bioinformatic coupled with cell biology assays highlighted C2_04370W displays high sequence similarity to other eIF3 complex members and its transcriptional repression leads to inhibition of translation. Overall, these characterizations provide important functional insights about essential processes in C. albicans and warrants further exploration into their potential as antifungal targets. Further, these examples highlight the power of combining computational, genomic, cell biology, and biochemical techniques in tandem to predict the function of previously unannotated genes in important fungal pathogens.

Finally, we leveraged our essentiality prediction database to explore antifungal strategies that target essential genes. By cross referencing top compounds from a chemical screen with a publicly available chemogenomic database9, we identified NP-BTA as a small molecule that targets the essential glutaminyl-tRNA synthetase Gln4. This work highlights how the essentiality prediction database provided by this study coupled with currently available chemogenomic datasets will be valuable in rapidly identifying small molecules predicted to target essential genes in fungal pathogens.

Read the full story here

Contributors: Emma Lash, Dr. Nicole Robbins and Dr. Leah Cowen | Department of Molecular Genetics | University of Toronto 


  1. Brown, G. D. et al. Hidden killers: Human fungal infections. Sci. Transl. Med. 4, 165rv13 (2012).
  2. Pfaller, M. A. & Diekema, D. J. Epidemiology of invasive candidiasis: a persistent public health problem. Clin. Microbiol. Rev. 20, 133–63 (2007).
  3. O’meara, T. R. et al. Global analysis of fungal morphology exposes mechanisms of host cell escape. Nat. Commun. 6, 6741 (2015).
  4. Roemer, T. et al. Large-scale essential gene identification in Candida albicans and applications to antifungal drug discovery. Mol. Microbiol. 50, 167–181 (2003).
  5. Segal, E. S. et al. Gene essentiality analyzed by in vivo transposon mutagenesis and machine learning in a stable haploid isolate of Candida albicans. MBio 9, 1–21 (2018).
  6. Roemer, T. & Boone, C. Systems-level antimicrobial drug and drug synergy discovery. Nat. Chem. Biol. 9, 222–231 (2013).
  7. Robbins, N., Wright, G. D. & Cowen, L. E. Antifungal Drugs: The Current Armamentarium and Development of New Agents. Microbiol. Spectr. 4, FUNK-0002–FUNK-2016 (2016).
  8. O’Meara, T. R. & O’Meara, M. J. DeORFanizing Candida albicans genes using coexpression. mSphere 6, e01245-20 (2021).
  9. Lee, A. Y. et al. Mapping the cellular response to small molecules using chemogenomic fitness signatures. Science 344, 208–211 (2014).

Emma Lash

PhD Candidate, Department of Molecular Genetics, University of Toronto