Self-replicating or self-synthesizing genetic mobile elements span (mostly) integrative elements encoding for its own DNA polymerase, always from family B (PolB) as far as we know by the moment. Those PolBs would be involved in the replication of the element (1) although only few of them have been experimentally characterized. They include eukaryotic virus-related Polintons (aka Mavericks), Casposons, which have been reported in archaea and some bacteria, and the recently described Pipolins, which we reported in 2017 (2), after an inspiring collaboration between Margarita Salas' laboratory in the Centro de Biología Molecular Severo Ochoa and Mart Krupovic and Patrick Forterre from the Pasteur Institute. We detected Pipolins mostly as integrative elements in diverse bacterial phyla, but also as plasmids in bacteria and mitochondria. The hallmark of pipolins is the presence of a new clade of PolBs, dubbed piPolBs by their primer-independent DNA polymerase activity. That is, rather than requiring the help of a pre-existing primer to add new nucleotides, they can directly add the first nucleotide opposite of its complementary nucleotide to kick-start the DNA synthesis.
Although our main interest was (and still is) the characterization of piPolBs biochemical properties and biotech applications, we considered that understanding of Pipolins as our new working model would be also essential. We were struck that these elements had been unnoticed for years as they do not bear any antibiotic resistance or virulence gene, but they are widespread among very diverse bacteria groups, making them either very old or very successful in terms of host adaptation (or both). Also, the related plasmid elements in mitochondria, for which the only annotated gene is the piPolB, appeared enigmatic for all of us.
Interestingly, whereas reported evidence of mobility of polintons and casposons is limited and based on metagenomic data , pipolins provided the opportunity to analyze the occurrence, diversity, and dynamics of self-replicative elements in well-characterized commensal and pathogenic bacteria, such as Escherichia coli, not only in genomic or metagenomics data, but also in circulating field isolates and pathogenic variants. Thus, in the first place, we decided to attempt surveying Pipolins prevalence and diversity among E. coli and I contacted Prof. Jorge Blanco, head of the E. coli Spanish Reference Laboratory from the Universidad de Santiago de Compostela, who was excited about the project and agreed to perform a screening of their collection of pathogenic E. coli strains from human and animal sources for new Pipolins. We soon learned that Pipolins don't seem to be very abundant but, although some of them were in closely related strains from the same clonal group, the diversity of pipolin-harboring strains span different phylogenetic groups, serotypes, and pathotypes. Furthermore, using partial sequences of the piPolB obtained from the PCRs we used to screen for Pipolins, and the PCR-based E. coli multi-locus sequence typing (MLST), we could perform a preliminary cophylogeny study that suggested that Pipolins were active mobile elements.
After those promising preliminary results, Jorge readily suggested carrying out the whole genome sequencing of the strains, which we did with his usual collaborator, Maria de Toro, head of the Genomics and Bioinformatics platform from the Centro Investigación Biomédica de La Rioja (CIBIR), also in the north of Spain. Maria has a great experience in the NGS analysis of bacterial genomes and mobile elements and her incorporation gave a broader perspective to the study. When we started to analyze the new genomes of pathogenic E. coli strains-carrying Pipolins, Saskia, a Ph.D. student at Jorge's lab realized that the analysis of the genetic structure of Pipolins entailed several difficulties. Although all Pipolins contained the DNA polymerase gene, they were very different from the previously reported ones and often split into two or three assembled contigs. We realized that bioinformatic processing of the data would be time-consuming and we recruited Liubov (Liuba) Chuprikova, a Bioinformatics MSc student who performed her MSc thesis in my lab with the co-supervision of Maria. For her Master's Thesis, Liuba developed a custom Python pipeline for detection, scaffolding, and extraction of Pipolins, with the subsequent homogeneous reannotation of the Pipolins' genes. This method was used for Pipolins in our new genomes, as well as in genomes of pathogenic strains retrieved from the GenBank database, which allowed us to expand our study to virtually all the characterized E. coli diversity.
The genetic structure of all analyzed Pipolins shows great flexibility and variability among the same species, with the piPolB gene and the attachment sites being the only common features. Most Pipolins contain one or more recombinases that would be involved in excision/integration of the element in the same conserved tRNA gene. In addition, we could perform a detailed analysis of cophylogeny between Pipolins and Pipolin-harboring strains that indicated a lack of congruence between several elements and their host strains, in agreement with recent horizontal transfer between hosts. One of the more striking findings for us was the very low co-occurrence of Pipolins with other integrative genetic mobile elements (MGEs), such as pks islands, functional CRISPR/Cas, or integrons. That is also something for what we will be vigilant in the future.
Overall, we achieved our goals and could get insights into Pipolins diversity and mobilization among circulating E. coli, which are now the paradigm for self-replicating GMEs. Further, this work opens many questions regarding Pipolins transference mechanism and relevance in E. coli, but also about differences among Pipolins from E. coli and other commensal and pathogenic bacteria that can carry pipolins, spanning other Enterobacteria (Enterobacter, Cronobacter), Proteobacteria (Vibrio) or even in other Phyla, such as Firmicutes (Staphylococcus, Streptococcus or Clostridium).
This project arose as a consequence of previous findings and took me beyond my previous background, which was exciting and enlightening. However, we also had to deal with difficult situations in the last part of this project. One of them, the worldwide Covid19 pandemic that led to laboratories lockdown and working at home, is shared with every scientist worldwide. Like many other researchers, we managed to keep on the science under these hard times, with online lab meetings that helped all of us to keep motivated and in contact with our collaborators. The other adversity was harder to tackle. In November 2019, we lost Margarita Salas (4), my previous mentor, and scientific reference. Finishing this one and other projects that we started in collaboration with her is the best tribute for her career as a pioneering scientist, but also a major responsibility, as I must do it with the highest standard of scientific quality that she always instilled us.
High diversity and variability of pipolins among a wide range of pathogenic Escherichia coli strains.
Saskia-Camille Flament-Simon, María de Toro, Liubov Chuprikova, Miguel Blanco, Juan Moreno-González, Margarita Salas, Jorge Blanco, Modesto Redrejo-Rodríguez.
Scientific Reports 2020: www.nature.com/articles/s41598-020-69356-6