Streptococcus pyogenes is an ‘old’ pathogen made up of many genetic lineages, yet not every lineage of S. pyogenes is the same. One lineage, emm1 (M1), has dominated the clinical landscape over the last half-a century and is the poster child of S. pyogenes infections. The history of M1 is fascinating, with new variants arising over time and replacement of previous clones (much like what is seen for many viral pathogens). One key evolutionary event in the emergence of modern M1 clones was the acquisition of a superantigen called speA (sometime back in the 1980s), which will play a main role in this story. Superantigens such as SpeA are not good for us humans. They send the immune system into overdrive by stimulating human T-cells, leading to some of the classic fever and clinical symptoms observed in nasty ‘invasive’ S. pyogenes infections.
While we have known about SpeA for a while, researchers in the UK recently identified a pimped new M1 lineage, called M1UK. This not only out-competed the previous M1 lineage by the mid 2010s but was also allied with a sharp increase in national cases of scarlet fever (a pharyngeal infection) and invasive S. pyogenes infections. M1UK differs from its most recent ancestor by 27 single nucleotide polymorphisms. The hallmark feature of this new clones was a 5x upregulation of SpeA – bad news right! The big question since has been how?
Using Illumina based RNA-seq and comparing M1 and M1UK isolates, it was discovered that only two genes were upregulated in M1UK. These were speA and a gene upstream of speA called paratox. Upon closer inspection, speA and paratox were found on opposing strands of DNA, causing different transcriptional directions. Due to this, data was reanalysed taking this into account. After reanalysis only one gene was left, our “star” of the show: speA!
Now why were RNA-seq reads observed on the DNA strand opposite of paratox?
We examined read coverage across the region of speA, paratox and the gene upstream of paratox. As per usual, this raised more questions than it gave answers. First, what was the upstream gene? It is purported to be a tmRNA, important for liberating ribosomes stalled during translation of mRNA and degradation of truncated protein products. Secondly, why was increased read coverage observed across the region opposite the paratox gene, and why did this seem to drop across “The valley of death” before resuming at a similar coverage level across speA. Next, why does the M1 and M1UK isolates display similar patterns in read coverage drop-off across the valley of death, but different levels of coverage across the region after the tmRNA across speA? The valley of death was puzzling for a while, until careful examination of Illumina whole-genome sequencing data. The DNA reads identified a similar drop in coverage across the valley of death. In contrast, aligned Nanopore reads did not show this drop, indicating the valley of death to be a method specific feature, due to low G/C content. This was supported by previous experience in the group and literature.
So now, why was the difference in coverage across M1UK and M1 across this region observed, and was it related to increased SpeA production?
Remember the 27 mutations found in M1UK? One of them is just upstream from the annotated tmRNA gene. Actually, it was found to be part of the 5’ leader sequence of the tmRNA transcript. This 5’ end containing the mutation is cleaved in the tmRNA maturation process and is therefore not annotated as part of the gene. The result, a single mutation in the transcript of a gene upstream of speA, and RNA-seq reads that seem to have coverage between the tmRNA gene and speA. Could this be due to a terminator being “leaky” not terminating all RNA-polymerases that transcribe the tmRNA gene, allowing some to continue transcription into speA?
Laboratory studies reverting and inserting the mutation in the 5’ end of the tmRNA in M1UK and M1 confirmed, that this single mutation was the cause of increased SpeA production. Great! But how? Luckily an answer was found next door and proves the power of conversation by the water cooler. A bright postdoc in a neighbouring lab had been adapting native Nanopore mRNA-sequencing for bacteria. This type of mRNA sequencing primarily rely on poly(A) tails which are absent in bacteria. The main challenges for the technology is that bacterial RNA is notoriously unstable, and to make it compatible with native RNA-seq, multiple processing steps are required, which can further fragment the RNA. However, it allowed for large, and sometimes even full-size transcripts, to be captured. This was a brilliant method with potential to solve the question! All that was required was RNA-seq on an M1UK isolate to confirm that transcripts across the tmRNA to speA region existed, easy! right? Not quite. No Gram-positive had previously been reported to have undergone native RNA-seq using Nanopore technology. Neither had the postdoc been hugely successful in previous attempts with another Gram-positive species, so let alone with, to them, a new species. None the less the postdoc executed skilfully, producing good quality reads, that spanned the valley of death, and reads spanning large parts of the combined tmRNA and speA transcript. This elegantly confirmed, our other bioinformatic anecdotal evidence, that occasionally the terminator of the tmRNA was not able to halt transcription, allowing some RNA-polymerases to continue downstream, transcribing speA. This readthrough mechanism was increased with the introduction of the mutation in the 5’ end of the tmRNA transcript in M1UK, explaining the lineages’ signature increased speA production.
So, the old schoolers may ask why we did not just use northern blotting. It is a fair question. We tried with mixed success. None the less, the technique is neat and have allowed new discoveries that hopefully in the future can be used for many more interesting discoveries. There is still much to explore around operon structure, transcriptional start and end sites, and the epitranscriptome that long-read native RNA-seq can hopefully help us understand. So, an old dog that picked up a new trick taught us the value of a new technology and helped to uncover the mechanism underlying the increased SpeA expression of the S. pyogenes M1UK sub-lineage.