Citizen scientists enlisted to spot viruses

A new project from Diamond Light Source combines citizen science and artificial intelligence to help manage huge quantities of Cryo-EM data.

Like Comment

It’s a big data cliché that one experiment can give more information than one scientist can analyse in a lifetime. Not only is computing power and storage an issue, but even the simplest tasks can take an extremely long time on large datasets. As instrumentation has improved in the field of microscopy, and with greater automation, large dataset are becoming increasingly common and they present a whole host of challenges to scientists looking to answer important research questions.


A good example of this is cryo-electron tomography which has seen a surge in popularity in recent years and generates huge amounts of data. A new project at Diamond Light Source is taking a citizen science approach to cryo-EM images of Rotavirus infected cells. The incentives are clear for Diamond which produces 500 terabytes of biological data every month and so they recently launched “Science Scribbler – Virus Project”. The project is funded by the Wellcome Trust and has been developed in collaboration with Zooniverse which is a platform for “people-powered” research.

 The idea is simple. Citizen scientists are shown cryo-EM images of viruses and then asked to find them in a series of images from real experiments. This input will help to train an AI algorithm which will be able to sort cry-EM data in the future and streamline the data analysis pipeline.


Professor Dave Stuart FRS, MRC Professor of Structural Biology at the University of Oxford and Life Sciences Director at Diamond Light Source explains: 


The ultimate goal is to completely automate segmentation using advances in deep learning. Such methods require significant quantities of already segmented data to train the systems we use. To build segmented data for this development, Zooniverse will offer members of the public across the globe the chance to partake in segmenting datasets to help researchers. This project aims to address these issues by providing tools to help researchers label features of interest, and to gather the data that is produced by citizen scientists in a standardised way that can be used to automate the process in the future, thereby helping fasten the analysis process from weeks to days or less.


Mayeb Ironically, this citizen science project is actually an artificial intelligence project. When you think about it though, this makes sense. Project Coordinator Mark Basham from Diamond Light Source explains:


Artificial Intelligence has begun to have a massive impact on the world in the last few years, from beating humans in games such as Go, to the amazing advances in self-driving cars. These dramatic developments have been aided by the availability of vast quantities of data with which AI systems can be trained with. Alongside these developments, 3D imaging of frozen cells, for example, has also developed rapidly, but as yet, very little training data is available. Researchers spend much of their time manually processing their data and this is an area where AI could be heavily used. However, for, machine learning to be possible, we need human input to guide the process and this is where members of the public can make a huge difference to our work.

You can take part in the citizen science project by visiting the Science Scribbler-Virus Project page.

Ben Libberton

Science Communicator, Freelance

I'm a freelance science communicator, formerly a Postdoc in the biofilm field. I'm interested in how bacteria cause disease and look to technology to produce novel tools to study and ultimately prevent infection.