Skip to the content

GDS harnesses machine learning to control feedback spam

11/10/22
Spam icon
Image source: istock.com/LumerB

The Government Digital Service (GDS) has developed a machine learning spam classifier to improve control of feedback on GOV.UK.

It said it has taken the step to defend against security threats and to improve the quality of insights from the central government website to officials.

Felix Reilly, a data scientist at GDS, said in a blogpost that GOV.UK received around 540,000 feedback responses in 2021, and that in early 2022 there was a surge in spam to a peak of 12% of the total. This ranged from fraudulent advertisements to inappropriate links to multiple lines of code.

This made it almost impossible to automate the extraction of unusable responses through manually filtering them out, and raised the threat of users clicking on dangerous links.

In response, the GDS team identified a need for a model that would go beyond the use of rules based methods of detecting spam, to using machine learning to classify feedback responses through probability metrics.

It used tools including the Machine Learning Canvas, the Random Forest Algorithm and Data Version Control and took an agile approach in building the model.

Huge time savings

“The first iteration of our spam classifier is capable of delivering huge time savings to GOV.UK,” Reilly said. “We can run it on over a month’s worth of feedback data – around 40,000 responses – in less than five minutes, a fraction of the time it takes human reviewers.”

He said that the model will now be deployed to larger, more complex feedback datasets and that new features will be engineered. This is aimed at improving its accuracy by finding an optimum balance in the classification threshold.

“Open development remains integral to our approach, helping us refine collaboration across teams, test novel techniques, and speed up the processing of tens of thousands of feedback responses every month,” he added.

 

Register For Alerts

Keep informed - Get the latest news about the use of technology, digital & data for the public good in your inbox from UKAuthority.