The Ministry of Justice (MoJ) has urged other government bodies to make use of its Splink software for linking datasets.
MoJ data scientist Robin Linacre said in a blogpost that the software, in Python programming language and available as open source on Github, can support record linking, which can linking multiple records that refer to the same entity but have no unique identifier.
He said Splink is now in its third version and provides a number of benefits. These include the ability to link data sets of tens of millions of records or more, being faster and more accurate than other free tools, compatibility with multiple databases and big data processing engines, and producing a range of interactive data visualisations.
Linacre added that the software has made it possible for the MoJ to share new linked datasets with accredited researchers as part of its Data First programme, and that the MoJ recommends that users begin by looking at its online tutorial.