The NHS AI Lab Skunkworks team has been working on a pilot project called ‘Data Lens’ – a fast-access data search that works in multiple languages – with the first-stage prototype now completed.
A successful candidate from the 2020 edition of the ‘Dragon’s Den-style’ project pitch to the AI Lab team at NHSX – which also took place this year – the core aim of Data Lens is to bring together information from multiple databases.
The pitch outlined what is considered a ‘common data problem’ for analysts and researchers across the UK – ‘large volumes of data’ being held on ‘numerous incompatible databases in different organisations’.
According to the Skunkworks, the team – comprised of colleagues at the NHSX Analytics Unit, Accelerated Capability Environment (ACE) and its competition-selected community member Naimuri – wanted to be able to ‘quickly source relevant information with one search engine’.
It’s hoped this will ultimately assist in reducing workload, making more quality of information available and help identify which types of data are being collected.
Taking the project through a 12-week development phase before completing the prototype, Natural Language Processing (NLP) and other AI technologies were used to create a ‘universal search engine for health and social care data catalogues and metadata’.
Aiming to be collaborative, user-friendly and time-saving, it works by joining up data catalogues from sources such as NHS Digital, the Health Innovation Gateway, MDXCube, NHS Data Catalogue, PHE Fingertips and the Office for National Statistics.
In an additional layer of usability, the tool will also provide open-source code and documentation from its development. This will be freely available available for other developers and organisations to use, learn from and build on.
As well as increasing data access, the tool will track user searches, click-throughs and datasets used – to enable it to become more useful to an individual over time and suggest ‘relevant results that go beyond the scope of search terms’.
For example, it was also ‘trained’ to ‘better understand’ semantic similarities in searches, such as ‘smoking’ and ‘cancer’ and responds to user feedback – reacting to “thumbs up” and “thumbs down” responses for suggested results.
Searches and results are also available in all 71 languages supported by Amazon Web Services (AWS), while ‘jargon busters’ and fuzzy matching for typos further increase the ‘usability and inclusivity’ of the engine.
Supported through the NHS digital service development pipelines, the prototype is in line with the NHS England ‘Joining up health and care data’ initiative, as well as wider goals to use AI to make the most of existing data sets and ‘ease the burden’ of data collection across the health and care system.
Paul Ross, Data Engineer at the NHSX Analytics Unit, said: “Working with the AI Lab Skunkworks on this project was Agile in the truest sense of the word. We pitched an idea, had funding approved and were up and running in a very short amount of time. I sincerely hope it can be taken forward into production to help its users get value from the wealth of data and information that is produced by the Health and Social Care sector.”
Any interested parties can find the Data Lens code through the dedicated NHSX GitHub, or read the full Data Lens case study via NHSX.