NHS AI Lab publishes blueprint for artificial intelligence validation

The NHS Artificial Intelligence Laboratory (NHS AI Lab) has published a new blueprint focusing on artificial intelligence validation.

Entitled ‘NCCID case study: Setting standards for testing Artificial Intelligence’, the blueprint follows an evaluation into the performance of AI models using data from the national COVID-19 Chest Imaging Database, and highlights a proof-of-concept validation process for testing the quality of AI in the health and care sector.

It discusses the need for a large volume of good quality data when developing AI radiology products – such as when using AI to identify COVID-19 from scans – in order to ensure the performance of the tools are effective and reliable enough for use.

The document highlights that creating a validation process for AI tools is essential in limiting negative outcomes for patients – and ensuring adopted technologies are both safe and ethical – through eliminating systematic and repeated errors in AI models, such as bias.

Explaining what the validation process looks like, the document states that the tests calculate how accurately the models detected positive and negative COVID-19 cases from medical images, as well as how the models performed with different sub-groups – such as age, race, ethnicity, and sex.

The document split the validation process into four broad steps, tailored depending on the inputs and outputs of each model, which included:

  • Creating a validation data set based on the intended use case of each algorithm by using data from the NCCID – that had not been used to train the algorithm.
  • Using a cloud-based deployment environment to run the algorithm, rather than a locally hosted one, which provided a secure space that protects the developer’s intellectual property.
  • Running the model on the validation set and performing pre-defined statistical tests to asses the robustness and performance of the model against various demographics.
  • Reporting the results to the organisation that built the models in order to inform model improvements.

The document also highlighted some of the outcomes and learnings from the process:

  • Improving understanding of the potential for AI models to support clinicians in diagnosing COVID-19 from medical images.
  • Producing guidance about the statistical tests needed to assess model performance and robustness.
  • Creating a method of developing labelled data sets.
  • Experimenting with data curation that will guide future imaging platform development.
  • Supporting methods of quantifying and ensuring to reduce bias in AI models for health and care.
  • Producing technical guidance for creating secure development environments that AI vendors can trust.

Dominic Cushnan, Head of AI Imaging, NHS AI Lab, said: “Our rigorous validation and testing procedures have implemented a novel process to test that AI models adopted are safe, robust and accurate in diagnosing COVID-19 – while protecting developers’ intellectual property.

“Unfair and biased models can lead to inconsistent levels of care, a serious problem in these critical circumstances. Outside of the NHS, our validation process has helped guide the use of AI in medical diagnosis and inform new approaches to the international governance of AI in healthcare.”

Read the full document here.