Now

HTN Now panel discuss AI approaches, safety, policy, regulation and evaluation

For a recent HTN Now webinar, we were joined by Neill Crump, digital strategy director at Dudley Group NHS Foundation Trust; Anil Mistry, AI safety lead and senior clinical scientist in AI at Guy’s and St Thomas’​ NHS Foundation Trust; and Matea Deliu, GP clinical lead and clinical lead for primary care digital delivery, South East London ICB.

After some brief introductions, our panellists went on to discuss AI in healthcare, sharing their own learnings from recent AI strategies and projects as well as their thoughts on regulations, tackling data bias, through to safety and evaluations.

AI projects and strategies 

To begin, Matea spoke about some of the recent AI projects that are taken place across South East London ICB, starting with the use of Microsoft Copilot. “We are one of the first pilot sites for NHS England to really try and evaluate whether Copilot is useful for clinicians and non-clinicians. We already knew about Copilot through Microsoft Word and Excel, using it in the back office, but we were tasked with seeing if there was a use for it in primary care as well.”

Matea went on to explain what this meant for the ICB, adding, “we put a call out and allocated a certain number of licences per practice, and started to help to organise training sessions. We’re still currently in the process of gathering feedback from our practices. I’d say probably by next year we should have a little bit more information, but overall it has been quite positive.”

She then touched on the ICB’s plans around implementing ambient AI. “About 70 percent of our practices are now using ambient AI,” she said. “The challenge has been that some practices don’t really know how to evaluate this type of tool or which tools have gone through the right regulations.” Because of this, Matea shared that many practices “didn’t really know which tool was the best one to use and clinical safety was often in the back of everyone’s mind because of a lack of clinical safety officers”. As such, the ICB is trying to look at ways to help with clinical safety, with Matea adding, “we’re looking at ways of using other tools to help enable practices to generate their own DCB0160 as part of implementing ambient AI.”

One final AI project Matea wanted to share was the use of AI triage and intelligent navigation, part of an NHS England initiative where the ICB is piloting the NHS App as a platform to improve the patient experience. This includes allowing patients to input their symptoms into the NHS App, with the AI triage tool using all the relevant data it already has to “signpost or navigate the patient to the right place at the right time, with the right type of clinician, not just your GP but more under-utilised community services like Pharmacy First too.”

Next, Neill took us through some of the strategic approaches he’s been involved with at The Dudley Group, going into more detail around acute and community care. “Ambient AI is absolutely a big focus for us and we think that it’s got the ability to transform the clinical patient experience,” he said. “We’ve been working with one supplier called Heidi, who we chose because they had the correct assurance in terms of device type. We’ve started small, choosing to use their AI scribe in same-day emergency care as well as rheumatology as a test. And what we’ve found is that it’s helped with freeing up clinical time and removing the administrative burden of typing up the consultation.”

He also touched on natural language processing, explaining how the trust now has an informatics team that is using cloud-based analytics on a variety of different tasks: “We’ve been looking at patient satisfaction, for example, as well as performing thematic analysis around staff surveys and IT incidents”. Neill then emphasised the importance of “using AI effectively for detection and response” as a form of cyber protection, in order to “improve the way we deliver care safely, how we use information and how it’s governed”.

Speaking about some of the AI projects and strategies at Guys and St Thomas’, Anil shared, “we’ve ended up evaluating well over 50 different products to solve clinical problems”, before sharing one example the trust has built themselves. To give some context, Anil explained how the trust needed an AI solution to help with the auto-contouring of tumours as part of their cancer treatment and imaging techniques. “A lot of organisations have tried to implement this but we weren’t able to evaluate anything good, so we ended up building one ourselves,” he said. “We’ve clinically implemented and translated research into the clinic, which is mainly focused around pelvis and prostate auto-contouring, showing all the organs at risk.”

Explaining the reasoning behind why the trust chose to develop their own tools within this area, Anil said, “what we found with other AI products that you buy is that they’re often trained on certain data sets that may not work on our own cohort. And it’s particularly a problem in radiotherapy, where each individual hospital, each individual department and each cancer centre has their own contouring culture. That means if you buy their products, it’s not always going to work.” He noted, “being able to train something on your own data and deploy it in your own hospital is really, really powerful. And being able to build it in-house has reduced the costs of us being able to deploy these applications.”

Anil and the team at Guys and St Thomas’ have been evaluating other AI tools as well, looking at how they could help solve problems within radiology, particularly around “never events”. As an example of a never event, he explained, “you might have a patient on the waiting list for three to four weeks with a very obvious pulmonary embolism, but because it wasn’t down as suspect PE, the patient would not be seen”. With this in mind, the trust has been trying to implement pulmonary embolism detection in the backlog for non-indicated incidental pulmonary embolism, working with multiple third-party suppliers to see which solution might be better.

Evaluating AI tools

Moving on to explore evaluation in more detail, Neill spoke about the benefit realisation study that is currently taking place at The Dudley Group. “We actually want to publish the study because we’re looking to do a wider procurement outside The Dudley Group. We want to be really clear on the benefits of using an ambient AI scribe.” Delving deeper into what the evaluation will cover, Neill added, “some of the benefits we’ve seen are around the clinician and patient experience, recognising how busy clinical life is at the moment, especially with all the targets they’re being asked to meet. And also recognising that we’re pretty much at an all-time low in terms of what people think about the NHS.”

A key benefit he wanted to highlight was the actual transcription that patients receive: “It’s in plain English, without all the medical language, so patients can actually understand what’s being written”. He also noted improvements around documentation, coding and governance: “If we get really high-quality data captured as part of the process, then the coding aspect becomes easier. For example, our coding manager has seen that there’s an opportunity to actually reduce the number of people that he has coding. And it’s not about reducing jobs, it’s about being more efficient and effective in how we complete those processes.” This high-quality coding then allows others, like those from Aston University who are on placement, to perform analytics based on the transcription, Neill explained.

“We’ve been looking at AI in different ways,” he said. “Looking at whether or not it offers a way to improve elective recovery and demonstrating the financial impact as well. We can’t just keep adding new tools without showing how we’re actually going to take cost. That’s why we started small because we wanted to make sure that we could actually take clinicians and patients with us and then work out how we can scale that in a sustainable way.”

For Neill, the ultimate goal is to roll this out across every single speciality: “I think that ambient AI, based on what I’ve seen so far and the feedback that we’re getting from clinicians, it has the ability to radically transform how we deliver healthcare. We’ve already got the integration into the EPR sorted, so we’re going to have to look at how we can actually enhance that integration as well.”

When evaluating any type of ambient AI tool, Matea noted the importance of “really looking at the key metrics that are going to deliver the most benefits and allow you to make an informed decision on implementation further down the line.” She touched upon operational efficiency, cost reduction and time savings as three key metrics to track. “One of the main things to consider for ambient AI is whether it’s actually going to improve a clinician’s workflow and cognitive load,” she said. “And some of the basic things that we’ve already seen with ambient AI in particular is that clinicians don’t have to type up their notes, which is a real time-saver.” However, Matea also highlighted the fact that not every tool is integrated with every system, which means clinicians do have to take the time to copy and paste to “make sure that the note is actually accurate enough” across the board.

“You don’t have to constantly think about remembering everything and making sure that all the information has been captured,” she said, highlighting the benefits around cognitive load. “The other metric that we look at includes clinician satisfaction, which evaluates usability, accuracy, acceptance and whether clinicians actually trust the AI tool. I think it’s really important for AI tools and LLMs to be transparent about the hallucination rates. For example, Tortus AI has a dashboard that they use to constantly monitor hallucination rates, which is really useful for being able to develop trust. Identifying hazards and looking at how we can mitigate them is an important part of any evaluation process.”

Finally, she discussed scalability and adoption and “looking at how successful an AI tool is when deployed in different settings”. She outlined, “not every practice is the same. Being successful in one practice doesn’t necessarily translate the same way to another practice. So we need to assess whether we can adapt that tool from different settings and different use cases to understand long-term sustainability and scalability.”

From an integration perspective, Anil expressed that there are “so many different things you need to consider before you can even evaluate a product”. Because of this, he suggested trying to speed up the evaluation process, “so that you’re not dragging it out” and spending too much or using too many resources. He noted the importance of “having a solid framework that covers clinical risk, IT risk and cybersecurity risk, before you even engage with the tool,” as this can help you identify products and companies who might not have done their due diligence. “We’ve stopped evaluations based on the fact that the companies haven’t anonymised the data before it enters the cloud. We’ve stopped evaluations because of our clinician workload,” he shared.

“The most successful evaluations I’ve done are when I’ve had clinicians who really understand the pathway and the implications of it,” Anil went on to say, highlighting how essential it is to get clinicians on board. “Clinical safety is a big issue as well,” he said, echoing much of what Matea had already mentioned on this subject, before adding, “Providing clinical safety training for AI deployments can help you map your pathway really clearly and assess the risks at every point, whether that’s data transferring between departments or clinical groups”. As a final note on this topic, Anil surmised, “evaluations aren’t just about whether a tool is effective; they show how tools can be useful within the hospital as well.”

Finding the right balance between speed and risk

Picking up on Anil’s point about trying to speed things up, Matea looked at this in terms of implementation, stating, “primary care is one of those really interesting places where because there are so many practices working as individual entities, they can make any decision they want. Whereas within a trust, it’s a lot more regimented and standardised, so there are a lot more layers to go through.” She went on to highlight how this can lead to a difference in speed: “If a practice wants to implement something tomorrow, they absolutely can and it’s usually within their own budget.”

However, she noted that because there are so many AI tools available, it can be “very difficult to know where to start and where to look and what a safe implementation really looks like”. Because of this, Matea suggested that standardised frameworks are necessary to help simplify this and “reduce a lot of the duplication that we have”. As an example, she highlighted the AI framework that was designed for safe implementation and maintenance of AI within her ICB and the London area. “It’s a really great starting point,” she shared. “But there are still nationwide differences where our framework might not necessarily be applicable elsewhere. So, we need to start looking at how we can standardise it nationally.”

Matea then went on to consider interoperability as key for balancing speed with risk, referencing the US as an example of a country where “interoperability is effective by default”.  With this in mind, she noted, “EPRs really need to allow easier integration and interoperability to be able to actually scale up and speed up implementation in primary care.”

Neill shared his thoughts around balancing speed with risk by stating that it’s important to “look at areas where we can have a high impact, but then measure that against the risk of actually going ahead with implementation. Then you do all the other stuff around clinical safety etc.” Like Matea, he suggested having some sort of process or framework in place, where you look at past use cases, agree on the success metrics and then split the process up into different stages. “What would day zero to eleven look like? Day eleven to forty? And then day forty onwards?” He suggested that this could help with assessing any bias and fine-tuning certain areas before you then decide to take it into the ICU or somewhere similar. “You wouldn’t go for full deployment, but maybe through some sort of assisted mode where you could then start testing.”

Adding onto what Neill had to say about assessing success metrics, Anil used an example from Stanford Health in California, where they were considering the use of a sepsis app but have since stopped pursuing this venture. “They tried it for four years and have said that they don’t want to do it anymore because there’s no value to it,” he explained. “You’d think there would be, but if it only works as a classifier and not as a predictor, then it’s going to have a huge margin for error. And if you’re in an ICU scenario where there are so many other things to think about, having a light that comes on saying that a patient might have sepsis and there’s a 50 percent chance it’s wrong, isn’t valuable at all. And the healthcare system isn’t in place to be able to deal with that result.”

The real value in AI, according to Anil, is its ability to predict good information in a timely fashion and in the right way, so if the reliability of the tool isn’t great, especially in a high-stress environment, “you’ll lose trust in it” from a clinician perspective. This is why the quality and speed of the data are essential for assessing risk, he explained.

Building on this point, Matea compared the implementation of AI in the reactive environment of a hospital to the proactive environment of primary care, “where we’ve got the element of time”. She explained, “we can try and look at things in a different way, but in a reactive system we’re so bound by guidelines. So, when X happens, you need to do Y. With these new tools, we would need to have a complete system rethink of what we do with patients.” Because of these guidelines, she noted, “In terms of faster implementation, there’s probably more potential in primary care because I can manage risk factors in the community to reduce risk etc. But in the hospital it will require an adoption piece and a change management piece, both at larger scale and I’m not sure if we’re fully ready to do that.”

Key takeaways around short-term changes and the impact of AI 

On one final note, our panellists discussed some of the key considerations around the short-term impact of AI. “I think this is still really new technology,” Anil began. “As AI matures, as this sector matures, the bad tools and the small startups will either be bought up or kicked out. The ones which have made it into clinic will have been proven and developed.” He emphasised the need to be sceptical and to “try and remove yourself from the hype a bit because there are so many products out there and they might not all have the right usability.” Anil recommended that hospitals should first consider whether a new AI tool integrates well with their existing systems and to remember that “AI is a really challenging space to be in”.

Neill agreed with Anil about making sure integration works best for your system, while adding that it’s best to consider tools that are going to “actually benefit clinicians both from a time and experience perspective and also benefit patient engagement”.

Matea highlighted some key tips, starting with understanding the problem you’re trying to solve. “Look at your biggest pain points instead of just allowing a shiny tool to come in and say you’ve got a solution,” she suggested. “Then we have to look at co-designing the tool with the people who are going to use it and use technology as an enabler that’s going to work for clinicians. I think we’re so focused on cost-saving, but we what about patient outcomes? What about clinician satisfaction?”

We’d like to thank our panellists for joining us and taking the time to share their insights.