A perspective article entitled ‘Guiding principles for the responsible development of artificial intelligence tools for healthcare’ has been published in Nature journal, with the intention of expanding guiding AI principles to assist it in “fixing deeply engrained and too often overlooked challenges in healthcare.”
The researchers share eight non-exhaustive principles that the researchers believe to be key when developing AI tools for healthcare. They propose that AI should be designed to alleviate disparities; report clinically meaningful outcomes; reduce over-diagnosis; consider biographical drivers of health; have high healthcare value; be easily tailored to local populations; promote learning; and facilitate shared decision-making.
Looking firstly at the principle that AI tools should aim to alleviate existing health disparities, the researchers note that AI tools usually require collection of specialised data for inputs, cloud or local computing for hosting, high purchasing power from commercial companies and technical expertises, all of which can be barriers to entry into hospital systems serving the most disadvantaged populations. Therefore, they say, AI tools are likely to only realise benefits in populations that already benefit the most from healthcare and risk widening the health equity gap. To combat this, the researcher suggest two strategies. The first strategy involves ensuring equal assess and benefit through taking concrete steps to dismantle systemic biases; for example, developers may be required to prioritise the use of routinely collected or inexpensive data points as input, prioritise the use of single, explainable algorithms that can be run on a local computer, and advocate for the provision of discounted products, free cloud access and local training. The second strategy involves reducing disparity by prioritising the development of AI tools for hospitals serving underrepresented groups, with the researchers acknowledging that “a combination of need and capacity to benefit is often needed to justify potential resource allocation.”
The next principle focuses on the fact that outcomes of AI tools should be clinically meaningful. “If AI researchers do not define clinical benefit from the start, they risk creating a tool clinicians cannot evaluate or use,” the researchers state. They specify that clinicians must evaluate the accuracy, fairness, risks of over0diagnosis and over-treatment, healthcare value, and the explainability, interpretability and auditability of AI tools. They add that in some domains it may be difficult to decline clinical benefit; “however, this does not preclude the need to identify an acceptable definition of benefit.”
The third principle is that AI tools should aim to reduce over-diagnosis and over-treatment. Here, the researchers highlight that the physical, emotional and financial costs of over-diagnosis and over-treatment must be considered, although “this is challenging because the definition of over-diagnosis is not always agreed upon”. The researchers provide an example regarding breast cancer; some AI tools designed to predict this disease do not differentiate between invasive cancer and non-invasive cancer. They suggest that developing AI tools capable of predicting subtype-specific breast cancer risk would be better, as such tools could be used to “appropriately tailor interventions according to predicted disease severity”.
Moving onto principle four, that AI tools should aspire to have high healthcare value and avoid diverting resources from higher-priority areas, the researchers state: “It is not enough to have a good working tool, it must make financial sense to the healthcare system and not increase costs for patients.” They note that an initial consultation at the outset with leadership stakeholders and health economists can establish whether and how AI tools should or could be a financial priority. In addition, “estimating the value of the tool benchmarked against the existing practice is imperative.”
Next, the researchers say, AI tools should consider the biographical drivers of health. They note that AI tools will “miss the goal of delivering precision medicine interventions if the biographical drivers of health that contribute to the variation in outcomes seen between patients are not seriously considered” and add that machine learning “is likely to be a key tool that will help us uncover the complex relationships between biology and biography.” The researchers point out that AI developers can utilise low-resolution data on information such as zip/postcode and socioeconomic status scales until it is possible to collect higher resolution biographical features, and emphasise that “deliberate thought and effort should be placed in determining how biographical determinants of health can be integrated into AI tools, with the goal of improving the resolution of these variables over time.”
The sixth principle is that AI tools should be designed to be easily tailored to the local population. Here, it is acknowledged that AI researchers often seek external datasets as a test to evaluate whether their tool can be generalised; however, these datasets are often sourced from similar settings such as academic hospitals. They suggest using easily-collected inputs and reliable training features across different populations, so that algorithms can be retrained for a specific setting. Another suggested strategy is to openly publish AI workflows, or to provide platforms upon which institutions can train and evaluate their own local models.
The penultimate principle states that AI tools should promote a learning healthcare system. The researchers emphasise that all interventions, AI or not, should be designed with the intention of regular evaluation, learning and improvement. “Further, as science evolves, there should be mechanisms to integrate new knowledge that could benefit the patient. Evaluation metrics, timeframes, and performance standards should be determined in the AI research phase in consultation with clinicians.”
The final principle is that AI tools should facilitate shared decision-making. The researchers emphasise the need for AI tools to be explainable and interpretable, adding: “Opaque AI tools cannot be adequately evaluated and audited, undermine trust, and cannot facilitate shared, informed decision-making between patient and practitioner.” If an AI tool is designed to assist with a decision, they say, then the patient and practitioner should also need to know how and why the recommendation was made, along with the advantages and limitations of the AI tool. The researchers note that there are different explainability tools that AI researchers can utilise to ensure that the tool they are developing is patient-centric.
Citation: Badal, K., Lee, C.M. & Esserman, L.J. Guiding principles for the responsible development of artificial intelligence tools for healthcare. Commun Med 3, 47 (2023). https://doi.org/10.1038/s43856-023-00279-9