News, NP

‘Better, broader, safer’ report published on use of health data for research

‘Better, broader, safer: using health data for research and analysis’ has been published, a report on how the safety and security of health data use can be improved.

Professor Ben Goldacre was commissioned in February 2021 by the government to undertake the review. It is aimed at policy makers in the NHS and government, research funders and those who use data for medical research, public health management and service planning.

The report starts with Sajid Javid, Secretary of State for Health and Social Care, noting that in some ways “healthcare is more suited to data and the innovation that follows than almost any other sector — with the depth and coverage of NHS data providing unique opportunities.” He states that the report shows a “need to be as thoughtful as we are innovative, guided by safe ethical frameworks for providing access to data.”

In his foreword, Professor Goldacre adds: “almost every interaction with the health service leaves a digital trace: the diagnoses, treatments, tests and outcomes for almost every citizen in the country. This raw information has phenomenal potential… But raw data is not powerful on its own.” He goes on to lay out the purpose of the review: to set out a practical vision for how that data can be managed effectively, and acted upon in a way that benefits patients and healthcare services.

The review is set out at three levels of detail: Executive Summary, containing a short overview of high-level opportunities, Brief Summary containing longer overview of opportunities; challenges, and recommendations; and the Full Text, containing detailed explanation of the work complete with extensive findings and practical descriptions to help ensure informed discussion.

For the purpose of this article, we will highlight the main takeaways delivered in the Executive Summary.

The Executive Summary splits Goldacre’s recommendations into categories:

Platforms and security

Goldacre advises that concrete action must be taken to build trust on the issues of privacy and transparency, and that NHS data policies must acknowledge existing shortcomings of current, outdated techniques to manage patient privacy. The paper states a small number of secure analytics platforms (‘Trusted Research Environments’) should be built and should become the norm for all analysis of NHS patient records. The enhanced privacy protections of the TREs can be used to create faster access rules and processes, with TREs used to drive modern, open and collaborative approaches to data science. The report highlights: “the current bulk flows of psuedonymised NHS GP data needs to be mapped and shut down, as soon as the TREs can meet user needs.”

Open working methods for NHS data

In order to produce high-quality, re-usable and sharable content, Goldacre recommends, a set of best practices and training (‘Reproducible Analytical Pipelines’ or RAP) should be promoted and resourced. Code for data curation and analysis paid for by the state through academic funders and NHS procurement should be shared openly for all relevant users. It also adds that software development needs to be recognised as a central feature of good work with data; competitive and high-status funding is required for software projects and developers working on health data. The paper notes there is a gap between health research and software development that needs to be bridged; training academic researchers and NHS analysts in contemporary data science techniques can achieve this, along with onboarding training for those entering health services research.

Data curation and knowledge management

There is a need to recognise NHS data curation as a ‘complex and high-status technical challenge’ of its own. Goldacre writes that this challenge can be met with systematic curation work, devoted teams, and shared working practices, code, tools and documentation. The aforementioned TREs are an opportunity to impose standards on the storage and curation of commonly-used datasets, he states. Additionally, an open online library for NHS data curation, validity tests and technical documentation should be created for staff with the appropriate skillsets, so that new analysts, academics and innovators can find platforms accessible and well-curated.

NHS data analysts

It would be beneficial to create an NHS Analyst Service with a head of profession, clear job descriptions, progression opportunities and realistic salaries where expensive specific skills are required, the paper highlights. Modern, open working methods for NHS data should be embraced by committing to RAP as the core working practice, and this should be a main focus for training. Goldacre recommends that an Open College for NHS Analysts be created, which should devise and coordinate delivery of initial and continuing training shared openly online to all and covering a range of skills. The creation and maintenance of a national open library of analyst code and methods would help in spreading knowledge and best practice. Goldacre adds that it would be useful to seek expert help to ensure all code and documentation is accessible to all.


A map of all approval processes should be created which all relevant organisations agree is accurate, and a single common application for all access permissions should de-duplicate work. Goldacre suggests that a “frank public conversation” would be helpful about the commercial use of NHS data for innovation purposes and clear rules need to be developed around the use of NHS patient records, and the “problem of trusts and GPs acting as separate data controllers needs to be altered” so that one national organisation or an “approved pool” takes on this responsibility.

Approaches and strategy

Goldacre writes that very senior strategic leadership roles need to be created, using people with technical skills to manage complex technical problems. It should be accepted that new ways of working are overdue, but old methods cannot be replaced overnight: “we must build skills, and prove the value of modern approaches to data in parallel to maintaining old services and teams.” Identifying a range of “data pioneers” from each sector would be of use so that they can adopt modern working practices and develop shared re-usable methods, code, documentation and tools. Finally, Goldacre recommends building TRE capacity by taking a hands-on approach to the work common to all TREs: avoid the commissioning of multiple closed data projects from which little can be learned, and instead focus on experimentation through which all can learn.

Goldacre brings the summary to a close by describing how previously, shortcomings in the system have been driven by “chasing small, isolated, short-term projects; at the expense of building a coherent system that can deliver faster, better, safer outputs for all users of data.” He states that investing in platforms and curation will allow the NHS to “rapidly capitalise” on skills and data, saving time, increasing productivity, and ultimately saving lives.

“73 years of complete NHS patient records contain all the noise from millions of lifetimes,” Goldacre concludes. “Perfect, subtle signals can be coaxed from this data, and those signals go far beyond mere academic curiosity. They represent deeply buried treasure, that can help prevent suffering and death, around the planet, on a biblical scale. It is our collective duty to make this work.”

To read the report at any of the three levels of detail provided, please click here.