ODUG benefits case for open data release of an authoritative GP dataset
Post: 22 July 2014
This week the Open Data User Group has published a benefits case arguing for open data release of an “authoritative” GP dataset.
ODUG calls on the Department of Health to organise an open dataset of all GP and dental practices, to include practice details, opening times, location, contact details, patient acceptance criteria, and a list of individual practitioners.
Geographic coverage is not mentioned, but as the call is to DoH I’m assuming ODUG is focused only (or at least mainly) on the data for England.
This is the first new benefits case from ODUG since last summer (list of previous benefits cases), so it’s worth taking a look at both the case itself and the related blog post by Giuseppe Sollazzo. My comments are below.
The current best sources for core bulk data on GP practices in England (codes, addresses, contacts, etc.) are:
Those datasets are all reusable under the Open Government Licence, i.e. they are open data.
Several side points before I get into the substance of the ODUG case:
1. NHS Choices staff are employed by HSCIC, so the first two datasets are effectively the responsibility of the same public authority. However there are substantial differences between the datasets as they reflect the underlying purposes for which they are maintained.
2. The ODUG criticises the NHS Choices dataset as follows:
"the branding of the NHS Choices dataset as a ‘Freedom Of Information’ dataset is troubling from an Open Data perspective, mainly for is "on demand" nature: a FOI data release, being a reactive response to a request, does not establish an ongoing process; while data release under an Open licence often comes proactively from the publishing entity, which in doing so creates a sustainable data update procedure".
I think this is rather over the top. NHS Choices hasn’t “branded” the data as a FOI dataset. It has merely made it available, along with a number of other useful data files, in the FOI section of its site. It would be nice if the NHS Choices site also had a dedicated open data landing page. However it’s perfectly sensible to draw users’ attention to existing datasets that they may want to know about before submitting a FOI request. NHS Choices says the data files are updated daily, so they are clearly not being published as a “reactive response” to FOI requests.
3. ODUG maintains that the GP practices data on the HSCIC site is not open data, and points to a page about “responsibilities in using the ODS data”. However HSCIC has recorded that dataset (EGPCUR) on Data.gov.uk as reusable under the OGL. (The ODS “responsibilities” page seems to written for NHS users. A literal reading only permits use of the data in connection with NHS-related activities, which is obviously not the actual licensing position.)
It’s also worth noting that elsewhere on its site HSCIC publishes an open dataset of practice codes, names and addresses as part of its monthly release of GP prescribing data.
Why are the HSCIC/NHS Choices datasets not “authoritative”?
There’s nothing wrong with arguing that existing datasets could be made more useful by improving the quality, or updating them more frequently, or appending data from other sources.
But we can have those arguments about most of the nation’s information infrastructure. A dataset doesn’t need to be ideal to be authoritative in practice.
The HSCIC and NHS Choices datasets are produced by the relevant official body, they are in wide use, and there are currently no better equivalents. The datasets are therefore, on the face of it, authoritative.
ODUG proposes that DoH establishes “an ongoing process to build, update and maintain on data.gov.uk an authoritative dataset of medical practices and operating practitioners, drawing on the datasets made available by HSCIC and NHS Choices”.
I’m not sure how ODUG expects DoH to build an authoritative dataset by drawing on datasets it has dismissed as non-authoritative. ODUG’s call is to DoH, but in practice DoH would surely delegate any such new process to HSCIC. So what is ODUG proposing HSCIC should do differently?
Maintaining the new dataset on Data.gov.uk is also unlikely to add credibility, given the current state of the DGU catalogue and other functionality. HSCIC already has its own platforms and they seem serviceable for the publication of data. What in the ODUG proposal requires the involvement of Data.gov.uk?
Release of open data or creation of a new data product?
The typical model of open data activism is to argue for the release of existing data assets (usually those held by public authorities) for reuse under an open licence. ODUG was originally set up to frame those arguments based on views from UK data users (within terms of reference from the Cabinet Office).
I’ve never been entirely on board with the idea of submitting “benefits cases” for release of open data, because it seems to conflict with the principle of “open by default”. In my view the onus should be reversed; public authorities should be required to demonstrate why we should not be able to reuse data that they hold. Benefits cases should only be necessary when there are significant costs involved in extracting and publishing the data.
However that model of open data release assumes we are talking about data that the public authority already holds and maintains in order to deliver its public task.
In this instance ODUG seems to be arguing for creation of a new data product, combining the existing HSCIC/NHS Choices datasets with data from other sources such as GMC’s Medical Register and patient acceptance criteria for each GP practice.
That last source in particular would probably involve quite a bit of ongoing administration and processing, as patient acceptance criteria are not held centrally or in a standard format.
Arguing for release of existing data is one thing. Arguing for the creation of new data products and new processes is something more.
I have no doubt there is room for improvement in the existing open data that HSCIC publishes on GP and dental practices. However public datasets are mainly produced to support a public task. I will be surprised if DoH takes up these ODUG recommendations without a more detailed demonstration of why the existing data and processes are inadequate to meet the requirements of the agencies and public bodies it supports.
For purposes of reuse beyond the needs of the health system itself, I think we are already quite well served by the existing open data on GP and dental practices. The ODUG benefits case is somewhat perfunctory; in the absence of more detailed analysis I am unconvinced by its attempts to talk down the value of the existing open datasets.
In my view the most interesting element of the ODUG benefits case is the idea that the Government should require the General Medical Council to release data from the Medical Register on individual practitioners. This register is an existing, useful source of public data that is not currently available for reuse under an open licence. I think a focus on that element, properly explicated, would make a more practical and worthwhile proposal.