a blog by Owen Boswarva

14 Dec

Post: 14 December 2012

The UK’s Data Strategy Board has released minutes of a meeting held on 28 November. These minutes provide a first look at the list of datasets identified by the Open Data User Group as good potential candidates for open data release.

The Open Data User Group (ODUG) is a sub-committee of the DSB made up of volunteers, from a range of sectors and backgrounds but all with an interest in UK open data. Over the past several months the ODUG has been collecting data requests from the wider community via, in order to identify public data assets likely to benefit the UK if released for wider re-use under an open data licence. 

According to the ODUG’s terms of reference it is focusing mainly (but not exclusively) on datasets held by the Public Data Group, a collection of publicly owned trading funds that includes Ordnance Survey, the Land Registry and the Met Office. The DSB has set aside a fund of up to £3.5m to “buy back” data from these and other public bodies.

The ODUG has submitted its initial recommendations in an (unpublished) paper to the Data Strategy Board. The DSB minutes are of course only a second-hand account of those recommendations. However it appears that the ODUG members have homed in on the following datasets as priorities:

A National Address Dataset, based on open data release of both the Royal Mail’s Postcode Address File (PAF) and Ordnance Survey’s AddressBase products. The ODUG had previously announced this as a priority. This recommendation has widespread popular support but there is strong institutional resistance. The DSB has given the proposal its backing in principle, although it is neutral on the ODUG’s additional recommendations about future ownership of the data.

Land Registry Price Paid information. Updates to this dataset from February 2012 onward are already open data, but the ODUG is developing a rationale for release of the historic data as well. The historic data goes back to 1995, so is the bulk of the dataset. This is one of the datasets for which I have submitted a data request to the ODUG.

Met Office hourly weather observations. Observation data for the past 24 hours is already available as open data via the Met Office’s DataPoint service. However the historic data has wide potential for re-use, particularly in applications for the utility and insurance sectors. There’s also a specific request for this data from Mike Godber of C3 Resources.

UK Vat Register information held by HMRC. I suspect this proposal is for release of a subset of non-confidential information from the Register, rather than all data. As noted in this data request from Paul Malyon (who is himself a member of the ODUG), the Register contains a wealth of useful but non-confidential information on unincorporated companies, which do not appear in Companies House records.

Public Rights of Way (PRoW) data held by local authorities and/or Ordnance Survey. The ODUG has asked the DSB to “seek transparency” on costs of either Ordnance Survey releasing a national PRoW data set or local authorities releasing it themselves. There have been numerous requests for this data, and I’ve written about its importance in a earlier post. All or most local authorities have used OS sources to produce their PRoW maps, so there is an issue with OS derived data. OS has previously undertaken to clear away barriers, but to date this has been slow going and only a handful of local authorities have released their data.

River Network. The ODUG has recommended asking Ordnance Survey and the Environment Agency to “create a generalised River Network map as part of medium and small scale location infrastructure” under an open licence. This is another idea that I’ve supported myself with a data request. Mapping of the river network is basic information infrastructure; it seems odd that we have good free public data on roads and railways but not on rivers. OS and EA have already invested in a Detailed River Network product for use by professional hydrologists, and I think they should be able to take some of the basic data (centrelines, nodes, name attributes) from that to produce a generalised map for everyday public use.

The DSG minutes indicate the above proposals are at various stages of development in terms of building rationales for open data release. We will have to wait and see how supportive the DSB members are to the ODUG recommendations. Do they have the appetite to challenge entrenched interests within the PDG and the Shareholder Executive, particularly in respect of address data?

The above list is of course only a first group of datasets. If the ODUG process does produce results, there will no doubt be many other candidates in the request pipeline.

I remain somewhat skeptical of the ODUG process as a well-designed approach to unlocking UK public sector data, mainly because it hinges on a narrow “business case” model that doesn’t adapt well to open data economics. I suspect the ODUG is also being hamstrung by lack of access to detailed information about the finances and commercial activities of the PDG trading funds.

Really the shoe should be on the other foot. Government should be pressing its departments and agencies to justify why they haven’t released their key data assets for re-use as open data, and to demonstrate why they think maintaining an artificial scarcity through charging is really a more beneficial approach.

However I’ve been impressed by the willingness of individual ODUG members to identify and understand the issues behind data requests. The initial short list of datasets that the ODUG has produced (at least as gleaned from the DSB minutes) is actually pretty good. With the help of the open data community the ODUG has come up with a list of datasets that have real potential to provide economic and social benefits if released under an open licence.