New open data releases announced by the Public Data Group
Post: 17 July 2014
The Public Data Group is a collection of four data-rich government organisations (Companies House, Land Registry, Met Office and Ordnance Survey) that report to the UK’s Department for Business, Innovation & Skills (BIS).
The PDG organisations are trading funds, encouraged by Government to generate commercial revenue from the data assets that they control. However all four organisations make available at least some data under an open licence.
On Tuesday the Public Data Group issued a statement containing some commitments to future releases of open data. The statement is a bit short on detail, so this post is an attempt to add some context to the planned releases.
The headline announcement in the PDG statement is the decision that Companies House will “make all of its digital data available free of charge”, from the second quarter of 2015 (April - June).
This follows previous initiatives to open up Companies House data:
- June 2012: Free Company Data Product released (bulk data)
- October 2012: Company Appointment Reports made free of charge (non-bulk)
- November 2013: Free Accounts Data Product released (bulk data)
The current Companies House Price List is online. My interpretation of Tuesday’s announcement is that Companies House will, at minimum, remove the £1 charge for access to individual Company Records via WebCHeck. “Electronic images” will also be free, which I think means PDF copies of the records.
The important unanswered question is whether this release will also include any new bulk downloads of data. According to the Open Definition, a dataset is only properly open data if it is available in bulk.
Bulk release is necessary for any kind of serious analysis of companies data. If the Government is serious about leveraging the free availability of Companies House data to “boost the UK economy”, then bulk release is essential.
According to the PDG statement Land Registry “will release their Price Paid Data for commercially owned properties for free by March 2015.”
There are no further details provided. It will be interesting to see what data is contained in that release. Land Registry currently makes available its Price Paid Data for residential transactions back to 1995, as open data. However it does not publish any statistics on commercial transactions. My past understanding was that Land Registry did not maintain a separate dataset for commercial sales (of either land or properties).
There are several open questions: How complete or extensive is Land Registry’s data on prices paid for commercially owned properties? Does Land Registry intend to release data on historical as well as new transactions (bearing in mind that the initial release of residential Price Paid Data was only new transactions)? And is there likely to be resistance from commercial property owners to the open publication of sale prices?
The PDG statement also says that in 2014/15 Land Registry intends to “make the whole Index Map polygon layer covering England and Wales available at a cost recovery price.”
This is not an open data release, of course. The news may be welcomed by Land Registry licensees, but it remains to be seen how much of a saving they will realise. “Cost recovery price” should not be confused with “marginal price”. Index Map polygons are based on Ordnance Survey spatial data, so (as we have seen with the INSPIRE Index Polygons) it will be OS pricing that determines the actual cost of reuse.
There is not much open data on the horizon from Met Office. However the PDG statement says Met Office is creating something called the “National Archive for the Nations Memory of the Weather”, and that “a selection of this will be available as Open Data.”
This could be significant or not, depending on what Met Office decides to release. Data required to maintain the nation’s “memory of the weather” could range from detailed weather observations, to historical documents, to anecdotal information about notable weather events.
Ordnance Survey - enhancements to OS OpenData
Ordnance Survey’s open data programme is more extensive than those of the other trading funds, as it was launched under the previous Government. Tuesday’s PDG statement notes several future developments.
The existing OS Street View product (raster base-mapping) will be enhanced with new features added such as car parks, major paths, major cycle routes and hill.
OS will release an “enhanced Gazetteer”. This is presumably the “Gazetteer of Great Britain" that OS demonstrated at GeoBusiness 2014 in May. I have seen some sample data for this product (via the OS Insight developer programme); it looks like a useful addition to the OS OpenData suite.
Ordnance Survey - Public Rights of Way
OS will be “working with” Defra to “provide consultancy, technology and to enhance public access (through a portal) to Rights of Way data”. I guess this is good news, though lack of a portal is not the main blockage to release of Public Rights of Way data.
I’ve written about PRoW data before. The main problem is that there has been no organised effort from central government to encourage local councils to release the data. OS is part of that problem (and therefore has to be part of the solution) because most councils use OS data to maintain their Rights of Way maps and need OS’s permission to release those maps as open data. However DCLG and the Local Government Association should be pushing this harder as well. The ideal would be publication of an open national PRoW dataset, collated from the many local sources.
We already have a pretty good portal for the 80 or so local PRoW datasets already available as open data: Barry Cornelius’s Rowmaps site.
Ordnance Survey - Derived River Network
Of the various open data commitments mentioned in the PDG statement, this is the one I am personally most excited about: Ordnance Survey plans to release a new “Derived River Network” open data product.
My understanding is that this dataset will be derived from the new Water Layer in MasterMap. (The other relevant dataset in this space, the Detailed River Network that OS developed with the Environment Agency, is being deprecated.)
Since then more spatial data about our rivers has become available, most recently the Cycle 2 draft of the EA’s Water Framework Directive (WFD) River Waterbodies dataset (last month’s post). However I am gratified to see that there is now sufficient support for open release of a good general-purpose vector map of the river network.
The Derived River Network dataset will complement the Environment Agency’s recent open data release of live flood warning and river level feeds as well as plans to release the NaFRA national flood risk dataset.
Photo credit: Open_Data_stickers.jpg by Jonathan Gray (CC0 1.0). It’s an iconic image and I was too lazy to find something more imaginative to illustrate this post.