mapgubbins

a blog by Owen Boswarva

May 23

It may be some time before we have the Government’s response to Stephan Shakespeare’s review of public sector information. But let us suppose the Cabinet Office seizes upon Shakespeare’s advice with “puppy-like enthusiasm”. What is the main barrier to implementation?

» Read the full article on the Guardian website


May 16

Some of us with an interest in open data were quite excited when, in late November of last year, the Telegraph not only published the first detailed web-based map of the Green Belt in England, but also made the full underlying dataset (a shapefile obtained from the Department for Communities and Local Government) available for download.

According to the Telegraph article:

This map is the first time it has been possible for members of the public to easily see which areas are green belt land, and which are not.

The Department for Communities and Local Government released the data for the 2011 green belt to the Telegraph, and it is being made available here to view, explore, share and download.

Previously the data has only been available at a cost of tens of thousands of pounds from a third party, despite the location of green belt land being identified by councils using taxpayer money.

Expert users may also download a copy of the green belt map (29MB ZIP file) for use in geographic information systems (GIS).

Green Belt boundaries are a classic example of national core reference data that should be available for public re-use under an open data licence. I first wrote about this dataset back in January 2012, and Alasdair Rae made a strong case for open data release in an article on the Guardian website in August 2012.

The apparent release of the Green Belt dataset by DCLG therefore looked like progress (even if using the Telegraph to distribute it seemed a bit unusual). No less than Ed Parsons, Google’s chief geospatial technologist, applauded the data release in a blog post. Was DCLG “the new opengov data poster child”?

Alas, no. It was all a cock-up …

image

I have now obtained, under the Freedom of Information Act, the relevant e-mail correspondence between DCLG, the Telegraph and Ordnance Survey. The Green Belt shapefile has not been released as open data, or even to share, by DCLG. Indeed DCLG claim never to have given the Telegraph permission to make the data available for distribution at all.

You can download my FOI request and DCLG’s full response, including the e-mails. However following are the key points:

  • DCLG provided the Green Belt shapefile to the Telegraph in January 2012 in response to a request. The Telegraph indicated they wanted to convert the data “for display on interactive visualisations and/or as a research base for stories about the Green Belt”.
  • DCLG confirmed that the Telegraph had an Ordnance Survey media licence. This is relevant because the shapefile contains Ordnance Survey derived data. However according to DCLG the Telegraph contact “did not say that he wanted to make files available for download”.
  • Distribution of the data to third parties is well beyond the scope of the OS media licence (and in any case DCLG believes that licence expired in July 2012).
  • In December 2012, shortly after the Telegraph article was published and the shapefile was made available for download from the Telegraph website, Ordnance Survey told DCLG: “my initial thoughts are that the download is without any licence at all and published on a Google API which if derived from OS data would appear to be beyond the terms of the PSMA”. (PSMA is the agreement that allows local authorities to use Ordnance Survey data.)
  • DCLG and Ordnance Survey discussed how to approach the Telegraph over this licensing issue, but apparently did not do so.
  • DCLG and Ordnance Survey picked up the correspondence again in February as a result of my query. Ordnance Survey said they were taking a “lenient approach” and would be “happy with the data being made available as equivalent to open data” under the PSMA exemption process.
  • However DCLG have not submitted any exemption request to Ordnance Survey. There is some suggestion that this is because the shapefile contains local authority IP. However that may just be an excuse. As I’ve pointed out to DCLG, local authorities usually cite Ordnance Survey as the main barrier to open re-use of their boundary data.

DCLG’s conclusion is that “the shapefile published by the Telegraph is in effect unlicensed as it was published without our permission. Notwithstanding that, any Green Belt boundaries provided by local authorities (as in the published shapefile) may only be used for non-commercial uses.”

So what does all this mean?

The first and most obvious point is that DCLG have confirmed the Green Belt shapefile is not currently open data. That’s unfortunate, but Ordnance Survey’s position is encouraging and we can continue to press DCLG to resolve any remaining issues with local authorities.

The complication is that the Green Belt data is now already in circulation, unaccompanied by any clear licensing information. The shapefile remains available for download on the Telegraph website at the time of writing, and is also now available from the London Datastore and ShareGeo Open repositories. It has also been added as a layer to RPTI’s Map for England.

Without any licensing information (other than some obscure metadata), and given that DCLG did not authorise distribution of the shapefile, there is an argument that technically nobody downloading the data has any proper basis on which to re-use it — even for non-commercial purposes. DCLG may have told me that non-commercial use is okay, but they have no mechanism to communicate that more widely to downloaders.

On the other hand, downloaders may already be re-using the Green Belt shapefile in good faith, including for commercial purposes. Since DCLG have made little or no attempt to discourage distribution of the shapefile, they may find it difficult to challenge that re-use.

_____________________________________

Update (May 20):

DCLG have written to clarify that they are “looking to publish Green Belt boundary information ourselves in future and so will consult with local authorities and Ordnance Survey to agree the relevant permissions”.


May 7

I’ve started a spreadsheet in Google Docs as an attempt to track on an ongoing basis all of the officially recognised and active boards, panels and advisory groups that meet to discuss and implement UK transparency and open data policy.

The spreadsheet includes a timeline from the 2010 General Election to present, with links to any published minutes.

A few of the groups (e.g. the Local Public Data Panel and the Location Council) were set up under the previous Government. However the majority follow from initiatives announced in PM Cameron’s letters of May 2010 and July 2011 and in last year’s Open Data White Paper.

image

In principle, UK policy on transparency and open data is led by the Cabinet Office and propagated through government via Sector Boards. Sector Boards take as a reference point the Public Data Principles drawn up by the Public Sector Transparency Board.

Data.gov.uk has a guide that explains the relationships between most of the various groups.

According to the latest tally (in a self-assessment report released by the Cabinet Office last month) the UK so far has “11 active Sector Boards in Transport, Social Mobility, Health and Social Care, Tax, Welfare, Research, Local Public Data Panel, Location Council and Criminal Justice.” 

Additionally FCO, BIS and DfE have “established internal panels that replicate the Sector Board terms of reference”. Defra has also recently established an internal Sector Board and appointed an external data user.

It’s a bit difficult to identify all of those 11 “active Sector Boards” but I think this is most of the list:

The other two may (or may not) be the International Development Sector Transparency Board announced by DfID in December and DCLG’s Public Data and Transparency Programme Board.

Many of the Sector Boards have not released any minutes. The situation at DCLG is particularly murky. DCLG’s website suggests the Public Data and Transparency Programme Board might have been disbanded at the end of 2012. The Local Public Data Panel has not published any minutes since May 2012 and its terms of reference indicate it was to be wound down; however based on subsequent passing references I suspect it has only adopted a lower profile.

The lack of output from the Department for Education is also noticeable, given that it is one of the “key delivery departments” flagged for special attention in the PM’s letter of July 2011.

In general the availability of minutes from Sector Boards seems to mirror actual progress in releasing open data, with Transport and Health running ahead of other departments.


May 3

Several months ago the Cabinet Office ran a public consultation on a draft Code of Practice (Datasets) to help public authorities implement some upcoming changes to the Freedom of Information Act.

Those changes will require public authorities to provide datasets requested under FOI in a re-usable format and with a licence for re-use, where reasonably practicable. The draft Code itself and the explanatory material are on the Data.gov.uk website.

In early March the Cabinet Office published a summary of responses to the consultation, but not the responses themselves. I submitted a Freedom of Information request to the Cabinet Office for copies of all responses to the consultation, and received them yesterday.

This zip file contains the contents of e-mail responses from the following organisations:

  • Department for Communities and Local Government (DCLG)
  • Department for Environment, Food and Rural Affairs (Defra)
  • East Cambridgeshire District Council
  • Essex County Council
  • Food and Environment Research Agency (Fera)
  • Ipsos Mori
  • Land Registry
  • London Borough of Bexley
  • Merton Council
  • Ministry of Defence (MOD)
  • Archives and Records Association
  • Chartered Institute of Library & Information Professionals (CILIP)
  • Health and Social Care Information Centre (HSCIC)
  • Transport for London (TfL)
  • University of Oxford
  • Universities UK (UUK)
  • Wakefield Council
  • West Yorkshire Police

Also enclosed is a copy of 50 comments submitted by 17 users via Data.gov.uk. (User names are omitted.)

The following four responses were already on the web:

I have previously blogged my own comments on the Code.

image

The Cabinet Office’s summary has picked out the main themes, but the full responses provide insight into the concerns of individual organisations. The local authority responses in particular cast doubt on whether FOI practioners will be able to confidently handle the additional issues around formatting and re-use of datasets.

The FOI dataset provisions were originally expected to take effect in April, but have been delayed. We have not yet seen publication of either a revised Code of Practice or the Fee Regulations that will accompany the new FOI provisions.

The Fee Regulations will be crucial in determining whether these provisions are used to promote an open data or a chargeable approach to re-use of datasets released by public authorities under FOI.


May 1

Yesterday Land Registry announced that additional Price Paid Data will be released later this year under the Open Government Licence. Price Paid Data is a dataset containing records of the sale price for every residential property sold at full market value in England and Wales from 1995 onwards.

Price Paid Data for registrations between 1 January 2009 and 31 January 2012 will be released on 28 June 2013, and the remaining data back to 1 January 1995 will be released by November 2013.

Monthly updates from 1 February 2012 onwards are already available as open data. However yesterday’s announcement means that from November the full dataset, some 17 million records, will be available for open re-use free of charge.

I’m not going to clap too hard about this announcement, in part because of the amount of time and effort it has taken to prise the full dataset out of Land Registry’s hands, and in part on principle; it is after all publicly funded data that Land Registry maintain under statutory authority (so it belongs to us dammit).

However Land Registry’s announcement is fundamentally good news. I’ve been critical of Land Registry’s intransigence on open data, and still have my doubts (why the delay until November?), but they are certainly making progress. In addition to yesterday’s announcement, the dataset inventory that Land Registry published last week is an exemplar of good practice that I would be happy to see all public authorities embrace.

Have a biscuit, Land Registry.

image

Open data release of Price Paid Data was originally announced in the 2011 Autumn Statement, then walked back. I’ve written about this in previous posts. Land Registry did begin to release monthly updates to the dataset as open data. However they managed to protect their commercial interests by treating most of the dataset as a separate “historical” product, along with some waffling about evaluating the potential impact of a full open data release.

In October of last year I nominated Price Paid Data as a candidate for open data release via the new Open Data User Group data request process. The ODUG picked this up with additional material and input from others, and submitted a benefit case to the Data Strategy Board.

It’s unclear at the moment whether Land Registry will receive any compensation from the DSB’s “buy-back” fund for releasing Price Paid Data.* However I am in no doubt that the ODUG’s benefit case was crucial to keeping the spotlight on this dataset and influencing Land Registry’s decision to release it as open data.

The full Price Paid Data release is significant for several reasons:

  • Price Paid Data is a reference dataset, which is to say it’s likely to be most useful when combined with data from other sources. Reference data is what we mean when we talk about information as infrastructure. It has value not just in itself but because it underpins wider analysis and enhances the utility of other datasets. Ministers and senior civil servants like to talk about how many thousands of public datasets have been listed on Data.gov.uk, but frankly much of that data is chaff; reference data is the wheat. It’s data that matters.
  • Land Registry are moving Price Paid Data from a commercial licensing model to an open licensing model. This is relatively unusual. Under the current Government, open data policy has focused mainly on releasing categories of data that have no established revenue streams attached to them. Consequently the economic promises made for open data have been slow to materialise in the UK. If government departments really want to drive economic benefits from re-use of public sector information, they will eventually have to grasp the nettle and wind down their commercial licensing operations. Charging for public data sucks the dynamism out of information markets, by creating barriers to entry for SMEs and restricting the flexibility with which data can be used — particularly in apps and on the web.
  • Price Paid Data is the first open data release identifiably linked to the DSB/ODUG process for unlocking datasets at the request of users. ODUG members have been very active in raising the profile of arguments for open data, but progress has been rather slow on raw release. Now that the first dataset has emerged from the request pipeline, the ODUG can point to this as a signal achievement.

Right. Now let’s talk about unlocking Land Registry’s cadastral data

* Update: I’ve been told open data release of Price Paid Data has been agreed without use of ODUG/DSB funds.


Page 1 of 7