Open Government, Data Reuse, and Semantic Publishing

February 24th, 2009 by Brad Taylor

Government holds a lot of information, much of it potentially very useful to a variety of non-government individuals and groups. Making this information more freely available for third-party reuse has the potential to create public value, improve efficiency, and increase government transparency. By separating information from presentation, semantic publishing provides a technical avenue by which the policy goal of open government data can be advanced.

Open Government

Making government-held information more accessible to the public has long been a policy goal in New Zealand. The Official Information Act 1982 was intended:

To increase progressively the availability of official information to the people of New Zealand in order—

(i) To enable their more effective participation in the making and administration of laws and policies; and

(ii) To promote the accountability of Ministers of the Crown and officials,—
and thereby to enhance respect for the law and to promote the good government of New Zealand.

The Act includes presumption of availability, requiring that ‘information shall be made available unless there is good reason for withholding it.’ Before 1982, the Official Secrets Act 1951 made it an offence for a public servant to communicate government information to any person unless specifically authorised to do so. The Official Information Act allows any New Zealand citizen to request government information be made available to them.

The 1980 report which led to the Act stated:

Nowadays it is generally accepted that the Government has a responsibility to keep the people informed of its activities and make clear the reasons for its decisions. The release and dissemination of information is recognised to be an inherent and essential part of its functions.

In practice though not yet in law, the onus of proof is shifting from those who want information disclosed to those who want it withheld. The assumption on which both the Government and interested groups are now tending to work is that official information should be made available to the public, unless there are good reasons to withhold it in the interests of the community at large.

It seems that public expectations have shifted further. While requiring public servants to respond to information requests was a reasonable way of promoting openness in 1982, technology and reasonable standards of ‘availability’ have evolved. It is difficult to view information which must be actively requested and can be withheld or delayed as ‘available’ when it could be cheaply and easily published on the web.

The policy framework for New Zealand Government-held information, approved by cabinet in 1997, presciently advises agencies to make public information ‘increasingly available on an electronic basis.’ In recent years, much government information has been provided online and this has substantially improved accessibility. The New Zealand E-government Interoperability Framework (eGif) introduced a focus on the form in which information is published, advocating open standards.

Internationally, there has been an increasing focus on open government data in recent years. Barack Obama has signalled a strong commitment to transparency and open government in the US, and the UK Government has been working on various projects to unlock the power of information, such as the Public Sector Information Unlocking Service and the Show us a Better Way contest.

The nonprofit sector has been pushing for more open government information and also innovating to make use of existing government information. Organisations such as theyworkforyou.co.nz in New Zealand, and the Sunlight Foundation, govtrack.us and mysociety.org abroad have gone to great effort in making government information more accessible and have led calls for government to provide information more openly. These organisations tend to argue that open government data increases transparency and enables citizen participation, and see the internet as a new public square which is essential for the functioning of democracy. An Open Government Working Group formalised the preferences of many NGOs in December 2007 with 8 Principles of open data:

1. Complete

All public data are made available. Public data are data that are not subject to valid privacy, security or privilege limitations.

2. Primary
Data are collected at the source, with the finest possible level of granularity, not in aggregate or modified forms.

3. Timely
Data are made available as quickly as necessary to preserve the value of the data.

4. Accessible
Data are available to the widest range of users for the widest range of purposes.

5. Machine processable
Data are reasonably structured to allow automated processing.

6. Non-discriminatory
Data are available to anyone, with no requirement of registration.

7. Non-proprietary
Data are available in a format over which no entity has exclusive control.

8. License-free
Data are not subject to any copyright, patent, trademark or trade secret regulation. Reasonable privacy, security and privilege restrictions may be allowed.

There has been a growing recognition that the goal of open government information rests not simply on citizens having the de jure right to access information, but also the de facto ability. The form in which information is made available is crucial.

Semantic Publishing

While government information has become increasingly available on the web, it often remains very difficult for users to find and manipulate. Users are required to seek out information through a search engine or navigate an agency website. There is no way for users to know whether information on a website has changed, other than by periodically checking the site manually. Agency websites often exist as self-contained silos: users wishing to find specific information need to guess which department holds the information and navigate that site to find it. All-of-Government efforts such as the portal newzealand.govt.nz have improved this situation, but there are significant gains yet to be realised.

Today’s web is largely designed for human consumption, with information too unstructured and context-dependent to be understood by machines. Consequently, any interesting manipulation of information must also be done by humans. By laying out information in a way so obvious that even a computer can understand it, we make it possible for computers to do a lot of the grunt-work now done by humans. This economises on human effort and also opens up new possibilities for the combination of data from different sources - mashups. This sort of semantic publishing - making information available in a format which explicitly flags the meaning of each piece of information - has the potential to improve the accessibility and reusability of Government information. This provides obvious benefits to the public, and also allows more efficient inter- and intra-agency use of information.

The current approach to making Government information more accessible is focused around making websites more user-friendly: the agency thinks about how people use their site and attempts to organise information appropriately. This approach is costly and there is a better alternative. Instead of attempting to be all things to all people, Government can open its information for reuse by consumers and third parties. Allowing information to be used in various ways frees Government of the responsibility of anticipating diverse and ever-changing consumer needs. Rather than providing a finished product to the public in a user-friendly form, open semantic publishing provides the raw material in a developer- and mashup-friendly form.

It then becomes possible for Government to focus on providing high-quality, authoritative information, while allowing that information to be presented in a variety of ways tailored to diverse end-user needs. Instead of taking responsibility for the entire supply-chain of information from creation to delivery, Government could create an information infrastructure on top of which third parties would build end-user solutions.

Semantic publishing provides the means through which government information can be easily reused. The vision of the semantic web is one of a web of data replacing the current web of documents. Instead of relying on human-parsable documents to convey information, semantic publishing seeks to expose the underlying structure of data in a machine-parsable form. Many sites today are dynamically generated from databases, which then serve static html based on user requests. In these cases, structure clearly exists, but is not clearly presented. Simply exposing existing database structures to public view would significantly advance the cause of the semantic web. While revealing the underlying structure of one source of information may be of only modest value in itself, linked data from different sources which can be cross-referenced and mashed up has the potential to fundamentally change the web by giving it some of the functionality of a database.

Types of Government information

The justification for open government data depends on the purpose of publishing the data. Information is used for different purposes, and a crude but useful distinction can be made between:

  • Information for Government - information which helps Government achieve its organisational goals. This is information produced by Government qua organisation. Job vacancies are a good example of this type of information: the main reason for making government job information available openly is that recruitment needs will be better fulfilled.
  • Information by Government - information which the government has produced in the course of its operations that could have practical value for the public. This is information produced by Government qua service provider. The data collected by StatsNZ, the research conducted by various agencies, and things like public holiday dates are examples of this type of information.
  • Information about Government - information which helps citizens keep tabs on government activities and participate in the political process. This is information produced by Government qua accountable representative of the public. Publishing this sort of information is a requirement of open, transparent government. The efforts of third-party sites like govtrack.us and theyworkforyou.co.nz use this type of information.

Semantic publishing can facilitate open government and improve performance in each of these areas, the justifications for openness are different in each case. The value of making information for Government more open derives largely from the support it lends to an agency’s functional capacity: it helps the agency operate better in its other pursuits which are intended to create public value.

The value of making information by Government openly available is in the downstream possibilities it creates. The information Government collects is potentially useful in a variety of applications and making it more accessible creates new opportunities for individuals and groups to better achieve their disparate goals. These goals range from researchers wishing to find and use quantitative data sources, web developers wishing to integrate government health ratings into a restaurant review site, to a community portal wishing to aggregate all news relevant to some geographic area. Semantic publishing would enable end-users and third parties to more easily find the information they are looking for and, once they have found it, to automatically process it.

Accessible information about Government is essential to the functioning of democracy. If citizens cannot keep track of their representatives, they cannot make informed political decisions. Opening information about MP voting records, as well as agency financial and operational performance in order for third parties to represent this information to the public in innovative ways allows us to crowdsource Government transparency.

User-led Innovation

The tools needed to create innovative applications of information goods are increasingly available at lower cost. The decreasing cost of computing power and bandwidth combined with a variety of cheap or free software effectively democratises innovation.

Government clearly has a comparative advantage in providing semantically rich, authoritative versions of its data, since it is much easier to explicitly flag meaning at source than guess at it later (and structure often already exists at source but is destroyed during publication or hidden from public view). Many individuals and groups, though, have a comparative advantage in creating applications of that data for specific audiences: one size seldom fits all. By opening up data, we enable decentralised small-scale experimentation not possible in an organisation as large as Government. The success of  projects such as Wikipedia and Open Source Software demonstrate the power of decentralised, collaborative production.

The fact that third parties currently go to great lengths to make unstructured government information available suggests that this information is valuable. Making it more reusable would enable more third-party and user innovation and provide greater social value.

Conclusion

The goal of making government information more available to the public is a long-standing one in New Zealand public policy. As the world changes, so must the means of pursuing this goal. Semantic publishing is a promising way of increasing government accessibility, transparency, and value for money. Government must focus on it’s core advantage - providing authoritative and accessible versions of the information it collects - and allow third parties to create bespoke solutions for diverse and changing consumer needs.


Slashdot Digg Reddit del.icio.us Facebook Technorati Google StumbleUpon

1 Star2 Stars3 Stars4 Stars5 Stars (148 votes, average: 3.18 out of 5)

4 Responses to “Open Government, Data Reuse, and Semantic Publishing”

  1. Me Elsewhere « Brad Taylor’s Blog says:

    [...] up at the research e-Labs site on some of the stuff I looked at in my summer gig for the enemy, open government data and semantic publishing. The thrust: Today’s web is largely designed for human consumption, with information too [...]

  2. Allan says:

    Great Article

    “Government could create an information infrastructure on top of which third parties would build end-user solutions.”

    I recently listened to a talk which attributed google’s success to providing an open platform rather than a closed portal (yahoo). This (combined with an interview at statistics) got me thinking on the large amount of data that the goverment could make available for mash up purposes and I am really glad to see that this is being considered in NZ.

    One of the other things that google does and could be useful for the government is “beta”. The government’s web presence can seem like a bit of a slow moving leviathon with all the hoops and procedures I am sure people have to go through to get something to happen. A “beta” program could liven things up a bit by removing the large portion of accountability that the current public facing sites have, and allow for a side image of a govt that is willing to have a go at new things - when they are occuring (web tech moves fast). The end product doesn’t have to be perfect and polished, and accessible to everyone. It doesn’t even have to be a success, but the mistakes made, and lessons learned would be invaluable.

  3. Don says:

    “7. Non-proprietary
    Data are available in a format over which no entity has exclusive control.”

    http://www.parliament.nz/en-NZ/AboutParl/SeeHear/PTV/

    Available formats: Apple Quicktime and Windows Media Format, both proprietary formats

    Could it be that parliamentary broadcasts are not considered ‘data’?

  4. Brad Taylor says:

    Don: Definitely data, but not open data by the standards of the Open Government Working Group. Of course, openness comes in degrees. On any reasonable definition, providing parliamentary broadcasts available in a proprietary format is certainly more open than not providing them at all. I’m fairly happy with baby-steps.

Leave a Comment





Is rain wet or dry?