The Panton Principles for Open Data in Science
This morning, the Open Knowledge Foundation published a declaration of principles intended to inform and encourage the publication of Open Data within the scientific arena. The OKF defines open data as “data that can used, reused and redistributed without restriction other than (perhaps) the requirement to attribution or share-alike.” The Panton Principles for Open Data in Science is not a manifesto on Open Access or Open Science (though such manifestos are available such as Willinsky’s The Access Principle: The Case for Open Access to Research and Scholarship or Hope’s excellent Biobazaar: The Open Source Revolution and Biotechnology). Rather they are simply four fundamental, guiding principles to make data available “without financial, legal, or technical barriers other than those inseparable from gaining access to the internet itself.” Here are the Panton Principles in their entirety as published by the Working Group on Open Data in Science.
-
Where data or collections of data are published it is critical that they be published with a clear and explicit statement of the wishes and expectations of the publishers with respect to re-use and re-purposing of individual data elements, the whole data collection, and subsets of the collection. This statement should be precise, irrevocable, and based on an appropriate and recognized legal statement in the form of a waiver or license.
When publishing data make an explicit and robust statement of your wishes.
-
Many widely recognized licenses are not intended for, and are not appropriate for, data or collections of data. A variety of waivers and licenses that are designed for and appropriate for the treatment of data are described here. Creative Commons licenses (apart from CCZero), GFDL, GPL, BSD, etc are NOT appropriate for data and their use is STRONGLY discouraged.
Use a recognized waiver or license that is appropriate for data.
-
The use of licenses which limit commercial re-use or limit the production of derivative works by excluding use for particular purposes or by specific persons or organizations is STRONGLY discouraged. These licenses make it impossible to effectively integrate and re-purpose datasets and prevent commercial activities that could be used to support data preservation.
If you want your data to be effectively used and added to by others it should be open as defined by the Open Knowledge/Data Definition – in particular non-commercial and other restrictive clauses should not be used.
-
Furthermore, in science it is STRONGLY recommended that data, especially where publicly funded, be explicitly placed in the public domain via the use of the Public Domain Dedication and Licence or Creative Commons Zero Waiver. This is in keeping with the public funding of much scientific research and the general ethos of sharing and re-use within the scientific community.
Explicit dedication of data underlying published science into the public domain via PDDL or CCZero is strongly recommended and ensures compliance with both the Science Commons Protocol for Implementing Open Access Data and the Open Knowledge/Data Definition.
The Panton Principles website features an quickly growing list of endorsees of the principles who have signed a petition for adoption, along with a form for adding your own endorsement.
Beyond this declaration of principles, the working group is trying to make putting them into practice as simple as possible. One tool they have recently launched is the “Is It Open Data?” service. The idea behind the service is to provide a brokerage for data use enquiries. The mechanism is simple, basically a web form with some boilerplate text seeking clarification of the “openness” of any given dataset. Once your enquiry has been submitted, the service helps to identify the body that can authoritatively answer permissions and usage questions. The responses are publicly available on the “Is It Open Data?” website. Since the service is new, there isn’t much there in terms of existing enquiries or responses, but hopefully people will avail themselves of the idea and the network effect will kick in, making this a central point of information sharing about open data. As awareness of the Panton Principles grows and hopefully they are adopted, data owners will proactively indicate the openness of their data. The Open Knowledge Foundation has provided some handy icons that can be added to your website to make it obvious.


