Outline
Improve the trust in government by enabling assessment and/or providing evidence related to the quality of the published information/data.
Management Summary
Whether a dataset is of sufficient quality to be used for a specific task will depend entirely on the task in question. There is no single objective measure of quality. However, work carried out in the European Commission's Open Data Support project suggests 7 aspects to consider:
- Accuracy: is the data correctly representing the real-world entity or event?
- Consistency: Is the data not containing contradictions?
- Availability: Can the data be accessed now and over time?
- Completeness: Does the data include all data items representing the entity or event?
- Conformance: Is the data following accepted standards?
- Credibility: Is the data based on trustworthy sources?
- Processability: Is the data machine-readable?
- Relevance: Does the data include an appropriate amount of data?
- Timeliness: Is the data representing the actual situation and is it published soon enough?
Users of data are likely to be making their assessment of whether the information and data available are sufficient for their needs based on these criteria.
Challenge
Understand what is required in terms of data quality and define a set of basic and measurable metrics to determine data quality in an objective way.
Solution
Implement a dataset publication pipeline that includes quality assessment and provenance information alongside the published data. The W3C Data Quality Vocabulary provides the means to make such information available in a machine readable and interoperable manner, covering:
- annotations that describe the data's quality;
- computed quality metrics;
- certificates that describe the dataset production pipeline;
- provenance information using the W3C PROV standards.
Best Practice Identification
Why is this a Best Practice? What’s the impact of the Best Practice
As a results of this practice, re-users will have greater trust the published data and will not need to carry, out or pay for others to carry out, a quality assessment.
Links to the Revised PSI Directive
Why is there a need for this Best Practice?
To support re-use, in particular, the creation of innovative commercial services.
What do you need for this Best Practice?
To assess the publishing process, consider the steps described by ODI Certificates (or similar).
To potnetial re-users to assess the quality of the dataset itself, provide provenance information and annotations using the W3C Data Quality Vocabulary. Feedback from existing users is always of interest to potential users.
Applicability by other Member States
The approach is applicable to any Member State.