The Open Data Census, an almost one-year-old project of the Open Knowledge Foundation, has been increased in size and scope, now including datasets from 25 cities as well as 48 countries, plus rankings.
The census was promoted around the recent Open Data Day 2013 activities and expanded by volunteer contributors who use the data submission form.
The results include charts rating countries and cities on a seven-point scale.
The 10 categories for countries were designed to represent “datasets” and are not much further defined elsewhere. They are:
- Election Results (national)
- Company Register
- National Map (Low resolution: 1:250,000 or better)
- Government Budget (high level – spending by sector)
- Government Budget (detailed – transactional level data)
- Legislation (laws and statutes)
- National Statistical Office Data (economic and demographic information)
- National Postcode/ZIP database
- Public Transport Timetables
- Environmental Data on major sources of pollutants (e.g. location, emissions)
There are seven elements used for rating the data, answered by “yes,” “no,” and “unsure.” The seven categories are:
- Does the data exist?
- Is it in digital form?
- Is it machine readable? (E.g. spreadsheet not PDF)
- Available in bulk? (Can you get the whole dataset easily)
- Is it publicly available, free of charge?
- Is it openly licensed? (as per the http://OpenDefinition.org/)
- Is it up to date?City data is rated on the same seven points.
There are 15 categories of information:
- Transport timetables
- Annual Budget
- Expenditure (detailed)
- Election results
- Air quality
- Public transport stops
- School locations
- Crime statistics
- Health statistics
- Water Quality
- Procurement contracts
- Restaurant hygiene
- Road traffic accidents
- Building permits
- Government services fees
Definitions General
A bit more information defining the dataset categories is available. See spreadsheet on countries here. “Public Transport Timetables” is defined as “Timetables of major government operated (or commissioned) services such as bus, train, tram etc.” “Legislation” means “All laws and statutes available online.
The data is provided by submitters who identify themselves, along with links to the data, the scores on the seven points, comments and other information. Country information is here and cities here, click on “Resources.”
A cursory sampling indicates some inappropriate or incomplete entries. For example, the entry for U.S. public transportation timetables lists government data on transportation, but not timetables. The link for economic data links to a site showing foreign economic data. Other U.S. entries link ambiguously to the home page for all government data. Links are lacking some of the entries, such as for Nigeria and France.
The evolving website includes a notice inviting corrections, a related blog posts discuss planned improvements and there’s a continuous discussion group online.
Among countries, the United Kingdom and the United States rank at the top, both with 64 out of 70 points.
The entries for 48 countries cover 284 datasets.
The city data, which has been collected beginning in February 2013, is sparse. The 25 cities listed include some big ones and some tiny ones. The 118 datasets listed. After the top eight cities, the rest have entries for four or fewer of the 15 categories.
The census website also has a directory of 295 data catalogs.
A Feb. 20 blog post by Rufus Pollock, founder and co-director of the Open Knowledge Foundation, explains the origins and design of the census and recapitulates its goals, stressing, “Doing open government data well depends on releasing key datasets in the right way.” The proliferation of sites it has made it “increasingly hard to track what is happening.”
The post also says, “Progress in open government data is not (just) about the number of datasets being released. The quality of the datasets being released matters at least as much – and often more – than the quantity of these datasets.”
Filed under: What's New