datasets / publicbodies Goto Github PK
View Code? Open in Web Editor NEWA database of public bodies such as government departments, ministries etc.
Home Page: http://publicbodies.org
License: MIT License
A database of public bodies such as government departments, ministries etc.
Home Page: http://publicbodies.org
License: MIT License
@okfngr just noticed that in #43 PR a lot of public bodies were missing a key field (now called id).
Would it be possible to generate and add an id field to all records - an id field is required and is necessary for the frontend to work.
We also seem to be missing jurisdiction codes (which are required) for
gr/dpa
gr/adae
gr/asep
gr/esr
gr/synigoros
gr/minedu
gr/neagenia
gr/gsae
gr/culture
gr/gss
gr/gsrt
gr/minedu
gr/minedu
gr/minedu
gr/gak
gr/gak
gr/iky
add the list of the italian public administrations using the data available on the CSV maintained by the Italian Public Administration Index
This is an idea that I've been thinking about for a while. I discussed it with @rgrp a couple of weeks ago and wanted to share it with the list to see what everyone thinks.
The short version: could public bodies be used to generate usable organisation identifiers?
The IATI Standard is an XML based format for sharing detailed information about aid projects. Fundamentally, the model shows resource flows from one organisation to another, with various classifications in between and many financial transactions as part of each project. So like this:
activity (DFID -> World Health Organisation)
- transaction (GBP 500 disbursed on 2013-05-01)
- transaction (GBP 500 disbursed on 2013-07-05)
For the private sector and NGOs, the methodology for uniquely identifying organisations is:
Jurisdiction
-National registration body
-Number
e.g. for Oxfam GB, registered at the Charity Commission, with reg number 202918:
GB-CHC-202918
For governments, the following methodology is used:
Jurisdiction
-OECD/DAC Agency code
e.g. for the UK's Department for International Development:
GB-1
For multilaterals, we use the following methodology:
OECD/DAC Channel code
e.g. for the World Bank's International Development Association (IDA):
44002
Miscellaneous
.Many organisations publishing IATI data will therefore struggle to provide unique organisation identifiers for many of the public sector / international organisations that they are working with.
BW-1
or BW-21
, the Botswana Ministry of Finance just needs a code).Fuzzy reconciliation / text matching of organisations, with an API that assigns an existing identifier where available, and creates a new one where it's not available
MINISTRY OF FINANCE
BW
(for Botswana)en
2013-07-05
the API responds with one of the following (possibly using HTTP status codes?):
a) Organisation found => use code BW-1
b) Organisation not found => created code BW-21
it also stores the data about the last recorded transaction, so that other people know that that organisation may have existed on that date.
Another source could be Charts of Accounts, existing lists (like those that exist on PB already), budget documents, and structured spending data, e.g. from OpenSpending.
This will probably lead to some duplicates being created. There could be some manual reconciliation for this. Organisations could have a primary identifier and several secondary identifiers that were used by duplicate organisations..
Organisations can be created / deleted / merged in the real world. This should probably lead to:
a) created - a new identifier gets created;
b) merged - a new identifier gets created for the new organisation; and (manually) the old organisations are linked / related to the new organisation;
c) deleted - the identifier continues to exist, because old (and possibly future) data will still refer to it. However, it should be (manually) marked as no longer existing, pointing to a successor organisation of one exists (with some flag to explain whether it's a wholly .
PB-
?OECD-DAC codelists:
Use info from http://datahub.io/dataset/uk-public-bodies
Has the licence for the whatdotheyknow list of public bodies been established? We asked a few months ago and they didn't have one, although no doubt with a good nudge they would be happy to.
This should probably go on the wiki once finished.
To discuss
Some duplicates at the moment and also many fewer bodies than there probably are!
https://github.com/okfn/publicbodies/blob/master/data/us.csv
Is the license for the source code of this project (not the data, as that is a separate issue) specified somewhere? I couln't find it. Please include a (preferrably) open source license or, if there is already one, make it more evident (e.g. mention on the README and/or include a COPYING.txt file).
Note: it may be necessary to:
I'm on it!
To make the data set more useful, I think adding a field to the schema for related bodies/agencies would be very useful. Perhaps the field is populated by the values key
field.
Thoughts?
it looks like with Firefox the two main DIVs, the one with the jurisdictions and the sidebar on the right overlap a bit. On both Chrome and Safari are instead well positioned
Options
Wrote a quick scraper for the directory of Swiss federal entities, see https://scraperwiki.com/scrapers/public_bodies_of_the_swiss_federation/
#51 made a change uppercasing jurisdiction codes, but links on the front page are lowercase.
For countries for which we have a good tree structure being able to browse that tree in the UI would be very helpful.
Requirements:
I have a scraper for Quebec's public bodies (my boss authored it, and wants to contribute). It's written in ruby, and can be seen here. How do we go about integrating this?https://gist.github.com/jpmckinney/5022490
Lost them in node upgrade ...
The ever growing list of German public bodies on FragDenStaat.de can be accessed via the FragDenStaat.de API:
https://fragdenstaat.de/api/v1/publicbody/?format=json
It's a bit verbose. If CSV is a better fit, I can also provide a dump.
It looks like npm install
is enough to install the site now. npm run-script make
fails since there’s no longer a site
directory.
e.g. nodejs + nunjucks + deploy on heroku
Note we would still just load raw csv when app loads - heroku 512 MB limit should be fine give amount of data we have so far ...
The search leads to links like publicbodies.org/gb
which is an empty page but the front page leads to publicbodies.org/GB
. These need to be harmonised.
The Public Bodies tagline is "A URL for every part of government"
yet very non-government entities pop-up on the UK list e.g. ASDA
It would be less catchy a tagline but perhaps, "A URL for every FoI-able public sector organisation" might be more accurate, less confusing?
Several options:
Public bodies change frequently and it would be good to agree how to deal with this. I think having a sense of permanence for URLs is useful, so I suggest:
Suggest:
US data is missing key field in many cases - cf #39
Load country code info from e.g. http://data.okfn.org/data/country-codes and use them ...
All the CSV downloads on the homepage link to "undefined.csv": http://publicbodies.org/
See https://github.com/okfn/opendatacensus/tree/master/tests for our preferred approach (using mocha, superagent etc)
Are in scope of the data for this project:
a) only organizations (as in org:Organization ); or
b) organization and their respective hierarchy of organizational units (as in org:OrganizationalUnit )?
Let's get rid of random generated uuid parts for keys and use slug instead.
Also:
We should create "scripts/import/XX" directories as needed in the repository to hold scripts to update the data, where avaliable from public sources. That way it would be much easier to keep the data up-to-date.
Data from Shen: http://ubercheckout.com/cn.csv
@rossjones suggested: "Would it make sense for publicbodies.org to follow the popolo spec at http://popoloproject.com/data.html" (that link is now broken)
Correct link is: http://popoloproject.com/specs/organization.html
Seems a great idea!
Current fields and suggested changes (e.g. to be in line with popolo as much as possible). Note the list of changes is in progress and incomplete.
Add:
Pros / Cons
Would be nice to link out from a given public body to all requests related to it on relevant FoI sites
/cc @wombleton NZ could be a test case for this ...
http://www.ujp.gov.si/dokumenti/dokument.asp?id=127 -- first excel links :)
Let's use nunjucks
var env = new nunjucks.Environment(); var tmpl = env.getTemplate('test.html'); console.log(tmpl.render({ username: "james" }));
Now that #29 is done and we are in line with Popolo in the CSV this should be pretty easy
No descriptions at the moment. This could be perfect for http://crowdcrafting.org/ or we could just put in a google spreadsheet and ask people to jump in.
Just grab the CSV from https://github.com/okfn/publicbodies/blob/master/data/raw/eu.csv and start updating the description field ...
Suggest we set up some kind of autopush to nomenklatura ...
There should a field in the csv schema to store the local identifier code for a public body, if available. This would make it possible to later create a global identifier (as per #41) and also to keep track of bodies that change names.
The web application should support internationalization.
I also suggest we create a project to localize it in Transifex.
That should help users of other languages to browse for public bodies in their native language.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.