Coder Social home page Coder Social logo

un_treaties's People

Contributors

dhesse avatar gahjelle avatar patechoc avatar

Stargazers

 avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

un_treaties's Issues

Trying out D3.js for visual data exploration

Visual Data Exploration

Once data is available as JSON file and quality improved after cleansing, we will try various visualization to help understand the dataset (starting with D3.js, but not limited to it. Can also consider DC.js+crossfilter.js).

Definition of Done

  • Trying out various D3 representations
  • Document these attempts somewhere (as Jupyter notebooks, or as standalone HTML pages, ...)

Next steps

  • Moving from visual data exploration to a visual prototype

Get creative around data visualization and data dissemination

Ideas/sketches to visualize interesting infos

There is ways to present such formal and boring data in beautiful and funny ways.

For this we got some creative ideas to make the infos accessible and fun:

Definition of Done

  • collect ideas, sketches, video attemps here
  • refine some of these ideas that look promising

Next steps

  • reproduce these attempts on to as many treaties/countries as possible.

Create a CSV-file

  • Remove countries from header
  • Normalize header names
  • Take away notes(numbers) behind country name
  • Move all county names to "participant column" (ex: one plase it is sorted under: States which have made declarations under Article 36, paragraph 2 of the Statute of the International Court of Justice or whose declarations made under Article 36, paragraph 2, of the Statute of the Permanent Court of International Justice are deemed to be acceptances of the compulsory jurisdiction of the International Court of Justice. (See paragraph 5 of Article 36 of the Statute of the International Court of Justice.) (State names which appear in backets are States having made declarations recognizing as compulsory the jurisdiction of the International Court of Justice for specified periods of time and which have been terminated or have since expired. For an explanation thereof, see endnotes at the end of this chapter.)10ParticipantAustraliaAustriaBarbadosBelgiumBolivia8Botswana[Brazil8]BulgariaCambodiaCameroonCanada[Colombia5,11]Costa RicaCôte d'IvoireCyprusDemocratic Republic of the Congo12DenmarkDjiboutiDominicaDominican Republic11Egypt[El Salvador8]Equatorial GuineaEstoniaFinlandFrance4GambiaGeorgiaGermanyGreece[Guatemala8]GuineaGuinea-BissauHaiti11HondurasHungaryIndiaIrelandIsrael3ItalyJapanKenyaLesothoLiberiaLiechtensteinLithuaniaLuxembourg11MadagascarMalawiMaltaMarshall IslandsMauritiusMexico[Nauru8]NetherlandsNew ZealandNicaragua11NigeriaNorwayPakistanPanama11ParaguayPeruPhilippinesPolandPortugal13RomaniaSenegal[Serbia2,6]SlovakiaSomalia[South Africa7]SpainSudanSurinameSwazilandSwedenSwitzerland[Thailand8]Timor-LesteTogo[Turkey8]UgandaUnited Kingdom of Great Britain and Northern Ireland[United States of America9]Uruguay11
  • remove [ ] sometimes used on some country names
  • split date and location on variable "date"
  • "date of action" in one column, "type of action" in one column
  • spliting rows with registration data to aviod losing this data
  • make date into uniform format xx.xx.xxxx
  • include coding of historical countries (pycountry supports this)
  • try to parse countries that are missed by pycountry
  • Check why some sites are thrown out: Found no participant table Some of these have a Participant 1-table for instance
  • Accession (a) Succession (d) combined tables
  • Row for "no action" for all missing countries

Backend data exploration and functionalities

Backend data exploration and functionalities

In parallel of the visualization, it is interesting to get answers to simple questions that either focus on a country, or that focus on a treaty. For example:

  • Which treaties Norway did sign?
  • More importantly, which one Norway didn’t sign? (country-focused type of question)
  • Which treaties Norway didn’t sign that most countries signed? Looking for some kind of outlier position.
  • Which countries didn't sign the Elimination of All Forms of Discrimination Against Women? (Treaty-focused type of question)

Definition of Done

  • Build functions to answer such questions for any country, or for any treaty
  • Come up with more questions. (Can make new issues for each, or checkboxes here)

Next steps

  • Moving from visual data exploration to a visual prototype

Clean the json/csv and report when/why parsing fails

Data Cleansing after extraction

Many treaties once extracted show missing information, or not well parsed data. This work consists in finding some of those and try to fix them.

Definition of Done

  • Identification of missing fields and explanations written in this issue as comments.
  • Improved crawling when possible.

Next steps

  • Data exploration & Visualization

Export data as JSON

Export crawled data as JSON for easy processing by front-end

Definition of Done

  • code implementation
  • check/complete documentation for running the extraction

Next steps

  • ?

Split current crawler in two parts

The current get_data.py is both a crawler and tries to wrangle the data into meaningful CSV.

We should split it in two parts. One for just simple crawling (downloading data), and another script that parses the data and creates the structure/meaning.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.