john-sandall / maven Goto Github PK
View Code? Open in Web Editor NEWMaven provides easy access to open datasets in both raw and model-ready formats.
License: Apache License 2.0
Maven provides easy access to open datasets in both raw and model-ready formats.
License: Apache License 2.0
codespell --ignore-words-list="humber,ons" --quiet-level=2
./maven/datasets/coronavirus/README.md:7: authorative ==> authoritative
./maven/datasets/coronavirus/README.md:7: authorative ==> authoritative
./maven/datasets/general_election/README.md:7: authorative ==> authoritative
./maven/datasets/general_election/README.md:7: authorative ==> authoritative
Should be similar to #19
Do as the thing says. Cover both get.py and existing pipelines.
Focus for 0.1 is to get it to a releasable state where I'm not entirely ashamed of how basic it all is. Task priority is on the Core Projects board.
Core
Pipeline design
New pipelines
Holding pen for things I'd like to look at integrating but aren't quite ready to go into a release yet.
Consider adding CLI functionality
The currently processed 2015 results data isn't even that nice. Integrate further cleaning and ID mapping from the SixFifty model.
This currently uses the Electoral Commission's results dataset. Earlier this year, they stopped hosting results and this is now officially published by the House of Commons library at parliament.uk. We need to bring this code up-to-date so it uses the new data (which is in a different format!) for the 2015 results: https://researchbriefings.parliament.uk/ResearchBriefing/Summary/CBP-7186
The goal is the same, we want to turn this into a dataset with the same format as the current code with columns something like:
Press Association ID Number Constituency ID Constituency Name Constituency Type County Region ID Region Country Election Year Electorate Valid Votes con lab ld ukip grn snp pc other con_pc lab_pc ld_pc ukip_pc grn_pc snp_pc pc_pc other_pc geo winner
The new dataset also switches to using just ons_id (which is the same as Constituency ID in the previous cleaned results) and not Press Association ID Number, this is fine, you can drop that column.
gov.uk
DailyIndicators.xlsx
dataDailyConfirmedCases.xlsx
dataNHSR_Cases_table.csv
dataCountyUAs_cases_table.csv
dataIndividual/historical aggregators
We changed some things (e.g. removing PANO in favour of ONS codes). The UK2015Model is also not idiomatic pandas in many places so a good opportunity to clean it up.
Add docs
Similar to UK2015Model, I have a notebook I can share which implements this.
Design inspiration: gensim's downloader functions
Also add additionally recommended steps from https://python-packaging.readthedocs.io/
Focus for 0.2 will be to bring in some more Python best practices. Feeling quite inspired after catching up with Cheuk, there's a lot of things going on in picknmix that are really nice.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.