Coder Social home page Coder Social logo

codeforsanjose / open-disclosure Goto Github PK

View Code? Open in Web Editor NEW
13.0 11.0 16.0 21.84 MB

A web app to track campaign finances for the General Election (November 3, 2020) in San Jose & South Bay California

License: GNU General Public License v3.0

JavaScript 56.07% Python 24.22% Dockerfile 0.81% SCSS 18.90%
campaign campaign-finance react python elections candidates ballot san-jose

open-disclosure's Introduction

Open Disclosure

A web app to track campaign finances for the California Primary Election (March 3, 2020), and the General Election (November 3, 2020). The goal of Open Disclosure is to help voters understand who/what Political Action Committees (PACs) are donating money to the candidates/measures. Are the donors from the same jurisdiction (city/county/state) as the candidate's intended office or outside of the jurisdiction? Are the donors individuals, or Political Action Committees (PACs)? What is the donor history of the PAC(s)?

View the currently deployed version of our web app: https://open-disclosure.codeforsanjose.org/

We are inspired by Open Oakland's Open Disclosure: https://www.opendisclosure.io

Initially this project will cover City of San Jose elections and later will broaden to cover elections more widely.

Resources

California Election Information:

San Jose voters will vote on November 3, 2020 for 5 Councilmembers in Districts 2, 4, 6, 8 and 10. More information can be found here. This project aims to cover the finances for these candidates in Version 2, from January 2020.

The Presidential Primary Election is on March 3, 2020 in the state of California. There will be elections for:

  • President of the United States
  • United States Representative in Congress
  • California State Senator and Member of the State Assembly

After the primary, the general election will be on November 3, 2020. More information here

Frontend Development setup

  1. Install Docker Desktop

  2. Open and start Docker.

  3. Clone the project to your local machine.

$ git clone https://github.com/codeforsanjose/open-disclosure.git
  1. Go into the project folder.
$ cd open-disclosure/
  1. Build Docker images.
$ docker compose build ui
  1. Run Docker images to start local development
$ docker compose up ui
  1. Open webpage in http://localhost:8000.

How to Launch the Scraper

MacOS:

Install Python3.8 for MacOS

% cd data_pipeline/scraper
% virtualenv env
% source env/bin/activate

(env) % python3 -m pip install -r requirements.txt

(env) % python ./scraper.py

Windows:

Install Python3.8 for Windows

% cd data_pipeline/scraper
% virtualenv --system-site-packages -p python3 ./venv
% .\venv\Scripts\activate

(env) % python3 -m pip install -r requirements.txt
(env) % python3 scraper.py

The example above uses virtualenv to help create a clean working environment and help you not pollute the spaces of other python applications you may use.

How to Launch Scraper Post-Processor

Install Python3.8 for MacOS

% cd data_pipeline/data_processing
% virtualenv env
% source env/bin/activate

(env) % python3 -m pip install -r requirements.txt

(env) % python3 aggregatedcsvtoredis.py

Deploy to Prod

First, gain access to the CFSJ AWS account. You will also want to configure the CLI at this point. You can contact Darren P. or Ryan W. for help with this.

Once you have the desired code changes, use the Dockerfile to build a new image:

docker build --platform=linux/amd64 .

Then, follow This Guide to push the image to ECS.

Finally, stop any currently active tasks associated with the service (found here). This will cause new tasks to be automatically started using the newly deployed docker image.

You can access container logs if you run into any issues here.

How to Contribute

Find an issue and assign yourself

Inspired by Open Oakland's Open Disclosure
Made with <3 by Code for San José

open-disclosure's People

Contributors

abhishekmangla avatar alessandro-pianetta avatar cmatthey avatar darpham avatar ejanzer avatar erikavasnormandy avatar gardenfiend138 avatar geleazar1000111 avatar giftofgrub avatar jeffersonken19 avatar jhung007 avatar jmstudiosjoe avatar krammy19 avatar logan-dang avatar nomad37 avatar rkiddy avatar rwalek668 avatar smellslikecake avatar stevenwuzz avatar sunnymui avatar tdw78 avatar ychoy avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

open-disclosure's Issues

Migrate data deduplication into python library

We want to be able to run this grouped together with the data scraping/preprocessing. We also want to be able to run this separately from the other steps in the workflow.

This will be complete when:
data_processing2.py is converted to a python class
An executable is created to run just the executable portion of data_processing2
The execution of data_processing2.py is linked into the larger data pipeline through an classmethod call.

Fall back to mock data if the api server isn't available

When I added the apiserver plugin I made it so that you have to have the Flask server running at localhost:5000 or else the entire gatsby site will fail to build, which is pretty inconvenient for people working on the frontend, to say the least. It seems like this is because the gatsby plugin I'm using doesn't fail gracefully if it hits an error when fetching the data - I may have to change plugins or write my own to handle that case better.

Get frontend ready to deploy

Issues we want to fix before deploying the frontend:

  • Switch all candidate UI over to using TotalRCPT instead of TotalFunding (Funding includes LOAN, but this is inconsistent with aggregate data showed elsewhere) #172
  • Show a minimum "sliver" in the bar chart if the % is too small #170
  • Update text for Committees section on Candidates page (committees are not owned by candidate) #170
  • Remove See All Contributions link for now #170
  • Add SJ to geo breakdown #170
  • Scroll issue with nav bar #168
  • Remove 'get notified of updates' in noData.js
  • Fix About Us page

install and run of scraper gives an error

I have been setting up tests for things I do using aws boxes and auto-load scripts. It looks as though this is needed here. I will get on it.

cheers - ray

Just FYI:

 % git clone https://github.com/codeforsanjose/open-disclosure.git
 Cloning into 'open-disclosure'...
 remote: Enumerating objects: 506, done.
 remote: Counting objects: 100% (506/506), done.
 remote: Compressing objects: 100% (325/325), done.
 remote: Total 937 (delta 258), reused 385 (delta 167), pack-reused 431
 Receiving objects: 100% (937/937), 11.66 MiB | 3.30 MiB/s, done.
 Resolving deltas: 100% (441/441), done.
 %
 % cd open-disclosure
 %
 % cd data_pipeline/scraper
 ray@rrk scraper % virtualenv env
 created virtual environment CPython3.7.3.final.0-64 in 428ms
   creator CPython3macOsFramework(dest=/Users/ray/Projects/open-disclosure/data_pipeline/scraper/env, clear=False, global=False)
   seeder FromAppData(download=False, pip=latest, setuptools=latest, wheel=latest, via=copy, app_data_dir=/Users/ray/Library/Application Support/virtualenv/seed-app-data/v1.0.1)
   activators BashActivator,CShellActivator,FishActivator,PowerShellActivator,PythonActivator,XonshActivator
 %
 % source env/bin/activate
 (env) %
 (env) % python3 -m pip install chromedriver_binary webdriver-manager selenium
 Collecting chromedriver_binary
   Downloading chromedriver-binary-85.0.4183.38.0.tar.gz (3.6 kB)
 Collecting webdriver-manager
   Downloading webdriver_manager-3.2.1-py2.py3-none-any.whl (16 kB)
 Collecting selenium
   Using cached selenium-3.141.0-py2.py3-none-any.whl (904 kB)
 Collecting crayons
   Downloading crayons-0.3.1-py2.py3-none-any.whl (4.6 kB)
 Collecting configparser
   Using cached configparser-5.0.0-py3-none-any.whl (22 kB)
 Collecting requests
   Using cached requests-2.24.0-py2.py3-none-any.whl (61 kB)
 Collecting urllib3
   Using cached urllib3-1.25.10-py2.py3-none-any.whl (127 kB)
 Collecting colorama
   Using cached colorama-0.4.3-py2.py3-none-any.whl (15 kB)
 Collecting certifi>=2017.4.17
   Using cached certifi-2020.6.20-py2.py3-none-any.whl (156 kB)
 Collecting chardet<4,>=3.0.2
   Using cached chardet-3.0.4-py2.py3-none-any.whl (133 kB)
 Collecting idna<3,>=2.5
   Using cached idna-2.10-py2.py3-none-any.whl (58 kB)
 Building wheels for collected packages: chromedriver-binary
   Building wheel for chromedriver-binary (setup.py) ... done
   Created wheel for chromedriver-binary: filename=chromedriver_binary-85.0.4183.38.0-py3-none-any.whl size=7702049 sha256=1b33b906747b5da1f6ff8af7eb734757b93eb4361672e6face9620248b857d19
   Stored in directory: /Users/ray/Library/Caches/pip/wheels/ed/9c/7c/81565815d07eb410cf5e63f896b5cbbcb012c690f1952cf00e
 Successfully built chromedriver-binary
 Installing collected packages: chromedriver-binary, colorama, crayons, configparser, certifi, urllib3, chardet, idna, requests, webdriver-manager, selenium
 Successfully installed certifi-2020.6.20 chardet-3.0.4 chromedriver-binary-85.0.4183.38.0 colorama-0.4.3 configparser-5.0.0 crayons-0.3.1 idna-2.10 requests-2.24.0 selenium-3.141.0 urllib3-1.25.10 webdriver-manager-3.2.1
 WARNING: You are using pip version 20.1.1; however, version 20.2 is available.
 You should consider upgrading via the '/Users/ray/Projects/open-disclosure/data_pipeline/scraper/env/bin/python3 -m pip install --upgrade pip' command.
 % 
 (env) % 
 (env) % python ./scraper.py
 
   File "./scraper.py", line 258
     self.driver, search_page_num, self.BALLOT_TYPE
        ^
 SyntaxError: invalid syntax

Fix console warnings

In order to reduce noise and ensure new warnings/ errors aren't introduced, we need to clean up the existing console warnings.

Add Candadite Name field into Excel Sheets

Today, Candidate names are not associated with their respective committees in the excel spreadsheet returned by city website. The ask here is essentially to scrape this data, and add it as an additional column into the excel sheet, which matches the candidate name found in the other columns in the excel sheet. This will allow us to join on candidate name across different types of entries.

Ballot Measures placeholders need to be updated with real data

scraper error, just reporting, not investigated yet.

Here is what I did to get here.
I will investigated. Just wanted to get this in.
thanx - ray

`% virtualenv env

% source env/bin/activate
(env) %

(env) % python3 -m pip install chromedriver_binary

(env) % python3 -m pip install webdriver-manager

(env) % python3 -m pip install selenium

(env) % python ./scraper.py

data/Ballot_Measure

[WDM] - Current google-chrome version 83.0.4103
[WDM] - Trying to download new driver from http://chromedriver.storage.googleapis.com/83.0.4103.39/chromedriver_mac64.zip
[WDM] - Unpack archive /Users/ray/.wdm/drivers/chromedriver/83.0.4103.39/mac64/chromedriver.zip

Traceback (most recent call last):
File "./scraper.py", line 222, in
s.scrape()
File "./scraper.py", line 207, in scrape
self.website.closeErrorDialog(self.driver)
File "./scraper.py", line 90, in closeErrorDialog
driver.find_element_by_id(self.ERROR_DIALOG_BUTTON_ID).click()
File "/Users/ray/Projects/open-disclosure/data_pipeline/scraper/env/lib/python3.7/site-packages/selenium/webdriver/remote/webelement.py", line 80, in click
self._execute(Command.CLICK_ELEMENT)
File "/Users/ray/Projects/open-disclosure/data_pipeline/scraper/env/lib/python3.7/site-packages/selenium/webdriver/remote/webelement.py", line 633, in _execute
return self._parent.execute(command, params)
File "/Users/ray/Projects/open-disclosure/data_pipeline/scraper/env/lib/python3.7/site-packages/selenium/webdriver/remote/webdriver.py", line 321, in execute
self.error_handler.check_response(response)
File "/Users/ray/Projects/open-disclosure/data_pipeline/scraper/env/lib/python3.7/site-packages/selenium/webdriver/remote/errorhandler.py", line 242, in check_response
raise exception_class(message, screen, stacktrace)
selenium.common.exceptions.ElementNotInteractableException: Message: element not interactable
(Session info: chrome=83.0.4103.61)
`

Define the Schema which will connect frontend to backend

We need to send the data from the backend sql database to the frontend in some sort of formatted schema. This will contain the all of the data that the UI uses, minus the Tableau visualizations.

The output of this is document detailing what the API looks like, preferably reviewed by at least 1 frontend person.

This is blocking the actual creation of the API in code.

Create prototype visualizations of top X contributions

Using the liccardo_2018.csv dataset saved here: https://drive.google.com/drive/folders/1RwaLU3564B60yMjCZKl2PCRK5-gKysDn create one or more visualizations that show the top 5, 10 or other relevant number of contributions to Sam Liccardo's campaign. Note that only the transactions with Form_Type = A are contributions. See the 460 Codebook (https://docs.google.com/document/d/1N_NCdYoBODUJwagAGNPKiQUGdHje11jjgZoQuxZY8I0/edit) for further details.

You may want to use bars, bubble charts, color coding etc - the possibilities are endless!

Footer Update

Some of this footer language was copied from Open Oakland's website. We should update the language so it more closely reflects our sources. This project, to my understanding, isn't made in partnership with the City Clerk's office, so I recommend changing the language.

  • Update this line:

BEFORE:
Brought to you by Open San José and San José's Public Ethics Commission

AFTER:
Made by Code for San José

  • Update this line:
    BEFORE: Campaign finance data provided by the City of San José Public Ethics Commission Public Portal for Campaign Finance and Lobbyist Disclosure. Candidate and ballot measure information gathered from information provided to the Santa Clara County Registrar of Voters by the City of San José.

AFTER:
Campaign finance data provided by the City of San José's Public Access Portal (CampaignDocs eRetrieval). Candidate and ballot measure information gathered from information provided to the Santa Clara County Registrar of Voters.

  • Add Code for San José logo

Create prototype visualizations comparing total contributions across candidates

Using the district9_2018.csv dataset found here: https://drive.google.com/drive/folders/1RwaLU3564B60yMjCZKl2PCRK5-gKysDn create one or more visualizations which compare the total contributions received by the different candidates standing for council member in district 9. Note that only transactions with Form_Type = A are contributions. See the 460 codebook for more details: https://docs.google.com/document/d/1N_NCdYoBODUJwagAGNPKiQUGdHje11jjgZoQuxZY8I0/edit

Create prototype visualizations comparing contributions across candidates over time

Using the district9.csv dataset found here: https://drive.google.com/drive/folders/1RwaLU3564B60yMjCZKl2PCRK5-gKysDn create one or more visualizations which compare how the contributions received by the different candidates standing for council member in district 9 changed over time. Note that only transactions with Form_Type = A are contributions. See the 460 codebook for more details: https://docs.google.com/document/d/1N_NCdYoBODUJwagAGNPKiQUGdHje11jjgZoQuxZY8I0/edit

Create Flask API for Data Pipeline

New sub-folder 'data_api' for created under 'data_pipeline'

This API will:

  • Scrape the data on either a certain cadence or small sub-check (new forms to download)
  • Complete the data processing/aggregation then load the data into a MySQL database

Additionally serves requests for certain election data: (this will require new tables to be uploaded into MySQL)

  • General election info (total $ amount, etc.)
  • Candidate Info (Candidate name, total contributions, etc.)

Candidate page UI issues/improvements

  • A - SideNav container padding causes H3s to be misaligned (the candidate's name should be aligned with the section headers beneath)

  • B - The 'See all contributions' text doesn't align properly with the arrow, doesn't match arrow color, and also doesn't do anything when you click on it (where should it go?) - Design is fixed, but currently no page to link it to

  • C - We're still missing profile photos. Should we ask for a default profile photo to use in case we can't get them all?

  • D - Voter's Edge link doesn't go anywhere, that needs to be added to candidate JSON file

  • E - Candidate description/bio (what should go there?) needs to be added to candidate JSON file

  • F - Fix positioning/wrap of bio links (row or col?)

  • G - Total contributions currently only includes RCPT. Do we want to include LOAN? If so, we need to update the aggregations on candidate and election for consistency (API changes needed).

  • H - We have a lot of expenditure data, so I'm currently only showing top 4 categories. Should we make this expandable?

  • I - Need more padding between sections

  • J - Text is wrapping too much in the chart labels, can we adjust the width/margin for the label to make it.. not do that?

  • K - Breakdown by region is currently missing the "In San José" category - need to get that from the API (when we do, we need to decide - should "In California" also include SJ, or no?)

L - Both of the funding breakdowns are currently only including RCPT. Should they also include LOAN?

  • M - What should the office title be - 'District 4', 'District 4 Representative', 'Council Member District 4'? We currently have all 3 in different places, I think.

Create prototype visualizations of contributions by location

Using the liccardo_2018.csv dataset saved here: https://drive.google.com/drive/folders/1RwaLU3564B60yMjCZKl2PCRK5-gKysDn create one or more visualizations that show how the contributions received by Sam Liccardo's campaign are distributed by location. Note that only the transactions with Form_Type = A are contributions. See the 460 Codebook (https://docs.google.com/document/d/1N_NCdYoBODUJwagAGNPKiQUGdHje11jjgZoQuxZY8I0/edit) for further details.

Create prototype visualizations of the top X expenditures

Using the liccardo_2018.csv dataset saved here: https://drive.google.com/drive/folders/1RwaLU3564B60yMjCZKl2PCRK5-gKysDn create one or more visualizations that show the top 5, 10 or other relevant number of expenditures by Sam Liccardo's campaign. Note that only the transactions with Form_Type = E are expenditures. See the 460 Codebook (https://docs.google.com/document/d/1N_NCdYoBODUJwagAGNPKiQUGdHje11jjgZoQuxZY8I0/edit) for further details.

You may want to look at expenditures individually, or grouped into categories, or grouped by recipients, or other possibilities.

San José live election section needs updates

  • San José live election snapshot - update the source so it's the latest data, not just July 19, 2020.
  • Update the placeholder "XXX" so it reflects the total donations from the city of San José
  • Consider changing the language to "donations from the city of San José to donations from San José residents". Otherwise, it sounds like the government of San José is donating money to candidates.
  • "4 candidates running"--> should be changed to "4 candidates running in 2 city council races". Otherwise it's misleading. There are many school board, and other open elections in San José.

Screen Shot 2020-10-18 at 10 29 02 AM

Create prototype visualizations comparing contributions across candidates by number of donors

Using the district9.csv dataset found here: https://drive.google.com/drive/folders/1RwaLU3564B60yMjCZKl2PCRK5-gKysDn create one or more visualizations which compare the number of people / companies contributing to the different candidates who stood for council member in district 9. Note that only transactions with Form_Type = A are contributions. See the 460 codebook for more details: https://docs.google.com/document/d/1N_NCdYoBODUJwagAGNPKiQUGdHje11jjgZoQuxZY8I0/edit

Create prototype visualizations comparing contributions across candidates by location

Using the district9.csv dataset found here: https://drive.google.com/drive/folders/1RwaLU3564B60yMjCZKl2PCRK5-gKysDn create one or more visualizations which compare the location of the contributions received by the different candidates standing for council member in district 9. Note that only transactions with Form_Type = A are contributions. See the 460 codebook for more details: https://docs.google.com/document/d/1N_NCdYoBODUJwagAGNPKiQUGdHje11jjgZoQuxZY8I0/edit

Confirm user stories for MVP (Version 1)

These are draft user stories for MVP (version 1) to complete by Dec 31, 2019.

  1. A user can see a list of election categories:
  2. President of the United States
  3. United States Representative in Congress
  4. California State Senator and Member of the State Assembly
  5. The user can select one category
  6. The use can select one candidate
  7. The user will see a list of donors, from highest donation to lowest donation. Include information about the donor like their job and donation amount.
  8. The user can see a list of where the donors are from: San Jose, Santa Clara County, California, Other States

Helen @SmellsLikeCake, I'd love to get your feedback on these draft user stories for MVP.

Once we get these confirmed, we can create separate GitHub issues for each of the stories.

Add mid breakpoints for pages

The site can look a little wonky when it's transitioning between full size and mobile layouts; we should add in another break point at around. The break points might differ between pages due to differences in content; ex index.js's breakpoint will be around 1300px, before the hero image touches the CTA buttons.

Create prototype visualizations of expenditures over time

Using the liccardo_2018.csv dataset saved here: https://drive.google.com/drive/folders/1RwaLU3564B60yMjCZKl2PCRK5-gKysDn create one or more visualizations that show the expenditure of Sam Liccardo's campaign over time. Note that only the transactions with Form_Type = E are expenditures. See the 460 Codebook (https://docs.google.com/document/d/1N_NCdYoBODUJwagAGNPKiQUGdHje11jjgZoQuxZY8I0/edit) for further details.

You may want to experiment with grouping / color coding different types of expenditures.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.