codeforsanjose / open-disclosure Goto Github PK

A web app to track campaign finances for the General Election (November 3, 2020) in San Jose & South Bay California

License: GNU General Public License v3.0

JavaScript 56.07% Python 24.22% Dockerfile 0.81% SCSS 18.90%

campaign campaign-finance react python elections candidates ballot san-jose

open-disclosure's Introduction

Open Disclosure

A web app to track campaign finances for the California Primary Election (March 3, 2020), and the General Election (November 3, 2020). The goal of Open Disclosure is to help voters understand who/what Political Action Committees (PACs) are donating money to the candidates/measures. Are the donors from the same jurisdiction (city/county/state) as the candidate's intended office or outside of the jurisdiction? Are the donors individuals, or Political Action Committees (PACs)? What is the donor history of the PAC(s)?

View the currently deployed version of our web app: https://open-disclosure.codeforsanjose.org/

We are inspired by Open Oakland's Open Disclosure: https://www.opendisclosure.io

Initially this project will cover City of San Jose elections and later will broaden to cover elections more widely.

Resources

California Election Information:

San Jose voters will vote on November 3, 2020 for 5 Councilmembers in Districts 2, 4, 6, 8 and 10. More information can be found here. This project aims to cover the finances for these candidates in Version 2, from January 2020.

The Presidential Primary Election is on March 3, 2020 in the state of California. There will be elections for:

President of the United States
United States Representative in Congress
California State Senator and Member of the State Assembly

After the primary, the general election will be on November 3, 2020. More information here

Frontend Development setup

Install Docker Desktop
Open and start Docker.
Clone the project to your local machine.

$ git clone https://github.com/codeforsanjose/open-disclosure.git

Go into the project folder.

$ cd open-disclosure/

Build Docker images.

$ docker compose build ui

Run Docker images to start local development

$ docker compose up ui

Open webpage in http://localhost:8000.

How to Launch the Scraper

MacOS:

Install Python3.8 for MacOS

% cd data_pipeline/scraper
% virtualenv env
% source env/bin/activate

(env) % python3 -m pip install -r requirements.txt

(env) % python ./scraper.py

Windows:

Install Python3.8 for Windows

% cd data_pipeline/scraper
% virtualenv --system-site-packages -p python3 ./venv
% .\venv\Scripts\activate

(env) % python3 -m pip install -r requirements.txt
(env) % python3 scraper.py

The example above uses virtualenv to help create a clean working environment and help you not pollute the spaces of other python applications you may use.

How to Launch Scraper Post-Processor

Install Python3.8 for MacOS

% cd data_pipeline/data_processing
% virtualenv env
% source env/bin/activate

(env) % python3 -m pip install -r requirements.txt

(env) % python3 aggregatedcsvtoredis.py

Deploy to Prod

First, gain access to the CFSJ AWS account. You will also want to configure the CLI at this point. You can contact Darren P. or Ryan W. for help with this.

Once you have the desired code changes, use the Dockerfile to build a new image:

docker build --platform=linux/amd64 .

Then, follow This Guide to push the image to ECS.

Finally, stop any currently active tasks associated with the service (found here). This will cause new tasks to be automatically started using the newly deployed docker image.

You can access container logs if you run into any issues here.

How to Contribute

Find an issue and assign yourself

Communicate with the team on Slack (channel: #open-disclosure).
Join our Slack
Attend a Code for San Jose civic hack night meetup: https://www.meetup.com/code-for-san-jose

Inspired by Open Oakland's Open Disclosure
Made with <3 by Code for San José

open-disclosure's People

Contributors

Stargazers

Watchers

Forkers

erikavasnormandy smellslikecake zacharymcmanus abhishekmangla riyamaj giftofgrub jhung007 rkiddy czhang475 cmatthey prazolpp sunnymui stevenwuzz nomad37 dnahol ptb99

open-disclosure's Issues

Migrate data deduplication into python library

We want to be able to run this grouped together with the data scraping/preprocessing. We also want to be able to run this separately from the other steps in the workflow.

This will be complete when:
data_processing2.py is converted to a python class
An executable is created to run just the executable portion of data_processing2
The execution of data_processing2.py is linked into the larger data pipeline through an classmethod call.

Add carousel component on mobile view of Homepage @ Candidate Info section

Update README and homepage with info about UI from OpenDislcosure.io

Since the UI based on Open Oakland's Open Disclosure project https://www.opendisclosure.io/, we should reference it in our README and on the footer of the homepage (or on the "Open Source page?") and give credit and thanks to them.

@SmellsLikeCake

Create layout for FAQ page

Create layout for candidate/ measure pages

See Figma for designs: https://www.figma.com/file/GmfndRmQChBeoklOkmi0FC/Open-Disclosure?node-id=1041%3A0

Mobile first -- side nav should be a dropdown with small window sizes and expand to a left-side navigation.

This should just cover the layout itself for now, not actually rendering any of the candidate/ measure info

Fall back to mock data if the api server isn't available

When I added the apiserver plugin I made it so that you have to have the Flask server running at localhost:5000 or else the entire gatsby site will fail to build, which is pretty inconvenient for people working on the frontend, to say the least. It seems like this is because the gatsby plugin I'm using doesn't fail gracefully if it hits an error when fetching the data - I may have to change plugins or write my own to handle that case better.

Create prototype visualizations by size / type of contribution

Visualizations showing candidate contributions either by size of contributions from a given individual or entity, or possibly types of contributors.

Get frontend ready to deploy

Issues we want to fix before deploying the frontend:

Switch all candidate UI over to using TotalRCPT instead of TotalFunding (Funding includes LOAN, but this is inconsistent with aggregate data showed elsewhere) #172
Show a minimum "sliver" in the bar chart if the % is too small #170
Update text for Committees section on Candidates page (committees are not owned by candidate) #170
Remove See All Contributions link for now #170
Add SJ to geo breakdown #170
Scroll issue with nav bar #168
Remove 'get notified of updates' in noData.js
Fix About Us page

maint: Update README with instructions on how to run this locally

Add more steps so it is clear how to run this repo locally

Improve Performance and Best Practices

Performance issues and best practices needs to be fixed, according to the Chrome Lighthouse audit of Open Disclosure.
LighthouseReport-10-28-20.pdf

install and run of scraper gives an error

I have been setting up tests for things I do using aws boxes and auto-load scripts. It looks as though this is needed here. I will get on it.

cheers - ray

Just FYI:

 % git clone https://github.com/codeforsanjose/open-disclosure.git
 Cloning into 'open-disclosure'...
 remote: Enumerating objects: 506, done.
 remote: Counting objects: 100% (506/506), done.
 remote: Compressing objects: 100% (325/325), done.
 remote: Total 937 (delta 258), reused 385 (delta 167), pack-reused 431
 Receiving objects: 100% (937/937), 11.66 MiB | 3.30 MiB/s, done.
 Resolving deltas: 100% (441/441), done.
 %
 % cd open-disclosure
 %
 % cd data_pipeline/scraper
 ray@rrk scraper % virtualenv env
 created virtual environment CPython3.7.3.final.0-64 in 428ms
   creator CPython3macOsFramework(dest=/Users/ray/Projects/open-disclosure/data_pipeline/scraper/env, clear=False, global=False)
   seeder FromAppData(download=False, pip=latest, setuptools=latest, wheel=latest, via=copy, app_data_dir=/Users/ray/Library/Application Support/virtualenv/seed-app-data/v1.0.1)
   activators BashActivator,CShellActivator,FishActivator,PowerShellActivator,PythonActivator,XonshActivator
 %
 % source env/bin/activate
 (env) %
 (env) % python3 -m pip install chromedriver_binary webdriver-manager selenium
 Collecting chromedriver_binary
   Downloading chromedriver-binary-85.0.4183.38.0.tar.gz (3.6 kB)
 Collecting webdriver-manager
   Downloading webdriver_manager-3.2.1-py2.py3-none-any.whl (16 kB)
 Collecting selenium
   Using cached selenium-3.141.0-py2.py3-none-any.whl (904 kB)
 Collecting crayons
   Downloading crayons-0.3.1-py2.py3-none-any.whl (4.6 kB)
 Collecting configparser
   Using cached configparser-5.0.0-py3-none-any.whl (22 kB)
 Collecting requests
   Using cached requests-2.24.0-py2.py3-none-any.whl (61 kB)
 Collecting urllib3
   Using cached urllib3-1.25.10-py2.py3-none-any.whl (127 kB)
 Collecting colorama
   Using cached colorama-0.4.3-py2.py3-none-any.whl (15 kB)
 Collecting certifi>=2017.4.17
   Using cached certifi-2020.6.20-py2.py3-none-any.whl (156 kB)
 Collecting chardet<4,>=3.0.2
   Using cached chardet-3.0.4-py2.py3-none-any.whl (133 kB)
 Collecting idna<3,>=2.5
   Using cached idna-2.10-py2.py3-none-any.whl (58 kB)
 Building wheels for collected packages: chromedriver-binary
   Building wheel for chromedriver-binary (setup.py) ... done
   Created wheel for chromedriver-binary: filename=chromedriver_binary-85.0.4183.38.0-py3-none-any.whl size=7702049 sha256=1b33b906747b5da1f6ff8af7eb734757b93eb4361672e6face9620248b857d19
   Stored in directory: /Users/ray/Library/Caches/pip/wheels/ed/9c/7c/81565815d07eb410cf5e63f896b5cbbcb012c690f1952cf00e
 Successfully built chromedriver-binary
 Installing collected packages: chromedriver-binary, colorama, crayons, configparser, certifi, urllib3, chardet, idna, requests, webdriver-manager, selenium
 Successfully installed certifi-2020.6.20 chardet-3.0.4 chromedriver-binary-85.0.4183.38.0 colorama-0.4.3 configparser-5.0.0 crayons-0.3.1 idna-2.10 requests-2.24.0 selenium-3.141.0 urllib3-1.25.10 webdriver-manager-3.2.1
 WARNING: You are using pip version 20.1.1; however, version 20.2 is available.
 You should consider upgrading via the '/Users/ray/Projects/open-disclosure/data_pipeline/scraper/env/bin/python3 -m pip install --upgrade pip' command.
 % 
 (env) % 
 (env) % python ./scraper.py
 
   File "./scraper.py", line 258
     self.driver, search_page_num, self.BALLOT_TYPE
        ^
 SyntaxError: invalid syntax

Create Find Your Ballot page

Create Find Your Ballot page w/ embedded tool from Voter's Edge and hook up to link in bottom nav.

Create Get Registered/Check Registration landing pages

Mocks: https://www.figma.com/file/GmfndRmQChBeoklOkmi0FC/Open-Disclosure?node-id=1185%3A708
Vote.org iframes: https://www.vote.org/technology/

Confirm UI is navigable via the keyboard

Create measure page

Hook up graphs in candidate page to API data

Fix console warnings

In order to reduce noise and ensure new warnings/ errors aren't introduced, we need to clean up the existing console warnings.

Add Candadite Name field into Excel Sheets

Today, Candidate names are not associated with their respective committees in the excel spreadsheet returned by city website. The ask here is essentially to scrape this data, and add it as an additional column into the excel sheet, which matches the candidate name found in the other columns in the excel sheet. This will allow us to join on candidate name across different types of entries.

Ballot Measures placeholders need to be updated with real data

X,Y,Z seems to be placeholders for ballot measures

https://open-disclosure.codeforsanjose.com/11/3/2020/referendums/ballot-measure-x
https://open-disclosure.codeforsanjose.com/11/3/2020/referendums/ballot-measure-y
https://open-disclosure.codeforsanjose.com/11/3/2020/referendums/ballot-measure-z

If we do have time before election to get all the updated data on San José measures, please add the data for:

City of San José Measures

School Districts in San José - Measures

Special District (Santa Clara County) Measures

Lastly:

Make sure all placeholder text is removed.

scraper error, just reporting, not investigated yet.

Here is what I did to get here.
I will investigated. Just wanted to get this in.
thanx - ray

`% virtualenv env

% source env/bin/activate
(env) %

(env) % python3 -m pip install chromedriver_binary

(env) % python3 -m pip install webdriver-manager

(env) % python3 -m pip install selenium

(env) % python ./scraper.py

data/Ballot_Measure

[WDM] - Current google-chrome version 83.0.4103
[WDM] - Trying to download new driver from http://chromedriver.storage.googleapis.com/83.0.4103.39/chromedriver_mac64.zip
[WDM] - Unpack archive /Users/ray/.wdm/drivers/chromedriver/83.0.4103.39/mac64/chromedriver.zip

Traceback (most recent call last):
File "./scraper.py", line 222, in
s.scrape()
File "./scraper.py", line 207, in scrape
self.website.closeErrorDialog(self.driver)
File "./scraper.py", line 90, in closeErrorDialog
driver.find_element_by_id(self.ERROR_DIALOG_BUTTON_ID).click()
File "/Users/ray/Projects/open-disclosure/data_pipeline/scraper/env/lib/python3.7/site-packages/selenium/webdriver/remote/webelement.py", line 80, in click
self._execute(Command.CLICK_ELEMENT)
File "/Users/ray/Projects/open-disclosure/data_pipeline/scraper/env/lib/python3.7/site-packages/selenium/webdriver/remote/webelement.py", line 633, in _execute
return self._parent.execute(command, params)
File "/Users/ray/Projects/open-disclosure/data_pipeline/scraper/env/lib/python3.7/site-packages/selenium/webdriver/remote/webdriver.py", line 321, in execute
self.error_handler.check_response(response)
File "/Users/ray/Projects/open-disclosure/data_pipeline/scraper/env/lib/python3.7/site-packages/selenium/webdriver/remote/errorhandler.py", line 242, in check_response
raise exception_class(message, screen, stacktrace)
selenium.common.exceptions.ElementNotInteractableException: Message: element not interactable
(Session info: chrome=83.0.4103.61)
`

Define the Schema which will connect frontend to backend

We need to send the data from the backend sql database to the frontend in some sort of formatted schema. This will contain the all of the data that the UI uses, minus the Tableau visualizations.

The output of this is document detailing what the API looks like, preferably reviewed by at least 1 frontend person.

This is blocking the actual creation of the API in code.

Create prototype visualizations of top X contributions

Using the liccardo_2018.csv dataset saved here: https://drive.google.com/drive/folders/1RwaLU3564B60yMjCZKl2PCRK5-gKysDn create one or more visualizations that show the top 5, 10 or other relevant number of contributions to Sam Liccardo's campaign. Note that only the transactions with Form_Type = A are contributions. See the 460 Codebook (https://docs.google.com/document/d/1N_NCdYoBODUJwagAGNPKiQUGdHje11jjgZoQuxZY8I0/edit) for further details.

You may want to use bars, bubble charts, color coding etc - the possibilities are endless!

Create layout for Register to Vote page

Mocks: https://www.figma.com/file/GmfndRmQChBeoklOkmi0FC/Open-Disclosure?node-id=1185%3A708
Using iframe from vote.org: https://www.vote.org/technology/

Replace contribution values with data obtaining through calling API

Will start by filling in with values from an API call to a random number generator -- will replace with call to backend API once it is completed.

Components for Candidate/Measure pages

Building out some of the components we'll need for the candidate and measure pages. Mocks: https://www.figma.com/file/GmfndRmQChBeoklOkmi0FC/Open-Disclosure?node-id=1433%3A1563

Complete the missing columns in the 460 codebook

The 460 Codebook describes the meaning of each column in the 460 Excel documents containing the campaign finance information. It is saved here and needs completing: https://docs.google.com/document/d/1N_NCdYoBODUJwagAGNPKiQUGdHje11jjgZoQuxZY8I0/edit

Footer Update

Some of this footer language was copied from Open Oakland's website. We should update the language so it more closely reflects our sources. This project, to my understanding, isn't made in partnership with the City Clerk's office, so I recommend changing the language.

Update this line:

BEFORE:
Brought to you by Open San José and San José's Public Ethics Commission

AFTER:
Made by Code for San José

Update this line:
BEFORE: Campaign finance data provided by the City of San José Public Ethics Commission Public Portal for Campaign Finance and Lobbyist Disclosure. Candidate and ballot measure information gathered from information provided to the Santa Clara County Registrar of Voters by the City of San José.

AFTER:
Campaign finance data provided by the City of San José's Public Access Portal (CampaignDocs eRetrieval). Candidate and ballot measure information gathered from information provided to the Santa Clara County Registrar of Voters.

Add Code for San José logo

Complete Candidate Page

Create prototype visualizations comparing total contributions across candidates

Using the district9_2018.csv dataset found here: https://drive.google.com/drive/folders/1RwaLU3564B60yMjCZKl2PCRK5-gKysDn create one or more visualizations which compare the total contributions received by the different candidates standing for council member in district 9. Note that only transactions with Form_Type = A are contributions. See the 460 codebook for more details: https://docs.google.com/document/d/1N_NCdYoBODUJwagAGNPKiQUGdHje11jjgZoQuxZY8I0/edit

Create layout for About Us page

Create chart component for candidate/measure pages

Mocks: https://www.figma.com/file/GmfndRmQChBeoklOkmi0FC/Open-Disclosure?node-id=1041%3A0

The chart will need to support contributions (blue) and expenditures (orange), and will need to be able to show a dollar value or a %

Migrate UI code out of top level directory and into its own folder.

Folder could be named UI or frontend or something similar.

Create prototype visualizations comparing contributions across candidates over time

Using the district9.csv dataset found here: https://drive.google.com/drive/folders/1RwaLU3564B60yMjCZKl2PCRK5-gKysDn create one or more visualizations which compare how the contributions received by the different candidates standing for council member in district 9 changed over time. Note that only transactions with Form_Type = A are contributions. See the 460 codebook for more details: https://docs.google.com/document/d/1N_NCdYoBODUJwagAGNPKiQUGdHje11jjgZoQuxZY8I0/edit

Switch to include LOAN in TotalFunding

Once the API is ready

Implement 'Get notified about updates'

From mocks: https://www.figma.com/file/GmfndRmQChBeoklOkmi0FC/Open-Disclosure?node-id=1041%3A0

We need to either hook this up to something or change the text

Create Flask API for Data Pipeline

New sub-folder 'data_api' for created under 'data_pipeline'

This API will:

Scrape the data on either a certain cadence or small sub-check (new forms to download)
Complete the data processing/aggregation then load the data into a MySQL database

Additionally serves requests for certain election data: (this will require new tables to be uploaded into MySQL)

General election info (total $ amount, etc.)
Candidate Info (Candidate name, total contributions, etc.)

Candidate page UI issues/improvements

~~L - Both of the funding breakdowns are currently only including RCPT. Should they also include LOAN?~~

M - What should the office title be - 'District 4', 'District 4 Representative', 'Council Member District 4'? We currently have all 3 in different places, I think.

Create prototype visualizations of contributions by location

Using the liccardo_2018.csv dataset saved here: https://drive.google.com/drive/folders/1RwaLU3564B60yMjCZKl2PCRK5-gKysDn create one or more visualizations that show how the contributions received by Sam Liccardo's campaign are distributed by location. Note that only the transactions with Form_Type = A are contributions. See the 460 Codebook (https://docs.google.com/document/d/1N_NCdYoBODUJwagAGNPKiQUGdHje11jjgZoQuxZY8I0/edit) for further details.

Create prototype visualizations of the top X expenditures

Using the liccardo_2018.csv dataset saved here: https://drive.google.com/drive/folders/1RwaLU3564B60yMjCZKl2PCRK5-gKysDn create one or more visualizations that show the top 5, 10 or other relevant number of expenditures by Sam Liccardo's campaign. Note that only the transactions with Form_Type = E are expenditures. See the 460 Codebook (https://docs.google.com/document/d/1N_NCdYoBODUJwagAGNPKiQUGdHje11jjgZoQuxZY8I0/edit) for further details.

You may want to look at expenditures individually, or grouped into categories, or grouped by recipients, or other possibilities.

San José live election section needs updates

San José live election snapshot - update the source so it's the latest data, not just July 19, 2020.
Update the placeholder "XXX" so it reflects the total donations from the city of San José
Consider changing the language to "donations from the city of San José to donations from San José residents". Otherwise, it sounds like the government of San José is donating money to candidates.
"4 candidates running"--> should be changed to "4 candidates running in 2 city council races". Otherwise it's misleading. There are many school board, and other open elections in San José.

Data pipelining

Establish a data pipeline for taking data off the website, cleaning it and combining it into a single structure. There is a scraper in the github already, and a de-deduplicating script in the google drive: https://drive.google.com/drive/folders/1RwaLU3564B60yMjCZKl2PCRK5-gKysDn

Create prototype visualizations comparing contributions across candidates by number of donors

Using the district9.csv dataset found here: https://drive.google.com/drive/folders/1RwaLU3564B60yMjCZKl2PCRK5-gKysDn create one or more visualizations which compare the number of people / companies contributing to the different candidates who stood for council member in district 9. Note that only transactions with Form_Type = A are contributions. See the 460 codebook for more details: https://docs.google.com/document/d/1N_NCdYoBODUJwagAGNPKiQUGdHje11jjgZoQuxZY8I0/edit

Create prototype visualizations comparing contributions across candidates by location

Using the district9.csv dataset found here: https://drive.google.com/drive/folders/1RwaLU3564B60yMjCZKl2PCRK5-gKysDn create one or more visualizations which compare the location of the contributions received by the different candidates standing for council member in district 9. Note that only transactions with Form_Type = A are contributions. See the 460 codebook for more details: https://docs.google.com/document/d/1N_NCdYoBODUJwagAGNPKiQUGdHje11jjgZoQuxZY8I0/edit

Fix graphical animation bug on mobile version of Layout/Menu component

When changing screen size, the animation that should only happen when menu is closed gets triggered

Update GraphQL for new schemas from database

GraphQL queries will need to be updated to match the new schema of the DB data.

Once @geleazar1000111 merges in her updates for the Election schema, we will be able to go ahead with this fix; @geleazar1000111 please comment here when you get the changes in and merged.

Improve performance, accessibility & best practices (Lighthouse Report)

Performance, accessibility and best practices needs to be fixed, according to the Chrome Lighthouse audit of Open Disclosure.
Lighthouse-Report-OpenDisclosure-10162020.pdf

Create prototype visualizations of contributions over time

Using the liccardo_2018.csv dataset saved here: https://drive.google.com/drive/folders/1RwaLU3564B60yMjCZKl2PCRK5-gKysDn create one or more visualizations that show the contributions received by Sam Liccardo's campaign over time. Note that only the transactions with Form_Type = A are contributions. See the 460 Codebook (https://docs.google.com/document/d/1N_NCdYoBODUJwagAGNPKiQUGdHje11jjgZoQuxZY8I0/edit) for further details.

Confirm user stories for MVP (Version 1)

These are draft user stories for MVP (version 1) to complete by Dec 31, 2019.

A user can see a list of election categories:
President of the United States
United States Representative in Congress
California State Senator and Member of the State Assembly
The user can select one category
The use can select one candidate
The user will see a list of donors, from highest donation to lowest donation. Include information about the donor like their job and donation amount.
The user can see a list of where the donors are from: San Jose, Santa Clara County, California, Other States

Helen @SmellsLikeCake, I'd love to get your feedback on these draft user stories for MVP.

Once we get these confirmed, we can create separate GitHub issues for each of the stories.

Add mid breakpoints for pages

The site can look a little wonky when it's transitioning between full size and mobile layouts; we should add in another break point at around. The break points might differ between pages due to differences in content; ex index.js's breakpoint will be around 1300px, before the hero image touches the CTA buttons.

Create prototype visualizations of expenditures over time

Using the liccardo_2018.csv dataset saved here: https://drive.google.com/drive/folders/1RwaLU3564B60yMjCZKl2PCRK5-gKysDn create one or more visualizations that show the expenditure of Sam Liccardo's campaign over time. Note that only the transactions with Form_Type = E are expenditures. See the 460 Codebook (https://docs.google.com/document/d/1N_NCdYoBODUJwagAGNPKiQUGdHje11jjgZoQuxZY8I0/edit) for further details.

You may want to experiment with grouping / color coding different types of expenditures.