Coder Social home page Coder Social logo

elex-loader's Introduction

AP ELECTION LOADER

Relies on Elex, a command-line tool to get results from the AP Election API 2.0. Demonstrates a method putting those results into a Postgres database using the COPY method and the loader's CSV output.

Assumptions

The following things are assumed to be true in this documentation.

  • You are running OSX.
  • You are using Python 2.7. (Probably the version that came OSX.)
  • You have pip, virtualenv and virtualenvwrapper installed and working.

See "Chapter 2: Install Virtualenv" of NPR's development environment blog post for details.

Having trouble on OS X El Capitan? See: Can't install virtualenvwrapper on OSX 10.11 El Capitan.

Getting started

mkvirtualenv elex-loader
git clone [email protected]:newsdev/elex-loader.git && cd elex-loader
pip install -r requirements.txt
./scripts/$ENV/bootstrap.sh

The bootstrap.sh script will create databases and the user necessary for local development. Note: This does not exist for non-development environments. Please use commands in elex-dotfiles instead.

Environments

The New York Times defines a handful of different environments; principally, dev, stg and prd.

  • dev: Hits test URLs by default. Assumes a local Postgres database where the local user is a superuser.
  • stg: Hits test URLs by default. Requires a Postgres user / host / password to be defined in the environment. We use a .pgpass file and export the rest in /etc/environment. Check out elex-dotfiles for more.
  • prd: Hits live URLs by default. Requires a Postgres user / host / password to be defined in the environment.

Use cases

Load initial data

./scripts/$ENV/reload.sh

The AP will make "live zeros" available in the morning of an election day. You can run reload.sh to get an entire new set of data, including races, reporting units, candidates and zeroed-out results.

Load results on election night

./scripts/$ENV/daemon.sh

The daemon will run 100,000 times (seriously) unless it is stopped. We control ours with a custom Supervisord instance and a modified /etc/supervisord.conf. This configuration file is available in elex-dotfiles along with other secrets.

Set a wait interval

You might want to control how long the daemon waits between cycles. This is hardcoded to a default -- 15s in production, 30s elsewhere. You can create the file /tmp/elex_loader_timeout.sh and export an ELEX_LOADER_TIMEOUT variable like this:

export ELEX_LOADER_TIMEOUT=60

The daemon checks for this file and sources it if it exists in every loop, which means you can dynamically control the wait time. For example, we do this in our admin.

Load results once

./scripts/$ENV/update.sh

Sometimes you just need to load a single update, e.g., to grab final results after we've turned off the loader. This command will get new results without baking or reloading any other parts of the database.

Load delegate data

./scripts/$ENV/delegates.sh

Often, the AP will update delegate information after our daemon has stopped running. To update just delegates for a given racedate, run this command.

elex-loader's People

Contributors

albertsun avatar eads avatar giratikanon avatar jeremyjbowers avatar rshorey avatar wmandrews avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

elex-loader's Issues

Can't drop DB while admin is running.

Drop elex_2016-02-23 if it exists dropdb: database removal failed: ERROR: database "elex_2016-02-23" is being accessed by other users DETAIL: There is 1 other session using the database.

Include total delegates in results

It would be great to include the total delegates and total superdelegates as candidate fields that can be added to the elex_results table. This will make it easier to make more advanced results views that include total delegates without making an extra AJAX request to the delegates API endpoint.

Solve table drop conundrum

Clients will read the DB and periodically see the table is empty / view doesn't exist.
Unhandled rejection SequelizeDatabaseError: relation "elex_results" does not exist
This happens because we are dropping the tables and bulk updating the records.

Write delegates to a separate database

We pull delegates separately from other races, and want a permanent delegate API that is independent of race date. Writing delegates to a separate database, instead of by race date, could work.

Not seeing district results in database after refactor

Even while watching the loader run, and what appears to be to Elex calls for normal and district results, I don't see district results in the database.

I think we may need to run the districts script after the results script in the daemon, since I think results drops the results table:

if [ "$delgates_interval" -eq 0 ]; then delegates fi
if [ "$districts_interval" -eq 0 ]; then districts fi
results

Separate table for NYT winners

Currently, NYT winner calls are part of the overrides table, which makes sense functionally. We may want to move it to a separate table, though, after New Hampshire.

Other overrides, like candidate names and poll closing times, will migrate between staging and production, but NYT calls never should. Also, if they're in their own table, we don't have to worry about losing them if we have to re-import or modify an override table in an emergency.

Need to re-create view in elex_results during delegates.sh

Including delegates in elex_results means that we need to re-create the view when we update delegate totals:

elex-loader $ scripts/stg/delegates.sh 2016-02-20
STARTED: 22:16:46
------------------------------
NOTICE:  drop cascades to view elex_results

Otherwise, after delegates, elex_results does not exist.

Throw elex errors

Currently, elex-loader fails with a cryptic message if environmental variables are missing or elex throws another error:

ERROR:  missing data for column "unique_id"
CONTEXT:  COPY results, line 2: ""

Would be nice to pass through the elex error message.

Error loading results w/ postgres COPY

Mar 23 01:01:08 int-elex-prd-east elex-loader-stderr.log:  ERROR:  unquoted newline found in data
Mar 23 01:01:08 int-elex-prd-east elex-loader-stderr.log:  HINT:  Use quoted CSV field to represent newline.
Mar 23 01:01:08 int-elex-prd-east elex-loader-stderr.log:  CONTEXT:  COPY results, line 165

Fix staging scripts and/or RDS users

A few bugs with scripts/stg/init.sh on the new staging server:

  • The elexadmin user doesn't exist on staging (the superuser is scotusadmin)
  • The elex user doesn't have permission to createdb:
createdb: database creation failed: ERROR:  permission denied to create database

reload.sh breaks elex-admin

I think reload.sh can break the elex-admin, because it reloads the database but doesn't populate the nyt_races for candidates.

The admin doesn't display candidates, and can't mark candidates as NYT important, so no results appear on the frontend.

Fields we can probably remove

Here are the fields we never used or no longer have use for.

nyt_race_preview
nyt_delegate_allocation
nyt_race_name
nyt_race_result_description
nyt_called
nyt_race_important
nyt_candidate_description
nyt_delegates
nyt_races*
race_raceid*
race_postal*

*Was never used by the election-2016 app, but there could be other uses.

Daemon sometimes can't find elex

Is it random?

Mar 12 09:36:44 int-elex-prd-east elex-loader-stderr.log:  /home/ubuntu/elex-loader/scripts/prd/_results.sh: line 7: elex: command not found
Mar 12 09:36:44 int-elex-prd-east elex-loader-stderr.log:  /home/ubuntu/elex-loader/scripts/prd/_districts.sh: line 2: elex: command not found

Override AP winners

For some races, we would like to override the AP's race call and call it ourselves. The solution must fit the following constraints.

  • Set at the race level.
  • Cascades to the candidate reporting unit level for the API.
  • Does not require a DB lookup.
  • Does not prohibit Postgres COPY command, e.g., is not an UPDATE query.
  • Even if overridden, still stores the AP's winner flag.

Refactor init/reload scripts

  • Init is now practically reload, so remove reload?
  • Do not drop the DB because we drop tables.
  • Minimize number of API calls.

Speed up results loading

  • Move districts out of main loop; once every 2 minutes.
  • Move delegates out of main loop; once every minute.
  • Decrease timeout accordingly.
  • Verify we won't exceed 10 calls per minute.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.