unitedstates / billmap Goto Github PK

View Code? Open in Web Editor NEW

12.0 7.0 2.0 86.99 MB

Utilities and applications for the FlatGov project by Demand Progress

License: Other

Python 26.47% HTML 23.95% Shell 1.29% JavaScript 45.75% CSS 2.54%

billmap's Introduction

BillMap: A Demand Progress Project

Table of Contents

Web Application Quickstart (development)
Deployment
- Deployment instructions
- System components
Related Bills and Bill Similarity
- Related bills
- Bill similarity — text similarity
Relevant Committee Documents

A live demo of the application is here.

The application version is shown in the top right of the page; it is set by the project’s latest git tag or, if that is not available, by the version string set in _version.py.

This documentation are also available there, at https://BillMap.linkedlegislation.com/static/docs/README.html When the documentation has been updated in the git repository, it can be converted to html and copied to the application directory with the script scripts/docs_generator.sh (requires installation of asciidoctor).

This repository contains:

A web application showing information for a given bill (Django/Python)
Utilities to scrape and process bill data (Python)

Both components are described below.

A separate repository (github.com/aih/bills) contains tools in Go to process bill data. This repository now uses some of those tools instead of the Python ones.

Web Application Quickstart (development)

The BillMap web application is built using the Django/Python web application framework. The application is contained in the server_py directory of this repository. It makes use of data that is processed using the scrapers and scripts described in the DATA_BACKGROUND.

Below are instructions to set up a local development environment. For production deployment instructions, see DEPLOYMENT.

Clone this repository

$ git clone https://github.com/aih/BillMap.git
$ cd BillMap

Install Python dependencies

Create a new Python virtual environment. You can use venv, virtualenv or preferably pyenv virtualenv, which requires installing pyenv first).

If you don’t have pyenv, try installing with homebrew

$ brew update
$ brew install pyenv

If you don’t have pyenv-virtualenv, try installing with homebrew

$ brew install pyenv-virtualenv

Note: you may have to manually update ~/.bashrc for virtual env commands to work

Create the environment (with pyenv virtualenv):

$ pyenv install 3.7
$ pyenv virtualenv 3.7 BillMap

Note: you may have to specify the patch version e.g. 3.7.9

Activate the environment

$ pyenv activate BillMap

Then load the requirements.txt into the virtual environment:

$ cd /path/to/server_py
$ pip install -r requirements.txt

Installing pypy as virtualenv

The application has been tested and works with pypy on ubuntu:

Install pypy as a pyenv virtualenv, for example

pyenv install pypy3.7-7.3.4
pyenv virtualenv pypy3.7-7.3.4 pypy37flat
pyenv activate pypy37flat

Upgrade pip, if appropriate

/home/ubuntu/.pyenv/versions/pypy3.7-7.3.4/envs/pypy37flat/bin/pypy3 -m pip install --upgrade pip

It may be necessary to install C libraries to build lxml

sudo apt-get install libxml2-dev libxslt-dev python-dev

Install requirements

cd /path/to/server_py
pip install -r requirements.txt

Create .env file

Copy server_py/flatgov/.env-sample to server_py/flatgov/.env, and change the SECRET_KEY defined in that file.

Also, obtain an API key from ProPublica and add it as PROPUBLICA_CONGRESS_API_KEY. This is used to get Press Statement data from the ProPublica API.

Database set-up

Use Django manage.py commands to download the data and populate the database (see DATABASE).

Data structure: the `Bill` model

The core of the application is a bill. This is described in BILL_MODEL, and the model itself is set up in Django in server_py/flatgov/bills/models.py. We model bills at the level of the billnumber, e.g. 116hr1500 is a bill in the 116th Congress, in the House of Representatives, bill number 1500. This bill may have many versions, which may differ significantly from each other (e.g the Introduced version may have just a few sections, while the Reported in House version has an entirely new thousand section bill substituted in its place). Where there are differences, we attempt to process the latest version of any bill (e.g to calculate bill similarity).

Data: download bills with the unitedstates/congress scraper

To download and process data from earlier congresses, see details in DATA_BACKGROUND. There are ~50Gb of data, total for Congresses 110-117, including processed json files, and `DATA_BACKGROUND`describes options for downloading and processing this data. For a 'quick start', you can use data from only the most recent Congress:

Download data from the most recent Congress

cd /path/to/uscongress
./run govinfo --bulkdata=BILLSTATUS --congress=117`
./run bills

Note	You may need to separately clone the `unitedstates/congress` repository, run the command from there, and link the `data` directory to a directory `congress/data` in this repository.

Celery task to update bill downloads and data

Updates to the data are done through the Celery taskrunner (see https://docs.celeryproject.org/en/stable/getting-started/introduction.html). Details of the tasks in BillMap are in CELERY.

To run the Celery worker

$ pyenv activate BillMap
$ cd ~/.../server_py/flatgov
$ celery worker -Q bill -A flatgov.celery:app -n flatgov.%%h --loglevel=info

Set up the Celery schedule

celery beat -S redbeat.RedBeatScheduler -A flatgov.celery:app --loglevel=info

Run the Django application

Run the application from server_py/flatgov (within the Python virtual environment you created above):

$ cd server_py/flatgov
$ python manage.py runserver

This will serve the application on localhost:8000. Pages for individual bills follow the form: http://localhost:8000/bills/116hr1500

Bill-to-bill data pages are at: /bills/compare/115s211/115hr604/

Deployment

Deployment instructions

Deployment instructions are in DEPLOYMENT. The application is served on a Linux server (currently Ubuntu Ubuntu 18.04.5 LTS on AWS).

System components

The components of the system are:

Linux server on AWS (Ubuntu 18.04.5 LTS)
Nginx web server
Postgresql server (see DATABASE)
Elasticsearch server for search and bill similarity processing (see ES_SIMILARITY)
Python/Django application (this repository)
uwsgi Python server running the Django application, proxied by Nginx above
Bill metadata and xml, downloaded using scrapers from unitedstates/congress

Scrapers: other data scraped from public sources, including:

-Statements of Administration Policy
-Press statements
-Congressional Budget Office reports
-Congressional Research Service reports
-Calendar information from various congressional sources

These are described in more detail in SCRAPING.

Bills that are related to each other are identified in three ways:

Metadata (in billstatus XML) from the Congressional Research Service identifies bills as identical or related (e.g through a Committee process). We show these in the Related Bills table of the application.
Same or similar titles. Two bills are considered related if they have exactly the same title, or differ only in the year (e.g. 'The Very Important Information Act of 2022' and 'The Very Important Information Act of 2023').
Calculation of text similarity between bills. We calculate similarity between bills using the bill_similarity module (see below).

Bill similarity — text similarity

Overview

Bill-to-bill comparison is impractical

Calculating the text similarity between two bills can be relatively straightforward: we can find the percentage of overlapping text between the two bills, or use an existing text similarity algorithm (e.g. Levenshtein distance).

However, for a database of the size of this one, calculating the similarity of all bills is impractical, particularly if we want to update the data. The calculation requires approximately n² comparisons, where n is the number of bills. For the ~80k bills in our corpus, this would be 6.4 billion comparisons.

Search-based comparison

To improve performance, we use search. In particular, we search each section of the latest version of abill against an index of all bills, and combine the results of all of the section-wise searches to get a total score. We then have to filter results to remove duplicates (due to the different versions of all bills).

This approach is imperfect, since many individual sections may share language with unrelated bills (e.g. an Effective Date provision). Smaller bills may not have enough text to reliably find the most relevant 'similar' bills. On the other hand, large bills may match many similar bills on a subset of sections.

This application sets up the basic mechanisms for similarity measurements (described further in ES_SIMILARITY), which are open to many refinements (e.g. with the similarity metric that is used in the comparison).

Finding Similar Bills

As shown below, the application has three main views to explore bill similarity:

A list of similar bills, in order of similarity.
A section-by-section analysis of which other bills have similar sections.
A bill-to-bill comparision showing matching sections between two bills.

Note that small sections with common language will not show as matches using our methodology. We will only show sections that use distinct language, where that language is shared between sections of the two bills.

Figure 1: Similar Bills

Figure 2: Section-by-section List

Figure 3: Bill-to-bill Similarity

Figure 4: Text-to-bill Similarity

Relevant Committee Documents

To load Relevant Committee Documents data use the following instructions:

After installing the requirements under scrapers directory, run crec_scrape_urls.py file under scrapers directory.
Go to the crec_scrapy folder and run “scrapy crawl crec” command. It will take about an hour to scrape all the data in crec_scrapy/data/crec_data.json file.
Copy scraped data from crec_scrapy/data/crec_data.json to django base directory. First delete old data under django base directory or replace it.
Run django command “./manage.py load_crec” command to populate the data to the database.

billmap's People

Contributors

Stargazers

Watchers

Forkers

dreamproit fredm23579

billmap's Issues

Collect cboCostEstimates from data.json for each bill

The data is in cboCostEstimates in the data.json. We want to get the data for a bill and for related bills. We should add that to the bill table.

Error when missing sponsor state or district

Request Method: | GET
-- | --
http://localhost:8000/bills/116s3231
3.1
TypeError
can only concatenate str (not "NoneType") to str
/Users/arihershowitz/Documents/workspace/FlatGovDir/FlatGov/server_py/flatgov/bills/views.py, line 70, in makeSponsorBracket
...
/Users/arihershowitz/Documents/workspace/FlatGovDir/FlatGov/server_py/flatgov/bills/views.py, line 135, in bill_view
        context['bill']['sponsor_fullname'] = deep_get(bill_meta, 'sponsor', 'title') + '. ' + sponsor_name + ' '  + makeSponsorBracket(bill_meta.get('sponsor'))  …
/Users/arihershowitz/Documents/workspace/FlatGovDir/FlatGov/server_py/flatgov/bills/views.py, line 70, in makeSponsorBracket
    return '[' + party + '-' +  sponsor.get('state') + sponsor.get('district') + ']' …
▶ Local vars

Show metadata for when congress is in session on monthly calendar

https://www.senate.gov/legislative/2020_schedule.htm

https://www.senate.gov/legislative/resources/pdf/2020_calendar.pdf

For calendar, create a key for colors, symbols, and other icons

The key should be placed underneath the calendar on the home page.
The user should be able to toggle between showing and hiding the key.

Display sponsors of related bills

Mark titles with an indicator of whether they are for the whole bill or just a portion

This would eventually be shown in the related bills table in the UI. The data comes from data.json in the titles.is_for_portion: false or titles.is_for_portion: true field. We may need to capture this information along with the bill number when making the titlesIndex.json.

E.g.

"Requiring congressional approval prior to engaging in hostilities within the sovereign country of Iran.": ["116hjres58(w)"]
for portion=false (whole). and (p) for portion=true.

Add num to header in similar bills list, where available

Add scraper for CRS reports

Start with everycrsreport.com (built by Josh). Can download data using instructions on About page.

May need to limit bill citations to summary.

CRS reports may be updated over time, so date of CRS report does not determine related bills.
Not many reports reference bills, so key will be filtering. CRS report will typically cite the bill name.
May be able to use categorization to narrow search criteria or confirm relationship with bills.

NOTE: only 1 in 10 or even 1 in 100 of the reports will be relevant to us. These are only reports that include a mention of the bill title or number or other related information.

Can start by indexing CRS reports and searching by bill name & bill number.
Better data is before March 2020. Recent data is pdf => OCR, so may not be as usable.

Create home page

This feature is to implement the home page based on the UX wireframe:
https://preview.uxpin.com/74759f3657712ac7dd6218004869a3c263dec460#/pages/132039538/simulate/no-panels?mode=i

Create Home page template
Configure header to route to home page
Create search bar
- Implement typeahead
- Implement search functionality to route users to /bills/{billNumber}
Discuss "whats happening this week" feature with UX to create requirements

Re-style tabular table to match UX

Currently the tabular implementation of the tables don't match the UX. We'll need to implement the current CSS in bootstrap tables to reskin the tabular tables.

Tabular:

Current bootstrap iteration (matching UX):

The weekly calendar should begin on Monday instead of Sunday

Add titles that differ by year in a `titles_year` field

This is working in the relatedBills branch:

Add shared sponsors to relatedBills

In data.json, the sponsor item is an object of the form:

cosponsors is an array of the form:

In relatedBills, if the two bills share the sponsor (both bioguide id and name), add the cosponsor item. If they share any sponsors, add a sponsors array with the shared sponsors (as they are found in the related bill).

Apparent problem indexing 116hr5150

As discussed in #32, the similar bills query does not return sections from 116hr5150ih. It appears this is true, even when the text from section 602 of 116hr5150ih is searched. I extended the results for the search to 100 (from the default of 10) and the bill is still not among the results. This suggests that the bill is not being properly indexed.

Check bills in 116hr5150 for related bills

Section 602 incorporates Mar A Lago Act => (s769, hr1736)

https://flatgov.linkedlegislation.com/bills/116s769

Create model for related bills

This model or models will store the information produced by relatedBills.py, which is currently stored in relatedBills.json

relatedBills.py

Running into some hiccups thought I would discuss here:

I am creating a file called relatedBills.py to do the following:

Load the titlesIndex file

def loadTitlesIndex(titleIndexPath=PATH_TO_TITLES_INDEX, zip=True):
    titlesIndex = {}
    if zip:
        try:
            with gzip.open(titleIndexPath + '.gz', 'rt', encoding='utf-8') as zipfile:
                titlesIndex = json.load(zipfile)
        except:
            raise Exception('No file at' + titleIndexPath + '.gz')
    else:
        try:
            with open(titleIndexPath, 'r') as f:
                titlesIndex = json.load(f)
        except:
            raise Exception('No file at' + titleIndexPath + '.gz')
    return titlesIndex

Loop over the data in the titlesIndex file and grab the list of billnumbers for each title
Loop over each list of bill numbers and return a dictionary containing a billnumber key, and list containing all the billsnumbers that share the same title

def getSameTitles():
    titlesIndex = loadTitlesIndex()
    newIndex = {}
    for key, value in titlesIndex.items():
        bills = value
        for bill in bills:
            if newIndex.get(bill):
                newIndex[bill].append(bill)
            else:
                newIndex[bill] = bill

So far I'm getting a string attribute error leading me to believe that the value I get from newIndex[bill] is not what I'm assuming it is, still working on it but would appreciate any input'

Not all related bills are being added to the database

From a review of a few bills, it seems that bills from previous congresses are not being added, and some other related bills are missing.

https://flatgov.linkedlegislation.com/bills/116hr1605

But in the db:

Save related bills information (from scripts/relatedBills.py) to the database

This is currently stored in a flat file, relatedBills.json.

In addition to updating the scripts, document the change (e.g. in README.adoc and deployment documentation)

Skip adding text when there is no summary (adding cosponsors)

Running python manage.py related_bills, I get this error:

Adding sponsor info
116hjres58 - bill has updated.
Cosponsors are added to bill - 116hjres58.
Traceback (most recent call last):
  File "manage.py", line 22, in <module>
    main()
  File "manage.py", line 18, in main
    execute_from_command_line(sys.argv)
  File "/Users/arihershowitz/.pyenv/versions/flatgov/lib/python3.7/site-packages/django/core/management/__init__.py", line 401, in execute_from_command_line
    utility.execute()
  File "/Users/arihershowitz/.pyenv/versions/flatgov/lib/python3.7/site-packages/django/core/management/__init__.py", line 395, in execute
    self.fetch_command(subcommand).run_from_argv(self.argv)
  File "/Users/arihershowitz/.pyenv/versions/flatgov/lib/python3.7/site-packages/django/core/management/base.py", line 330, in run_from_argv
    self.execute(*args, **cmd_options)
  File "/Users/arihershowitz/.pyenv/versions/flatgov/lib/python3.7/site-packages/django/core/management/base.py", line 371, in execute
    output = self.handle(*args, **options)
  File "/Users/arihershowitz/Documents/workspace/FlatGovDir/FlatGov/server_py/flatgov/common/management/commands/related_bills.py", line 14, in handle
    makeAndSaveRelatedBills()
  File "/Users/arihershowitz/Documents/workspace/FlatGovDir/FlatGov/server_py/flatgov/common/relatedBills.py", line 171, in makeAndSaveRelatedBills
    addSponsors()
  File "/Users/arihershowitz/Documents/workspace/FlatGovDir/FlatGov/server_py/flatgov/common/relatedBills.py", line 109, in addSponsors
    summary = billData.get('summary', {}).get('text')
AttributeError: 'NoneType' object has no attribute 'get'

Build scraper for Statements of Administrative Policy, connect to bill

https://obamawhitehouse.archives.gov/omb/legislative_sap_default

https://www.presidency.ucsb.edu/documents/app-categories/written-statements/presidential/statements-administration-policy?items_per_page=10&page=127

https://www.presidency.ucsb.edu/documents/app-categories/written-statements/presidential/statements-administration-policy?items_per_page=10&page=127 George w bush saps https://georgewbush-whitehouse.archives.gov/omb/legislative/sap/index.html 

Create a relatedBills.json for same_titles

This is the topic of PR #5

Create an output relatedBills.json that collects bills that have the same title.

The first version of this will be of the form:

{
116s130: {

  same_titles: ['115hr123', '116hr201', ...]
 
},

...

Where the array in same_titles shares any of the titles of the other documents: it is symmetrical, in that if 115hr123 is in the array of same_titles for 116s130, then 116s130 should be in the array corresponding to 115hr123.

I will open another issue for a 'next' step to provide a richer JSON of related bills.

On the Bills page, change the order of tabs in the "Understand the Context" section

Create a 'bill' model

Model the 'bill' in the Django models for flatgov. This should be compatible with the govtrack models (Bill, BillType, BillStatus, BillSummary), and be able to connect to a model that stores information about related bills.

Create enriched relatedBills JSON

This is an extension of #6 and should be combined into PR #5

Generate a JSON that contains rich information about bill similarity. For each bill, there will be a list of objects for related bills. The JSON will have billnumbers as the keys, and the value will be an array of objects corresponding to related bills. In each object will be information about what the two bills share. So, for example:

116s130: [
  { billCongressTypeNumber: '116hr201'
    cosponsors: [bioguide_id1, bioguide_id2],
    titles: ['Shared Title 1', 'Shared Title 2', etc.]
    similar_title: ['Similar (nonidentical) Title 1', 'Similar (nonidentical) Title 2', etc.]
  }...
  ],
]

Use model data to display in the `bills` view, for the related bills and sponsors tables

Related to #44, #45, #46. The Django UI currently opens and reads relatedBills.json and displays data from that data (in memory) in the bills view. This will update views.py to instead use the database models.

Link bill number in tables

Allow a user to open the link to view that bill's page.

Add section number to section header in similar bills display table, where available

Interesting example: 115hr5164

A number of bills matched on title (identical & only year changed), many of which were not identified in GPO data as related:

Sources for 'in Focus' information

House Leader's webpage, has a list of upcoming items by week. (e.g. www.majorityleader.gov/content/weekly-leader-friday-july-24-2020)
Rules Committee website

Add two-way relationships for bills identified as 'related'

Some bills have been identified (e.g. by CRS) as related only in one direction. For example, CRS identifies that 115hr2023 is related to 115s1520 (see http://flatgov.linkedlegislation.com/bills/115hr2023) but not the other way around (see http://flatgov.linkedlegislation.com/bills/115s1520).

We may want to add a category for CRS* when this occurs, to add the relationship in 115s1520. Note that this relationship is identified in the similar bill search when searching section 202 of 115s1520.

The list of related bills in the top left panel does not always match the table of related bills

Here, 116hr19 is in the list, but not the table:

Create a json file for each bill that lists metadata including related bills

Duplicate bill being added in related bills

e4706da

Measure bill similarity using the ES sections index

We want to preprocess each bill to find 'similar' bills. The list of bills that are similar will be added to our existing 'related bills' JSON, with a new category, maybe 'es-similarity'.

The bill similarity algorithm should work something like this:

Create a ES section index (done)
For a given bill + version (e.g. data/116/dtd/116hr1500ih), break it down into individual sections and headers. We already have code for this in the ES indexing scripts, using lxml. To break up a bill into sections, we can reuse the code or method here:

https://github.com/aih/FlatGov/blob/master/flatgovtools/elastic_load.py#L70

 sections = billTree.xpath('//section')
 headers = billTree.xpath('//header')

For each section, serialize to text ( 'section_text': etree.tostring(section, method="text", encoding="unicode")) and do a query against the index in 1, to find similar bills. The query can use the moreLikeThis search from elastic_load.py.

The moreLikeThis search returns a list of (a) similar sections, (b) the bills that those sections are from, and (c) the similarity score for the current searched text. For each section in a bill we will save this information. We will need to be able to vary the number of similar bills returned (current ES default is 10) and the threshold for similar bills (default should be ~ score of 20)

Combine the information from 3. to get a list of all similar bills, ranked by score. The score will be a normalized combination of the score of all of the sections of the current bill from item 1 (e.g. data/116/dtd/116hr1500ih)

We will do this in stages.

Stage 1 will just get the top X similar bills and their normalized similarity scores.

This list will be saved to the related bills JSON with the es-similarity category.

Stage 2 will allow users to ask more detailed questions. For example:

a) navigate the bill and show similar sections of other bills for each section of the bill
b) are there bills that are fully contained in other bills (i.e. all or most of the sections of bill A are very similar to sections in bill B).

The goal of Stage 1 is that a user can just search a bill, like they do now, typing 116hr1500, and they get a table of related bills and, in addition to the current categories (title, CRS), there are bills that are listed because of their text similarity.

Tables should allow resizing of columns

The new styling of the tables has a search feature (good). But a few things have changed that we want back: columns should be resizeable, especially the Title and other columns that may be cut off.

On calendar, hide times before 8:00 a.m. and after 6:00 p.m.

***Don't hide these timeframes if there is an event that occurs

Check 'incorporated via' bills for bills that should be recognized as related

https://www.govtrack.us/congress/bills/browse#current_status[]=1&enacted_ex=on

Link to Committee transcript data for bill (no need to scrape)

Hearing transcript and report from proceedings

 https://www.congress.gov/event/115th-congress/senate-event/LC52421/text?loclr=bloglaw 

 https://www.govinfo.gov/app/collection/chrg 

In relatedBills, add items for relationships identified in related_bills of data.json

E.g.

Develop API calls for Press statements

Derek Willis may be able to point to the sources

Scrape data for CBO report and relate to bill

The data is already available in cboCostEstimates in the data.json. We want to get the data for a bill and for related bills. We should add that to our table.

May scrape from cbo.gov: https://www.cbo.gov/cost-estimates . The CBO data is also on congress.gov. Would be great to relate to a logical routing pattern, e.g.

116/hr1

R Street may have a process to do this.

Integrate logic from this repo as a PR for govtrack

Provide additional functions for get_related bills.
See https://github.com/govtrack/govtrack.us-web/blob/914a928a9585765c72715c705f5bfc43548f6c6c/bill/models.py#L1451

Error running python manage.py bill_data

Entering directory: /Users/arihershowitz/Documents/workspace/FlatGovDir/FlatGov/server_py/flatgov/congress/data/116/bills/sconres/sconres47
Traceback (most recent call last):
  File "/Users/arihershowitz/.pyenv/versions/flatgov/lib/python3.7/site-packages/django/db/models/query.py", line 573, in get_or_create
    return self.get(**kwargs), False
  File "/Users/arihershowitz/.pyenv/versions/flatgov/lib/python3.7/site-packages/django/db/models/query.py", line 431, in get
    self.model._meta.object_name
bills.models.DoesNotExist: Bill matching query does not exist.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/Users/arihershowitz/.pyenv/versions/flatgov/lib/python3.7/site-packages/django/db/backends/utils.py", line 84, in _execute
    return self.cursor.execute(sql, params)
psycopg2.errors.StringDataRightTruncation: value too long for type character varying(500)


The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "manage.py", line 22, in <module>
    main()
  File "manage.py", line 18, in main
    execute_from_command_line(sys.argv)
  File "/Users/arihershowitz/.pyenv/versions/flatgov/lib/python3.7/site-packages/django/core/management/__init__.py", line 401, in execute_from_command_line
    utility.execute()
  File "/Users/arihershowitz/.pyenv/versions/flatgov/lib/python3.7/site-packages/django/core/management/__init__.py", line 395, in execute
    self.fetch_command(subcommand).run_from_argv(self.argv)
  File "/Users/arihershowitz/.pyenv/versions/flatgov/lib/python3.7/site-packages/django/core/management/base.py", line 330, in run_from_argv
    self.execute(*args, **cmd_options)
  File "/Users/arihershowitz/.pyenv/versions/flatgov/lib/python3.7/site-packages/django/core/management/base.py", line 371, in execute
    output = self.handle(*args, **options)
  File "/Users/arihershowitz/Documents/workspace/FlatGovDir/FlatGov/server_py/flatgov/common/management/commands/bill_data.py", line 26, in handle
    updateBillsMeta()
  File "/Users/arihershowitz/Documents/workspace/FlatGovDir/FlatGov/server_py/flatgov/common/billdata.py", line 255, in updateBillsMeta
    walkBillDirs(processFile=addToBillsMeta)
  File "/Users/arihershowitz/Documents/workspace/FlatGovDir/FlatGov/server_py/flatgov/common/billdata.py", line 67, in walkBillDirs
    processFile(dirName=dirName, fileName=fname)
  File "/Users/arihershowitz/Documents/workspace/FlatGovDir/FlatGov/server_py/flatgov/common/billdata.py", line 248, in addToBillsMeta
    defaults=bill_data
  File "/Users/arihershowitz/.pyenv/versions/flatgov/lib/python3.7/site-packages/django/db/models/manager.py", line 85, in manager_method
    return getattr(self.get_queryset(), name)(*args, **kwargs)
  File "/Users/arihershowitz/.pyenv/versions/flatgov/lib/python3.7/site-packages/django/db/models/query.py", line 576, in get_or_create
    return self._create_object_from_params(kwargs, params)
  File "/Users/arihershowitz/.pyenv/versions/flatgov/lib/python3.7/site-packages/django/db/models/query.py", line 610, in _create_object_from_params
    obj = self.create(**params)
  File "/Users/arihershowitz/.pyenv/versions/flatgov/lib/python3.7/site-packages/django/db/models/query.py", line 447, in create
    obj.save(force_insert=True, using=self.db)
  File "/Users/arihershowitz/.pyenv/versions/flatgov/lib/python3.7/site-packages/django/db/models/base.py", line 751, in save
    force_update=force_update, update_fields=update_fields)
  File "/Users/arihershowitz/.pyenv/versions/flatgov/lib/python3.7/site-packages/django/db/models/base.py", line 789, in save_base
    force_update, using, update_fields,
  File "/Users/arihershowitz/.pyenv/versions/flatgov/lib/python3.7/site-packages/django/db/models/base.py", line 892, in _save_table
    results = self._do_insert(cls._base_manager, using, fields, returning_fields, raw)
  File "/Users/arihershowitz/.pyenv/versions/flatgov/lib/python3.7/site-packages/django/db/models/base.py", line 932, in _do_insert
    using=using, raw=raw,
  File "/Users/arihershowitz/.pyenv/versions/flatgov/lib/python3.7/site-packages/django/db/models/manager.py", line 85, in manager_method
    return getattr(self.get_queryset(), name)(*args, **kwargs)
  File "/Users/arihershowitz/.pyenv/versions/flatgov/lib/python3.7/site-packages/django/db/models/query.py", line 1249, in _insert
    return query.get_compiler(using=using).execute_sql(returning_fields)
  File "/Users/arihershowitz/.pyenv/versions/flatgov/lib/python3.7/site-packages/django/db/models/sql/compiler.py", line 1395, in execute_sql
    cursor.execute(sql, params)
  File "/Users/arihershowitz/.pyenv/versions/flatgov/lib/python3.7/site-packages/django/db/backends/utils.py", line 98, in execute
    return super().execute(sql, params)
  File "/Users/arihershowitz/.pyenv/versions/flatgov/lib/python3.7/site-packages/django/db/backends/utils.py", line 66, in execute
    return self._execute_with_wrappers(sql, params, many=False, executor=self._execute)
  File "/Users/arihershowitz/.pyenv/versions/flatgov/lib/python3.7/site-packages/django/db/backends/utils.py", line 75, in _execute_with_wrappers
    return executor(sql, params, many, context)
  File "/Users/arihershowitz/.pyenv/versions/flatgov/lib/python3.7/site-packages/django/db/backends/utils.py", line 84, in _execute
    return self.cursor.execute(sql, params)
  File "/Users/arihershowitz/.pyenv/versions/flatgov/lib/python3.7/site-packages/django/db/utils.py", line 90, in __exit__
    raise dj_exc_value.with_traceback(traceback) from exc_value
  File "/Users/arihershowitz/.pyenv/versions/flatgov/lib/python3.7/site-packages/django/db/backends/utils.py", line 84, in _execute
    return self.cursor.execute(sql, params)
django.db.utils.DataError: value too long for type character varying(500)

Integrate relatedBills info into Django app

Done with tabulator js table library:

e.g.

Missing some related bills files

Request Method: | GET
-- | --
https://flatgov.linkedlegislation.com/bills/116s4230
3.1
FileNotFoundError
[Errno 2] No such file or directory: '/opt/flatgov/FlatGov/server_py/flatgov/congress/data/relatedbills/116s4230.json'

Document the `bills` table and how it relates to govtrack tables

Error running python manage.py migrate

This worked with a previous version of the feature/bill-sponsor branch, but now at c63877d, I get:

File "/Users/arihershowitz/.pyenv/versions/flatgov/lib/python3.7/site-packages/django/db/backends/utils.py", line 84, in _execute
    return self.cursor.execute(sql, params)
psycopg2.errors.ObjectInUse: cannot ALTER TABLE "bills_bill" because it has pending trigger events


The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "manage.py", line 22, in <module>
    main()
  File "manage.py", line 18, in main
    execute_from_command_line(sys.argv)
  File "/Users/arihershowitz/.pyenv/versions/flatgov/lib/python3.7/site-packages/django/core/management/__init__.py", line 401, in execute_from_command_line
    utility.execute()
  File "/Users/arihershowitz/.pyenv/versions/flatgov/lib/python3.7/site-packages/django/core/management/__init__.py", line 395, in execute
    self.fetch_command(subcommand).run_from_argv(self.argv)
  File "/Users/arihershowitz/.pyenv/versions/flatgov/lib/python3.7/site-packages/django/core/management/base.py", line 330, in run_from_argv
    self.execute(*args, **cmd_options)
  File "/Users/arihershowitz/.pyenv/versions/flatgov/lib/python3.7/site-packages/django/core/management/base.py", line 371, in execute
    output = self.handle(*args, **options)
  File "/Users/arihershowitz/.pyenv/versions/flatgov/lib/python3.7/site-packages/django/core/management/base.py", line 85, in wrapped
    res = handle_func(*args, **kwargs)
  File "/Users/arihershowitz/.pyenv/versions/flatgov/lib/python3.7/site-packages/django/core/management/commands/migrate.py", line 245, in handle
    fake_initial=fake_initial,
  File "/Users/arihershowitz/.pyenv/versions/flatgov/lib/python3.7/site-packages/django/db/migrations/executor.py", line 117, in migrate
    state = self._migrate_all_forwards(state, plan, full_plan, fake=fake, fake_initial=fake_initial)
  File "/Users/arihershowitz/.pyenv/versions/flatgov/lib/python3.7/site-packages/django/db/migrations/executor.py", line 147, in _migrate_all_forwards
    state = self.apply_migration(state, migration, fake=fake, fake_initial=fake_initial)
  File "/Users/arihershowitz/.pyenv/versions/flatgov/lib/python3.7/site-packages/django/db/migrations/executor.py", line 227, in apply_migration
    state = migration.apply(state, schema_editor)
  File "/Users/arihershowitz/.pyenv/versions/flatgov/lib/python3.7/site-packages/django/db/migrations/migration.py", line 124, in apply
    operation.database_forwards(self.app_label, schema_editor, old_state, project_state)
  File "/Users/arihershowitz/.pyenv/versions/flatgov/lib/python3.7/site-packages/django/db/migrations/operations/fields.py", line 236, in database_forwards
    schema_editor.alter_field(from_model, from_field, to_field)
  File "/Users/arihershowitz/.pyenv/versions/flatgov/lib/python3.7/site-packages/django/db/backends/base/schema.py", line 572, in alter_field
    old_db_params, new_db_params, strict)
  File "/Users/arihershowitz/.pyenv/versions/flatgov/lib/python3.7/site-packages/django/db/backends/postgresql/schema.py", line 168, in _alter_field
    new_db_params, strict,
  File "/Users/arihershowitz/.pyenv/versions/flatgov/lib/python3.7/site-packages/django/db/backends/base/schema.py", line 726, in _alter_field
    params,
  File "/Users/arihershowitz/.pyenv/versions/flatgov/lib/python3.7/site-packages/django/db/backends/base/schema.py", line 142, in execute
    cursor.execute(sql, params)
  File "/Users/arihershowitz/.pyenv/versions/flatgov/lib/python3.7/site-packages/django/db/backends/utils.py", line 98, in execute
    return super().execute(sql, params)
  File "/Users/arihershowitz/.pyenv/versions/flatgov/lib/python3.7/site-packages/django/db/backends/utils.py", line 66, in execute
    return self._execute_with_wrappers(sql, params, many=False, executor=self._execute)
  File "/Users/arihershowitz/.pyenv/versions/flatgov/lib/python3.7/site-packages/django/db/backends/utils.py", line 75, in _execute_with_wrappers
    return executor(sql, params, many, context)
  File "/Users/arihershowitz/.pyenv/versions/flatgov/lib/python3.7/site-packages/django/db/backends/utils.py", line 84, in _execute
    return self.cursor.execute(sql, params)
  File "/Users/arihershowitz/.pyenv/versions/flatgov/lib/python3.7/site-packages/django/db/utils.py", line 90, in __exit__
    raise dj_exc_value.with_traceback(traceback) from exc_value
  File "/Users/arihershowitz/.pyenv/versions/flatgov/lib/python3.7/site-packages/django/db/backends/utils.py", line 84, in _execute
    return self.cursor.execute(sql, params)
django.db.utils.OperationalError: cannot ALTER TABLE "bills_bill" because it has pending trigger events

Handle cosponsor names like 'Sanford D. Bishop, Jr.' which currently becomes 'Jr. Sanford D. Bishop'

Readme Changes

Did you still want to include the info on the scraper? Or perhaps a caveat about it actually taking a long time to run?

Also noting the $ python3 <filename> command with macs. Python novice here, so wasn't immediately apparent the difference between the commands "python" and "python3".

I think some language describing in better detail where the congress data should exist in the repo before you run the scripts would be nice. Also, some documentation on expected output when scripts are run successfully and unsuccessfully.