Coder Social home page Coder Social logo

unitedstates / congress-legislators Goto Github PK

View Code? Open in Web Editor NEW
2.0K 2.0K 498.0 87.57 MB

Members of the United States Congress, 1789-Present, in YAML/JSON/CSV, as well as committees, presidents, and vice presidents.

License: Creative Commons Zero v1.0 Universal

Python 98.55% Shell 1.45%

congress-legislators's People

Contributors

arrighik avatar asebold avatar atbaker avatar bchartoff avatar bycoffe avatar christineletts avatar crdunwel avatar csnardi avatar dannguyen avatar drinks avatar dwillis avatar essandess avatar gphemsley avatar h4ck3rm1k3 avatar hugovk avatar joelcollinsdc avatar joshdata avatar konklone avatar laurenpully avatar lavaturtle avatar mattpaz avatar mrumsky avatar msimonborg avatar nickoneill avatar patrickvankessel avatar plantfansam avatar putorti avatar tcarobruce avatar thinkcontext avatar timball avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

congress-legislators's Issues

Track by seat instead of lawmaker?

I'm putting together a network diagram of voting patterns in which it's helpful to treat John Kerry and Mo Cowan as one person, since they occupy the same senate seat and belong to the same party. Is there a programmatic way to create a chain of lawmakers occupying a single seat for a single session? Happy to do it myself if so.

Provide alternative formats for bulk download

Hey! This is the only complete data source I've been able to find regarding legislators and all their information (NYTimes API comes close but doesn't provide contact info). I typically work in the .NET environment so we have a ton of great JSON deserializers.

It's a bit of a handful and a lot of extra hassle to transform these YAML documents into a strongly-typed model... I would love if you provided multiple formats for these documents, like JSON or XML; something that's a bit easier to deserialize.

I plan to use GH's API to periodically check when these are updated to grab the latest and download them to an offline cache. I do my own parsing and converting using YamlDotNet but I still need to create intermediate objects to transform the YAML into a nicer object graph.

Consistently quote IDs

I think I saw somewhere that thomas IDs were quoted because they had leading zeroes, but not all thomas IDs are quoted; in the same file, some are and some aren't.

Also, it would be helpful if all numerical IDs were quoted, so that they were read as strings instead of integers. (As it is, I kind of have to play whack-a-mole with IDs to deal with string/integer type mismatches.)

Include Congress number in current committees

As I understand it, the "current" committee information is in fact still from the 112th Congress, but there is no easy way for me to confirm that that is the case. It would be helpful if the file somehow had some indication as to which Congress it applied to. Maybe even a comment that just indicated it (and was updated when the file was updated).

RSS URLs?

I've been collecting RSS urls for both member and committee sites. Any interest in me adding them to legislators-current and committees-current?

Script to scan for leads on official social media accounts

I'd like to take the lead on this, as I have a Ruby script I've used for a few years that I can port to Python and to the new dataset. It does two things:

  • For members whom we don't have a record, it scans their official homepages' source for mentions of a URL with a twitter.com, youtube.com, or facebook.com domain, and produces a spreadsheet of "leads" to look into. This is the core.
  • Hits the Twitter API for existing Twitter handles to make sure they're still there and haven't been renamed. This is not a useful approach to port though, because Twitter API requires OAuth for everything now and is harder to use. Plus it's not a great measure since other people can take over abandoned usernames and make them look legit. What's probably more useful is a longer task that looks to see if the member's website still lists that account.

So I may or may not port the second thing, but I would probably add a blacklist of Twitter accounts that members often link to when they first get set up (the House GOP caucus account, for example).

FEC ids aren't consistent for legislators

Hey @konklone fec_ids should probably not go in the id section because they can change. (Candidate ids start with the chamber--house id's start with an 'H' and senate ids start with an 'S').

To complicate matters, a single lawmaker can have multiple ids at once; for the '12 cycle, Michelle Bachman has a presidential id P20002978 and a house id H6MN06074.

Another, more common, example: Martin Heinrich, in NM, just won a senate seat, with id S2NM00088 but he also has a house id for his old job H8NM01224; these are both valid FEC ids for the '12 cycle.

If one were being strict about this, fec_ids would have a many-to-many relationship with term (b/c the same id can be reused from one cycle to the next, I believe).

Determine how to handle mid-term changes to term-specific information

As mentioned in #10 and #15, certain change can occur within a traditional term of office that affect data that we would normally associate with a given term, the most prominent of which being the party affiliation and caucus preference of a member.

My proposal of abusing/redefining "term" to have a separate one for each change met with controversy, so other ideas are necessary.

Photo gathering script

Even if we don't store the photos in version control, the script to gather them can live here.

Email addresses for legislators?

Is there any data source that provides direct email addresses for legislators? Would it be possible, if that's available, to include that in the bio section? If not, that's fine. It would be real nice to have direct email contacts.

LIS IDs for committees

Given that the information is now readily available, could we discuss the prospect of referring to committees via their LIS IDs instead of the current hodgepodge of Thomas IDs and partial LIS IDs?

At least on the Senate side, LIS IDs seem to be the de-facto standard for referring to committees, and this repository's way of providing committees (and especially subcommittees) with unique IDs seems unnecessarily convoluted.

Right now, standard committees look like

name: Senate Committee on Appropriations
  url: http://appropriations.senate.gov/
  thomas_id: SSAP
  senate_committee_id: SSAP
  subcommittees:
  - name: Commerce, Justice, Science, and Related Agencies
    thomas_id: '16'
  - name: Energy and Water Development
    thomas_id: '22'
-----
- type: house
  name: House Committee on Agriculture
  url: http://agriculture.house.gov/
  thomas_id: HSAG
  house_committee_id: AG
  subcommittees:
  - name: Conservation, Energy, and Forestry
    thomas_id: '15'
    address: 1301 LHOB; Washington, DC 20515
    phone: (202) 225-2171

In our list (as far as I can tell) the thomas_id always matches the senate_committee_id, which (I think) is supposed to be the LIS ID's prefix. (However, LIS always refers full committees as XXXX00). On the house side, house_committee_id always seems to match the second two letters of the thomas_id, which again, is the prefix of the LIS ID.

Similarly, for subcommittees, the thomas_id always seems to be equal to the LIS ID's numeric suffix. (In the above examples, Energy & Water's LIS ID is SSAP22, while Conservation, Energy, and Forestry is HSAG15).

I'm not sure if there are any inconsistencies in here (thus justifying the duplicative specification of IDs), but it sure would seem easiest to just specify each committee and subcommittee by its LIS ID.

Thoughts?

Historical Committee assignments

Great work here, such an excellent source. I was curious about the possibility of keeping historical committee assignments for legislators from their previous terms. As I understand only current committee assignments are housed here. Anyone have thoughts on this?

Wikipedia page names

How do you guys feel about adding a new ID type called wikipedia which would have the Wikipedia page name for the person?

I've written a scraper. Wikipedians are really good about applying CongBio and CongLinks templates to Members of Congress which pretty reliably identifies the bioguide ID.

They've also filled in VoteSmart and other IDs that we're missing. (I will import these anyway since importing these doesn't depend on adding a Wikipedia ID field.)

Combine social with current-legislators?

Is there a reason to keep the social info separate from current legislators? I'd like to expose it and it's easy when it's just part of the entire data set.

If not, I can download it separately as well.

Vice presidents

Since vice presidents are president of the Senate (and thus can be the tie-breaker in votes), and since many legislators have become vice president (and a few vice presidents have become president), it would probably be good to list them somewhere.

A good testcase for whether everything is going smoothly is Andrew Johnson: He was a Representative from Tennessee, a Senator from Tennessee, a vice president (a Democrat elected on a coalition ticket with the Republican Abraham Lincoln, under the banner of the National Union Party), a president (having succeeded after the death of Lincoln), and then later a Senator from Tennessee again.

Committee assignments?

Anyone know if the House is working on posting them and if the Senate XML up to date?

Incorrect Name Info for Rep. Debbie Wasserman Schultz

Hopefully this is the correct venue to report this bug, but I noticed that the following information in legislators-current.yaml is incorrect:

name:
first: Debbie
middle: Wasserman
last: Wasserman Schultz
official_full: Debbie Wasserman Schultz

As you can see, her last name should be "Schultz" and should not include her middle name as "Wasserman Schultz".

Could someone please correct this for us, or is this deemed to low of priority?

Split chamber from role type

It was proposed in #18 that delegates (and resident commissioners) be distinguished from representatives because their terms and/or voting privileges are different. It was also noted that it is possible that the Senate may also gain delegates in the future, and we should be prepared to handle them.

So we're looking at the following:

chamber: H
type: rep

chamber: H
type: del

chamber: H
type: (res)com

chamber: S
type: sen

chamber: S
type: del

This remains somewhat orthogonal to the issue of titles, but it gets us part of the way there. (I'd also considered using "house" instead of "chamber", but that might be confusing.)

Decide how to handle the Speaker of the House

There was interesting discussion in #26 about how to handle a Speaker of the House and whether that person would have voting rights even if they were not otherwise a member of the House.

It was suggested that we add a boolean "speaker" field to a term to indicate a speaker, but a Speaker term may not always coincide with a Representative term, so I wonder if we should handle it differently.

Label delegates and resident comissioners separately

Delegates and Resident Commissioners are non-voting members of the House of Representatives, which distinguishes them from Representatives. In addition, a Resident Commissioner is distinguished from a Delegate in that the former elected to a four-year term versus the two-year term of the latter.

If the Senate ever winds up adding delegates, there would then be a further distinction between House Delegates and Senate Delegates.

I am of the opinion that the combination of the "chamber" and "type" fields, as detailed in #26, is enough to distinguish all of these. I don't think that it is necessary to transition away from the "type" field in favor of another field, like "title".

Include govtrack id in committee memberships

Committee memberships have bioguide and thomas IDs associated with them, as does social media. But social media also has govtrack IDs. It would be helpful to me if committee memberships also had govtrack IDs, so that I didn't have to load in the whole legislators file just to get the proper IDs.

Missing Twitter accounts for some senators

I notice that certain senators do not have their Twitter accounts listed in the social media file:

  • @SenatorIsakson – Johnny Isakson
  • @SenGillibrand – Kirsten Gillibrand

I'm sure there are a bunch of others, and on social media outlets, but these are a few I came across directly.

Determine how to associate Congress number with terms

There are actually two separate issues here which may or may not be solved with a single solution:

  • How do we best associate a term with a single Congress (Representatives)?
  • How do we best associate a term with multiple Congresses (Senators, Presidents, Resident Commissioners)?

I mentioned a few possible solutions in #15, some of which were controversial.

Specify the reason for a term ending

It has been suggested in numerous places (including #7 and #10) that the dates for terms be tied to something. In most cases, that will be the term of the Congress, but other cases could include death, resignation, party change, etc.

I suggest, along with the previous suggestion of the addition of the Congress number, that the reason for a term ending be specified:

  • C - Normal Congress change
  • R - Resignation
  • D - Death in office
  • P - Party change

Then, every time some piece of data about a term changed (Congress number, party, end date, etc.), a new term would be listed and the old term would get a reason code as to why it changed.

(We could arguably also have a corresponding starting reason, for issues like special elections or appointments.)

This proposal likely isn't particularly controversial for Representative terms, but it could entail controversial changes for Senator terms.

For handling Senators, we have a few options:

  • Do nothing.
  • Add the change code but not the Congress.
  • Split each Senator term into three, adding the change code and the Congress.
  • Add the change code and add a space-separated list of 3 Congresses.

All of which would be independent of how the Representatives would be handled.

And the executives would be another story (although, by being listed in a separate file, they have the opportunity to be governed by different rules).

Remove votesmart IDs

Barring objections, I'm gonna drop the Votesmart IDs from our system. There's no longer a great way to keep them up to date, since Votesmart's API is now a for-pay product. They're just gonna decay and become useless as it stands.

Who's Using This Database

Let's make a list of projects using this database in production.

GovTrack is now pulling from master!

Legislators are sorted incorrectly

According to the README: "Legislators are listed in order of the start date of their earliest term."

This may be true for historical legislators (I didn't fully check), but it is certainly not true for current legislators. That file lists Sherrod Brown as first, even though her first term was in the 1990s, whereas John Dingell, whose first term started in 1955, is listed somewhere in the middle.

The preferable change would be resort the current file, but I suppose changing the README would also fix this issue.

Caucus membership?

Anyone considered adding caucus membership? Would be useful to have which party independents caucus with as well as CBC and CHC membership, etc. Happy to contribute -- any leads appreciated. (I'm not 100 percent clear on the legal status of a caucus or lack thereof)

Remove '113th_Congress' branch

AFAICT, the work done on the '113th_Congress' branch was long ago merged into master. It'd probably be a good idea to delete the branch, if it is indeed obsolete, in order to prevent confusion.

Popolo-compliant version

I've written a draft data standard for people, organizations and the relations between the two. I'm currently coordinating with mySociety regarding its use in PopIt and receiving feedback from the OpenStates team, looking towards adopting a common schema.

Chatting with @konklone, it may be interesting for congress-legislators to adopt the standard - it could be for example via a script that transforms the YAML to a Popolo-compliant version. Popolo currently defines serializations for JSON and RDF, but it would be possible to define a YAML serialization if that's the preferred format.

Anyway, I'm leaving this issue open ended, looking forward to comments/feedback.

Schatz's end date

Right now we're listing Schatz's term end date as Jan 2017 (previously Inouye's term end date).

Schatz serves until the next general election in 2014, as reported by the media. Do you guys think that means he'll leave office some time in 2014, before the end of the 113th Congress? So it would make sense to change the end date to Dec 31, 2014 as at least a little more accurate than Jan 2017?

C-SPAN ids

Something of a niche thing, but C-SPAN assigns unique IDs to members of congress (and others) and uses them to build the RSS feeds it has of recent appearances. For example: http://www.c-spanvideo.org/feeds/recentApp.php?id=1683 (Harry Reid). We have many of these for recent congresses and are in the process of updating for 113th. Any interest in putting them in the files?

Older URLs for current members

Two questions: should we display older urls for current members, particularly for those people who served in the House and then the Senate, like Sherrod Brown? His house website no longer exists. I can see the case for maintaining older urls for current members in the same chamber, but not as much for current senators who previously served in the House. This isn't a huge thing, obviously.

Parties are messy

Much of the party information is flawed or straight-up inaccurate. In addition, it doesn't seem like there has been any attempt to standardize the names—a few names differ only in capitalization (e.g. "Pro-administration" vs. "Pro-Administration").

Here's the full list of parties used:
[
'AL',
'Adams',
'Adams Democrat',
'American',
'American Labor',
'Anti Jackson',
'Anti Jacksonian',
'Anti Mason',
'Anti Masonic',
'Anti-Administration',
'Anti-Jacksonian',
'Anti-Lecompton Democrat',
'Anti-administration',
'Coalitionist',
'Conservative',
'Conservative Republican',
'Constitutional Unionist',
'Crawford Republican',
'Democrat',
'Democrat Farmer Labor',
'Democrat-Liberal',
'Democrat-turned-Republican',
'Democrat/Independent',
'Democrat/Republican',
'Democratic',
'Democratic - Republican',
'Democratic Republican',
'Democratic and Union Labor',
'Democratic-Republican',
'Farmer-Labor',
'Federalist',
'Free Silver',
'Free Soil',
'Ind. Democrat',
'Ind. Republican',
'Ind. Republican-Democrat',
'Ind. Whig',
'Independent',
'Independent Democrat',
'Independent/Republican',
'Jackson',
'Jackson Republican',
'Jacksonian',
'Jacksonian Republican',
'Law and Order',
'Liberal',
'Liberal Republican',
'Liberty',
'National Greenbacker',
'New Progressive',
'Nonpartisan',
'Nullifier',
'Popular Democrat',
'Populist',
'Pro-Administration',
'Pro-administration',
'Progressive',
'Progressive Republican',
'Prohibitionist',
'Readjuster',
'Readjuster Democrat',
'Republican',
'Republican-Conservative',
'Silver',
'Silver Republican',
'Socialist',
'States Rights',
'Unconditional Unionist',
'Union',
'Union Democrat',
'Union Labor',
'Unionist',
'Unknown',
'Whig',
'no party'
]

(Note: Legislators are said to be in the "Democrat" party, while executives are in the "Democratic" party; the latter is the appropriate one.)

I would recommend consolidating some of these, and perhaps having a separate file that maps names to abbreviations, which should be distinct.

In addition, I think that changing parties mid-term should be shown with two terms, but I suppose that's debatable. (Another option is to only list the party at the time of election.) As it stands, when a candidate changes parties mid-term, they get a party like "Democrat/Independent" or similar.

This page has abbreviations for some of the more prominent parties (perhaps just the ones in the Senate), but I don't think it covers all the parties used here:
http://www.senate.gov/artandhistory/history/common/generic/Key_Party_Abbreviations.htm

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.