Coder Social home page Coder Social logo

chompy's Introduction

DoSomething.org (Legacy Website) 🔥

This was the Drupal 7 app which powered DoSomething.org and our international affiliates from 2013-2018. It was gradually replaced by Phoenix "Next", which first launched in 2017. This application was officially retired in January 2019.

License

©2019 DoSomething.org. DoSomething is free software, and may be redistributed under the terms specified in the LICENSE file. The name and logo for DoSomething.org are trademarks of Do Something, Inc and may not be used without permission.

chompy's People

Contributors

aaronschachter avatar chloealee avatar deguzmankevin avatar dependabot[bot] avatar dfurnes avatar mendelb avatar sbsmith86 avatar

Stargazers

 avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

chompy's Issues

unit tests

Each import should have a set of unit tests tied to it so we can ensure that import jobs are working correctly.

TurboVote Imports: MIssing data in quasar

Problem

There is an issue where signups that are created by chompy are not being sent to quasar. This is causing the registered to vote count to drop by about 40%.

There is a pivotal tracker issue with more detail from the data team here: https://dosomething.slack.com/archives/CA0N5FXND/p1529441737000557

And a slack thread about possible solutions here: https://dosomething.slack.com/archives/CA0N5FXND/p1529441737000557

Cause

When chompy processes records, it simply sends a POST /api/v3/posts request to Rogue to create the voter-reg post. Rogue has logic on it's end where, before it creates a post, it checks if there is a signup that the post should be associated with. If that signup exists, we use it, if it doesn't rogue creates a new signup and then uses that signup to tie the post to.

When we create the signup on the fly, we do not move the signup through the business logic of sending to customer.io or quasar, which is why data is noticing they have posts, but not signups.

Solution

There are two parts to fixing this.

  1. We need to run the SendToQuasar command in rogue to send over anything that was created since chompy launched so that data gets all the missing signups.

  2. We need to update the logic on the rogue side to send signups created on the fly to quasar when they are made so that we do not need to backfill going forward.

CSVs made in Excel error out

When you create a csv in Excel, save it, and then try to upload the file in chompy we get a bunch of errors that basically say it is not a valid CSV.

@katiecrane mentioned she has seen this for and you have to clean up some of the extra sauce that excel puts in a csv file.

From katie:

To clean up those weird excel csvs, this has worked for me: perl -p -i -e "s/\r/\n/g" file.csv

Importing Historic Share Data

Background

We have been providing a way for users to take action on our campaigns by sharing content. When we launched these new actions, we tracked the data of who did what in our analytics platform.

They have proven to be successful action types, we want to be able to export the data we have collected into our campaign activity service so that they can be track as actions throughout our systems, and so we can serve this activity via the API for other apps to use.

Problem

The data that we recieve from the analytics tracking we have been doing is not in the same schema as post data. We want to figure out how to store these share "posts" into our campaign activity service.

Solution

Record Transformation Rules

Analytic Data Model

{
    "event.name": "completed_share_facebook",
    "page.path": "/us/campaigns/give-spit-about-cancer/action",
    "type" : "facebook",
    "action" : "quiz-share",
    "user.northstarId" : "59ba9xxxxxxxxe564f",
    "keen.created_at" : timestamp,
}

The data model above needs to be transformed into a post in our campaign activity service.

Note: When importing these into our campaign activity service from a CSV we will ignore any records w/o account IDs.

Campaign Activity Post

{
    "id" : 3,
    "signup_id" : 1234,
    "campaign_id" : 34567,
    "northstar_id" : "59ba9xxxxxxxxe564f",
    "type" : "share-facebook",
    "action" : "quiz-share",
    "quantity" : null,
    "url" : null,
    "text" : null,
    "status" : "accepted",
    "source" : "importer-client",
    "source_detail" : { "original-source" : "phoenix-next"},
    "created_at" : keen.created_at,
    "updated_at" : "timestamp",
    "deleted_at" : null
}

Questions (5/31/18)

  • For post status, I think we should automatically accept these. But, if there are other event.name values that could be parsed we could do something more sophisticated like- set the post status to accepted only if the event.name includes "completed".

  • I think we are going to need a campaign_id column to associate signups/posts to. Is that possible?

  • I think type should be prefixed with share- and then the platform it was shared on. Like share-facebook, share-twitter. Thoughts?

Add a default status

For the TurboVote import, add a default status we can fall back too if the status we are provided doesn't translate to the 5 we determined.

RTV Imports

Context

For our voter registration efforts, we've been using two platforms -- TurboVote (TV) and Rock the Vote (RTV). We've already documented and done the work to import the TV file and now we need to do the same with the RTV file.

Here's what the referral code looks like in the RTV file (NS ID hidden) -- a major difference from the TV referral code is that it some also includes iframe?r=:

screen shot 2018-06-07 at 10 47 55 am

Similar to the TV file, the referral code:

  • Might be populated
  • Might have NS ID
  • Might have a campaign ID, but no Run, vice versa, or have both
  • Might be a duplicate if a user returns again

We want to import this CSV into Rogue as signups and posts so that these users can be served experiences based on their registration status, they can be messaged, and we can accurately report this information in Looker (and do deeper data analysis when needed). 


Problem

RTV records and Rogue posts do not have the same schema! We want to figure out how to store RTV records in Rogue so that other DoSomething Apps can get this information via the Rogue API and be able to serve this data to internal teams and serve customized user experiences based on a user’s registration status.

Solution

Use Chompy, our importer app!

screen shot 2018-06-07 at 10 58 28 am

Record cleaning 💅

There are some records that might show up in the RTV record that we want to ignore. The rules around that data cleaning are outlined below:

  • If an email includes thing.org in the address, ignore it.
  • If an email includes @dosome in the address, ignore it.
  • If a last name includes Baloney, ignore it.
  • If an email includes test, ignore it.
  • If an email includes rockthevote.com, ignore it.
  • If an email includes +, ignore it
  • If an email includes Ashley or Luke's personal email address (not posting here for privacy)

Status Translation Rules

Ultimately, there are 4 post statuses we want to capture for voter-reg posts for Rock the Vote (Note RTV doesn't have a confirmed status like TurboVote did):

  • register-form - User completed the registration form
  • register-OVR - User completed the registration form on their state's Online Voter Registration platform
  • ineligible - User is ineligible to register for whatever reason
  • uncertain - We can not be certain about this user registration status

This is the logic for how to determine what a post status should be.

  • If status is complete and finish with state is no --> register-form
  • If status is complete and finish with state is yes --> register-OVR
  • If status is any of the steps --> uncertain
  • If status is rejected --> ineligible

Because we're pulling some of the columns into the post details, data will then be able to know if they, for example, pre-registered or why their registration was ineligible.

Status Hierarchy

**When a RTV CSV has multiple records per user we use the following hierarchy to determine which status should be reported on the Rogue post. If when importing, there is an existing status per the campaign and user from any previous import (TV or RTV), follow the hierarchy. **

  1. register-form
  2. register-OVR
  3. confirmed
  4. ineligible
  5. uncertain

For example: If a user has a confirmed status already from a TV import, and the RTV file suggests that it should be uncertain, do not update.

We’ve established this hierarchy because each time a user interacts with the RTV form a new row is created in the CSV. There are the edge cases when a user is chased to finish their registration that they would be interacting with the same row (thus the "steps"). The hierarchy is the simplest approach to dealing with varying statuses, but we anticipate some edge cases that we may need to deal with as they come up.

Here’s one example:

  • User A completes the RTV form —> register-form status
  • User A, for whatever reason, starts the RTV form again and drops off --> uncertain status

In this case, we would want to count the form completion (register-form). It’s important to note that the hierarchy is for internal reporting and doesn’t prevent the user from interacting with the RTV form if they want to do so.

Dealing with Non-Member Registrants

If the referral column doesn't have a NS ID, we should do what we do with the TV import.

  1. Try to match to a member on number
  2. Try to match to a member on email

If those don't work, then create a NS account for them with the relevant information (First name, last name, contact information) like we do with TurboVote. For the sms_status we should populate it for the time being with what's in the partner SMS opt-in column.

Note: Referral links will come in this format: https://vote.dosomething.org/member-drive?userId=NSID&r=user:NSID,campaign:####,campaignRunID=####,referral=true

How to count these as impact

Based on the above statuses, some should be counted as a RB and some should not. This determination was made by the executive team and allows us to report internally progress towards the organization's report back goal. Here's what counts as a reportback from the RTV export:

  • register-form
  • register-OVR

Note: register-form and register-OVR are the only statuses that count as registrations.

Rogue will NOT store this information, but will return a derived value in the JSON response when the voter registration post is created or read that holds this information. The logic to determine this is as follows:

if (in_array($rogueStatus, ['confirmed', 'register-form', 'register-OVR'])) {
    $reportbackStatus = 'T';
} else {
    $reportbackStatus = 'F';
}

Other Important Information

  • The details on the post will have the row of information available for data to use.
  • The submission_created_at date is when the importer ran. Details about when the registration was created/updated are in the source_details.
  • All of these signups will have a source of importer-client (this is how messaging is suppressed in C.io)
  • All of these posts have a type = voter-reg
  • The month that the registration came in is what informs the action column (e.g., february-2018-rockthevote)
  • For now, if not available in the referral, we will attribute everyvoter-reg post to the Grab the Mic campaign to maintain a 1:1 relationship. This is what we have done with TurboVote.
  • Quasar will be pushed these posts from Rogue but if Data needs to do a deeper analysis, they will use the raw RTV CSV to do that work.
  • If a user shares their UTM'ed URL with other people, there could be duplicate referral codes but associated with different registrants:
    See a screenshot of what this data looks like (note: the user depicted in this spreadsheet is fake.)

Next Steps

  1. Create this logic for RTV
  2. Start importing!
  3. Move post voter reg status onto NS profile
  4. Coordinate w/ Marketing for email (weeklydo only) and sms suppression (no status filled)

Open Questions

  1. How can we distinguish between TV and RTV import? Do we need to?
  2. RTV has more birthdays in it, can we use that on the NS profile when we create accounts? (TV didn't provide this)
  3. We've been having some Chompy validation issues w/ the TV import -- is this the good time to tackle those things? If so, is a running list the best way of communicating some of the things we've seen in the referral column? There's one specific referral that's different and it's just ads -- these are from Google ads and the only way they could set up tracking that Google likes. Would love to talk through best way to deal with these...if it's something unique!

CSV Column Validation

When a user uploads a csv file to be processed, they also select the "type" of csv they are uploading. If the user uploads a TurboVote CSV, but triggers a Rock The Vote import, then errors will occur.

Let's add an extra level of defense that validates the CSV (i.e. making sure it has the right headers) depending on the type of import that is selected.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.