Coder Social home page Coder Social logo

propublica / facebook-political-ads Goto Github PK

View Code? Open in Web Editor NEW
236.0 37.0 50.0 77.21 MB

Monitoring Facebook Political Ads

License: MIT License

HTML 34.95% JavaScript 33.84% CSS 1.47% Rust 15.50% Lua 0.93% Python 3.38% PLpgSQL 1.10% Shell 0.14% SCSS 8.69%
rust javascript redux preact hyper diesel facebook extension

facebook-political-ads's Introduction

Facebook Political Ad Collector

This repository is no longer maintained and exists for archival purposes only.

For a more recent project on this topic, see Ad Observer.

See archival README information.

This is the source code behind our project to collect political ads on Facebook. You can browse the American ads we've collected at ProPublica, and the Australian ads over on the Guardian's website.

We're asking our readers to use this extension when they are browsing Facebook. While they are on Facebook a background script runs to collect ads they see. The extension shows those ads to users and asks them to decide whether or not a particular ad is political. Serverside, we use those ratings to train a naive bayes classifier that then automatically rates the other ads we've collected. The extension also asks the server for the most recent ads that the classifier thinks are political so that users can see political ads they haven't seen. We're careful to protect our user's privacy by not sending identifying information to our backend server.

We're open sourcing this project because we'd love your help. Collecting these ads is challenging, and the more eyes on the problem the better.

Run it on your own

You can find instructions on how to set up your own full-fledged version of the Facebook Political Ad Collector in INSTALLATION.md

There is an explanation of all the moving parts in ARCHITECTURE.md

Stories

Where We Need Your Help

In general, the project needs more tests. We've written a couple of tests for parsing the Facebook timeline in the extension directory, and a few for the tricky bits in the server, but any help here would be great!

Also, the rust backend needs a bit of love and care, and there is a bit of a mess in backend/server/src/server.rs that could use cleaning up.

Types of ads the collector doesn't collect

  • mobile ads
  • pre-roll, midstream video ads
  • video from ads in the stream
  • Instagram-only ads (Note that many ads are set to run on Facebook and Instagram with the same creative)

TODOs to Consider:

  • considering triggering the ad parsing routine only on scroll, to mitigate the clicking-off problems.
  • consider retaining utm params (i.e. a whitelisted set of parameters in links that are added by advertisers, not by FB and ipso facto are not personally identifiable, e.g. utm_content, etc., since those sometimes include useful metadata about the ad.)
  • consider turning off the panelist_ads table, etc.
  • consider seeding the partisanship model in new languages with political tweets.

facebook-political-ads's People

Contributors

alecglassford avatar allyjlevine avatar bsmithgall avatar dreisman avatar drjerry avatar fdebijl avatar jeremybmerrill avatar ohadcn avatar pdehaan avatar peterk avatar rileynwong avatar samatt avatar thejefflarson avatar tomcardoso avatar tpreusse avatar varner avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

facebook-political-ads's Issues

impressions incorrectly set to 0 sometimes

the impression count is sometimes an undercount, I think perhaps when the vote is submitted before the ad itself? there are some ads with zero impressions but >=1 vote (none with zero impressions and zero votes).

Sidebar ads not detected when FB UI language is swedish

Looking at the code it seems like sidebar ad collecting was implemented but I do not see sidebar ads in the classifier when using Facebook in Swedish. Switching it to English and sidebar ads are detected/collected properly.

Don't put Facebook first in the title

Consider running this issue past your lawyers:

Based on past behavior of social media platforms, I imagine that Facebook's legal department would object less to "Political Ad Collector for Facebook" than to "Facebook Political Ad Collector". This would help make it clear from line one that Facebook does not explicitly endorse this extension and help preserve a nominative use defense. Look into why Fluff Busting Purity, for example, is no longer called "Facebook Purity".

Implement pagination on the dashboard.

Right now there's no way to page through the ads. We have a way to do that by appending page=<page> to the /ads route, but it needs to be hooked up to the ui.

Facebook language settings influencing detection?

Using the plugin it seems to base detection on the language setting the user has in Facebook. Trying it out in Turkish, Pashto and Swedish detects no ads. Given that a substantial amount of users in Germany may have a different language setting than English or German there may be a risk of ads going undetected.

Detangle parser.js

The functions in parser.js should not be tied together in the way that they are. parser() should tie the various promises together so we can build a test harness like the one proposed over here.

Jest complaining about missing key

When I run the jest tests I get a warning about a missing key in groupedattrs:

  console.error node_modules/react/node_modules/fbjs/lib/warning.js:33
  Warning: Each child in an array or iterator should have a unique "key" prop.
  Check the top-level render call using <tbody>. See https://fb.me/react-warning-keys for more information.
      in tr

@jeremybmerrill wanna take a look?

You have been added to awesome-humane-tech

This is just a FYI issue to notify that you were added to the curated awesome-humane-tech in the 'Awareness' category, and - if you like that - are now entitled to wear our badge:

Awesome Humane Tech

By adding this to the README:

[![Awesome Humane Tech](https://raw.githubusercontent.com/humanetech-community/awesome-humane-tech/main/humane-tech-badge.svg?sanitize=true)](https://github.com/humanetech-community/awesome-humane-tech)

https://github.com/humanetech-community/awesome-humane-tech

Facebook posts' context menus spontaneously open (and steal my arrow-key-focus)

STR:

  1. Be logged in to facebook.
  2. Install Facebook political ad collector
  3. Visit https://www.facebook.com/
  4. Wait ~15 seconds.

ACTUAL RESULTS:
After waiting 10-15 seconds, the facebook "context menu" spontaneously opens, from the top right of the topmost post in your feed.

(Annoyingly, this happens even if you were scrolling with the arrow keys -- and the context menu "steals focus" and snaps you back up to to the first post, and the menu becomes the thing that the arrow keys are scrolling. So all of a sudden, you're snapped to a different scroll position and your arrow keys are scrolling through the context menu instead of through your feed.)

I'm hitting this 100% of the time in Firefox release (63) and also Chrome ("dev channel" version 72), on Ubuntu 18.10 linux on a 64-bit Lenovo laptop.

I'm using extension version 1.8.1 in Firefox, and 1.8.0 in Chrome (latest available from their respective addon sites). I was testing in a fresh (created today) Firefox profile and a basically-fresh Chrome profile -- no other extensions.

Note: I tried a hard-refresh based on a suggestion in another issue, but it doesn't help - the issue still happens.

MutationObserver steals focus from Facebook messenger?

Needs investigation. With extension running and an active FB messenger chat window open, the main page will steal focus from the chat window.

Minor nit, but if it is actually caused by the extension it is a fairly annoying usability issue. Turning off the extension seems to have fixed the issue for me...

To reproduce:
Type message into FB messenger chat popup at bottom of FB page. After ~10 seconds of typing, focus will no longer be in the chat window and you have to click to place the cursor in the chat window to keep typing.

Notes for model

Naively sampling on the whole collection of posts will distort classifier depending on how well-represented 'political' is.

For now: Check for imbalance in samples, and artificially bolster to equalize class sizes (though that should be validated).

reverse image search via perceptual hashing

it would be really neat to be able to search for ads that use an image (e.g. a news photo or a meme) OR group ads that use the same image.

The way to do this would likely be to -- at some point -- record each image's perceptual hash, then at search time, compare the given image to all images in the DB, returning those that are similar. dhash and phash are both perceptual hashing algorithms, but there may be other good choices.

For instance, hopefully, we could determine which ads use the Distracted Boyfriend meme like this one .

Ads I'm Seeing empty despite seeing ads

Issue title kinda says it all I've installed the plugin disabled my ad blocker for Facebook I have Facebook open in a container tab to isolate it but when I check "Ads I'm Seeing" in the plugin there is nothing in the list despite seeing 4 or more ads on Facebook already. Any ideas?

Scrolling not working in extension popup

When on facebook.com and with the extension popup open, scrolling doesn't work in the popup. If I go to a different tab and open up the popup then I can scroll in it.

US seed data?

Given the timeliness of fb political advertising, any interest in having the seed of an American classifier ready to go soon?

(sorry I've been MIA, the LSAT's next week!)

Classification Option: «I don't know»

A user here in Switzerland reported back that it would be great to have an «I don't know» option when classifying ads.

This especially make sense in a Swiss context because you can easily end up seeing ads in another national language that you might not understand.

This could be implemented as a local only thing—only dismissing the ad from the classifying queue. On the other hand this could be counter productive if many users use it and one might want to monitor that (by reporting to the server).

What do you think? Does it make sense in other contexts too? Spanish ads in the US? I'd be happy to work on the client side.

state page combining targeting and candidates

In the admin, there should be a page that shows a user all the ads that are EITHER targeted to that state (as shown in the targets column, with either Region: Georgia or State: Georgia, for instance) OR the ads whose candidates (join defined here ) are in that state.

So the Georgia page would show ads targeted to Georgia residents AND ads from, say, Casey Cagle and Stacey Abrams.

new searches should reset pagination

Right now, if you page over to page 3 and then do a search, you'll remain on page 3 of your search results. Expected outcome is that you'd be back on page 1 of your new results.

new column, parse out "Paid for by" string from ads

US ads have a new-ish string in them with a "Paid for by _______" disclosure. We should add a new column (using a migration in fbpac-api-public ) and then begin parsing it out the string into the column.

We'll have to do this once for all the already-recorded ads, then also either (a) on the fly when ads are received, in the Rust app here or (b) in the classifier Python script that's run hourly on a cron. I have no particular preference.

save_ads() expects json to be a sequence of ad objects, but currently extension sends single ad object

(gonna leave these as issues for now, not sure if this is your preferred way of tracking design issues/progress)

As per title, save_ads tries to make a Vec of a single object. This fails.

Design choice:
(A) Extension returns a json sequence (even if has just a single object)
(B) Server only expects one add at a time.

I think A is preferable. If in the future the extension bundles several posts together for processing, the server won't need to be changed.

Collecting ads in a new country

I would like to make a campaign for collecting political ads in Israel (I have been testing the Firefox and Chrome extensions myself for the past few weeks). How would you go about doing so?
Is there a way to gain access to the raw data collected by the original extensions? Will it be better to fork, build and run on a local environment?

Much obliged

How is the add-on detecting country location?

I’m in Canada, but it detects my language/location as English/USA. Not sure if it isn’t detecting Canada or if my laptop is defaulting to some en-US language and that’s what propublica is using for detection.

Classifier data tracking in git leads to large git history

The initial clone is 2.4G, due to the tracking of updated models in the same repo as the source code.

Would it be possible to host the models in another repository or s3, and fetch the most recent ones when appropriate (during build of a release, or at runtime? I'm not sure yet which is more appropriate for the project)

classify ads by whether they're persuasive, mobilization, listbuilding or fundraising

political ads can have many different purposes, including

  • listbuilding: finding potential supporters and getting their contact info, so you can claim them as supporters and also so you can ask them for money
  • fundraising: asking people -- probably people who you already know are your supporters or else people you think are reasonably likely to support you -- for money
  • mobilization: asking people -- probably people who you already know are your supporters or else people you think are reasonably likely to support you -- to do stuff, like vote early or volunteer
  • persuasion: communicating to people -- who are not your supporters but who probably aren't your opponent's supporters either -- about specifically-chosen issues/messages to persuade them to vote for you (or at least to not vote for your opponent)

(I realize this is a somewhat simplified ontology. Ideas on how to come up with -- and operationalize -- a different ontology are totally welcome.)

It'd be amazing to come up with a machine learning model that could come up with a decent guess as to which category a given political ad falls into. You might be able to figure this out just from the text of the ad. (In a perfect world, we could also extract interesting features from the ad images/video, but that's out of scope.)

I can talk endlessly about this idea. Let me know if you're interested. Reply here or email me at jeremy dot merrill at propublica dot org.

get political ads filtered by language from ProPublica?

hello anyone
is there any API to get ads from ProPublica servers filtered by language? I can filter by region (also for region outside US), but it always showing me ads in english targeted to english speakers in the region.

this code is supposed to give me results filtered by region:

curl "https://projects.propublica.org/fbpac-api/ads/persona?location_bucket=Israel" 

but the results looks like they are intended for non-Israeli interested in Israel, not for Israelis.

I couldn't find how to filter by language or even get results in local language, adding Accept-Language header, locale, lang or language parameter do not help - still getting english results.

curl "https://projects.propublica.org/fbpac-api/ads/persona?location_bucket=Israel&lang=he-IL" -H "Accept-Language: he"

I've also tried the other API endpoint, which gave me empty arrays:

curl "https://projects.propublica.org/facebook-ads/ads" -H "Accept-Language: he-IL"

Do ProPublica keeps this data? Can I get it? If not - can I ask you to keep it? (it's going to be interesting before Isreali elections on April)

Want to bring it to Brazil!

Hey Folks!

We at the LabHacker in Brazil wants to bring this tool to the national elections.
Should we deploy another instance and another plugin? Or there is some way to access and process the data within ProPublica implementation afterwards? :)

Extension Performance

Readers have complained that they some times get a Firefox warning that the plugin is blocking the webpage with an stop script option. I've personally noticed that opening the popup in Chrome takes a few seconds after using it for a while.

Things to look into:

  • large local data set leading to slow initialisation
  • loading too many languages
  • too aggressive facebook.com DOM access

Would be nice to solve this with an TTD approach and keeping track of performance over time.

A report about the issue in German

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.