Coder Social home page Coder Social logo

fredcallaway / heroku-experiment Goto Github PK

View Code? Open in Web Editor NEW
7.0 8.0 8.0 8.33 MB

Starter kit for running a psiturk experiment on heroku with jspsych.

Home Page: http://salty-meadow-30207.herokuapp.com/

License: MIT License

Makefile 0.10% Python 24.96% HTML 26.30% CSS 4.17% JavaScript 43.96% Procfile 0.03% Shell 0.50%

heroku-experiment's Introduction

Heroku experiment template

A starter pack for running online experiments on Heroku using Psiturk or Prolific.

Setup

Dependencies

Make sure you have all of these installed before continuing:

Installation

Create a new repository using this repository as a template (on github there is a green "Use this template" button at the top of the page). Clone the new repository to your machine and cd into the directory from a terminal.

Create a virtual environment and install the requirements with the following commands. We install pandas separately because we only need it locally (for data preprocessing).

python3 -m venv env
source env/bin/activate   
pip install -r requirements.txt
pip install pandas

You can then see the experiment by running make dev and opening the printed link (cmd-click). By default the experiment will be served at http://localhost:22362. You can change the port number in config.txt (e.g., to allow previewing multiple experiments at once).

Update university- and app-specific information

  • Update the email in experiment.js
  • Put your IRB-approved consent form in templates/consent.html.
  • If using prolific: update the [Prolific] section of config.txt
  • If using mturk: do a search for "bodacious" to find places where you should change info, mostly in ad.html and config.txt

Deploy to Heroku

Make sure you're logged into the correct Heroku account using the Heroku CLI (use heroku auth to see useful commands).

Create a new app and add a Postgres database. Note: these commands must be run from the project directory (the one containing this README.md). You should probably change the name of your app to something less silly.

heroku create dizzydangdoozle --buildpack heroku/python
heroku git:remote -a dizzydangdoozle
heroku addons:create heroku-postgresql

You can confirm that the heroku site has been created with the heroku domains, which will print the domain of your shiny new website!

Make some changes and commit them using git. You can then deploy all commited changes with

git push heroku master

This makes heroku build your app, which can take a minute or so. Then your website will be updated.

Developing your experiment

  • The structure of the experiment is defined in static/js/experiment.js
  • You will find some tutorial-like information there and in static/js/instructions.js
  • If you have multiple conditions, use the CONDITION variable. The number of conditions is set in config.txt. You can manually specify the condition while debugging by adding &condition=1 to the URL.
  • Add any additional experiment files and dependencies to templates/exp.html.
  • Run make static to preview your experiment.
  • Edit, refresh, edit, refresh, edit, refresh....

By default, data will not be saved when running locally. If you want to save data while debugging, follow these steps:

  • Run make dev instead of make static
  • Visit http://localhost:22362/test. The port (22362) is configured in config.txt.
  • The fields necessary to store your data will be automatically added to the URL. Take note of or change the workerid (something like debug58523) if you wish. Note that if you use the same id twice, it will overwrite the previous data.
  • The data will be saved to the local participants.db sqlite database.

Downloading data

  • Run bin/fetch_data.py [codeversion]. codeversion is set to the current version (set in config.txt) if you don't specify it.
  • Pass the --local flag if you want to "download" from the local participants.db database.
  • You will find the data in data/raw/[codeversion]/events/. There is one file per participant. It is a json list with one object for every time you called logEvent.
  • Note: it's up to you how you want to handle data representation. Frameworks like jsPsych often batch up all the data for a trial into one object. You can do that if you want; just call logEvent at the end of each trial passing a big object with all the data. I prefer to just put a logEvent any time anything happens and then I worry about formatting it later.

Posting your study

First, update codeversion in config.txt. This is how the database knows to keep different versions of your study separate. What you do next depends on the recruitment service.

Prolific

For your first pass, you should create the study with Prolific's web interface.

  1. Set the URL to. https://<YOUR_APP_DOMAIN>.herokuapp.com/consent?mode=live&workerId={{%PROLIFIC_PID%}}&hitId=prolific&assignmentId={{%SESSION_ID%}}. Make sure to replace <YOUR_APP_DOMAIN> in the link with the current domain, which you can see with the heroku domains command.
  2. Make sure "I'll use URL parameters" is checked.
  3. Select "I'll redirect them using a URL". Copy the code and set it as PROLIFIC_CODE in experiment.js, e.g. const PROLIFIC_CODE = "6A5FDC7A".
  4. As always, do a dry run with Prolific's "preview" mechanism before actually posting the study. I also recommend running only a couple people on your first go in case there are unforseen issues.

We also provide an alpha-release CLI for Prolific, using the Prolific API. Run bin/prolific.py to see the available commands. The most useful ones are

  • approve_and_bonus does what you think it does using the bonus.csv file produced by bin/fetch_data.py
  • post_duplicate posts a copy of your last study (as if you had used Prolific's "duplicate study" feature) with an updated name. You can update the pay and number of places in config.txt. It won't actually post the study without you confirming (after printing a link to preview it on Prolific).

You'll need to install two additional dependencies for this script: pip install markdown fire

MTurk

I haven't used MTurk in a while, so I'm not sure this actually works, but...

Start the psiturk shell with the command psiturk. Run hit create 30 1.50 0.5 to create 30 hits, each of which pays $1.50 and has a 30 minute time limit. You'll get a warning about your server not running. You are using an external server process, so you can press y to bypass the error message.

Downloading data

To download data for a given version run

bin/fetch_data.py <VERSION>

If you don't provide a version, it will use the current one in config.txt.

The raw psiturk data is put in data/raw. This data has identifiers and should not be shared. Make sure not to accidentally put it on github (data is in .gitignore so this shouldn't be a problem). The mapping from the anonymized "wid" to "workerid" is saved in data/raw//identifiers.csv.

Minimally processed (and de-identified) data is written as JSON files in data/processed.

Note: data will not be saved when testing locally. If you want to save data while debugging, you will need to run the experiment on heroku and pass the relevant URL parameters, for example:

https://dizzydangdoozle-4cd6ae16d401.herokuapp.com/exp?mode=live&workerId=debug123&hitId=prolific&assignmentId=debug123

If you don't want to overwrite the previously saved debug data, you have to change the workerId or assignmentId.

Additionally, by default bin/fetch_data.py will not download data with "debug" in the workerId or assignmentId. You can pass the --debug flag to disable this behavior and download all data.

Posting static versions

It is often useful to have a permanent link to different versions of the experiment. This is easy to do if you have your own personal website that you can rsync to. First set the relevant parameters in bin/post_static. Then you can run e.g. bin/post_static v1.

FAQ

Can I check how many participants there are without downloading the full dataset?

Yes. Use e.g. heroku pg:psql -c "select count(*) from participants where codeversion = 'v1'". You can also open an interactive SQL terminal with just heroku pg:psql. Another useful query is select workerid,codeversion,cond,beginhit,endhit from participants order by beginhit desc;

Contributors

  • Fred Callaway
  • Carlos Correa

heroku-experiment's People

Contributors

fredcallaway avatar sophieshangfei avatar cgc avatar tsumers avatar

Stargazers

Maya Malaviya avatar Tianmin Shu avatar  avatar Stefan Uddenberg avatar Mayank Agrawal avatar Ryan Wesslen avatar Joshua Peterson avatar

Watchers

James Cloos avatar  avatar  avatar Joshua Peterson avatar  avatar Ruairidh Battleday avatar  avatar  avatar

heroku-experiment's Issues

Password authentication failure in fetch_data.py

(From Jonathan)

I’m getting a bizarre error when I go to fetch the data from my experiment (running bin/fetch_data.py version; this has been working completely fine until now).

psycopg2.OperationalError: connection to server at "ec2-44-215-1-253.compute-1.amazonaws.com" (44.215.1.253), port 5432 failed: FATAL:  password authentication failed for user "pecjzmgvmcffme"
connection to server at "ec2-44-215-1-253.compute-1.amazonaws.com" (44.215.1.253), port 5432 failed: FATAL:  no pg_hba.conf entry for host "84.33.153.8", user "pecjzmgvmcffme", database "d1gh6l0jffv33e", no encryption

Building with JsPsych

Can we build an experiment using JsPsych that functions using Psiturk and Heroku with a similar structure?

Remove psiturk.js

Psiturk docs state that the psiturk.js static file is actually generated, not read from the filesystem:

Q: Where is the /static/js/psiturk.js file? It doesn't appear in any of the experiments I have downloaded!
A: psiturk.js doesn’t actually “exists” as a file in the static folder of any project. Instead, the psiturk server/command line tool automatically generates this file. The best way to view it is by “view source” in your browser while debugging your experiment. While somewhat unintuitive, this ensures that changes to psiturk.js are linked to new versions of the overall psiturk command line tool (since they are tightly interdependent).

https://github.com/NYUCCL/psiTurk/blob/c847801e9c02a9aa16a70407e91de5d0d2ac21f3/doc/faq.rst#where-is-the-staticjspsiturkjs-file--it-doesnt-appear-in-any-of-the-experiments-i-have-downloaded

Set an appropriate variable for `threads` in config.

Our configuration currently has threads=1. Given my experiences running the server on Heroku I think we should probably set this to auto. Below is a plot of Dyno Load (see definitions for metrics in plot here. load is defined as "The load value indicates a runnable task (a process or thread) that is either currently running on a CPU or is waiting for a CPU to run on, but otherwise has all the resources it needs to run. The load value does not include tasks that are waiting on IO.")

Screen Shot 2021-05-06 at 12 21 15 PM

On the left, you can see traffic as a result of a small pilot (4 participants) on a Hobby core. On the right, you can see traffic as a result of a larger pilot (160 participants) on 5:Standard-2x, and later 9:Standard-1x cores. Since the 1M load max never exceeds 50% (and in many cases doesn't exceed 33%), I think it's worth increasing the number of threads somewhat substantially to make better use of compute resources. Since threads=auto sets workers to 2 * # CPUS + 1 (code here), this makes it a natural choice for the single and dual core case (which would result in 3 and 5 threads respectively).

Use built-in Heroku support over our custom support.

The complicated startup solution that I wrote last year should actually be superseded by psiTurk's builtin support for Heroku. I think the only change we'd need to add to the README is heroku config:set ON_CLOUD=1 (though b/c we use an older psiTurk version, it may require ON_HEROKU), though I'd want to read the docs and read the script more closely to ensure that's the case. We wouldn't want to directly run those things since they create new files (like Procfile and requirements).

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.