Coder Social home page Coder Social logo

vida-nyu / reproserver Goto Github PK

View Code? Open in Web Editor NEW
10.0 10.0 8.0 2.93 MB

A web application reproducing ReproZip packages in the cloud.

Home Page: https://server.reprozip.org/

License: BSD 3-Clause "New" or "Revised" License

Python 74.74% Shell 0.65% CSS 0.35% HTML 19.43% Dockerfile 0.68% Starlark 0.94% Smarty 3.22%
reprozip reprounzip docker kubernetes reproducibility reproducible-research linux science nyu hacktoberfest

reproserver's Introduction

Matrix

ReproServer

An online service to run ReproZip bundles in the cloud. No need to install reprounzip on your machine, point ReproServer to a RPZ file or upload one, and interact with the reproduced environment from your browser.

Additionally, ReproServer can capture remote web assets referenced by a web application that has been packaged as a RPZ file. That way, you will always be able to get a consistent reproduction of the RPZ bundle, even if those remote assets disappear.

How to run this with Tilt

Make sure you have checked out the submodule with git submodule init && git submodule update

You will need Tilt, kubectl, and a cluster with a local registry (that you can set up with ctlptl).

For example, create a local cluster with:

minikube start --kubernetes-version=1.22.2 --driver=docker --nodes=1 --container-runtime=docker --ports=8000:30808

Install the ingress controller using:

kubectl apply -f k8s/nginx-ingress.k8s-1.22.yml

Start the application for development using:

tilt up

You can then open http://localhost:8000/ in your browser. Tilt will automatically rebuild images and update Kubernetes as you make changes.

How to run this with docker-compose

You will need Docker and docker-compose.

  • Make sure you have checked out the submodule with git submodule init && git submodule update
  • Copy env.dist to .env (you probably don't need to change the settings)
  • Start services by running docker-compose up -d --build
    • Alternatively, use the development mode (insecure, but displays debug info and autoreloads): docker-compose -f docker-compose.dev.yml up -d --build
  • Open localhost:8000 in your browser

How to stop it: docker-compose down -v

reproserver's People

Contributors

ikreymer avatar pavanposani avatar remram44 avatar vickyrampin avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

reproserver's Issues

Output page: text description

2: Minor Usability Problem
Adding text descriptions like "You can see the log and output files from the research, which you can view or download to your machine. You can also share the link for other people to reproduce your research. " would be helpful.
Screen Shot 2022-08-18 at 12 41 30

Upload page: Order

2: Minor Usability Problem
I would change the order of the upload page to allow users to learn how to use it first

  1. How to use this site
  2. Examples
  3. Select a package

Screen Shot 2022-08-18 at 12 33 17

to upload

Output page: "Go back" button instead of clicking the file name

2: Minor Usability Problem
Currently, the file name in the heading “Package bechdel.rpz, run s67hd” is clickable but It is unclear what clicking it will do.
Instead of making it clickable, adding another button that says something like "Go back and modify>" would make it more clear that users can go back and change the input.
Screen Shot 2022-08-18 at 12 44 59

Usage limits to prevent abuse

Originally opened 2017-05-19 14:44 EDT by @remram44

  • Set CPU/RAM usage limits
    • This can be done via Kubernetes, setting limit to DinD container in runner pod
    • Or via the Dind daemon, having the runner set a limit
  • Set runtime limit
    • Have the runner spawn a thread that kills the container after a while
  • Set disk space limit
    • Docker doesn't seem to support this?
  • Careful of GZIP bombs
    • The builder could check the unpacked size

Limit data movement during build

Originally opened 2019-09-11 16:59 EDT by @remram44

The server is made slower by the need to move data around.

  • RPZ is received from upload / downloaded from repo
  • uploaded to S3
  • downloaded from S3 to builder container
  • uploaded to Docker-in-Docker in builder pod
  • then image is uploaded to registry

Cache repository APIs

There's no need to query the repositories every single time you visit a /reproduce/<repo>/<path> URL. A very simple caching mechanism would probably go a long way here.

Top nav: unclear what page user is on

3: Major Usability Problem
There is no indication at the top navigation bar to let the users know what page they're on. A small dot under the menu items would make it clear.
Screen Shot 2022-08-18 at 12 27 53

Web Capture page: Add section

  • Get rid of RPZ file and WACZ file input section and add a section under "Web Capture" instead
    We have the following information about your files:
    ReproZip bundle: filename.rpz (size in italic)
    WACZ file:
    Web Server:
    Current web archive bundle:
    Enter the hostname and port number if needed:

Screenshot 2023-04-07 at 15 11 30

Run the experiment page: adding input examples

2: Minor Usability Problem
It might be confusing for new users because it's unclear what to input. Examples, information overlay, or documentation would be helpful. (In the redesign, I added an information icon that opens a help overlay when clicked and users can read what they should input and what it means.)
Screen Shot 2022-08-18 at 12 36 42

Bring your own compute: GCP

Originally opened 2019-11-14 22:29 EST by @remram44

See also #36

GCP supports oauth2, so it's possible to have a "log in with Google" button and then have ReproServer start VM on your account.

I have the code to start VMs. Need to make that VM run the bundle now...

Also need to review GCP's terms of service.

Update to Bootstrap 5

The templates were made on the previous version of bootstrap and should be updated to bootstrap 4.

Types of accepted files

3: Major Usability Problem
Adding a supported file type would be helpful for new user -both upload page and web capture page
Screen Shot 2022-08-18 at 12 18 23
Screen Shot 2022-08-18 at 12 17 02
s

Run the experiment page: text description

2: Minor Usability Problem
Adding a text like "We have finished loading your .rpz file. Now, reproduce the work by clicking the “Run” button." would make it more clear.
Screen Shot 2022-08-18 at 12 34 50

Web capture page: text description

3: Major Usability Problem
Suggestion for text description on web capture page:

Select your own .rpz file, or try using one of the examples from the ReproZip-Examples repository, and click the "Upload" button!
From there, you can set parameters and (optionally) upload your own inputs, and then reproduce the work by clicking the "Run" button. You can then see the log and output files from the research, which you can view or download to your machine. You can also share the lin

Screen Shot 2022-08-18 at 12 46 20

k for other people to reproduce your research.

Pluggable execution

Originally opened 2017-08-17 14:29 EDT by @remram44

All that's needed for execution is a Docker daemon that can pull from our registry. Currently there is a Docker-in-Docker daemon running in the runner pod, but really that could be something else. It could also be possible to choose one of multiple possible backends depending on RPZ metadata and user selection (both on the 'reproduce' page and via user accounts).

Example execution backends:

  • Another Docker server (give reproserver host + certificate, from docker-machine)
  • AWS (give reproserver IAM credentials and it'll start an EC2 instance to run your experiments on)

It could also be possible to run this without Docker. ReproUnzip supports Vagrant, and Vagrant supports AWS

Share configured environments

Originally opened 2019-11-01 21:07 EDT by @remram44

It might be useful to share a link to an environment configured differently (e.g. different parameters and input files).

The input files are stored on S3 so stay accessible.

Alternatively we could make it possible to get the configuration from the results page.

Support for more data repositories - with a shared library?

Originally opened 2019-12-20 07:11 EST by @nuest

In the ReproServer-preprint you mention you want to support more data repositories. 💯 !

Looking at reproserver's code to download data from Zenodo and the one that repo2docker uses in its "contentprovider", I see a lot of similarities!

suppdata is a "DOI to data" package for R, which so far focused on supplemental data for papers' DOIs, but I'd like to extend it towards data repositories.

A background discussion is also here: ropensci-archive/doidata#1

What do you think about a generic "data download from DOI" package in Python that both ReproServer and repo2docker could use?

While uploading: no visibility of system status

3: Major Usability Problem
After clicking the upload button, the browser starts loading but there is no indication of the status on the website. Adding a loading icon or an overlay would be helpful.
Screen Shot 2022-08-18 at 12 30 50

Accept non-repository URLs and turn them into an upload

Currently:

  • Upload a file -> get a unique ID from ReproServer backed by the DB (example: /reproduce/r2ptb)
    • This URL doesn't persist if ReproServer data is lost
  • Provide a URL to a supported repository -> use a unique ID backed by the repository (uses the ID on the repository) (example: input https://osf.io/5ke97 -> /reproduce/osf.io/5ke97)
    • This URL
  • Provide a URL to anything else -> get an error "unrecognized URL"

I think if an unrecognized URL is provided, there should be a message explaining that it is not recognized as a persistent URL, but the file can still be used if it's a direct link to a RPZ. From there the user would be able to click a button to have ReproServer download the RPZ and create a unique ID for it, just like if it had been uploaded.

JSON API

Originally opened 2017-11-09 21:29 EST by @remram44

Current views:

  • /upload is POST-ing a file, redirects
  • /reproduce when build is not done: has HTML and JSON response
  • /reproduce when build is done: only HTML, JSON should have parameter list
  • /run is POST, gets form data, redirects. Should accept JSON
  • /results: only HTML, should return JSON results

We should separate the JSON endpoints under /api/ or similar. Also, do we need a public API?

Web capture: design changes

  • For hostname and port number, shorten the input box so they only extends to the middle of the screen size.
  • Put "Hostname:" and "Port number:" on top of the input boxes instead of left
  • Next to "Hostname", put "(optional) Enter the hostname the server is expecting" in italic on top of the input box. Text color should be #757575
  • Under the hostname input box, add "Example: localhost" in italic and #757575
  • Next to "Port number", put "Enter the port number that the web server listens on" in italic and #757575
  • Under the port number input box, add "Example: 3000" in italic and #757575

Screenshot 2023-04-07 at 15 20 19

How to scale ReproServer?

Originally opened 2019-11-14 22:36 EST by @remram44

Not a problem for now, but worth thinking about.

Running multiple web pods wouldn't work because specific ones would know about specific builds/runs.

Changing VM states to not trigger multiple builds might be good but still cause issues. Plus container restart would have all reproservers watching all running pods.

I think the best option is probably to keep one reproserver per cluster, and just run multiple clusters. This is also an opportunity for federation, e.g. there could be "nyu reproserver" and "umass reproserver" and links would point to one specific server and everything would work. It could also be used to have institutional instances providing better quotas to their own employees.

Adding another home page

Suggestion
It is unclear especially to nontechnical users what the whole process is. Making another home page that explains the process, and having an option to go to either web capture or upload would be helpful.
Screen Shot 2022-08-18 at 12 50 30

Add web capture description

Could you add the following paragraph instead of lorem ipsum on this page?
[https://staging.server.reprozip.org/web]

After generating a ReproZip bundle (.rpz) file that contains the server-side elements of your app, you can archive the client-side assets using Web Capture.
Upload your .rpz file below and start crawling.

It will look something like this

Screen Shot 2022-10-11 at 15 01 37

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.