Coder Social home page Coder Social logo

core-service's Introduction

Cognoma core-service

This repository, under the umbrella of Project Cognoma (https://github.com/cognoma), holds the source code, under open source license, of a runnable django rest API, a component in the overall system specified in Project Cognoma.

Getting started

Make sure to fork this repository on +GitHub first.

Prerequisites

  • Docker - tested with 1.12.1
  • Docker Compose - tested with 1.8.0

Starting up the service

docker-compose up

Sometimes the postgres image takes a while to load on first run and the Django server starts up first. If this happens just ctrl+C and rerun docker-compose up

The code in the repository is also mounted as a volume in the core-service container. This means you can edit code on your host machine, using your favorite editor, and the django server will automatically restart to reflect the code changes.

The server should start up at http://localhost:8080/, see the API docs.

Swagger UI

Accessing the root API endpoint (ex: http://localhost:8080/) will bring up the Swagger UI for viewing the API.

Note: swagger will only display endpoints that you are authorized to view. In order to authenticate, go to the top right corner and click Authorize. Where it says api_key, type in Bearer <your_random_slug_here> and hit enter to authenticate for the rest of the session.

Running tests locally

Make sure the service is up first using docker-compose up then run:

docker-compose exec core python manage.py test

Loading cancer static data

To load data, again with service up run:

docker-compose exec core bash
python manage.py acquiredata
python manage.py loaddata

To verify, run curl http://localhost:8000/diseases/ to get a list of all diseases.

Or, run curl http://localhost:8000/samples?limit=10 to view data for 10 samples.

Deployment

Prerequisites

This project is deployed within the Greene Lab AWS account. To be able to deploy this project you will need to:

  1. Be invited to the account.
  2. Receive an AWS access key and secret key.

Logging Into ECR

This project leverages AWS Ec2 Container Service (ECS). ECS provides a private container registry called the Ec2 Container Repository (ECR). To be able to push Docker images to this repository you will first need to get a login with:

aws ecr get-login --region us-east-1

and then run the output of that command. It will look something like:

docker login -u AWS -p <A_GIANT_HASH> -e none https://589864003899.dkr.ecr.us-east-1.amazonaws.com

Building, Tagging, and Pushing the Container

This project uses two containers: one for Nginx and one for the core-service. You will probably be deploying only the core-service unless you have modified config/prod/nginx.conf.

Core Service Container

Run these commands:

docker build --tag cognoma-core-service .
docker tag cognoma-core-service:latest 589864003899.dkr.ecr.us-east-1.amazonaws.com/cognoma-core-service:latest
docker push 589864003899.dkr.ecr.us-east-1.amazonaws.com/cognoma-core-service:latest

Nginx Container

Run these commands:

docker build --tag cognoma-nginx --file config/prod/Dockerfile_nginx .
docker tag cognoma-nginx:latest 589864003899.dkr.ecr.us-east-1.amazonaws.com/cognoma-nginx:latest
docker push 589864003899.dkr.ecr.us-east-1.amazonaws.com/cognoma-nginx:latest

Restarting the ECS Task

Navigate to Cognoma's ECS Tasks Page and select the tasks corresponding to the container you are deploying. The tasks will have a Task Definition like either cognoma-core-service:X or cognoma-nginx:X which can be used to determine which are the correct tasks. Once you have selected the correct tasks click the Stop button. This will cause the tasks to be stopped and ECS will restart them with the new version of the container you have pushed. Therefore you're now done.

Updating the Data

Take a look at api/management/commands/acquiredata.py. In there are two commit hashes, one for most of the data called COMMIT_HASH and one just for genes data called GENES_COMMIT_HASH. Update these hashes to the hash of the data that you want to update to. Once you've done so you should redploy the core-service. Once the core-service has been redeployed you should ssh onto the EC2 instance next you should run docker ps to get a list of containers running on that instance. Find the name for the core-service container and run docker exec -it <that_name> /bin/bash. Within that shell run the following commands:

python3 manage.py acquiredata
python3 manage.py loaddata

If these complete successfully then the data has been downloaded and loaded into cognoma's database.

core-service's People

Contributors

abeedvisram avatar aelkner avatar amrox avatar awm33 avatar cgreene avatar dcgoss avatar dhimmel avatar kurtwheeler avatar ramenhog avatar stephenshank avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

core-service's Issues

Endpoint/Models for "samples/examples"

Researchers will select which samples they want to include in the analysis. From a machine learning point of view, we mean which examples are relevant to the researcher. These samples will have various metadata. The GDC Data Portal [ https://gdc-portal.nci.nih.gov/search/s ] has a very nice interface for these metadata. Essentially the facets on the left for "cases" are the same ones that we would expect to be relevant here.

Switching to conda to manage the Python environment

Currently (bee0519), django-python specifies using virtualenv to manage the environment. I'm wondering if it makes sense to switch to conda.

I've been using conda for managing my Python environment for over a year and it's pretty awesome. It does a great job quickly installing advanced scientific libraries.

If people think this makes sense, I'm happy to submit a pull request. We should also choose which version of python we would like to use. I think Python 3.5.1 is the natural choice, but conda makes switching your python version easy.

Finally, does anyone have advice on whether we should use a project wide environment or whether each component repository should specify its own environment?

Add filter to and field selection to /samples

In response to the needs presented in cognoma/cancer-data#29 we need to add the ability for the frontend to get a list of samples related to a given mutated gene.

Example:
/samples?mutations__gene=1234&mutations__status=true&&fields=sample_id

Looks like the SearchFilter class supports filtering related models. The dynamic fields library can be used to filter fields. Note: Even though the examples use ModelSerializer, it should work with the regular Serializer used in this repo.

Update analysis finished email

Update text to read:

Your Cognoma classifier is complete. The results are available as a Jupyter notebook. Please see classifier_120.ipynb.

This classifier predicts mutations in the following genes: XXX, XXX, XXX. Cancers of the following types were included: XX, XX, XX. In total, XX of XX cancers were mutated for at least one of the query genes.

See cognoma/frontend#132 for more details

Consolidate core-service and task-service

Following discussion on cognoma/task-service#17 and with @cgreene, I believe it makes sense to move forward with consolidating core-service and task-service. I am currently wrapping up a PR which does just that, by means of transferring the relevant TaskDef and Task columns from task-service directly onto the Classifier in core-service.

Gene Search

The Cognoma team would like to provide some search functionality to assist the user during gene selection.

Options include:

  • Simple LIKE operation on the field(s) to be searched
  • Postgres full text search
  • Elasticsearch

The Greene Lab has already used elasticsearch with the django-genes model that we're including in this project. They used haystack which is a django elasticsearch library.

There is a haystack library for django rest framework.

If the search is simple, or doesn't need full text capabilities, then LIKE is the simplest solution.

Postgres full text search would mean one less moving part. Here is a good blog post on setting it up in the django rest framework

Elasticsearch has already been used with this model and provides advanced search capabilities, but would mean running another server. AWS does have a hosted elasticsearch solution, and setup can be pretty simple. I've also used elasticsearch for a lot of things, including searching medical records.

I think we should start by getting some examples.

@cgreene @dhimmel @gwaygenomics Can you provide some examples queries? Literally what a user would be typing, along with how you think it would connect to the gene model, like "this person is typing in part of a standard name" or something similar. CC'ing @BobMiller @bdolly

Jobs fail due to memory usage

When fitting models on all disease types, it's common for the job to exceed its memory allotment and fail.

We can increase the instance size as a first step. If that becomes cost prohibitive, we can consider changes to our dask-searchcv configuration. Currently, we use the default cache_cv=True:

Whether to extract each train/test subset at most once in each worker process, or every time that subset is needed. Caching the splits can speedup computation at the cost of increased memory usage per worker process. If True, worst case memory usage is (n_splits + 1) * (X.nbytes + y.nbytes) per worker. If False, worst case memory usage is (n_threads_per_worker + 1) * (X.nbytes + y.nbytes) per worker.

This really speeds things up, so setting cache_cv=False would not be ideal.

Algorithms Model/Endpoint

Cognoma will need to be able to return a list of the supported algorithms, as well as the characteristics of those algorithms relevant for the algorithm selector.

docker-compose up error on Windows

I'm using docker for windows on Windows 10. The "hello world" on docker for windows works fine but when I try to run "docker-compose up" from this repo it doesn't seem to work. I posted the error message below. This is likely user error but I figured I'd post this issue in case anyone has seen this before or has any advice. Thanks in advance.

Starting coreservice_core_db_1 ...
Starting coreservice_core_db_1 ... done
Starting coreservice_core_1 ...
Starting coreservice_core_1 ... done
Starting coreservice_nginx_1 ...
Starting coreservice_nginx_1 ... done
Attaching to coreservice_core_db_1, coreservice_core_1, coreservice_nginx_1
core_db_1 | LOG: database system was shut down at 2017-09-01 22:13:40 UTC
core_db_1 | LOG: MultiXact member wraparound protections are now enabled
: invalid optionn/bash: -
core_db_1 | LOG: database system is ready to accept connections
core_1 | Usage: /bin/bash [GNU long option] [option] ...
core_db_1 | LOG: autovacuum launcher started
core_1 | /bin/bash [GNU long option] [option] script-file ...
core_1 | GNU long options:
core_1 | --debug
core_1 | --debugger
core_1 | --dump-po-strings
core_1 | --dump-strings
core_1 | --help
core_1 | --init-file
core_1 | --login
core_1 | --noediting
core_1 | --noprofile
core_1 | --norc
core_1 | --posix
core_1 | --rcfile
core_1 | --restricted
core_1 | --verbose
core_1 | --version
core_1 | Shell options:
core_1 | -ilrsD or -c command or -O shopt_option (invocation only)
core_1 | -abefhkmnptuvxBCHP or -o option
coreservice_core_1 exited with code 2

Automated Testing

We should set up a continuous integration framework to automatically run tests. I am at a conference and study section this week but can look into this next weekend, or if someone else wants to snag this task - feel free to assign to yourself!

Need to use an application server like uwsgi

Hi team,

One thing I noticed is the backend is using the built in development server that comes with Django.
It's not suitable for production environment, you can get a better performance by using app server like uwsgi.

Another thing, nginx might not be needed if you use uwsgi directly without using static files like images and css in the backend. It looks like this app is used only as an api end point.

If you need to modify any headers in the returned request, you can have that in a middle ware, and have all the requests have any headers you want.

--
Ahmed

Index user.random_slugs

The user.random_slugs field is searched every time a request authenticates a user. Because that operation is run so often, this field should be indexed. The field is a postgres array type, an array of strings. It looks like the best way to index this is using gin.

This currently uses the __contains operation built into django. Will operation take advantage of an index? If not, we need to somehow use one that will, even if it means passing raw SQL. Most querysets (all?) expose a query property that can be used to see what sql was run, passing this to EXPLAIN in postgres will tell you if the index is being used.

Authentication?

Do we need auth? If so should we use sessions, tokens, JWT, or a combo? The type will depend on what platforms cognoma can be accessed from (browser, app, etc).

Development environment requirements

So far...

python 3.5
django 1.10 (needs verification)
postgres 9.5+ (needs verification)

We will finalize and make a md or yml doc for env req/dep

python manage.py loaddata is killed mysteriously while loading mutations

While attempting to solve #56 (comment) @dhimmel and I ran python manage.py loaddata on one of the EC2 instances within the running Docker container. The script got Killed mysteriously. Here is the output of said script:

root@core-service:/code# python manage.py loaddata                  
Loading mutations table...
Processing 1000 rows so far
Processing 2000 rows so far
Processing 3000 rows so far
Processing 4000 rows so far
Processing 5000 rows so far
Processing 6000 rows so far
Processing 7000 rows so far
Bulk loading mutation data...
Killed
root@core-service:/code# $?
bash: 137: command not found

We researched what error code 137 means and it appears to be a SIGKILL with priority 9. We cannot determine what would be sending that signal. @dhimmel thinks it may be caused by running out of memory, however we monitored the memory usage during the execution and it didn't exceed 23% of the memory. We tried multiple times to execute this command, some of which it died before getting as far as it did in the above output.

This is the relevant code block where the command is getting murdered.

We used the API to inspect the number of diseases, samples, and genes and those tables all seem to have been populated successfully.

@awm33 @stephenshank any ideas?

Determine branching pattern.

We should develop a common branching pattern with the front end so that someone working on one repo will be familiar with the patterns on the other.

/genes/ sluggish response

Accessing the /genes/ endpoint takes a long time, because the server has to process 100 gene objects with all of its related mutations. This can be resolved by reducing pagination page sizes.

Checking In

Hey all,

I've been off the grid for a few days now, I promise I will catch up. Due to some trips planned months in advance, I will be unable to attend the Tuesday meeting for the next three weeks. I will be sure to check for the updates and do my best to contribute remotely.

Cheers!
Derek

Rename the repo?

We started referring to this as the "core" or "core api". There was also the "brain" but I think that would be confusing in a ML project :).

Some ideas:

  • cognoma/core
  • cognoma/core-backend
  • cognoma/core-api
  • cognoma/core-service
  • cognoma/api

Maybe others can post some suggestions?

Improve error handling from ml-worker

When ml-worker fails processing a classifier, it hits the fail endpoint and that's that. There is no information stored about what type of error it was or what caused it. Furthermore, when a user is emailed about the error it just says that it failed, with no reasons why.

There should be two fields added to the classifier that store errors provided by ml-worker when classifier processing fails:

  • fail_reason: string of title of error, example: memory_error or processing_error
  • fail_message: string providing explanation of fail_reason, maybe a traceback

This information should be included in failure emails as well.

Gene Selection API needs/isssues

Here are the three methods the UI will use for the user to create the desired gene list. Each has its own data/API issues.

Gene Selection - direct selection

User needs to select genes from list of 20,000 genes. User will enter the first few letters of the gene identifier, a listbox will be populated containing only genes matching those letters.
Here are three possible options for doing this:

  1. UI uses API requests full gene list (20,000), server sends it to UI. UI is responsible for filtering list to display. This approach may tax UI resources, and be too slow for a smooth interface.
  2. UI uses API to request only genes starting with a string of letters, server returns a filtered list to UI. This process will happen multiple times as user thins list. Quick response from the server is critical for smooth performance.
  3. UI uses API to request full ordered gene list, server returns not only full list, but also an index of the list. The index will contain the stating location of every 2 letter combination. The UI would then use the index to quickly filter the full gene list and populate the listbox.

Gene Selection - selection by path

In selection by path, a listbox is populated by known pathways, the user selects a path, a second listbox is populated containing only those genes associated with the selected path. The user then selects some or all of the desired genes from this second listbox.

To initialize this screen, an API will be needed to request the path list. The server should return the list of know paths.

Once a user selects a path, an API is needed to request only those genes associated with this path. The server should return a gene list.

Gene Selection - custom query

This is the least defined of the three gene selection methods. In this method, the UI will provide the user with a query building gui in which the user will construct a custom gene selection query. An API will submit the custom query/criteria to the server. The server will return a gene list.

Swagger Docs

If we document the API using swagger, generating docs and API clients would be easy.

Internal API Auth

This issue is needed by task-service as well. Basically, we internal processes, such as workers or the services, to be able to talk to each other and make POST/PUT calls. The current UI only lets users do this, and POST/PUTs to the task-service will not be allowed from the public.

The most basic thing would be to just hardcode something in the environment. But I think digitally signed tokens / JSON Web Tokens using a public/private key pair would be best.

The signed tokens would be created using a python script and access to a private key. The running services would have access to a public key to verify tokens and each would have a token.

Failed to hit internal service for: post /classifiers/143/upload/

On February 16, I submitted a classifier which failed. The title of the email was "Cognoma Classifier 143 Processing Failure" and the body was:

An error has occurred and your classifier could not be processed.
Error: Failed to hit internal service for: post /classifiers/143/upload/
Support is available at https://github.com/cognoma.

Loading cancer static tables

Ideally some sort of script that can load the data on the local machine and in production.

  • genes - comes django-genes
  • organisms - comes django-organisms
  • diseases
  • samples
  • mutations

Status/'y' selector

This is a very brief description that we'll need to flesh out more, particularly with help from @gwaygenomics:

Cognoma will provide the opportunity to construct a supervised machine learning model that predicts a feature of interest. For example, mutations of genes in a specific pathway.

We anticipate that this endpoint will allow the user to specify a set of genes and samples. The endpoint would return the number of samples that contain alterations within that set.

Task creation and expansion

When a classifier is created, we need to queue a task in the task server. When classifier objects are retrieved or listed using the REST API, the task should be available for expansion as a child.

Looking at using this RemoteField library, or potentially rolling my own since this doesn't like compabilable with djangorestframework-expander, which works well. Might be a good opportunity for a mini open source lib of my own. Would really help for using django rest framework for microservices.

@dbolly @BobMiller Is the frontend going to keep the classifier in-memory before it's ready to go, or persist it before then? If it needs to be persisted before then, we need to track the frontend state with a new field. Basically, if it's ready to be queued, but you may want to track more states for the wizard like UI, so the user can pick up where they left off.

CORS settings need adjusted for local dev environment

@dcgoss the front-end in a local dev environment runs from http://locahost:3000 so requests to the core-service running locally via docker at http://localhost:8080 are not from the same origin and I think it's causing CORS issue with the POST /classifier request.

Angular sends a pre-flight OPTIONS request that is throwing the error below preventing the POST request to go through
screen shot 2017-07-11 at 1 21 25 pm
screen shot 2017-07-11 at 1 21 44 pm

[HackNight] 08/09/16 Meeting minute

Note: This is an intent to formalize hacknights outcomes and todo list
Note2: I assume this project is the (maybe temporary) placeholder for backend development

HackNight reference: http://www.meetup.com/DataPhilly/events/233070705/

The following points were mentioned. Some related issues are already existing, while others aren't.
All attendants are warmly encouraged to discuss this summary

  • New architecture proposal (please update the picture)
  • Need for specification: Front-end API (interface between the JavaScript UI and the "backend") Link to issue.
  • Need for specification: Asynchronous Task Queue (ATQ) (interface between the django backend and the ML module(s))
  • Need for specification: Machine Learning API (interface between the ATQ and the Machine Learning module(s))
  • Need for specification: Containers deployment (to start with, a basic microservice breakdown through docker). We need to draw the rough lines of which software module goes where (i.e is an independently deploy-able unit, with its own - and specifed - APIs)
  • Need for diagram: Database layout. The purpose of this diagram is only to map data to modules. At this time we can cope with imprecision about the scale of data.
  • Need for directive: How to contribute to django-cognoma. Link to issue

Next meeting: in 2 weeks [please confirm / update with exact date, time and location]

Notebook upload fails when user has no email

If there is no email associated with the user on a classifier the notebook upload requests will fail. This is problematic for ml-workers.
The correct behavior should just be to fail silently.

GET from web browser location bar 500

When a user makes a request user the browser location bar or a link (eg to view in the browser, not JS) the request 500s.

It looks like the renderer needs to be explicitly setup.

From @dhimmel

Maybe it could also be smart enough to add newlines when queried by a browser.

If there's a plugin or some easy way to do that, go with it, but idk if we want to write a custom django rest framework renderer to do it. You would need to detect a browser based on the Agent or that text/html has a higher priority than application/json.

Django rest framework does support the indent parameter in the Accept header.

api.cognoma.org not returning accurate GET responses

@dhimmel @dcgoss @cgreene @awm33
Hey guys when the front-end makes any GET requests to api.cognoma.org it is returning the same response without any data being populated.

EXAMPLE:
GET https://api.cognoma.org/samples?disease=ACC
RESPONSE 200 OK {count: 0, next: null, previous: null, results: []}
It should return the disease data with an array of mutations and the count should be the number of samples per disease as per the API DOCS, this data is used in calculations in the front-end

This same null response is appearing for other GET requests such as
GET https://api.cognoma.org/samples?limit=1&disease=ACC&mutations__gene=770&mutations__gene=767&mutations__gene=4871 which is used to calculate the positive and negative for Mutations in each Disease Type

Fail to run migrations from scratch using Django management, but all tests run

Been running into a few issues this evening trying to run the migrations for this project. When running locally (using Postgres running on my laptop) or via Docker, I run into the following issue

Starting coreservice_core_db_1
Recreating coreservice_core_1
Recreating coreservice_nginx_1
Attaching to coreservice_core_db_1, coreservice_core_1, coreservice_nginx_1
core_1     | + python manage.py migrate -v3 --no-input
core_db_1  | LOG:  database system was shut down at 2017-01-18 01:56:53 UTC
core_db_1  | LOG:  MultiXact member wraparound protections are now enabled
core_db_1  | LOG:  database system is ready to accept connections
core_db_1  | LOG:  autovacuum launcher started
core_1     | Operations to perform:
core_1     |   Apply all migrations: api, contenttypes
core_1     | Running pre-migrate handlers for application contenttypes
core_1     | Running pre-migrate handlers for application rest_framework
core_1     | Running pre-migrate handlers for application api
core_1     | Running pre-migrate handlers for application organisms
core_1     | Running pre-migrate handlers for application genes
core_1     | Running migrations:
core_1     |   Rendering model states... DONE (0.004s)
core_db_1  | ERROR:  relation "genes_gene" does not exist
core_db_1  | STATEMENT:  ALTER TABLE "mutations" ADD CONSTRAINT "mutations_gene_id_6289b708_fk_genes_gene_id" FOREIGN KEY ("gene_id") REFERENCES "genes_gene" ("id") DEFERRABLE INITIALLY DEFERRED
core_1     |   Applying api.0001_initial...Traceback (most recent call last):
core_1     |   File "/usr/local/lib/python3.5/site-packages/django/db/backends/utils.py", line 64, in execute
core_1     |     return self.cursor.execute(sql, params)
core_1     | psycopg2.ProgrammingError: relation "genes_gene" does not exist
core_1     | 
core_1     | 
... <truncated>

When running the tests locally the migrations all run successfully, and all the tables are created:

$ ./manage.py test -v3
Creating test database for alias 'default' ('test_cognoma_core_service')...
Operations to perform:
  Synchronize unmigrated apps: rest_framework, organisms, staticfiles, genes, postgres
  Apply all migrations: contenttypes, api
Running pre-migrate handlers for application contenttypes
Running pre-migrate handlers for application rest_framework
Running pre-migrate handlers for application api
Running pre-migrate handlers for application organisms
Running pre-migrate handlers for application genes
Synchronizing apps without migrations:
  Creating tables...
    Creating table organisms_organism
    Creating table genes_gene
    Creating table genes_crossrefdb
    Creating table genes_crossref
    Running deferred SQL...
Running migrations:
  Rendering model states... DONE (0.011s)
  Applying api.0001_initial... OK (0.356s)
  Applying api.0002_alter_sample_fields... OK (0.048s)
  Applying api.0003_genes_mutations... OK (0.094s)
  Applying contenttypes.0001_initial... OK (0.031s)
  Applying contenttypes.0002_remove_content_type_name... OK (0.033s)
Running post-migrate handlers for application contenttypes
Adding content type 'contenttypes | contenttype'
Running post-migrate handlers for application rest_framework
Running post-migrate handlers for application api
Adding content type 'api | mutation'
Adding content type 'api | user'
Adding content type 'api | disease'
Adding content type 'api | sample'
Adding content type 'api | gene'
Adding content type 'api | classifier'
Running post-migrate handlers for application organisms
Adding content type 'organisms | organism'
Running post-migrate handlers for application genes
Adding content type 'genes | gene'
Adding content type 'genes | crossref'
Adding content type 'genes | crossrefdb'
test_cannot_update_other_user_classifier (api.test.test_classifiers.ClassifierTests) ... ok
... <truncated>

To get around this, and so that I could start playing around with the project this evening, I copied the schema from the test database and set that as my local DB. Obviously not a proper way of doing things, but meant that I could hack on the project a bit.

Has anybody else encountered this? I'm sure I'm missing something really obvious with regards to the different settings between running the unit tests and the management commands, but I can't seem to figure out what the issue is. Any help much appreciated.

Front end screen data needs

Here is a list of Front End Screens and there associated data needs.

Login Screen

  1. User Id
  2. Password
Note: User needs to be able to login anonymously.

Sample Chooser/ Status Chooser Screen

initial data input

  1. List of tissues
    1a) Tissue description (maybe?)
    1b Tissue graphic (maybe?)
  2. List of genes
    2a)Gene decription (maybe?)

Algorithm Chooser Screen
initial data input

  1. List of algorithms
    1a) Algorithm descrition (maybe?)
  2. Logic info about algorithms (TBD)

Job Submit screen
will send you all chosen data for job
1)selected tissues
2)selected genes
3)selected algorithm
4)user id

expect to receive back job status info for submitted job(s)

Question

while user is going from screen to screen selecting data, should we hold all data selection in browser session and send it to you when job request is submitted,

or

send you the results of each screen and use the data you have to populate the job Submit screen?

Upload endpoint to store completed notebook on classifier object

When a new classifier is created, core-service creates a new task which is then queued for processing by a ml-worker. When ml-worker finishes processing, it needs to upload the completed notebook back to core-service directly to the classifier.

This uploaded notebook should be stored by core-service as a file, eventually in S3.

Steps:

  • adding a notebook_file FileField to Classifier
  • updating the Classifier serializer to handle the new notebook_file field
  • new permissions to ensure only an internal service like ml-worker can upload a notebook
  • tests for the permissions and file uploading

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.