Coder Social home page Coder Social logo

jonathanscialpi / decentralized_corpus_manager Goto Github PK

View Code? Open in Web Editor NEW
5.0 2.0 3.0 6.64 MB

A CorDapp designed to facilitate the crowd sourcing of data used to build machine learning models.

Kotlin 87.05% JavaScript 7.01% HTML 1.23% Python 4.40% CSS 0.31%

decentralized_corpus_manager's Introduction

Corda

Decentralized Corpus Manager

A CorDapp designed to facilitate the crowd sourcing of data used to build machine learning models.

In my medium article I discuss:

  • What common issues data scientists face when trying to build these models
  • How I solved these issues using Corda
  • Why Corda is the right choice for decentralized machine learning applications

Starting the Nodes

From the /decentralized_corpus_manager/CorDapp/ directory:

  • Create your nodes by running ./gradlew deployNodes.
  • Start your nodes by running the build\nodes\runnodes command.

Starting the Spring Webserver

  • Build the Spring jar file by running ./gradlew clients::bootJar command fromt the /CorDapp directory.
  • Start the server with: java -jar .\clients-0.1.jar --server.port:8080 --config.rpc.host=localhost --config.rpc.port=10006 --config.rpc.username=user1 --config.rpc.password=test from the /CorDapp/clients/build/libs directory

Starting the Flask Server

  • If you don't have Flask (and using a Windows machine), watch this video.
  • Copy the Classification API code to your flask application.
  • Run the activate command from the MyProject/Scripts/ directory to start up your virtual environment.
  • Start the server with flask run

Using Anaconda to start the flask application

  • If you have Anaconda Prompt already installed, you can use anaconda prompt to start the flask application. When you open Anaconda Prompt, you should be in the home directory, which is C:\Users\John
  • Step 1: Create a Virtual environment
    • If you don't have virtualenv installed, run the command pip install virtualenv
    • Run the command virtualenv venv. Note that venv is the name of our virtual environment. Feel free to replace venv with a more suitable name.
    • To activate the virtual environment, navigate to the Scripts folder by typing cd venv\Scripts. After typing that command, type activate.
    • If you activate your virtual environment correctly, you should see something like: (venv) (base) C:\Users\John\venv\Scripts
  • Step 2: Running the Flask Application
    • We need to install flask. Now that you are in the virtual environment through Anaconda, run the command pip install flask
    • To see what version of flask you have, run the command flask --version
    • In the virtual environment, we need to create a directory called demoapp. Run the command mkdir demoapp while in (venv) (base) C:\Users\John\venv\Scripts
    • Open up a code editor. I used Sublime Text. Drag the folder you created in the previous step into the Sublime text window.
    • In Sublime Text, you see on the left hand side an icon called "Folders" with demoapp already there. Right click the folder and select New File.
    • Go to where you have the decentralized_corpus_manager folder in your computer. Click on the folder called flask and open the file called app.py.
    • Copy and paste that code in the new file you created in Sublime Text. Save that file as app.py.
    • Go to Anaconda and run the command flask run. If you execute this command correctly, you will see Running on 127.0.0.1:5000/. That is the localhost on your browser.
    • If you execute flask run and Anaconda wants you to install additional libraries, run pip install <libraryName> for each library you are told. For example, to install scikit learn and numpy, run pip install sklearn and pip install numpy respectively. Feel free to google the pip install command to make sure you install the library correctly.
    • Go to your browser and type localhost:5000 and you should see Hello from Flask>app.py. If you examine the app.py file carefully, you will see app.route() in the documentation. That tells you what to enter into the browser to execute certain functions.

Starting the Node Server

  • Run npm start from the decentralized-corpus-management\react_app directory

Spring Endpoints

  1. issueCorpus: An endpoint to issue a corpus using JSON.
  • @Param: corpus is a LinkedHashMap<String, String> where the Key is the data row and the value is the classification label
  • @Param: algorithmUsed is a String describing the type of algo used to produce the model
  • @Param: classificationURL is a String which represents the Flask endpoint where the classification report can be created from.
  • @Param: participants is the list (Strings) of CordaX500 names for each party included on the TX.
  1. updateCorpus: An endpoint that allows a user to propose a new corpus for the model with the intent to improve it.
  • @Param: proposedCorpus is a LinkedHashMap<String, String> where the Key is the data row and the value is the label.
  • @Param: corpusLinearId is the LinearPointer used to query for the corpus state.
  1. updateClassificationURL: An endpoint for strictly modifying the URL that is used to build the classification report.
  • @Param: newURL is a String representing the new endpoint used to produce the classification report.
  • @Param: corpusLinearId is the LinearPointer used to query for the corpus state.
  1. transferOwnership: Only the owner of a corpus can "close" it or update its classification URL. This endpoint allows an owner to re-assign the ownership of a corpus.
  • @Param: newOwner is the party representing the new owner of the corpus.
  • @Param: corpusLinearId is the LinearPointer used to query for the corpus state.
  1. closeCorpus: An endpoint for the corpus owner to prevent any further changes to a corpus by pointing to the exit state.
  • @Param: corpusLinearId is the LinearPointer used to query for the corpus state.
  1. issueCorpusWithCSV: This endpoint gives the user has the option of using a CSV file delimited by "|" to use as a corpus for corpus state creation.
  • @Param: csvFile is a multipart file that is a "|" delimited utterance (data|label).
  • @Param: algorithmUsed is a String describing the type of algo used to produce the model
  • @Param: classificationURL is a String which represents the Flask endpoint where the classification report can be created from.
  • @Param: participants is the list (Strings) of CordaX500 names for each party included on the TX.
  1. corpusLookup: Retrieve the most recent version of a corpus state.
  • @Param: corpusLinearId is the LinearPointer used to query for the corpus state.

Postman Collection

Import the sample Postman requests

If you don't have Postman installed, go to this link. Follow the download instructions and open Postman.

Once you open Postman, click Import on the top left hand corner and navigate to the folder decentralized_corpus_manager/postman/postman-collection.json. If you do it correctly, you should see 9 requests with two GET requests in green while the rest yellow.

You should be ready for the demo video. We hope you enjoy learning about machine learning with Corda!

Demo Video

IssueCorpus, UpdateCorpus, IssueCorpusWithCSV

decentralized_corpus_manager's People

Contributors

clintonio avatar jonathanscialpi avatar florian-f avatar adagys avatar mikehearn avatar chrisr3 avatar dhrudevalia avatar kasiastreich avatar shamsasari avatar karkir0003 avatar caismanai avatar nimmaj avatar exfalso avatar rick-r3 avatar r3domfox avatar jxilt avatar dependabot[bot] avatar m4ksio avatar davidleeuk avatar agoldvarg avatar andr3ej avatar anthonykeenan avatar cxyzhang0 avatar zkiss avatar cburlinchon avatar peterli-r3 avatar szymonsztuka avatar tb-pq avatar

Stargazers

 avatar Zoe Braiterman avatar Amol Pednekar avatar Miguel Vaz avatar Karthik (Kirito) avatar

Watchers

James Cloos avatar  avatar

decentralized_corpus_manager's Issues

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.