Coder Social home page Coder Social logo

edenciso / qsignals-automl-sagemaker Goto Github PK

View Code? Open in Web Editor NEW
3.0 1.0 1.0 76 KB

Predicting daily trading signals with Swarmode datasets + AWS SageMaker Auto-ML

Home Page: https://swarmode.ai

License: GNU General Public License v3.0

Jupyter Notebook 100.00%
machine-learning quantitative-finance trading-algorithms stock-prediction

qsignals-automl-sagemaker's Introduction

QSignals ML daily predictions with AWS SageMaker Auto-ML

Swarmode's Qsignals are daily predicted trading signals for US equities. In this experiment we are using AWS SageMaker Studio to build and run a Machine Learning model using the Auto-pilot feature to predict the number of good daily trading signals in an out-of-sample test dataset.

The experiment has two parts:

  1. Run the Auto-ML pilot job to train the model using the QSignals historical dataset in AWS Data Exchange, and deploy the model to a prediction SageMaker endpoint.

The ML model is trained using historical Qsignals data from 1/2/20 to 6/19/20. A train dataset sample is included in the Data Exploration notebook. The target attribute name in the train dataset for which SageMaker will make predictions is the column 'goodsignal'. Based on the dataset features, we expect the Auto-ML determine the best ML pipeline and algorithms to predict how many daily Qsignals are good to trade. Given the column values are binary: 'y' (goodsignal) and 'n' (not good signal), it is expected SageMaker will suggest a Binary Classification type of problem to model.

After running the Auto-ML pilot experiment, SageMaker created two Jupyter notebooks:

  • SageMakerAutopilotCandidateDefinitionNotebook: A list of ML pipeline candidates, algorithms, hyperparameter tuning, model selection and deployent.
  • SageMakerAutopilotDataExplorationNotebook: contains dataset analysis, training and validation ramdom data split, column analysis, and descriptive statistics.

As expected, after completing the Auto-ML job, SageMaker defined the problem as Binary Classification to maximize the Accuracy quality metric of the trained model. In this case, the Accuracy metric will provide the percentage of times the model predicted the correct class, which is 'y' values in the 'goodsignal' target column. After the Auto-ML pilot job has analyzed the training dataset, SageMaker is building the pipeline with 10 ML models and two algorithms: Xgboost and Linear-learner. The job takes aproximately 30 mins to complete all three stages: analyzing data, feature engineering, and model tunning so we can chose the best tuning job and deploy the model to the SageMaker endpoint.

  1. Run the model with out-of-sample data to predict the number of accurately predicted QSignals

The test dataset included predicted trading signals for one day. The Python code to upload the dataset, execute the SageMaker client and call the prediction SageMaker endpoint is contained in the QStrades_test1 Jupyter notebook.

Additionally, there's a couple of YouTube videos to demo the end-to-end experiment.

Demo Part 1: https://youtu.be/Ht1fHJL0qDw

Demo Part 2: https://youtu.be/Ln7JNw_vH4Q

Results

These are the observed out-of-sample test results after calling the model endpoint with a 2,192 Qsignals dataset.

Confusion Matrix

1820 | 0

0 | 372 <- true positives, or number of good Qsignals predicted for the day

Classification Accuracy = 1.0

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.