Coder Social home page Coder Social logo

aws-samples / amazon-sagemaker-stock-prediction-archived Goto Github PK

View Code? Open in Web Editor NEW
128.0 128.0 70.0 14.02 MB

Workshop to demonstrate how to apply NN based algorithms to stock market data and forecast price movements.

License: MIT No Attribution

Jupyter Notebook 58.04% Dockerfile 1.02% Shell 0.89% Python 40.05%

amazon-sagemaker-stock-prediction-archived's Introduction

Stock Prediction using Neural Network on Amazon SageMaker

License Summary

This sample code is made available under a modified MIT license. See the LICENSE file.

Introduction

This is a sample workshop that demonstrates how to use a neural network-based algorithm for time series prediction. The workshop uses stock market data maintained by Deutsche Börse and made available through the Registry of Open Data on AWS. This dataset contains stock movement data from over 100 stocks traded on the Frankfurt Stock Exchange and is updated by the minute. Data is available starting from July 2016.

Time series data can be analyzed using a variety of techniques, including a simple multilayer perceptron (MLP), a stacked recurrent neural network (RNN), and forecasting methods such as Autoregressive Integrated Moving Average (ARIMA) or Exponential Smoothing (ETS). As a first attempt, we'll use a simple Recurrent Neural Network (RNN) model to predict the price of a single stock.

Action Plan

Amazon SageMaker is the Machine Learning platform on AWS that provides infrastructure to run hosted Jupyter Notebooks. Amazon SageMaker is integrated with other storage and analytics services on AWS to make the essential data management tasks for a successful Machine Learning project secure, scalable and streamlined.

SageMaker

In this workshop, we'll first use Amazon SageMaker-hosted notebooks to fetch the data from Deutsche Börse dataset, clean it, and aggregate it in Amazon S3 buckets.

In addition to hosted notebooks, Amazon SageMaker also provides managed training and hosting for machine learning models, using a variety of languages and libraries. Once we've prepared the data and stored it in Amazon S3, we'll use this functionality to containerize the machine learning training and prediction code, publish it on an Amazon Elastic Container Registry (Amazon ECR) repository, and host our custom model behind an Amazon SageMaker endpoint to generate predictions.

Amazon SageMaker also provides several built-in algorithms for image classification, regression, clustering of structured data, time series processing, and natural language processing. In the latter part of this workshop, we'll use DeepAR, which is a supervised learning algorithm for forecasting one-dimensional time series using RNN.

Disclaimer

This workshop is not an exercise in statistical methods, nor does it attempt to build a viable stock prediction model that you can use to make money. However, it does showcase the machine learning techniques that you can use on AWS.

1. Getting started

Since you will execute most of the workshop steps on a Jupyter Notebook hosted on Amazon SageMaker, start by creating a notebook instance on Amazon SageMaker from the AWS Console.

Refer to the AWS Region Table to check the availability of Amazon SageMaker, and choose to create the following infrastructure in any of the regions where it is available.

As of re:Invent-2018, Amazon SageMaker is available in the following regions:

  • us-east-1 (Northern Virgina)
  • us-east-2 (Ohio)
  • us-west-1 (California)
  • us-west-2 (Oregon)
  • ca-central-1 (Canada)
  • eu-west-1 (Ireland)
  • eu-west-2 (London)
  • eu-central-1 (Frankfurt)
  • ap-northeast-1 (Tokyo)
  • ap-northeast-2 (Seoul)
  • ap-southeast-1 (Singapore)
  • ap-southeast-2 (Sydney)
  • ap-south-1 (Mumbai)
  • us-gov-west-1 (AWS GovCloud)

1.1. Lifecycle configuration

Lifecycle configurations are small scripts, that you can use to automate certain tasks when a notebook instance is being created and/or being started. For this workshop, create a startup script to download pre-built notebooks from this Github repository onto your notebook instance.

Configure this script to run on Create notebook.

#!/bin/bash
set -e
git clone https://github.com/aws-samples/amazon-sagemaker-stock-prediction.git
mkdir SageMaker/fsv309-workshop
mv amazon-sagemaker-stock-prediction/container SageMaker/fsv309-workshop/container/
mv amazon-sagemaker-stock-prediction/notebooks SageMaker/fsv309-workshop/notebooks/
mv amazon-sagemaker-stock-prediction/images SageMaker/fsv309-workshop/images/
rm -rf amazon-sagemaker-stock-prediction
sudo chmod -R ugo+w SageMaker/fsv309-workshop/
Step-by-step instructions (expand for details)

  1. In the AWS Management Console choose Services then select Amazon SageMaker under Machine Learning.

  2. Choose Lifecycle configurations under the section Notebook on the left panel. Lifecycle configurations

  3. Choose Create configuration to open the create dialog.

  4. Type the name fsv309-lifecycle-config in the Name field.

  5. In the tab Create notebook, type or copy-paste the Create Notebook script from above. Create notebook script

  6. Finish configuration by clicking Create configuration.

Note: If you don't create a lifecycle configuration or attach the configuration to your notebook instance, you can always run the above commands directly into a Terminal window, from within your instance's Jupyter console.

1.2. Notebook instance

  1. Use the lifecycle configuration to create a notebook instance in the region of your choice.

  2. Choose a small instance class, such as ml.t2.medium. Since you won't use this notebook instance to execute training and prediction code, this will be sufficient.

  3. If you do not have an AWS Identity and Access Management (IAM) role created prior with all the necessary permissions needed for Amazon SageMaker to operate, create a new role on the fly.

  4. The IAM role you choose to use with the notebook needs to be authorized to create an Amazon ECR repository and upload an container image to the repository. Therefore add the following permissions to the IAM Role that you'll be using for your Notebook instance:

    • ecr:CreateRepository
    • ecr:InitiateLayerUpload
    • ecr:UploadLayerPart
    • ecr:CompleteLayerUpload
    • ecr:PutImage
  5. Optionally you can choose to place your instance within a VPC and encrypt all data to be used within notebook to be encrypted. For the purpose of the workshop, you can proceed without doing this.

Step-by-step instructions (expand for details)

  1. In the AWS Management Console choose Services then select Amazon SageMaker under Machine Learning.

  2. Choose Notebook instances under the section Notebook on the left panel. Notebook instances

  3. Choose Create notebook instance to open the create dialog.

  4. Type the name fsv309-notebook in the Name field.

  5. From Notebook instance type dropdown, choose ml.t2.medium.

  6. From IAM role dropdown, choose Create a new role.

  7. In the dialog that pops up, select the radio button for Any S3 bucket, . Notebook instance IAM role

  8. Choose Create Role to return to the notebook creation dialog. Notice that Amazon SageMaker creates a new execution role with the current timestamp appended at the end of its name, and that this role remains selected under IAM role dropdown.

  9. From the Lifecycle configuration dropdown, choose the configuration named fsv309-lifecycle-config, that you created in section 1.1.

  10. Immediately below the IAM Role field, you should see a success message in a green message box, with the name of your newly created IAM role displayed as a hyperlink. Click on the hyperlink to open the role in IAM console in a new browser tab.

  11. From the IAM console page that opens in a new browser tab displaying the role summary, choose Add Inline policy Notebook instance setting

  12. On the Create policy page, click on Choose a service Notebook instance setting

  13. In the suggestive search box, type "EC2", to have the list of displayed service filtered down, then choose EC2 Container Registry from the narrowed down list. Notebook instance setting

  14. Under the Actions section, expand Write Access level

  15. Select actions - CreateRepository, InitiateLayerUpload, UploadLayerPart, CompleteLayerUpload and PutImage Notebook instance setting

  16. Under the Resources section, click on the text displaying You chose actions that require the policy resource type.

  17. Choose All resources radio button under Resources section. Notebook instance setting

  18. Choose Review policy at the bottom right-hand corner of the screen.

  19. On the review screen, ensure there are no errors or warnings displayed.

  20. Type a name of the policy in Name field, Choose a meaningful name, such as ECRUpload.

  21. Choose Create policy at the bottom right-hand corner of the screen. Notebook instance setting

  22. Close the browser tab to return to the previous tab for the Amazon SageMaker console.

  23. Leave the VPC selection and Encryption Keys empty for the purpose of this workshop, and choose Create notebook instance to finish creation. Notebook instance creation dialog

  24. You'll be returned to the list of notebooks, with the status of current notebook shown as Pending. Wait until the status changes to InService before proceeding to the next section. Notebook instance creation status

  25. When the status of your notebook shows InService, click on the Open Jupyter link under Actions column to open the Jupyter notebook on your instance and proceed to the following sections of this workshop. Notebook instance ready

2. Data preparation

The Deutsche Börse Public Data Set consists of trade data aggregated at one-minute intervals. While such high-fidelity data could provide excellent insight and prove to be a valuable tool in quantitative financial analysis, for the scope of this workshop, it would be more convenient to work with data aggregated at a larger interval rate, such as daily and hourly intervals.

Moreover, the source dataset is organized into hierarchical S3 bucket prefixes, according to date and time and the data contains some missing days, hours, either due to non-trading windows, or due to errors in data collection. In the dbg-data-preparation notebook, you'll download raw data from source for an interval of your choosing, resample the data at hourly and daily intervals, and upload to your own S3 bucket.

Within this notebook, you'll also find code to that you can use to grab the cleaned data directly from an S3 bucket maintained for this workshop. This alternative will save you time because you do not have to execute code to obtain data from the source and cleanse it yourself. In order to use the second option, execute the cells in the notebook from section 2.5 onward.

Whichever way you choose, proceed to obtain the data by executing code in dbg-data-preparation from your Amazon SageMaker notebook instance, and come to the next section of this readme when finished.

3. Data analysis

After we prepared the data, we did some preliminary analysis and observed that :

  • Minimum and maximum prices during an interval are possible indicators of closing Price, in that during an upward trend of prices, closing price is closer to maximum price, whereas during a downward trend it is closer to minimum price.
  • Minimum and maximum price during an interval are possible indicators of opening Price, in that during an upward trend of prices, opening price is closer to minimum price, whereas during a downward trend it is closer to maximum price.
  • Opening price during an interval is possible indicator of closing Price, in that during an upward trend of prices, closing price is above opening price, whereas during a downward trend it is below opening price.

The insights above are useful because while predicting closing price of stocks, these insights indicate that we could use these other metrics as determining features that has influence on the target metric. We'll use this insight when we build the deep neural network models in next two sections.

As one would imagine, individual stocks' movement doesn't exist in vacuum. Often times, companies in related industries, or in similar businesses, follow similar patterns. If we could find similar companies' stocks, it would allow us to use these other stocks as exogenous time series, while predicting a particular stock as main time series.

Empirically, we can assume that companies in similar industries, such as the automobile or telecommunication industries would have some bearing on each others' price movements. In order to confirm this intuition, you can execute the code in dbg-stock-clustering notebook, to have the similar stocks clustered, using the HDBSCAN algorithm.

Although the clustering result may vary depending on the time period you choose while running the algorithm and the similarity function you choose, for the stocks in this dataset, they should be clustered somewhat similarly as shown in the diagram below.

Clustered stock view (expand for diagram)

Some prominent clusters are highlighted manually in this image, based on the clustering algorithm output.

Create bucket screenshot


To see for yourself, you can execute the code in dbg-stock-clustering from your Amazon SageMaker notebook instance and come to the next section of this readme when finished.

4. Custom Recurrent Neural Network (RNN)

Forecasting the evolution of events over time is essential in many applications, such as financial analysis, climatology, logistics, and supply chain management. Although predicting the future is hard, and requires availability of reliable and effective indicators, the infrastructure tools and algorithmic techniques are readily available on AWS.

Following two modules in this workshop will provide you with an understanding of how Recurrent Neural Network (RNN)-based deep learning algorithms can be applied to sequential data, such as the stock market data. You'll also know where to start if you decide to use an AWS provided algorithm for this purpose.

At a high level, you'll follow the plan as described in the session plan diagram:

Session plan diagram

As a first step, you'll use a custom RNN-based algorithm, following the dbg-custom-rnn notebook. Since the data preparation steps have already been completed in previous modules, you'll simply submit your model to Amazon SageMaker for training. Once trained, you'll deploy the model to generate predictions, forecast future stock values, and visualize within the notebook to see the performance of the model you deployed.

Although it is possible to execute training using the compute available in your own Notebook instance, containerizing your code and submitting to Amazon SageMaker for training has a number of advantages. Managed training and hosting services on Amazon SageMaker not only gives you flexibility of choosing the appropriately sized compute resources, it also ensures you only pay for what you actually use. Moreover, this approach makes it easier for data engineers to establish model pipelines by allowing them to automated these tasks in a repeatable fashion.

You can refer to the Amazon SageMaker build framework as a reference implementation of CI/CD framework for automated build and deployment of machine learning models.

For now, you can proceed to train and deploy the custom RNN model following the code in the dbg-custom-rnn notebook. Once finished, you can come back to the following section of this readme to explore another approach, using one of Amazon SageMaker's native algorithms, as provided by AWS, free of charge.

5. SageMaker DeepAR

The previous module served to demonstrate that even without covariate data from meaningful external source, using RNN-based model, it is possible to predict stock price movements better than random guessing. There are, however, better algorithms that might be able to improve upon the forecasting results obtained by our crude RNN-based model, as you'll see in the dbg-deepar notebook.

Classical forecasting methods, such as ARIMA (Autoregressive Integrated moving average), attempts to predict a future value by regressing a target time series itself on some lag. This technique is further improved by ARIMAx, which includes covariates and does the regression on the lag of the series itself and the other related time series. In both cases, another part of regression is done on some lag of random fluctuations around the moving average, thereby accounting for the stochastic part (as in moving average - MA).

One major drawback in both of these classical approaches is that they fit a single model to each individual time series. In reality however, such as in the case of the stock market data we are exploring in this workshop, we encounter many similar time series across a set of cross-sectional units. It is beneficial in such cases to train a single model jointly over all these time series.

Amazon SageMaker DeepAR follows this approach and can train a model with hundreds of time series. Once trained, such a model can then be used to forecast any of the time series' values into the future. As compared to our custom RNN approach, you would not need to train different models to predict movements of different stocks.

A recent feature addition in DeepAR is inclusion of dynamic features, which works in a way similar to how we used covariates in our custom RNN based model. Using dynamic features, as supporting time series that help explain the variability of the main time series, you can easily improve upon the prediction accuracy. Values of dynamic feature series' however have to be known for the forecast horizon. Although you will be using metrics from the same data set as dynamic features in this workshop, it is not realistic to know the values of those in advance throughout the forecast horizon.

In order to adopt the techniques you learn in this workshop to a real world use case, you might use data such as forward looking bond prices, federal interest rate, companies' revenue or sales guidance, option pricing etc. DeepAR's support of dynamic feature would then allow you to incorporate such additional knowledge about future into your model, thereby allowing you to forecast the future prices better.

You can now proceed to explore the final approach of predicting stock price movements using DeepAR, following the code in the dbg-deepar notebook.

5. Cleanup

One advantage of using AWS for your Machine Learning pipeline is that you do not have to procure expensive infrastructure, and that you can spin up the resources just in time, and spin them down when not in use.

In sections 3 and 4 of this workshop, you trained your model using hosted training job, during which Amazon SageMaker used instance of your type chosen only during the time training job was running. After job finishes, the entry you see on Amazon SageMaker console, under Training Jobs is merely a record of the jobs that run, and doesn't consume any resources.

However, the endpoints that you created by deploying your models in both cases resulted in long running resources that you could use to serve your customers in a production scenario. In this case, however, you have cleaned up the endpoints to avoid any cost overrun by executing the last cell in respecting notebooks. In case you haven't done so, you can always delete the endpoints from the Amazon SageMaker console, by selecting the endpoints displayed under Endpoints section, and choosing Delete Action.

Lastly, the notebook instance itself can and should be either stopped or terminated. If you choose to retain your work, you can simply visit the Notebook instances section in your Amazon SageMaker console and stop the instance. You can always turn it back on later, and it will retain all of your work.

amazon-sagemaker-stock-prediction-archived's People

Contributors

dbinoy avatar jedsundwall avatar jpeddicord avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

amazon-sagemaker-stock-prediction-archived's Issues

Issue with model deployment

Hello,

I am new to SageMaker so I have been using this to learn.

I have followed the tutorial in every step and everything went fine until the RNN model deployment cell:

%%time

Create an endpoint on a web server

predictor = rnn.deploy(1, 'ml.m4.xlarge', serializer=csv_serializer)

When I run this, I get:

UnexpectedStatusException: Error hosting endpoint dbg-custom-rnn-H-BMW-2019-12-16-14-36-15-248: Failed. Reason: The primary container for production variant AllTraffic did not pass the ping health check. Please check CloudWatch logs for this endpoint..

Looking at CloudWatch:

19:58:36 Starting the inference server with 4 workers.
19:58:36 [2019-12-17 19:58:32 +0000] [13] [INFO] Starting gunicorn 20.0.4
19:58:36 [2019-12-17 19:58:32 +0000] [13] [INFO] Listening at: unix:/tmp/gunicorn.sock (13)
19:58:36 [2019-12-17 19:58:32 +0000] [13] [INFO] Using worker: gevent
19:58:36 [2019-12-17 19:58:32 +0000] [17] [INFO] Booting worker with pid: 17
19:58:36 [2019-12-17 19:58:32 +0000] [18] [INFO] Booting worker with pid: 18
19:58:36 [2019-12-17 19:58:33 +0000] [19] [INFO] Booting worker with pid: 19
19:58:36 [2019-12-17 19:58:33 +0000] [20] [INFO] Booting worker with pid: 20
19:58:36 Using TensorFlow backend.
19:58:36 Using TensorFlow backend.
19:58:36 Using TensorFlow backend.
19:58:36 Using TensorFlow backend.
19:59:36 [2019-12-17 19:59:35,632] ERROR in app: Exception on /ping [GET]
19:59:36 Traceback (most recent call last): File "/usr/local/lib/python3.5/dist-packages/flask/app.py", line 2446, in wsgi_app response = self.full_dispatch_request() File "/usr/local/lib/python3.5/dist-packages/flask/app.py", line 1951, in full_dispatch_request rv = self.handle_user_exception(e) File "/usr/local/lib/python3.5/dist-packages/flask/app.py", line 1820, in handle_user_exception
19:59:36 AttributeError: 'gevent._local.local' object has no attribute 'value'
19:59:36 10.32.0.2 - - [17/Dec/2019:19:59:35 +0000] "GET /ping HTTP/1.1" 500 290 "-" "AHC/2.0"
19:59:40 [2019-12-17 19:59:40,402] ERROR in app: Exception on /ping [GET]
19:59:40 Traceback (most recent call last): File "/usr/local/lib/python3.5/dist-packages/flask/app.py", line 2446, in wsgi_app response = self.full_dispatch_request() File "/usr/local/lib/python3.5/dist-packages/flask/app.py", line 1951, in full_dispatch_request rv = self.handle_user_exception(e) File "/usr/local/lib/python3.5/dist-packages/flask/app.py", line 1820, in handle_user_exception
19:59:40 AttributeError: 'gevent._local.local' object has no attribute 'value'
19:59:40 10.32.0.2 - - [17/Dec/2019:19:59:40 +0000] "GET /ping HTTP/1.1" 500 290 "-" "AHC/2.0"
19:59:45 [2019-12-17 19:59:45,398] ERROR in app: Exception on /ping [GET]
...
this goes on and on.

If I expand the 19:59:36 item, it shows this:

Traceback (most recent call last):
File "/usr/local/lib/python3.5/dist-packages/flask/app.py", line 2446, in wsgi_app
response = self.full_dispatch_request()
File "/usr/local/lib/python3.5/dist-packages/flask/app.py", line 1951, in full_dispatch_request
rv = self.handle_user_exception(e)
File "/usr/local/lib/python3.5/dist-packages/flask/app.py", line 1820, in handle_user_exception
reraise(exc_type, exc_value, tb)
File "/usr/local/lib/python3.5/dist-packages/flask/_compat.py", line 39, in reraise
raise value
File "/usr/local/lib/python3.5/dist-packages/flask/app.py", line 1949, in full_dispatch_request
rv = self.dispatch_request()
File "/usr/local/lib/python3.5/dist-packages/flask/app.py", line 1935, in dispatch_request
return self.view_functionsrule.endpoint
File "/opt/program/predictor.py", line 158, in ping
health = ScoringService.get_model() is not None
File "/opt/program/predictor.py", line 40, in get_model
cls.model = load_model(model_artifact)
File "/usr/local/lib/python3.5/dist-packages/keras/engine/saving.py", line 492, in load_wrapper
return load_function(*args, **kwargs)
File "/usr/local/lib/python3.5/dist-packages/keras/engine/saving.py", line 584, in load_model
model = _deserialize_model(h5dict, custom_objects, compile)
File "/usr/local/lib/python3.5/dist-packages/keras/engine/saving.py", line 274, in _deserialize_model
model = model_from_config(model_config, custom_objects=custom_objects)
File "/usr/local/lib/python3.5/dist-packages/keras/engine/saving.py", line 627, in model_from_config
return deserialize(config, custom_objects=custom_objects)
File "/usr/local/lib/python3.5/dist-packages/keras/layers/init.py", line 168, in deserialize
printable_module_name='layer')
File "/usr/local/lib/python3.5/dist-packages/keras/utils/generic_utils.py", line 147, in deserialize_keras_object
list(custom_objects.items())))
File "/usr/local/lib/python3.5/dist-packages/keras/engine/network.py", line 1056, in from_config
process_layer(layer_data)
File "/usr/local/lib/python3.5/dist-packages/keras/engine/network.py", line 1042, in process_layer
custom_objects=custom_objects)
File "/usr/local/lib/python3.5/dist-packages/keras/layers/init.py", line 168, in deserialize
printable_module_name='layer')
File "/usr/local/lib/python3.5/dist-packages/keras/utils/generic_utils.py", line 149, in deserialize_keras_object
return cls.from_config(config['config'])
File "/usr/local/lib/python3.5/dist-packages/keras/engine/base_layer.py", line 1179, in from_config
return cls(**config)
File "/usr/local/lib/python3.5/dist-packages/keras/legacy/interfaces.py", line 91, in wrapper
return func(*args, **kwargs)
File "/usr/local/lib/python3.5/dist-packages/keras/engine/input_layer.py", line 87, in init
name=self.name)
File "/usr/local/lib/python3.5/dist-packages/keras/backend/tensorflow_backend.py", line 73, in symbolic_fn_wrapper
if _SYMBOLIC_SCOPE.value:
File "src/gevent/local.py", line 408, in gevent._local.local.getattribute

Please help. I already tried deleting the model and endpoint and restarting the notebook but no luck.

SyntaxError in core.py while running Model training cell in dbg-custom-rnn.ipython

I changed the time range in dbg-data-preparation.ipynb to 30/04/2020 - 31/07/2020, and the ticker in dbg-stock-clustering.ipynb from BMW to VOW3.
In dbg-custom-rnn.ipython, I changed the path in ECR Repository as follow.

image

# Define model artifact name and image
account = session.boto_session.client('sts').get_caller_identity()['Account']
region = session.boto_session.region_name
image = '{}.dkr.ecr.{}.amazonaws.com/{}:latest'.format(account, region, artifactname)
os.chdir("/home/ec2-user/SageMaker/container")
!sh build_and_push.sh $artifactname

I ran the Model training code cell and hit the SyntaxError below.
Could anyone explain for me this error?

image
image
image
image
image

Parameter image_name will be renamed to image_uri in SageMaker Python SDK v2.
's3_input' class will be renamed to 'TrainingInput' in SageMaker Python SDK v2.
2020-08-12 15:58:15 Starting - Starting the training job...
2020-08-12 15:58:17 Starting - Launching requested ML instances......
2020-08-12 15:59:38 Starting - Preparing the instances for training...
2020-08-12 16:00:16 Downloading - Downloading input data...
2020-08-12 16:00:22 Training - Downloading the training image...
2020-08-12 16:01:13 Uploading - Uploading generated training model.2020-08-12 16:01:08.786584: W tensorflow/stream_executor/platform/default/dso_loader.cc:59] Could not load dynamic library 'libcudart.so.10.1'; dlerror: libcudart.so.10.1: cannot open shared object file: No such file or directory
2020-08-12 16:01:08.786650: I tensorflow/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine.
Starting the training.
Hyperparameters file : {"target_stock": "BMW", "lag": "10", "interval": "D", "batch_size": "4096", "covariate_columns": "StartPrice, MinPrice, MaxPrice", "percent_train": "85.0", "covariate_stocks": "CON, DAI, PAH3, VOW3", "dropout_ratio": "0.1", "num_epochs": "1000", "target_column": "EndPrice", "horizon": "5", "num_units": "256"}
Hyperparameters initialized
Loading data from : /opt/ml/input/data/training/resampled_stockdata.csv
Loading data from : /opt/ml/input/data/training/resampled_stockdata.csv
Training data loaded
100 Stock symbols found.
Records for 65 trading days found.
0-CON#0111-DAI#0112-PAH3#0113-VOW3
Exception during training: invalid syntax (core.py, line 314)
Traceback (most recent call last):
File "/opt/program/train", line 255, in train
traindata.to_csv(os.path.join(model_path, trainfile))
File "/usr/local/lib/python3.5/dist-packages/pandas/core/generic.py", line 3020, in to_csv
formatter.save()
File "/usr/local/lib/python3.5/dist-packages/pandas/io/formats/csvs.py", line 157, in save
compression=self.compression)
File "/usr/local/lib/python3.5/dist-packages/pandas/io/common.py", line 344, in _get_handle
from s3fs import S3File
File "/usr/local/lib/python3.5/dist-packages/s3fs/__init__.py", line 1, in <module>
from .core import S3FileSystem, S3File
File "/usr/local/lib/python3.5/dist-packages/s3fs/core.py", line 8, in <module>
from fsspec import AbstractFileSystem
File "/usr/local/lib/python3.5/dist-packages/fsspec/__init__.py", line 10, in <module>
from .mapping import FSMap, get_mapper
File "/usr/local/lib/python3.5/dist-packages/fsspec/mapping.py", line 2, in <module>
from .core import url_to_fs
File "/usr/local/lib/python3.5/dist-packages/fsspec/core.py", line 314
out[0] = (f"{out[0][1]}://", out[0][1], out[0][2])
^
SyntaxError: invalid syntax
2020-08-12 16:01:19 Failed - Training job failed
UnexpectedStatusException Traceback (most recent call last)
<timed exec> in <module>

~/anaconda3/envs/python3/lib/python3.6/site-packages/sagemaker/estimator.py in fit(self, inputs, wait, logs, job_name, experiment_config)
497 self.jobs.append(self.latest_training_job)
498 if wait:
--> 499 self.latest_training_job.wait(logs=logs)
500
501 def _compilation_job_name(self):

~/anaconda3/envs/python3/lib/python3.6/site-packages/sagemaker/estimator.py in wait(self, logs)
1114 # If logs are requested, call logs_for_jobs.
1115 if logs != "None":
-> 1116 self.sagemaker_session.logs_for_job(self.job_name, wait=True, log_type=logs)
1117 else:
1118 self.sagemaker_session.wait_for_job(self.job_name)

~/anaconda3/envs/python3/lib/python3.6/site-packages/sagemaker/session.py in logs_for_job(self, job_name, wait, poll, log_type)
3075
3076 if wait:
-> 3077 self._check_job_status(job_name, description, "TrainingJobStatus")
3078 if dot:
3079 print()

~/anaconda3/envs/python3/lib/python3.6/site-packages/sagemaker/session.py in _check_job_status(self, job, desc, status_key_name)
2669 ),
2670 allowed_statuses=,
-> 2671 actual_status=status,
2672 )
2673

UnexpectedStatusException: Error for Training job dbg-custom-rnn-D-BMW-2020-08-12-15-58-15-812: Failed. Reason: AlgorithmError: Exception during training: invalid syntax (core.py, line 314)
Traceback (most recent call last):
File "/opt/program/train", line 255, in train
traindata.to_csv(os.path.join(model_path, trainfile))
File "/usr/local/lib/python3.5/dist-packages/pandas/core/generic.py", line 3020, in to_csv
formatter.save()
File "/usr/local/lib/python3.5/dist-packages/pandas/io/formats/csvs.py", line 157, in save
compression=self.compression)
File "/usr/local/lib/python3.5/dist-packages/pandas/io/common.py", line 344, in _get_handle
from s3fs import S3File
File "/usr/local/lib/python3.5/dist-packages/s3fs/__init__.py", line 1, in <module>
from .core import S3FileSystem, S3File
File "/usr/local/lib/python3.5/dist-packages/s3fs/core.py", line 8, in <module>
from fsspec import AbstractFileSystem
File "/usr/local/lib/python3.5/dist-packages/fsspec/__init__.py", line 10, in <module>
from .mapping import FSMap, get_mapper
File "/usr/local/lib/python3.5/dist-packages/fsspec/map

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.