notadamking / stock-trading-environment Goto Github PK

View Code? Open in Web Editor NEW

774.0 774.0 293.0 106 KB

A custom OpenAI gym environment for simulating stock trades on historical price data.

License: MIT License

Python 100.00%

stock-trading-environment's People

Contributors

Stargazers

Watchers

Forkers

mzaprudin pi-3p14 jingmouren damonclifford jb33k chiencarlos11 chinshou draichi zhuzhenping lalax-systems sulasen fightthepower puncoz-bookmarks jbdatascience devas123 henryurlo peter101101 wizcap stjordanis lukemshannonhill bulentongun blackivory ryancleeton3 dogsled mailatradu adamhanafi dangutdavid alexbuce genysys asela-wijesinghe algorithmictradinglstm olegdk diegoperezgatti ofirshm qljz mehedi02 neoricalex tinkersprojects bradwee2000 richinvest niharikamessi beehamer lwang89 arnavsaxena17 lishiqimagic victor8733 awesome-archive cfzvzv yutiansut erichuang2013 mkygogo irikefe41 chandimab anooppoommen primekun kmichal lorentz-wu afcarl munkichung learcane bobyzeng irfnrdh satishadhikari lzcaisg chorseng kdcro101 juliuskittler conrad-strughold ploxoy twotines metathesis-ai hyungjun010 alvarocalle chaschev ashok-dell michelgokan spaceooooo wolfhu shawnandshirley sahanduiuc iimmer pasca15 kdkuldeep roy-tang-hs just4jc jiangge whitecrow zhangyuz fghcdp armstronga hello1226 zerounnet frankfan007 sxty4170160 pengkiki movane alimai owodunni lifisherca rhinorhino

stock-trading-environment's Issues

Versions of installed libraries and packages

Hi, is it possible to provide the versions of installed libraries and frameworks?
It seems the versions used in this codebase is depricated.
Thanks.
@notadamking

how to understand the reward calculation

How to understand this? what is the exact purpose?
thanks l lot

delay_modifier = (self.current_step / MAX_STEPS)
reward = self.balance * delay_modifier

Problem after changing Database

Traceback (most recent call last):
File "main.py", line 20, in
model.learn(total_timesteps=100000)
File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/stable_baselines/ppo2/ppo2.py", line 292, in learn
obs, returns, masks, actions, values, neglogpacs, states, ep_infos, true_reward = runner.run()
File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/stable_baselines/ppo2/ppo2.py", line 432, in run
self.obs[:], rewards, self.dones, infos = self.env.step(clipped_actions)
File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/stable_baselines/common/vec_env/base_vec_env.py", line 130, in step
return self.step_wait()
File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/stable_baselines/common/vec_env/dummy_vec_env.py", line 36, in step_wait
self.envs[env_idx].step(self.actions[env_idx])
File "/Users/XXXX/Downloads/Stock-Trading-Environment-master/env/StockTradingEnv.py", line 112, in step
obs = self._next_observation()
File "/Users/XXXX/Downloads/Stock-Trading-Environment-master/env/StockTradingEnv.py", line 58, in _next_observation
]], axis=0)
File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/numpy/lib/function_base.py", line 4694, in append
return concatenate((arr, values), axis=axis)
ValueError: all the input array dimensions except for the concatenation axis must match exactly

Can't find a solution to this. If I change the data base to any other, I get this error. Any solution?

ModuleNotFoundError: No module named 'tensorflow.contrib'

Run main in /notadamking/Stock-Trading-Environment
Google colab report error.

Expect to run smoothly env.StockTradingEnv
Thanks!

how to prevent impossible actions

Hi, I was wondering, what will happen when shares_hold = 0 , and action_type = 2, which means to sale shares but there's currently no shares? It seems that you didn't take measures to prevent impossible actions.

action_type claimed to be discrete but is a BOX

If you are claiming a Discrete action space, I would expect something like

if action_type == 1: #BUY
elif action_type ==2: #SELL
else: # HOLD

Why the action_type is a float? I'm new in to gym environments. so maybe this is more of a question

Stock-Trading-Environment/env/StockTradingEnv.py

Line 70 in e72167b

if action_type < 1:

Requesting an article explaining the math of StockTradingEnv

First of all thank you for writing those articles. I was searching for a stock simulating system similar to gym. In your article you have given a good explanation of programming a gym enviornment but many of us don't know finance, so can you please add an article explaining the math behind it.

 if action_type < 1:
    # Buy amount % of balance in shares
    total_possible = self.balance / current_price
    shares_bought = total_possible * amount
    prev_cost = self.cost_basis * self.shares_held
    additional_cost = shares_bought * current_price
    self.balance -= additional_cost
    self.cost_basis = (prev_cost + additional_cost) / 
                            (self.shares_held + shares_bought)
    self.shares_held += shares_bought
  elif actionType < 2:
    # Sell amount % of shares held
    shares_sold = self.shares_held * amount . 
    self.balance += shares_sold * current_price
    self.shares_held -= shares_sold
    self.total_shares_sold += shares_sold
    self.total_sales_value += shares_sold * current_price

Look at this code I really don't know what is happening inside those conditional statements. It seems that it only buy and sell and don't hold it. I for one wants to build a simulator which trade x number of shares(not percentage of balance) for a amount of say 100 dollars. Without understanding the math I really can't build a new stocksimenv.

env error, possible solution??

ModuleNotFoundError Traceback (most recent call last)
in ()
7 from stable_baselines import PPO2
8
----> 9 from env.StockTradingEnv import StockTradingEnv
10
11 import pandas as pd

ModuleNotFoundError: No module named 'gym.Env'

NOTE: If your import is failing due to a missing package, you can
manually install dependencies using either !pip or !apt.

To view examples of installing some common dependencies, click the
"Open Examples" button below.

_next_observation method might be looking into the future

Hello,
I'm still getting acquainted with OpenAI Gym, so I'm not entirely sure about this issue, but it's possible that for each step, the _next_observation method is sending the next (future) 5 candles information, instead of the last five candles.

    def _next_observation(self):
        # Get the stock data points for the last 5 days and scale to between 0-1
        frame = np.array([
            self.df.loc[self.current_step: self.current_step +
                        5, 'Open'].values / MAX_SHARE_PRICE,
            self.df.loc[self.current_step: self.current_step +
                        5, 'High'].values / MAX_SHARE_PRICE,
            self.df.loc[self.current_step: self.current_step +
                        5, 'Low'].values / MAX_SHARE_PRICE,
            self.df.loc[self.current_step: self.current_step +
                        5, 'Close'].values / MAX_SHARE_PRICE,
            self.df.loc[self.current_step: self.current_step +
                        5, 'Volume'].values / MAX_NUM_SHARES,
        ])

This is the first 6 days of the dataframe (AAPL.csv):

Index Date Open High Low Close Volume

0 1998-01-02 13.63 16.25 13.50 16.25 6411700.0

1 1998-01-05 16.50 16.56 15.19 15.88 5820300.0

2 1998-01-06 15.94 20.00 14.75 18.94 16182800.0

3 1998-01-07 18.81 19.00 17.31 17.50 9300200.0

4 1998-01-08 17.44 18.62 16.94 18.19 6910900.0

5 1998-01-09 18.12 19.37 17.50 18.19 7915600.0

Index	Date	Open	High	Low	Close	Volume
0	1998-01-02	13.63	16.25	13.50	16.25	6411700.0
1	1998-01-05	16.50	16.56	15.19	15.88	5820300.0
2	1998-01-06	15.94	20.00	14.75	18.94	16182800.0
3	1998-01-07	18.81	19.00	17.31	17.50	9300200.0
4	1998-01-08	17.44	18.62	16.94	18.19	6910900.0
5	1998-01-09	18.12	19.37	17.50	18.19	7915600.0

Setting the self.current_step to 0, and printing the frame array:

[[0.002726, 0.0033, 0.003188, 0.003762, 0.003488, 0.003624], 
[0.00325, 0.003312, 0.004, 0.0038, 0.003724, 0.003874],
[0.0027, 0.003038, 0.00295, 0.003462, 0.003388, 0.0035],
[0.00325, 0.003176, 0.003788, 0.0035, 0.003638, 0.003638],
[0.00298568, 0.00271029, 0.0075357, 0.00433074, 0.00321814, 0.00368599]])

If we remove the normalization:

[13.63, 16.5,  15.94, 18.81, 17.44, 18.12] #Open
[16.25, 16.56, 20. , 19. , 18.62, 19.37] #High
[13.5,  15.19, 14.75, 17.31, 16.94, 17.5] #Low
[16.25, 15.88, 18.94, 17.5,  18.19, 18.19] #Close
[ 6411700. , 5820300. , 16182800. , 9300200. , 6910900. , 7915600. ] #Volume

If you check the DF you will see that those are the next 5 future prices/volume for the next steps. Logic applies to all following steps as well, not limited to step 0.

ValueError when trying to add more price data

Love the article and love the walkthrough.

I'm having an issue though. I'm trying to add a bigger array with more information. I'm following a similar format to the AAPL.csv file, but I've made my own dataset with columns Time (instead of Date), Bid, BidSize, Last, LastSize, Ask, AskSize, and Volume. So I've amended the code to read:

` # Prices contains the values for the last five prices
self.observation_space = spaces.Box(
low=0, high=1, shape=(8, 6), dtype=np.float16)

def _next_observation(self):
    # Get the stock data points for the last 5 days and scale to between 0-1
    frame = np.array([
        self.df.loc[self.current_step: self.current_step +
                    LOOKBACK_PERIOD, 'Bid'].values / MAX_SHARE_PRICE,
        self.df.loc[self.current_step: self.current_step +
                    LOOKBACK_PERIOD, 'Last'].values / MAX_SHARE_PRICE,
        self.df.loc[self.current_step: self.current_step +
                    LOOKBACK_PERIOD, 'Ask'].values / MAX_SHARE_PRICE,
        self.df.loc[self.current_step: self.current_step +
                    LOOKBACK_PERIOD, 'BS'].values / MAX_BIDASK_SIZES,
        self.df.loc[self.current_step: self.current_step +
                    LOOKBACK_PERIOD, 'LS'].values / MAX_LAST_SIZE,
        self.df.loc[self.current_step: self.current_step +
                    LOOKBACK_PERIOD, 'AS'].values / MAX_BIDASK_SIZES,
        self.df.loc[self.current_step: self.current_step +
                    LOOKBACK_PERIOD, 'Volume'].values / MAX_NUM_SHARES,
    ])`

...I've assigned the MAX variables correctly in accordance to my new dataset. So my LOOKBACK_PERIOD = 5 currently, however I'm using (almost) tick data from the same day in my dataset, so I need it to have a further lookback period as there's a lot of data it should look at. In the AAPL.csv file, it uses daily ohlc prices, however in mine I'm using almost second by second data from one day. So I would like it to be able to lookback more than 5 timesteps, however when I change that number to something like 75, I receive this error:

`Traceback (most recent call last):
File "main.py", line 20, in model.learn(total_timesteps=20000)
File "c:\users...\desktop\python scripts\all tools\stable-baselines-master\stable_baselines\ppo2\ppo2.py", line 277, in learn
runner = Runner(env=self.env, model=self, n_steps=self.n_steps, gamma=self.gamma, lam=self.lam)

File "c:\users...\desktop\python scripts\all tools\stable-baselines-master\stable_baselines\ppo2\ppo2.py", line 399, in init
super().init(env=env, model=model, n_steps=n_steps)

File "c:\users...\desktop\python scripts\all tools\stable-baselines-master\stable_baselines\common\runners.py", line 19, in init
self.obs[:] = env.reset()

File "c:\users...\desktop\python scripts\all tools\stable-baselines-master\stable_baselines\common\vec_env\dummy_vec_env.py", line 45, in reset
obs = self.envs[env_idx].reset()

File "C:\Users...\Desktop\Python Scripts\Stock Market Prediction\Stock Trading Environment OpenAI Gym 2019! - Matt's Attempt at Level 1\env\StockTradingEnv.py", line 158, in reset
return self._next_observation()

File "C:\Users...\Desktop\Python Scripts\Stock Market Prediction\Stock Trading Environment OpenAI Gym 2019! - Matt's Attempt at Level 1\env\StockTradingEnv.py", line 85, in _next_observation
]], axis=0)

File "C:\Users...\AppData\Local\Programs\Python\Python36\lib\site-packages\numpy\lib\function_base.py", line 4528, in append
return concatenate((arr, values), axis=axis)
ValueError: all the input array dimensions except for the concatenation axis must match exactly`

Would you be able to shed some light on why I might be receiving this message? When I leave the lookback period to 5, it does run, however it doesn't place any trades so I think it just needs more of a lookback period to get a sense of what the prices are doing. Something to do with the axis???

Thanks a lot! Looking forward to the follow up articles for this!

Migration to TF2 from TF1

Should I migrate and send a PR?

Using future data

I think you are using future data as an Observation and reinforcement learning is using that data to make profit that's why it's fitting the stock market line.
I think rather using
self.df.loc[self.current_step: self.current_step + 5, 'Open'].values / MAX_SHARE_PRICE,
this, you should use this
self.df.loc[self.current_step - 5: self.current_step, 'Open'].values / MAX_SHARE_PRICE,

correct me if I am wrong.