Coder Social home page Coder Social logo

hmomin / finenvs Goto Github PK

View Code? Open in Web Editor NEW
9.0 1.0 1.0 80.02 MB

Fast Parallel Simulation of Financial Time Series Environments for Reinforcement Learning

License: GNU General Public License v3.0

Python 99.86% Batchfile 0.09% CMake 0.06%
reinforcement-learning finance fintech

finenvs's Introduction

finenvs's People

Contributors

hmomin avatar mugiwarakaizoku avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar

Forkers

mugiwarakaizoku

finenvs's Issues

SAC Issues

@mugiwarakaizoku

I'm having trouble getting SAC to learn Cartpole effectively. Below is sample output of one of the better trials, but in most trials, it can't even break above a total reward of 10.

Also, there is a memory leak somewhere that triggers after about 2.2-million samples for me: based on the error message, it looks like it's resulting from not detaching the output from means = self.forward(states) in the actor file, but I'll let you see to it.

Importing module 'gym_37' (/home/momin/Documents/isaacgym/python/isaacgym/_bindings/linux-x86_64/gym_37.so)
Setting GYM_USD_PLUG_INFO_PATH to /home/momin/Documents/isaacgym/python/isaacgym/_bindings/linux-x86_64/usd/plugInfo.json
PyTorch version 1.10.2+cu113
Device count 1
/home/momin/Documents/isaacgym/python/isaacgym/_bindings/src/gymtorch
Using /home/momin/.cache/torch_extensions/py37_cu113 as PyTorch extensions root...
Emitting ninja build file /home/momin/.cache/torch_extensions/py37_cu113/gymtorch/build.ninja...
Building extension module gymtorch...
Allowing ninja to set a default number of workers... (overridable by setting the environment variable MAX_JOBS=N)
ninja: no work to do.
Loading extension module gymtorch...
/home/momin/anaconda3/envs/rlgpu/lib/python3.7/site-packages/gym/spaces/box.py:112: UserWarning: WARN: Box bound precision lowered by casting to float32
  logger.warn(f"Box bound precision lowered by casting to {self.dtype}")
Not connected to PVD
+++ Using GPU PhysX
Physics Engine: PhysX
Physics Device: cuda:0
GPU Pipeline: enabled
num samples: 48640 - evaluation return: 42.664555 - mean training return: 19.240843 - std dev training return: 21.298613
num samples: 75264 - evaluation return: 17.117111 - mean training return: 37.337551 - std dev training return: 24.090464
num samples: 96256 - evaluation return: 9.022424 - mean training return: 34.875938 - std dev training return: 26.308626
num samples: 116224 - evaluation return: 7.290071 - mean training return: 33.133209 - std dev training return: 25.416498
num samples: 136192 - evaluation return: 8.945022 - mean training return: 32.170544 - std dev training return: 25.197166
num samples: 157184 - evaluation return: 11.210396 - mean training return: 32.678501 - std dev training return: 22.590132
num samples: 178176 - evaluation return: 8.651594 - mean training return: 33.387737 - std dev training return: 24.583792
num samples: 203264 - evaluation return: 19.072805 - mean training return: 33.129204 - std dev training return: 22.507452
num samples: 224256 - evaluation return: 10.693352 - mean training return: 33.824829 - std dev training return: 22.631231
num samples: 242688 - evaluation return: 7.536970 - mean training return: 35.717468 - std dev training return: 26.653385
num samples: 263168 - evaluation return: 9.159291 - mean training return: 39.677971 - std dev training return: 26.461964
num samples: 284672 - evaluation return: 13.753101 - mean training return: 38.719288 - std dev training return: 25.779493
num samples: 306688 - evaluation return: 11.256489 - mean training return: 41.800442 - std dev training return: 27.398203
num samples: 326656 - evaluation return: 7.761618 - mean training return: 41.407909 - std dev training return: 30.619041
num samples: 349184 - evaluation return: 12.650271 - mean training return: 39.605946 - std dev training return: 26.463860
num samples: 368640 - evaluation return: 7.406431 - mean training return: 43.363243 - std dev training return: 31.207508
num samples: 391168 - evaluation return: 13.261372 - mean training return: 40.798859 - std dev training return: 28.053375
num samples: 411136 - evaluation return: 8.921464 - mean training return: 47.900318 - std dev training return: 34.384724
num samples: 431616 - evaluation return: 10.965508 - mean training return: 41.790977 - std dev training return: 30.708284
num samples: 459264 - evaluation return: 24.945358 - mean training return: 42.953075 - std dev training return: 33.013519
num samples: 480256 - evaluation return: 10.809840 - mean training return: 41.681019 - std dev training return: 29.536516
num samples: 500736 - evaluation return: 10.699786 - mean training return: 42.309608 - std dev training return: 30.666838
num samples: 519680 - evaluation return: 7.754902 - mean training return: 38.211960 - std dev training return: 29.142509
num samples: 538112 - evaluation return: 7.239990 - mean training return: 40.609222 - std dev training return: 30.046165
num samples: 557568 - evaluation return: 7.792455 - mean training return: 41.511486 - std dev training return: 29.129290
num samples: 579584 - evaluation return: 15.246098 - mean training return: 43.569519 - std dev training return: 30.588730
num samples: 598528 - evaluation return: 6.837287 - mean training return: 44.639370 - std dev training return: 32.618675
num samples: 616960 - evaluation return: 8.379328 - mean training return: 44.910011 - std dev training return: 30.007103
num samples: 635392 - evaluation return: 7.636024 - mean training return: 41.894653 - std dev training return: 29.709085
num samples: 654848 - evaluation return: 10.105590 - mean training return: 42.790398 - std dev training return: 29.942787
num samples: 674304 - evaluation return: 9.454279 - mean training return: 42.145344 - std dev training return: 30.102503
num samples: 696832 - evaluation return: 15.113770 - mean training return: 42.603848 - std dev training return: 26.551289
num samples: 743936 - evaluation return: 69.809341 - mean training return: 44.716377 - std dev training return: 31.713352
num samples: 765952 - evaluation return: 14.800228 - mean training return: 48.687096 - std dev training return: 35.265812
num samples: 784384 - evaluation return: 8.898764 - mean training return: 44.878021 - std dev training return: 29.701893
num samples: 810496 - evaluation return: 22.929743 - mean training return: 42.003948 - std dev training return: 29.101030
num samples: 830464 - evaluation return: 8.730614 - mean training return: 46.895416 - std dev training return: 30.673267
num samples: 850944 - evaluation return: 10.750460 - mean training return: 44.366295 - std dev training return: 32.735119
num samples: 869376 - evaluation return: 7.646038 - mean training return: 42.031437 - std dev training return: 30.974838
num samples: 888320 - evaluation return: 8.660542 - mean training return: 45.897411 - std dev training return: 35.273087
num samples: 910336 - evaluation return: 14.657757 - mean training return: 42.573399 - std dev training return: 29.213062
num samples: 951808 - evaluation return: 59.844833 - mean training return: 44.369228 - std dev training return: 32.552788
num samples: 972288 - evaluation return: 10.970460 - mean training return: 42.581337 - std dev training return: 26.832909
num samples: 990208 - evaluation return: 8.688063 - mean training return: 42.989204 - std dev training return: 27.803591
num samples: 1009664 - evaluation return: 10.115323 - mean training return: 44.869339 - std dev training return: 32.852955
num samples: 1028608 - evaluation return: 7.315423 - mean training return: 41.035736 - std dev training return: 32.797501
num samples: 1051648 - evaluation return: 17.410482 - mean training return: 43.608242 - std dev training return: 32.394970
num samples: 1070080 - evaluation return: 8.257707 - mean training return: 44.351231 - std dev training return: 29.345806
num samples: 1089024 - evaluation return: 7.072944 - mean training return: 44.150719 - std dev training return: 31.034515
num samples: 1107968 - evaluation return: 7.315763 - mean training return: 45.740803 - std dev training return: 29.843706
num samples: 1126912 - evaluation return: 8.030341 - mean training return: 48.802032 - std dev training return: 32.735664
num samples: 1147904 - evaluation return: 12.481560 - mean training return: 46.902039 - std dev training return: 30.762377
num samples: 1165824 - evaluation return: 7.350004 - mean training return: 49.774536 - std dev training return: 34.108013
num samples: 1184768 - evaluation return: 8.855827 - mean training return: 48.475475 - std dev training return: 33.205433
num samples: 1203200 - evaluation return: 6.800958 - mean training return: 43.822147 - std dev training return: 27.918304
num samples: 1249280 - evaluation return: 60.188492 - mean training return: 48.652798 - std dev training return: 32.888950
num samples: 1267200 - evaluation return: 7.280651 - mean training return: 43.635883 - std dev training return: 29.472729
num samples: 1314816 - evaluation return: 68.751907 - mean training return: 45.681065 - std dev training return: 32.199825
num samples: 1334784 - evaluation return: 10.479751 - mean training return: 46.177887 - std dev training return: 33.436707
num samples: 1363456 - evaluation return: 27.123913 - mean training return: 45.143280 - std dev training return: 32.398781
num samples: 1382912 - evaluation return: 10.328647 - mean training return: 43.507858 - std dev training return: 35.936104
num samples: 1401856 - evaluation return: 7.638084 - mean training return: 44.668758 - std dev training return: 31.289669
num samples: 1432576 - evaluation return: 32.943344 - mean training return: 44.900688 - std dev training return: 31.360880
num samples: 1452544 - evaluation return: 9.221864 - mean training return: 41.564133 - std dev training return: 27.927759
num samples: 1473536 - evaluation return: 11.704432 - mean training return: 48.011837 - std dev training return: 34.653778
num samples: 1496064 - evaluation return: 15.954937 - mean training return: 50.346596 - std dev training return: 35.377712
num samples: 1514496 - evaluation return: 8.035228 - mean training return: 47.771240 - std dev training return: 33.395077
num samples: 1533952 - evaluation return: 8.281386 - mean training return: 41.216488 - std dev training return: 30.946314
num samples: 1553408 - evaluation return: 10.508433 - mean training return: 44.966591 - std dev training return: 29.735842
num samples: 1575936 - evaluation return: 16.217566 - mean training return: 44.983177 - std dev training return: 33.251244
num samples: 1594368 - evaluation return: 8.081646 - mean training return: 44.372837 - std dev training return: 34.626404
num samples: 1615872 - evaluation return: 12.964675 - mean training return: 45.627056 - std dev training return: 30.419598
num samples: 1636864 - evaluation return: 11.474453 - mean training return: 44.596386 - std dev training return: 30.335295
num samples: 1656320 - evaluation return: 9.743287 - mean training return: 48.475723 - std dev training return: 34.589176
num samples: 1676288 - evaluation return: 9.889929 - mean training return: 45.983326 - std dev training return: 32.190174
num samples: 1706496 - evaluation return: 31.201733 - mean training return: 44.044250 - std dev training return: 29.999483
num samples: 1751552 - evaluation return: 67.660858 - mean training return: 47.377201 - std dev training return: 33.140793
num samples: 1771520 - evaluation return: 11.536253 - mean training return: 48.449409 - std dev training return: 33.765598
num samples: 1788928 - evaluation return: 7.400703 - mean training return: 46.131039 - std dev training return: 34.952114
num samples: 1810944 - evaluation return: 14.474745 - mean training return: 42.301899 - std dev training return: 35.764179
num samples: 1831936 - evaluation return: 10.688743 - mean training return: 46.439331 - std dev training return: 31.543478
num samples: 1860096 - evaluation return: 26.456173 - mean training return: 45.223267 - std dev training return: 31.939152
num samples: 1881600 - evaluation return: 10.944726 - mean training return: 44.479214 - std dev training return: 27.474779
num samples: 1905152 - evaluation return: 18.194380 - mean training return: 50.965813 - std dev training return: 35.656303
num samples: 1925120 - evaluation return: 9.182425 - mean training return: 46.331676 - std dev training return: 33.462132
num samples: 1943040 - evaluation return: 7.370675 - mean training return: 49.256039 - std dev training return: 32.469273
num samples: 1963520 - evaluation return: 10.486354 - mean training return: 46.099960 - std dev training return: 32.586018
num samples: 1980416 - evaluation return: 6.084354 - mean training return: 45.396633 - std dev training return: 33.130478
num samples: 2000384 - evaluation return: 11.129582 - mean training return: 45.089146 - std dev training return: 33.360134
num samples: 2021376 - evaluation return: 11.500680 - mean training return: 46.762657 - std dev training return: 32.203106
num samples: 2042880 - evaluation return: 12.560404 - mean training return: 49.993290 - std dev training return: 33.039204
num samples: 2061824 - evaluation return: 9.376321 - mean training return: 45.685165 - std dev training return: 33.842228
num samples: 2081792 - evaluation return: 11.399903 - mean training return: 44.692337 - std dev training return: 33.136116
num samples: 2101760 - evaluation return: 10.711651 - mean training return: 44.341721 - std dev training return: 31.529741
num samples: 2128896 - evaluation return: 24.776752 - mean training return: 47.567459 - std dev training return: 30.484776
num samples: 2200064 - evaluation return: 87.675751 - mean training return: 46.126362 - std dev training return: 31.622690
Traceback (most recent call last):
  File "examples/SAC_MLP_Isaac_Gym.py", line 30, in <module>
    train_SAC_MLP_on_environiment("Cartpole")
  File "examples/SAC_MLP_Isaac_Gym.py", line 20, in train_SAC_MLP_on_environiment
    actions = agent.step(states)
  File "/home/momin/Documents/GitHub/FinEnvs/finenvs/agents/SAC/SAC_agent.py", line 117, in step
    actions = self.actor.get_actions_and_log_probs(states)[0]
  File "/home/momin/Documents/GitHub/FinEnvs/finenvs/agents/SAC/actor.py", line 58, in get_actions_and_log_probs
    distribution = self.get_distribution(states)
  File "/home/momin/Documents/GitHub/FinEnvs/finenvs/agents/SAC/actor.py", line 51, in get_distribution
    mean = self.forward(states)
  File "/home/momin/Documents/GitHub/FinEnvs/finenvs/agents/networks/multilayer_perceptron.py", line 26, in forward
    return self.network(inputs)
  File "/home/momin/anaconda3/envs/rlgpu/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home/momin/anaconda3/envs/rlgpu/lib/python3.7/site-packages/torch/nn/modules/container.py", line 141, in forward
    input = module(input)
  File "/home/momin/anaconda3/envs/rlgpu/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home/momin/anaconda3/envs/rlgpu/lib/python3.7/site-packages/torch/nn/modules/activation.py", line 499, in forward
    return F.elu(input, self.alpha, self.inplace)
  File "/home/momin/anaconda3/envs/rlgpu/lib/python3.7/site-packages/torch/nn/functional.py", line 1391, in elu
    result = torch._C._nn.elu(input, alpha)
RuntimeError: CUDA out of memory. Tried to allocate 2.00 MiB (GPU 0; 10.75 GiB total capacity; 8.52 GiB already allocated; 31.38 MiB free; 8.60 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation.  See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

real	0m36.206s
user	0m37.513s
sys	0m3.575s

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.