Coder Social home page Coder Social logo

babcockt18 / rl-dynamicconvexrisk Goto Github PK

View Code? Open in Web Editor NEW

This project forked from acoache/rl-dynamicconvexrisk

0.0 0.0 0.0 426 KB

Python code to perform risk-sensitive Reinforcement Learning with dynamic convex risk measures

Python 52.37% Jupyter Notebook 47.63%

rl-dynamicconvexrisk's Introduction

Reinforcement Learning with Dynamic Convex Risk Measures

This Github repository regroups the Python code to run the actor-critic algorithm and replicate the experiments given in the paper Reinforcement Learning with Dynamic Convex Risk Measures by Anthony Coache and Sebastian Jaimungal. There is one folder for each set of experiments, respectively the statistical arbitrage, cliff walking and hedging with friction examples. There is also a Python notebook to showcase how to use our code and replicate some of the experiments.

For further details on the algorithm and theoretical aspects of the problem, please refer to our paper.

Thank you for your interest in my research work. If you have any additional enquiries, please reach out to myself at [email protected].

Authors

Anthony Coache & Sebastian Jaimungal


All folders have the same structure, with the following files:

  • hyperparams.py
  • main.py
  • main_plot.py
  • actor_critic.py
  • envs.py
  • models.py
  • risk_measure.py
  • utils.py

hyperparams.py

This file contains functions to initialize and print all hyperparameters, both for the environment and the actor-critic algorithm.

main.py

This file contains the program to run the training phase. The first part concerns the importation of libraries and initialization of all parameters, either for the environment, neural networks or risk measure. Some notable parameters that need to be specified by the user in the hyperparams.py file are the numbers of epochs, learning rates, size of the neural networks and number of episodes/transitions among others. The next section is the training phase and its skeleton is given in the paper. It uses mostly functions from the actor_critic.py file. Finally, the models for the policy and value function are saved in a folder, along with diagnostic plots.

main_plot.py

This file contains the program to run the testing phase. The first part concerns the importation of libraries and initialization of all parameters. Note that parameters must be identical to the ones used in main.py. The next section evaluates the policy found by the algorithm. It runs several simulations using the best behavior found by the actor-critic algorithm. Finally it outputs graphics to assess the performance of the procedure, such as the preferred action in any possible state and the estimated distribution of the cost when following the best policy.

actor_critic.py

The whole algorithm is wrapped into a single class named ActorCriticPG, where input arguments specify which problem the agent faces. The user needs to give an environment, a (convex) risk measure, as well as two neural network structures that play the role of the value function and agent's policy. Each instance of that class has functions to select actions from the policy, whether at random or using the best behavior found thus far, and give the set of invalid actions. There is also a function to simulate (outer) episodes and (inner) transitions using the simulation upon simulation approach discussed in the paper. The update of the value function is wrapped in a function which takes as inputs the mini-batch size, number of epochs and characteristics of the value function neural network structure, such as the learning rate and the number of hidden nodes. Similarly, another function implements the update of the policy and takes as inputs the mini-batch size and number of epochs.

envs.py

This file contains the environment class for the RL problem, as well as functions to interact with it. It has the PyTorch and NumPy versions of the simulation engine.

models.py

Models are regrouped under this file with classes to build ANN structures using the PyTorch library.

risk_measure.py

This file has the class that creates an instance of a risk measure, with functions to compute the risk and calculate its gradient. Risk measures currently implemented are the expectation, the conditional value-at-risk (CVaR), the mean-semideviation, a penalized version of the CVaR, and a linear combination of the mean and CVaR. More specifically, we have

equation

equation

equation

equation

equation

equation

utils.py

This file contains some useful functions and variables, such as a function to create new directories and colors for the visualizations.


rl-dynamicconvexrisk's People

Contributors

acoache avatar sebjai avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.