Coder Social home page Coder Social logo

neural-combinatorial-optimization-rl-tensorflow's Introduction

Neural Combinatorial Optimization with RL

TensorFlow implementation and extension of Neural Combinatorial Optimization with Reinforcement Learning for the Traveling Salesman Problem (TSP) and the TSP with Time Windows (TSP-TW)

model

The Neural Network consists in a RNN or self attentive encoder-decoder with an attention module connecting the decoder to the encoder (via a "pointer"). The model is trained by Policy Gradient (Reinforce, 1992).

Requirements

Architecture

(under progress)

Usage

TSP

  • To train a (2D TSP20) model from scratch (data is generated on the fly):
> python main.py --max_length=20 --inference_mode=False --restore_model=False --save_to=20/model --log_dir=summary/20/repo

NB: Just make sure ./save/20/model exists (create folder otherwise)

  • To visualize training on tensorboard:
> tensorboard --logdir=summary/20/repo
  • To test a trained model:
> python main.py --max_length=20 --inference_mode=True --restore_model=True --restore_from=20/model

TSP-TW

  • To pretrain a (2D TSPTW20) model with infinite travel speed from scratch:
> python main.py --inference_mode=False --pretrain=True --restore_model=False --speed=1000. --beta=3  --save_to=speed1000/n20w100 --log_dir=summary/speed1000/n20w100
  • To fine tune a (2D TSPTW20) model with finite travel speed:
> python main.py --inference_mode=False --pretrain=False --kNN=5 --restore_model=True --restore_from=speed1000/n20w100 --speed=10.0 --beta=3 --save_to=speed10/s10_k5_n20w100 --log_dir=summary/speed10/s10_k5_n20w100

NB: Just make sure save_to folders exist

  • To visualize training on tensorboard:
> tensorboard --logdir=summary/speed1000/n20w100
> tensorboard --logdir=summary/speed10/s10_k5_n20w100
  • To test a trained model with finite travel speed on Dumas instances (in the benchmark folder):
> python main.py --inference_mode=True --restore_model=True --restore_from=speed10/s10_k5_n20w100 --speed=10.0

Results

TSP

Sampling 128 permutations with the Self-Attentive Encoder + Pointer Decoder:

  • Comparison to Google OR tools on 1000 TSP20 instances: (predicted tour length) = 0.9983 * (target tour length)

Self_Net_TSP20

TSP-TW

Sampling 256 permutations with the RNN Encoder + Pointer Decoder, followed by a 2-opt post processing on best tour:

  • Dumas instance n20w100.001 tsptw1
  • Dumas instance n20w100.003 tsptw2

Authors

Michel Deudon / @mdeudon

Pierre Cournut / @pcournut

References

Bello, I., Pham, H., Le, Q. V., Norouzi, M., & Bengio, S. (2016). Neural combinatorial optimization with reinforcement learning. arXiv preprint arXiv:1611.09940.

neural-combinatorial-optimization-rl-tensorflow's People

Contributors

micheldeudon avatar pcournut avatar

Stargazers

 avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.