happypepper / deepholdem Goto Github PK

Makefile 0.13% Perl 4.04% C 24.92% Shell 0.05% Lua 69.61% Python 1.25%

deepholdem's Introduction

DeepHoldem

This is an implementation of DeepStack for No Limit Texas Hold'em, extended from DeepStack-Leduc.

Setup

Running any of the DeepHoldem code requires Lua and torch. Please install torch with lua version 5.2 instead of LuaJIT. Torch is only officially supported for *NIX based systems (i.e. Linux and Mac OS X).

Connecting DeepHoldem to a server or running DeepHoldem on a server will require the luasocket package. This can be installed with luarocks (which is installed as part of the standard torch distribution) using the command luarocks install luasocket. Visualising the trees produced by DeepHoldem requires the graphviz package, which can be installed with luarocks install graphviz. Running the code on the GPU requires cutorch which can be installed with luarocks install cutorch.

The HandRanks file was too big for github, so you will need to unzip it: cd Source/Game/Evaluation && unzip HandRanks.zip

scatterAdd

When you try to run DeepHoldem, you will eventually run into a problem where scatterAdd is not defined. Torch7 actually includes a C++ implementation of scatterAdd but for whatever reason, doesn't include a lua wrapper for it.

I've included TensorMath.lua files in the torch folder of this repository that include the wrapper functions for both CPU and GPU. Copy them to their corresponding torch installation folders.

Now, from your torch installation directory, run:

./clean.sh
TORCH_LUA_VERSION=LUA52 ./install.sh

and you should be good to go.

Performance

This implementation was tested against Slumbot 2017, the only publicly playable bot as of June 2018. The action abstraction used was half pot, pot and all in for first action, pot and all in for second action onwards. It achieved a baseline winrate of 42bb/100 after 2616 hands (equivalent to ~5232 duplicate hands). Notably, it achieved this playing inside of Slumbot's action abstraction space.

A comparison of preflop ranges was also done against DeepStack's hand history, showing similar results.

	DeepStack	DeepHoldem
Open fold
Open pot
3bet pot after pot open

Average thinking times on NVIDIA Tesla P100:

Street	Thinking Speed (s)
Preflop	2.69
Flop	12.42
Turn	7.57
River	3.33

Training details:

	# samples	Validation huber loss
River network	1,000,000	0.0415
Turn network	1,000,000	0.045
Flop network	1,000,000	0.013
Preflop aux network	1,000,000	0.0017

Creating your own models

Other than the preflop auxiliary network, the counterfactual value networks are not included as part of this release, you will need to generate them yourself. The model generation pipeline is a bit different from the Leduc-Holdem implementation in that the data generated is saved to disk as raw solutions rather than bucketed solutions. This makes it easier to experiment with different bucketing methods.

Here's a step by step guide to creating models:

cd Source && th DataGeneration/main_data_generation.lua 4
Wait for enough data to be generated.
Modify the last line of Training/raw_converter.lua to specify the folders where the raw training data (source folder) you got from step 1 is and where you want the bucketed training data (dest folder) to be.
th Training/raw_converter.lua 4
th Training/main_train.lua 4
Models will be generated under Data/Models/NoLimit. Pick the model you like best and place it inside Data/Models/NoLimit/river along with its .info file. Rename them to final_cpu.info and final_cpu.model. Please refer to the DeepStack-Leduc tutorial if you want to convert them to GPU models.
Repeat steps 1-6 for turn and flop by replacing 4 with 3 or 2 and placing the models under the turn and flop folders.

If you want to speed up data generation with a GPU, make sure to modify Settings/arguments.lua so that params.gpu = true

Playing against DeepHoldem

Player/manual_player.lua is supplied so you can play against DeepHoldem for preflop situations. If you want DeepHoldem to work for flop, turn and river, you will need to create your own models.

cd ACPCServer && make
./dealer testMatch holdem.nolimit.2p.reverse_blinds.game 1000 0 Alice Bob
2 ports will be output, note them down as port1 and port2
Open a second terminal and cd Source && th Player/manual_player.lua <port1>
Open a third terminal and cd Source && th Player/deepstack.lua <port2>. It will take about 20 minutes to load all the flop buckets, but this is actually not necessary until you've created your own flop model. You can skip the flop bucket computation by commenting out line 44 of Source/Nn/next_round_value_pre.lua.
Once the deepstack player is done loading, you can play against it using manual_player terminal. f = fold, c = check/call, 450 = raise my total pot commitment to 450 chips.

Differences from the original paper

A river model was used instead of solving directly from the turn
Different neural net architecture
- Batch normalization layers were added in between hidden layers because they were found to improve huber loss
- Only 3 hidden layers were used. Additional layers didn't improve huber loss, in agreement with the paper.
Preflop solving was done with auxiliary network only, whereas paper used 20 iterations of flop network
- Because of this, the cfvs for a given flop must be calculated after seeing it by solving the preflop again with the current flop in mind
During re-solving, the opponent ranges were not warm started

Future work

Warm start opponent ranges for re-solving
Cache flop buckets so initializing next_round_value_pre doesn't take 20 minutes
Speed up flop solving (use flop network during preflop solving?)
Support LuaJIT
C++ implementation?

deepholdem's People

Contributors

Stargazers

Watchers

Forkers

jaysenstark vhranger honyzahy chivalry123 aikupoker timothysu aligatorblood corintio gaofangshu matthewmav luweiblues holoiii chinshou maxim28 etherpornstars paul-chelarescu jimmy-kuo shaohelv yue-hash zhaoenmin downseq hhanh00 dennisjay kli-casia kingsfield strikles easefunsz bupticybee vogtai jmhenri dugler1990 gschengcong crazybull44 pengfoo sohaibb98 tzuren hobbit19 jinqingyu williamyuanv0 pystone gaellnz jayansene voidxv yffbit millx2021 umialpha awfeequdng horstboy chasebrooks luminalle pavadik hushidong jinlmsft zhanjunxiong lizhongguo enfernuz cat-zzz

deepholdem's Issues

Assertion "indexValue >=0" failed in next_round_value_pre.lua:114

Hi,

When I tried to play against DeepHoldem, I got a assertion failed error. It seems i have a same problem as this issue. Here is the error:cuda runtime error (59) : device-side assert triggered at .../cutorch/lib/generic/THCTensorScatterGather.cu:268

stack trackback:

./Nn/next_round_value_pre.lua:114: in function '_card_range_to_bucket_range_on_board'
./Nn/next_round_value_pre.lua:247: in function 'get_value_aux'
./Lookahead/lookahead.lua:254: in function '_compute_terminal_equities_next_street_box'
./Lookahead/lookahead.lua:323: in function '_compute_terminal_equities'
./Lookahead/lookahead.lua:93: in function '_compute'
./Lookahead/lookahead.lua:66: in function 'resolve'
./Lookahead/resolving.lua:182: in function 'get_chance_action_cfv'
./Player/continual_resolving.lua:112: in function '_update_invariant'
./Player/continual_resolving.lua:81: in function '_resolve_node'
./Player/continual_resolving.lua:142: in function 'compute_action'
./Player/deepstack.lua:41: in main chunk

Here is the code at the error point (next_round_value_pre.lua:114)：

function NextRoundValuePre:_card_range_to_bucket_range_on_board(board_idx, card_range, bucket_range)
  local other_bucket_range = bucket_range:view(-1,self.bucket_count + 1):zero()

  local indexes = self.board_indexes_scatter:view(1,self.board_count, game_settings.hand_count)[{{},board_idx,{}}]
    :expand(bucket_range:size(1), game_settings.hand_count)
  other_bucket_range:scatterAdd(
    2,
    indexes,
    card_range
      :view(-1,game_settings.hand_count)
      :expand(card_range:size(1), game_settings.hand_count))
end

I think the wrong reason is the “indexes”, because the assertion is that (THCTensorScatterGather.cu):

I still can not fix it. It's been a long time since I stuck here.

Any ideas would be great. Thanks very much!

scatterAdd not working

Somehow I do not get the scatterAdd solution to work:

I got torch from:
git clone https://github.com/torch/distro.git ~/torch --recursive

then in the torch folder changed TensorMath.lua as described...

and installed via:
TORCH_LUA_VERSION=LUA52 ./install.sh

(but installed cutorch via luarocks -> is that wrong?)

after all I get:

./Game/card_tools.lua:30: attempt to call method 'scatterAdd' (a nil value)

Not playing best strategy?

I guess its to not be predictable to opponent?

For example it pushed me allin with 0.12 prob while check and raise were around 0.44.
That was with 2 low pairs on river, and only really bad players would call 10x pot allin with loosing hands. (I guess)

Doesnt make a lot of sense to me to make moves like this. Am I missing something here or is this intended? Models are trained at around 0.05 vl (1m hands) (except preflop is the provided one)

Segmentation fault (core dumped)

Hi,

when trying to do a preflop match with deepstack I am getting a "Segmentation fault (core dumped)" error. It doesn't matter if I skip the bucket computation or not. Running on Ubuntu 18.04

Any hints on how to resolve this?

Bug in node.strategy

Source\Tree\strategy_filling.lua
function StrategyFilling:_fill_chance(node)

code:
--remove 2 because each player holds one card
node.strategy[i][mask] = 1.0 / (game_settings.card_count - 2)

Its correct? Each player holds 2 card in Holdem Poker. I think correct is 2*2 = - 4

main_train does still nothing

Essentially this problem again: #38
But since a solution was never provided I reopen it.

I am using Google colab's free GPU to generate data and train models. I got data Generated, Data converted, but can't get Model trained.
The issue is whenever i run the training code, the script terminates at line 60 in Training/train.lua at the line:
M.network(inputs)
This function never returns, doesn't give any error or warning either.
Tried wirh different batch sizes, on CPU, on GPU

Unnecessary init checks?

First thank you for your work! It is really impressive.
One question: in the bucketer.lua there are some init check, like:

function M:_init()
  if self._ihr_pair_to_bucket == nil then

and

if self._turn_means == nil then

so on.
Then subsequently right away a call:

M:_init()

So, there is any sense of checking if e.g. self._ihr_pair_to_bucket == nil if this is an init functions anyway, thus we will call it only once at the time of initializing?

The full relevant code for convenience:

function M:_init()
  if self._ihr_pair_to_bucket == nil then
    local f = assert(io.open("./Nn/Bucketing/riverihr.dat", "rb"))
    local data = f:read("*all")

    self._river_ihr = {}
    for i = 1, string.len(data), 7 do
      local key = 0
      for j = i,i+4 do
        key = key + data:byte(j) * (2 ^ ((4 - j + i) * 8))
      end
      local win = data:byte(i+5)
      local tie = data:byte(i+6)
      self._river_ihr[key] = win*200 + tie
    end
    f:close()

    local f = assert(io.open("./Nn/Bucketing/rcats.dat", "r"))
    self.river_buckets = f:read("*number")
    self._ihr_pair_to_bucket = {}
    for i = 1, self.river_buckets do
      local win = f:read("*number")
      local tie = f:read("*number")
      self._ihr_pair_to_bucket[win * 1000 + tie] = i
    end
    f:close()
  end

  if self._turn_means == nil then
    self._turn_means = {}
    local f = assert(io.open("./Nn/Bucketing/turn_means.dat"))
    local num_means = f:read("*number")
    for i = 1,num_means do
      local dist = {}
      for j = 0,50 do
        dist[j] = f:read("*number")
      end
      self._turn_means[i] = dist
    end
    f:close()
  end

  if self._turn_cats == nil then
    self._turn_cats = {}
    local f = assert(io.open("./Nn/Bucketing/turn_dist_cats.dat", "rb"))
    local data = f:read("*all")

    for i = 1, string.len(data), 6 do
      local key = 0
      for j = i,i+3 do
        key = key + data:byte(j) * (2 ^ ((j - i) * 8))
      end
      local cat = data:byte(i+4) + data:byte(i+5) * (2 ^ 8)
      self._turn_cats[key] = cat

      assert(cat <= 1000 and cat >= 1, "cat = " .. cat)
    end
    f:close()
  end
  if self._flop_cats == nil then
    self._flop_cats = {}
    local f = assert(io.open("./Nn/Bucketing/flop_dist_cats.dat", "rb"))
    local data = f:read("*all")

    for i = 1, string.len(data), 6 do
      local key = 0
      for j = i,i+3 do
        key = key + data:byte(j) * (2 ^ ((j - i) * 8))
      end
      local cat = data:byte(i+4) + data:byte(i+5) * (2 ^ 8)
      self._flop_cats[key] = cat

      assert(cat <= 1000 and cat >= 1, "cat = " .. cat)
    end
    f:close()
  end
end

M:_init()

bucket_conversion invalid arguments while running raw_converter.lua

I've been looking for the right docker image and setting up stuff for days...
I am stuck at th Training/raw_converter.lua 4. Here's the error output.

root@a8cc5c2b7949:~/DeepHoldem/Source# th Training/raw_converter.lua 4 
15000 good files        
/root/torch/install/bin/lua: ./Nn/bucket_conversion.lua:61: invalid arguments: CudaTensor FloatTensor CudaTensor 
expected arguments: *CudaTensor* CudaTensor~2D CudaTensor~2D
stack traceback:
        [C]: in function 'mm'
        ./Nn/bucket_conversion.lua:61: in function 'card_range_to_bucket_range'
        Training/raw_converter.lua:129: in function 'convert'
        Training/raw_converter.lua:154: in main chunk
        [C]: in function 'dofile'
        /root/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:150: in main chunk
        [C]: in ?

torch: 7
cutorch: scm-1 (deepstack-Leduc uses 1.0-0 but for me 1.0-0 gets stuck at data generation)
cuda: 8.061
cudnn: 5
os: ubuntu 14.04

Different stack and blind sizes

Hello, To start off I think this a great project and well developed and commented so thank you.
I am writing a paper about poker myself and the code here is giving me a lot of ideas.

This and the official deepstack are both trained on one stack size (20000) and one blind size. The deepstack team used their bot to also play with different stack sizes and different blind sizes like in regular poker matches without retraining their model. However they have not given clear instructions how they exactly did this.
So I was wondering if you had any ideas how to implement this in DeepHoldem without retraining it for different stack and blind sizes.
My bad solution to this was to just bet proportionally to the original model, so a raise of 300 with a stack size 20000 ends up as a raise of 150 with a stack size of only 10000. This is not a very clean solution and I wonder how the official deepstack did it.
Thanks for any suggestions

Sizes error during data generation for turn model

Trying to generate data for model 3 I get the following error.

Here is the traceback

f@desktop ~/D/D/Source> th DataGeneration/main_data_generation.lua 3
Generating data ...
terminal time: 0.6830141544342
5hTh9s8h 1 10
NN information:
epoch 200
gpu true
valid_loss 0.0042649123140391
NN architecture:
nn.Sequential {
[input -> (1) -> (2) -> (3) -> output]
(1): nn.ConcatTable {
input
|-> (1): nn.Sequential { | [input -> (1) -> (2) -> (3) -> (4) -> (5) -> (6) -> (7) -> (8) -> (9) -> (10) -> output] | (1): nn.Linear(1009 -> 500) | (2): nn.BatchNormalization (2D) (500) | (3): nn.PReLU | (4): nn.Linear(500 -> 500) | (5): nn.BatchNormalization (2D) (500) | (6): nn.PReLU | (7): nn.Linear(500 -> 500) | (8): nn.BatchNormalization (2D) (500) | (9): nn.PReLU | (10): nn.Linear(500 -> 1008) | } -> (2): nn.Sequential {
[input -> (1) -> output]
(1): nn.Narrow
}
... -> output
}
(2): nn.ConcatTable {
input
|-> (1): nn.Sequential { | [input -> (1) -> output] | (1): nn.SelectTable(1) | } -> (2): nn.Sequential {
[input -> (1) -> (2) -> (3) -> output]
(1): nn.DotProduct
(2): nn.Replicate
(3): nn.MulConstant
}
... -> output
}
(3): nn.CAddTable
}
nextround init_bucket time: 0.52907490730286
build time: 1.0953109264374
/home/francesco/torch/install/bin/lua: ./Lookahead/lookahead.lua:189: bad argument #1 to 'copy' (sizes do not match at /home/francesco/torch/extra/cutorch/lib/THC/THCTensorCopy.cu:31)
stack traceback:
[C]: in function 'copy'
./Lookahead/lookahead.lua:189: in function '_compute_terminal_equities_terminal_equity'
./Lookahead/lookahead.lua:326: in function '_compute_terminal_equities'
./Lookahead/lookahead.lua:93: in function '_compute'
./Lookahead/lookahead.lua:47: in function 'resolve_first_node'
./Lookahead/resolving.lua:59: in function 'resolve_first_node'
./DataGeneration/data_generation.lua:132: in function 'generate_data_file'
./DataGeneration/data_generation.lua:24: in function 'generate_data'
DataGeneration/main_data_generation.lua:14: in main chunk
[C]: in function 'dofile'
...esco/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:150: in main chunk
[C]: in ?

Be the Dealer

Is there an option to actually "be the dealer"? So like given the DeepHoldem the state of the board and it will respond with it's next step?

I read a thread where you descripted how you let DeepHoldem play against Slumbot (you were talking about a js source), where can I find it?

I would love to use it as a board/hand eval tool, but my c experience is really bad and i'm trien to use it from a c# app.

Sorry that I open an issue, but I have no idea how to reach you.

Deepholdem.lua Segmentation fault

expected arguments: CudaTensor CudaTensor~2D CudaTensor~2D

im trying to convert the raw data stored in Data/Trainsamples to a new folder i called BucketData/4.
this is the last line:

convert(tonumber(arg[1]), "", "/home/painkillerrr/Documents/AAA/Poker/deepStacl/V2/DeepHoldem-master/Data/BucketData/4/")

i leave the "" as it recognize the folder TrainSamples on "defalut".

this is the error line 129:

i run: th Training/raw_converter.lua 4:

17079 good files
/home/painkillerrr/torch/install/bin/lua: ./Nn/bucket_conversion.lua:61: invalid arguments: CudaTensor FloatTensor CudaTensor
expected arguments: CudaTensor CudaTensor~~2D CudaTensor~~2D
stack traceback:
[C]: in method 'mm'
./Nn/bucket_conversion.lua:61: in method 'card_range_to_bucket_range'
Training/raw_converter.lua:129: in function 'convert'
Training/raw_converter.lua:154: in main chunk
[C]: in function 'dofile'
...errr/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:150: in main chunk
[C]: in ?

Bug in terminal_equity.set_call_matrix

For streets 2 and 3 next_round boards ( local next_round_boards = card_tools:get_last_round_boards(board);) is a 2D Tensor containing all the possible boards given a set of board cards.

When you later call handle_blocking_cards in get inner_call_matrix you pass this 2D Tensor as an argument.

Inside handle_blocking_cards you use this 2D array to get the possible hand indexes , which I think is wrong as card_tools:get_possible_hand_indexes expects only one set of board cards.

what's the difference between deepholdem and DeepStack-Leduc.

what's the difference between deepholdem and DeepStack-Leduc.?

and how can i understand the code.it is difficult for me .
can anybody have a detailed view of the code.
sorry to bother

Source code for prebuild bucketing

Hi, fantastic job!!!!

Can you commit code how to made bucket static dat file?
I would like to study how to do it.

Generation data error

I tried to generate data as said in readme, and run into the problem:

Generating data ...	
terminal time: 1.0491530895233	
Qd7h6d6cAd 1 365	
build time: 0.014127969741821	
 1: 0.0024588108062744	
 2: 0.54947257041931	
 3: 0.57994318008423	
 4: 0.0099010467529297	
 5: 1.7352206707001	
 6: 0.88286304473877	
 7: 0.76583361625671	
 8: 0.014760494232178	
resolve time: 4.5872991085052	
/home/celepa/torch/install/bin/lua: cannot open <../Data/TrainSamples/NoLimit/1532797978-Qd7h6d6cAd-1.inputs> in mode  w at /home/celepa/torch/pkg/torch/lib/TH/THDiskFile.c:673
stack traceback:
	[C]: in ?
	[C]: in function 'DiskFile'
	/home/celepa/torch/install/share/lua/5.2/torch/File.lua:385: in function 'save'
	./DataGeneration/data_generation.lua:151: in function 'generate_data_file'
	./DataGeneration/data_generation.lua:24: in function 'generate_data'
	DataGeneration/main_data_generation.lua:14: in main chunk
	[C]: in function 'dofile'
	...lepa/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:150: in main chunk
	[C]: in ?

Any idea what might cause the problem?

Have anyone created a docker file?

It seems like a lot of people have problems getting this to work so I wonder if anyone has created a docker file in a way that anyone with any machine could get it easily to work. I'm trying to create one but I have problems getting it to work.

Pot Sizes

What's the calculation behind the min_pot | max_pot NL settings?

min_pot = {100,200,400,2000,6000}
max_pot = {100,400,2000,6000,18000}

I'm changing stack size and number of players. (4000 stack, 3 players)

With two players, no bet, just blinds and antes already there is 250 on the preflop so why min/max 100?. I guess if you can bet 100 on the flop,turn river that might explain 100 min, (why not 0 in that case). Maybe the idea being you call the 100 so that's the 'max' as well?

--- the size of the game's ante, in chips
params.ante = 100
params.sb = 50
params.bb = 100
--- the size of each player's stack, in chips
params.stack = 20000

And on the river, max (if we use acpc concept of maxspent) would be 40,000 (or 39,500 otherwise). Even if we are bucketing everything above 18,000 to be in 18,000 bucket, I wonder how you came about those particular numbers?

I just wonder because I will have 3 players (thus larger pots), but with smaller stacks, if you computed relative to blinds, stack sizes, or out of hat.

Samples count

Hi! How much samples did you use to train the network? What was the huber loss? Can You share the training data to play with it and compare with my? (I have implimented deepStack in c++ now generating samples)

Setup environment

Hello. I want to ask in which environment did you setup the project. I am trying on ubuntu 18.04 but it resists me. Also, I want to ask if the algorithm behaves like the traditional solvers that exist in the web. Is there a point to compare the strategies with a strategy from an other solver?

LUA Error?

I find success when trying to run deepstack against the human interface out of the box.

However, after enabling the GPU flag I find this error below :

th Player/deepstack.lua
/home/xxx/torch/install/bin/lua: cannot open <../Data/Models/NoLimit/flop/final_gpu.info> in mode r at /home/xxx/torch/pkg/torch/lib/TH/THDiskFile.c:673
stack traceback:
[C]: in ?
[C]: in function 'DiskFile'
/home/xxx/torch/install/share/lua/5.2/torch/File.lua:405: in function 'load'
./Nn/value_nn.lua:44: in function '__init'
/home/xxx/torch/install/share/lua/5.2/torch/init.lua:91: in function </home/xxx/torch/install/share/lua/5.2/torch/init.lua:87>
[C]: in function 'ValueNn'
./Lookahead/lookahead_builder.lua:37: in function '_construct_transition_boxes'
./Lookahead/lookahead_builder.lua:484: in function 'build_from_tree'
./Lookahead/lookahead.lua:28: in function 'build_lookahead'
./Lookahead/resolving.lua:54: in function 'resolve_first_node'
./Player/continual_resolving.lua:45: in function 'resolve_first_node'
./Player/continual_resolving.lua:21: in function '__init'
/home/xxx/torch/install/share/lua/5.2/torch/init.lua:91: in function </home/xxx/torch/install/share/lua/5.2/torch/init.lua:87>
[C]: in function 'ContinualResolving'
Player/deepstack.lua:22: in main chunk
[C]: in function 'dofile'
...egro/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:150: in main chunk
[C]: in ?

Any ideas?

Info

@happypepper

Hello, in the description you have wrote you have make it play vs slumbot, how did you do it?

Thanks

preflop sizings

Where in this can it be altered to include preflop sizes other than call/pot/allin?

Warm-up opponent ranges

I see you have a warm-up opponent ranges in plans. Do you understand the approach from the paper? The have aggressive and conservative approach. As far as I understand

Conservative approach is to linearly combine estimated opp range and uniform and then feed it to the gadget.
In aggressive approach they input uniform range to the gadget, but linearly combine output of the gadget with the estimated opp range.
Am I correct?

multiple GPUs

Is it possible to run on multiple GPUs in parallel ?

Will it reduce the thinking time?

Thanks

Problem with data generation

I am a bit stuck how to do the data generation:

When starting: th DataGeneration/main_data_generation.lua 4

I get the follwing error:

./Nn/value_nn.lua:71: attempt to index field 'mlp' (a nil value)

Did anyone solve this problem before?

What is the process for data generation/ model training (with GPU) that works for you?

Thank you!

"Not our Turn" with protocol demo data

Hey, I'm still trien to let deepholdem play against my own bot / algorithm.

Deepholdem is playing ok on the preflop, but when I try to let it play it's move on the flop, turn or river I only get "Not our Turn", no matter what I try.

When I send for example this state:
MATCHSTATE:0:30:cc/r250c/r500c/:9s8h|/8c8d5c/6s/2d

like shown in the protocol.pdf, it will return "Not our turn", even if I'm pretty sure, it's deepholdems turn ...

Any ideas would be great.

10 million samples models.

Great models

Time per 1M events?

What is the expected computing time that is necessary to generate 1M events on a 1080? If I am reading my terminal correctly then it looks like it is taking me up to 1s per event, which means it will take many days to generate one similar to that outlined on this github.

My terminal reads :
build time: 0.0013620853424072
1: 0.0014543533325195
2: 0.15544033050537
3: 0.18826580047607
4: 0.0041546821594238
5: 0.23357081413269
6: 0.18744778633118
7: 0.22407674789429
8: 0.0065135955810547
resolve time: 1.0202720165253
terminal time: 0.3047468662262

Invalid index in scatterAdd at .../HTensorMath.c:495

the log is:
$ th Player/deepstack.lua 20001

/home/wy/torch/install/bin/lua: ./Nn/next_round_value_pre.lua:114: Invalid index in scatterAdd at /home/wy/torch/pkg/torch/lib/TH/generic/T
HTensorMath.c:495stack traceback:
[C]: in method 'scatterAdd'
./Nn/next_round_value_pre.lua:114: in method '_card_range_to_bucket_range_on_board'
./Nn/next_round_value_pre.lua:247: in method 'get_value_aux'
./Lookahead/lookahead.lua:254: in method '_compute_terminal_equities_next_street_box'
./Lookahead/lookahead.lua:323: in method '_compute_terminal_equities'
./Lookahead/lookahead.lua:93: in method '_compute'
./Lookahead/lookahead.lua:66: in method 'resolve'
./Lookahead/resolving.lua:182: in method 'get_chance_action_cfv'
./Player/continual_resolving.lua:112: in method '_update_invariant'
./Player/continual_resolving.lua:81: in method '_resolve_node'
./Player/continual_resolving.lua:142: in method 'compute_action'
Player/deepstack.lua:41: in main chunk
[C]: in function 'dofile'
...e/wy/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:150: in main chunk
[C]: in ?

there are three places use scatterAdd:

wy@ubuntu:~/poker/DeepHoldem-master/Source$ grep -rni scatterAdd -C 4 .
./Game/card_tools.lua-26-
./Game/card_tools.lua-27-function M:get_possible_hands_mask(hands)
./Game/card_tools.lua-28- local used_cards = arguments.Tensor(hands:size(1), game_settings.card_count):fill(0)
./Game/card_tools.lua-29-
./Game/card_tools.lua:30: used_cards:scatterAdd(2,hands,arguments.Tensor(hands:size(1), 7):fill(1))
./Game/card_tools.lua-31- local ret = torch.le(torch.max(used_cards, 2), 1):long()
./Game/card_tools.lua-32- if arguments.gpu then
./Game/card_tools.lua-33- ret = ret:cudaLong()
./Game/card_tools.lua-34- end

./Nn/next_round_value_pre.lua-93- local other_bucket_range = bucket_range:view(-1,self.board_count,self.bucket_count + 1):zero()
./Nn/next_round_value_pre.lua-94-
./Nn/next_round_value_pre.lua-95- local indexes = self.board_indexes_scatter:view(1,self.board_count, game_settings.hand_count)
./Nn/next_round_value_pre.lua-96- :expand(bucket_range:size(1), self.board_count, game_settings.hand_count)
./Nn/next_round_value_pre.lua:97: other_bucket_range:scatterAdd(
./Nn/next_round_value_pre.lua-98- 3,
./Nn/next_round_value_pre.lua-99- indexes,
./Nn/next_round_value_pre.lua-100- card_range
./Nn/next_round_value_pre.lua-101- :view(-1,1,game_settings.hand_count)

./Nn/next_round_value_pre.lua-110- local other_bucket_range = bucket_range:view(-1,self.bucket_count + 1):zero()
./Nn/next_round_value_pre.lua-111-
./Nn/next_round_value_pre.lua-112- local indexes = self.board_indexes_scatter:view(1,self.board_count, game_settings.hand_count)[{{},board
_idx,{}}]./Nn/next_round_value_pre.lua-113- :expand(bucket_range:size(1), game_settings.hand_count)
./Nn/next_round_value_pre.lua:114: other_bucket_range:scatterAdd(
./Nn/next_round_value_pre.lua-115- 2,
./Nn/next_round_value_pre.lua-116- indexes,
./Nn/next_round_value_pre.lua-117- card_range
./Nn/next_round_value_pre.lua-118- :view(-1,game_settings.hand_count)

and I read the torch7's code, torch7/lib/TH/generic/THTensorMath.c:

void THTensor_(scatterAdd)(THTensor *tensor, int dim, THLongTensor *index, THTensor *src)
{
long elems_per_row, i, idx;

THArgCheck(dim < THTensor_(nDimension)(tensor), 2, "Index dimension is out of bounds");
THArgCheck(THLongTensor_nDimension(index) == THTensor_(nDimension)(tensor), 3,
"Index tensor must have same dimensions as output tensor");
THArgCheck(THTensor_(nDimension)(src) == THTensor_(nDimension)(tensor), 4,
"Input tensor must have same dimensions as output tensor");

elems_per_row = THLongTensor_size(index, dim);

TH_TENSOR_DIM_APPLY3(real, tensor, real, src, long, index, dim,
for (i = 0; i < elems_per_row; ++i)
{
idx = (index_data + iindex_stride);
if (idx < TH_INDEX_BASE || idx >= tensor_size + TH_INDEX_BASE)
{
THFree(TH_TENSOR_DIM_APPLY_counter);
THError("Invalid index in scatterAdd");
}
tensor_data[(idx - TH_INDEX_BASE) * tensor_stride] += (src_data + isrc_stride);
})
}

but I don't know how to fix this.
I am using Ubuntu 16.04, and tried both LUA52 and LUA53.

PS: the insert code of issue is toooooo bad :(

Multi agents or change the blind structure

Hi, thanks for this amazing open project. I am wondering whether we can change the game setting and train multiagents?
Another question is that can I change the blind structure and add ante?
Thank you very much

main_train does nothing

Hello,

After generated data and converted them, I was trying to train the model but it seems like nothing happened :

$ th Training/main_train.lua 4
Loading Net Builder
166858 all good files
Erreur de segmentation (core dumped)

Except the segmentation fail, I don't have any error or result.
The script seems to stop in the file train.lua line 61:

local loss = M.criterion:forward(outputs, targets, mask)

Is someone know what could be the thing ?

Thanks in advance.

handle zero reach strategy

Hello !
can you explain these two lines, it seems it has no effect at all to me.
Maybe it's a fancy implementation which used special feature of torch?

@happypepper

validation loss doesn't decrease

hi, happypepper,

i have generated 120w data for river street and try to train the network, but after 100 epochs i found the validation loss comes to 0.052-0.053 and never decrease any more. i have tried to lower the learning rate more after 100 epochs but nothing changed( the original code divide the lr by 10 after 50 epochs).

So, could you tell you how can you train the river network to decrease validation loss to 0.0415 for 100w data? is there any tricks?

thanks a lot~

can i have your model files?

First,sorry for my bad English.
I have tried to generate data to train my model，but the data generation code runs very slowly on my computer，can you give me your model files or your training data?

my email : [email protected]

Thank you very much !!

Not enough memory

I tried to run the agent, but encountered a problem as the screen shot shows.

By the way, which IDE did you use to debug? I can't find a convenient lua IDE.
I can't run DataGeneration also, even i reduce the params.train_data_count and params.gen_batch_size to 10, 10 respectively. It's also memory error.

I used 32 cpu server with 64GB Memory to run thest code.

And the problem seems to be caused by lookahead, if I do nothing but require lookahead, I will get this error.

Thanks!

thoughts on libratus, modicum?

First, excellent job this is very inspirational.

Have you read the latest modicum bot from CMU?

Also, Libratus beet Deepstack by a large margin, is there any interest in implementing Libratus/Modicum?

Using Lua and tricky torch makes the project quite unfriendly to newbees

I have tried many ways to install the lua and torch of required version. However, as torch only support some ancient CUDA version, there is still some trouble on making it work... Could you please share a docker? Thanks.

A question about training the neural networks

I tried to implement deepstack with python, and generated 4M training samples for the turn network. And I'm using exactly the same network structure as the author did.

But I found that with hundreds of epochs' training, the huber loss is about 0.2 on the training samples, which is far larger than the author's (0.016). Do you have any suggestions on training the network?

Thank you!

arguments.lua - Change SB / BB

Hey,

finally I got the base working and data generating is running, but I have a problem, when I try to change the big blind/ small blind in the arguments.lua I get an Error:

matthias@maG-Linux:~/deepholdem/Source$ th DataGeneration/main_data_generation.lua 4 Generating data ... /home/matthias/torch/install/bin/lua: ./Game/bet_sizing.lua:32: assertion failed! stack traceback: [C]: in function 'assert' ./Game/bet_sizing.lua:32: in function 'get_possible_bets' ./Tree/tree_builder.lua:168: in function <./Tree/tree_builder.lua:87> (...tail calls...) ./Tree/tree_builder.lua:216: in function '_build_tree_dfs' ./Tree/tree_builder.lua:224: in function '_build_tree_dfs' ./Tree/tree_builder.lua:224: in function '_build_tree_dfs' ./Tree/tree_builder.lua:277: in function 'build_tree' ./Lookahead/resolving.lua:28: in function '_create_lookahead_tree' ./Lookahead/resolving.lua:44: in function 'resolve_first_node' ./DataGeneration/data_generation.lua:151: in function 'generate_data_file' ./DataGeneration/data_generation.lua:24: in function 'generate_data' DataGeneration/main_data_generation.lua:14: in main chunk [C]: in function 'dofile' ...hias/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:150: in main chunk [C]: in ?

I tried to change it to this:
params.ante = 10
params.sb = 10
params.bb = 20
params.stack = 500

Btw is there a way to change big blind / small blind dynamicly? Cause Big Blind is getting raised over time in most games.

bad argument #2 to '?'

So I generated about 96k files for the river network. I'm trying to train the models on my CPU. However, when I run th raw_converter.lua 4 I get the following error:
`

48398 good files	
100
/home/colonel/torch/install/bin/lua: bad argument #2 to '?' (end index out of bound at /home/colonel/torch/pkg/torch/generic/Tensor.c:992)

stack traceback:

	[C]: in ?

	[C]: in function '__index'

	raw_converter.lua:129: in function 'convert'

	raw_converter.lua:154: in main chunk

	[C]: in function 'dofile'

	...onel/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:150: in main chunk

	[C]: in ?

I've searched around for the issue and the most common suggestion is that the dataset size may be too small. I dont believe that is the case here since I know a lot of people have been able to train on datasets of this size. Any help is much appreciated.

What is to be expected from this project?

After searching for a while, this is the project that seems to be the most promising in term of publicly available no limit hold'em advanced machine learning based bot with a full 52cards deck fast enough for real time decisions.

Now I believe most neophytes like me stumbling on this repo all ask themselves this shameful question: "can I make money out of this?". Other than being prohibited by the terms and conditions, is it in the realm of possibilities to think we could set up the bot to play against real opponents online in real poker softwares or is it a ridiculous assumption?

I'm actually surprised this isn't something more mainstream. If the technology exists, what is preventing everybody use it and completely render online poker obsolete?

At a quick glance, the project "dealer" script generate the board flop and the bot play on it, there's no easy way to feed in a board and player information coming from an external source. But I guess it wouldn't be too hard to create a script doing that using the existing functions.

There's also the fact that this project is not straightforward to use at all. I consider myself pretty knowledgeable in programming and Linux usage, and I just spent the last 5 hours trying to make this run on WLS2 without success, just configuring Cuda for it is a mess, not to mention this project reliance on cutorch 1.0 makes it dependent on Cuda max 9.2 if I understand correctly which is already quite dated. The there's the whole model training thing that isn't straightforward either. So before putting more effort into it, I decided to just gonna go ahead and ask this here. I'm also going to ask if anyone know other projects that are still actively updated or if I am on the right track with this one?