diditforlulz273 / pokerrl-omaha Goto Github PK

Omaha Poker functionality+some features for PokerRL Reinforcement Learning card framwork

License: MIT License

Python 100.00%

cfr counterfactual-regret-minimization deep-learning monte-carlo-tree-search omaha-poker poker-bot pytorch reinforcement-learning reinforcement-learning-algorithms

pokerrl-omaha's People

Contributors

Stargazers

Watchers

Forkers

reginepeltier patricej jessyica123 ortscheit baconmanbat svanaubel boeunpark trevorjansen xiniuniu mzung michael-z northwolf521 der-ofenmeister jejellyroll-fr bassemfg

pokerrl-omaha's Issues

thanks

You can delete me. I am just saying thanks for your contributions

Memory usage grows rapidly in SD-CFR

As strategies buffer grows, memory usage grows rapidly, because every strategy copy its own lookup table inside its network. In my situation, memory grows out of memory at iteration 300+...
Every lookup table in the strategy are transferred into float32 occupying more than 800mb memory which make model training and serving impossible.

Adding stack size as an input parameter

I was wondering, does this for allow for stack sizes to be added as an input parameter too the network and if so how long would you think it would take to converge to something close enough to the Nash equilibrium with respect to the normal un-ajusted stacksize version of this fork?

Also how would I go about adding stack size as an input parameter to this fork?

Evaluation issues

Hi Vsevolod!

I've tried to launch PLO_training_start.py with enabled LBR and failed (without any eval_methods iterations are running fine, but I can't evaluate results). I've tried both PLO and DiscretizedNLHoldem, with Debugging option turned on and off.
When DEBUGGING=True, and nn_type "feedforward" or "dense_residual", I've got AssertionError:

/PokerRL-Omaha-master/DeepCFR/IterationStrategy.py", line 144, in get_a_probs_for_each_hand_in_list
assert len(pub_obs.shape) == 2, "all hands have the same public obs"
AssertionError: all hands have the same public obs

And if DEBUGGING=False I've got this error on iteration 1:

PokerRL-Omaha-master/PokerRL/rl/neural/MainPokerModuleFLAT2.py", line 109, in forward
pf_mask = torch.where(pub_obses[:, 14] == 1)
TypeError: list indices must be integers or slices, not tuple

If nn_type="recurrent", I've got error on iteration 0:

PokerRL-Omaha-master/PokerRL/rl/neural/MainPokerModuleRNN.py", line 157, in forward
pub_obses = torch.from_numpy(pub_obses[0]).to(self.device).view(seq_len, bs, self.pub_obs_size)
TypeError: expected np.ndarray (got Tensor)

My requirements.txt:

gym==0.10.9 (tried 0.12.5 too)
numpy==1.21.2
psutil==5.8.0
pycrayon==0.5
pytz==2021.3
ray==0.6.1 (didn't use Distributed)
scipy==1.7.3
torch==1.4.0 (tried Pytorch versions till 1.10 with CUDA 10.2)

CPU-GPU scheme

Where is the code for the CPU:-GPU training scheme? for me it is not using the GPU

diditforlulz273 / pokerrl-omaha Goto Github PK

pokerrl-omaha's People

Contributors

Stargazers

Watchers

Forkers

pokerrl-omaha's Issues

thanks

Memory usage grows rapidly in SD-CFR

Adding stack size as an input parameter

Evaluation issues

CPU-GPU scheme

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent