wuthefwasthat / hanabi.rs Goto Github PK

View Code? Open in Web Editor NEW

43.0 43.0 11.0 244 KB

State of the art Hanabi bots + simulation framework in rust

Rust 100.00%

hanabi.rs's People

Contributors

Stargazers

Watchers

Forkers

gitter-badger tjhance felixbauckholt stanxii fshih1 fangbq timotree3 phimuemue apachescribe standardgalactic

hanabi.rs's Issues

general hat-based strategy discussion

@felixbauckholt

Moving discussion from discord to here, as it's a bit easier to follow this way

Your original comment (modifying format a bit):

I spent a bunch of time getting increasingly confused about my "general hat-based strategy" idea, and now I think it might not be very useful: All the interesting nontrivial tweaks to a hat-based strategy I could think of involve conveying information that can't immediately be made common knowledge, and thus don't fit into a "standard hat-based template" well. The tweaks I was mostly thinking of were

@florrat2 's approach (where a player will only be able to decode a hint once all players between the hinter and them have played)

Whenever there's a choice between different hints that have the same "hat-value" (say two color hints that don't involve the "hint-index" card), using that choice in a "hat-based" way to establish common knowledge of something between all players but the hinted

If the amount of information in a hint is a composite number, "splitting up" the hint into a part that is made common knowledge now, and a part that will be made common knowledge later

If you have other ideas that fit the "general hat-based strategy" template better, let me know!)

I haven't thought much along these lines, but initial thoughts:

Could you explain florrat2's approach? It sounds very tricky, and I don't think I understand how it would work
Very cool idea! Could be a big boost, esp in 4 and 5 player
This sounds similar to 1, also not sure how it would work. What were you thinking exactly?

BTW, I think I imagine not so much "general hat-based strategy" as "general hat-based information strategy"! which the code might already be mostly set up for.

The information strategy seems nondeterministic and I don't know why

I only now noticed that when running the strategy with -n 10000 -s 0 -t 4 -p 5 -g info, the score histogram varies very slightly between different runs. This behavior goes as far back as commit 7370859, and does not change when I run with one thread instead of four.

Do you have any guesses what could be causing this? Floating-point stuff would explain different behavior across versions that "should" be equivalent, but it doesn't seem to explain why behavior would be different across runs of the same code. I heard that HashMaps randomize something for security reasons, but I'm not sure if that would plausibly affect the strategy, and I also don't know how to turn off that feature and make all HashMaps deterministic.

Am I likely to miss other causes of nondeterminism? Are there any tricks to make win-rates reproducible exactly?

Simulator allows discarding at 8 clues

The Hanabi rules in my copy (PDF here) and the ones implemented on hanab.live agree that discarding at 8 clues is not allowed. The simulator does not implement this functionality, and the strategies seem to be making use of this: both the cheating and information strategies discarded at 8 clues in the first game I tested with each. This should be simple enough to fix but it may reduce the effectiveness of the strategies

wuthefwasthat / hanabi.rs Goto Github PK

hanabi.rs's People

Contributors

Stargazers

Watchers

Forkers

hanabi.rs's Issues

general hat-based strategy discussion

The information strategy seems nondeterministic and I don't know why

Simulator allows discarding at 8 clues

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent