Comments (8)
Could you tell me what is your tensorflow version? It seems like a backward incompatibility issue. I was using tf1.4.0 when developing this project. If you find a solution, pull request is welcomed. Thanks
Edit: It is comfired to be a bug caused by tf1.7.0. The program still works if you comment out the global variables in var_to_save in Network.py. @arisliang
from alphagozero-python-tensorflow.
I use tf 1.7.0. If commented out var_to_save, we also need to comment out the self.saver attribute, is it? since it depends on the var_to_save to initialize. And once commented out that, the program will fail to load model, since there's no saver anymore.
from alphagozero-python-tensorflow.
By the way, same error happened in 1.8.0 too. Thanks to your updated comments in the code, it's more clear to me how to apply this fix.
from alphagozero-python-tensorflow.
I finally figure out what was wrong. We know all variables we user created are called "global variable" (in contrast, variables created inside tensorflow api is called "local variable"), among "global variable", variables whose "trainable" flag isn't false are called "trainable variable".
If you take a look at the _batch_norm() I wrote, I created the offset(beta) and scale(gamma) who are trainable. But tensorflow 1.4 didn't include them in "trainable variable". I noticed that and added them into var_to_save. And the tensorflow team fixed this bug in later version (start from 1.5 actually). Hope this explains everything.
from alphagozero-python-tensorflow.
Since there is an issue in loading the model, maybe you want to try install tf 1.4 GPU (with py3.6 and linux O/S):
pip install https://storage.googleapis.com/tensorflow/linux/gpu/tensorflow_gpu-1.4.0-cp36-cp36m-linux_x86_64.whl
from alphagozero-python-tensorflow.
Wow, didn't know tf 1.4 has this bug. Do you know which issue they created for this fix? I tried to google, but couldn't find. Actually would you consider to upgrade the code to more recent tf? since newer tf include this bug fix, plus other improvements and bug fixes I would imagine. I couldn't install tf1.4 despite trying, because I have cuda9.0 installed, tf1.4 seems to require cuda8.0
from alphagozero-python-tensorflow.
@yhyu13 This wouldn't be a general fix, as the installed CUDA version for people running TF 1.7/1.8 would be CUDA 8 or higher. A downgrade to 1.4 would require downgrading entire CUDA setups.
Would be interested in a fix for working on latest TF version.
from alphagozero-python-tensorflow.
@awilliamson The solution that worked is to comment out the list of variables and just left var_to_save = tf.trainable_variables()
. But the issue was the trained model malfunctions under tf1.5 and higher. The code would work but requires to retrain.
from alphagozero-python-tensorflow.
Related Issues (12)
- 'Network' object has no attribute 'run' when gtp mode is chosen HOT 1
- How is the selfplay version going? HOT 1
- main.py: error: unrecognized arguments: —-policy=randompolicy HOT 1
- What does random AI mean? HOT 1
- How to specify pretrained model in model_path? HOT 1
- What level is pretrained large20 against random? The win rate is always 0.5 HOT 2
- Not found: Key Variable not found in checkpoint HOT 4
- Go on!! HOT 25
- Project Update HOT 26
- uvloop modle can not for windows HOT 4
- ChessAlpha Zero development
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from alphagozero-python-tensorflow.