Coder Social home page Coder Social logo

tensorflow-demo's Issues

cifar10 distribute error <tf.Tensor 'report_uninitialized_variables/boolean_mask/Gather:0' shape=(?,) dtype=string>

运行cifar10下面的cifar10_sync_dist_train.py,在一台机器上模拟分布式,

python cifar10_sync_dist_train.py --ps_hosts localhost:2220 --job_name ps --worker_hosts localhost:2221,localhost:2222 --task_id 0

python cifar10_sync_dist_train.py --ps_hosts localhost:2220 --job_name worker --worker_hosts localhost:2221,localhost:2222 --task_id 0

python cifar10_sync_dist_train.py --ps_hosts localhost:2220 --job_name worker --worker_hosts localhost:2221,localhost:2222 --task_id 1

问题出来了,我在tensorflow 0.12上是可以运行,没有错误的,但是在tensorflow1.2和tensorflow1.3上面就出现了这个问题(代码是修改了符合版本可运行后的,一些函数的api的修改,并没有添加任何函数)。

ERROR

ERROR:tensorflow:==================================
Object was never used (type <class 'tensorflow.python.framework.ops.Tensor'>):
<tf.Tensor 'report_uninitialized_variables/boolean_mask/Gather:0' shape=(?,) dtype=string>
If you want to mark it as used call its "mark_used()" method.
It was originally created here:
['File "cifar10_sync_dist_train.py", line 179, in <module>\n    tf.app.run()', 'File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/platform/app.py", line 48, in run\n    _sys.exit(main(_sys.argv[:1] + flags_passthrough))', 'File "cifar10_sync_dist_train.py", line 176, in main\n    train()', 'File "cifar10_sync_dist_train.py", line 113, in train\n    apply_gradients_op = opt.apply_gradients(grads, global_step=global_step)', 'File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/training/sync_replicas_optimizer.py", line 257, in apply_gradients\n    variables.global_variables())', 'File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/util/tf_should_use.py", line 170, in wrapped\n    return _add_should_use_warning(fn(*args, **kwargs))', 'File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/util/tf_should_use.py", line 139, in _add_should_use_warning\n    wrapped = TFShouldUseWarningWrapper(x)', 'File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/util/tf_should_use.py", line 96, in __init__\n    stack = [s.strip() for s in traceback.format_stack()]']

failed to connect to 'ipv4:10.160.113.47:2222': socket error: connection refused

when I run the distributed mnist_cnn.py and I just followed the comand like this "./start_tf.sh 8 3 mnist_cnn.py", I encountered some errors such as "failed to connect to 'ipv4:10.160.113.47:2222': socket error: connection refused" .
Besides, I am also wondering by using the command "./start_tf.sh 8 3 mnist_cnn.py" how to start remote server process without using ssh or some other protocols.
thanks.

想问下分布式下的tensorflow

你好,我看了那个distribute文件夹里面的例子 ,在我的机器上是可以运行的。
我看了那个分布式的例子
想问下数据是自己切分么?
而且在同步的时候是怎么同步的?需要自己手动的同步么?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.