Coder Social home page Coder Social logo

Comments (4)

Mingbo-Lee avatar Mingbo-Lee commented on June 19, 2024

这应该不是WSL网络方面的问题

from yacl.

Jamie-Cui avatar Jamie-Cui commented on June 19, 2024

可以先检查一下 config 里面的端口是否已经被占用吗?Thx~

from yacl.

integrationex01 avatar integrationex01 commented on June 19, 2024

可以先检查一下 config 里面的端口是否已经被占用吗?Thx~

————————————————————————————————————
这是运行时对应的config:
代码:aby3_config = sf.utils.testing.cluster_def(parties=["alice", "bob", "carol"])
aby3-config:node:[{'party': 'alice', 'address': '127.0.0.1:55661'}, {'party': 'bob', 'address': '127.0.0.1:41041'}, {'party': 'carol', 'address': '127.0.0.1:42505'}]
这是对应端口的查看:
(sf) :$ sudo netstat tnulp | grep 55661
tcp 0 0 localhost:55661 localhost:44110 ESTABLISHED
tcp 0 0 localhost:44112 localhost:55661 ESTABLISHED
tcp 0 0 localhost:55661 localhost:44112 ESTABLISHED
tcp 0 0 localhost:44110 localhost:55661 ESTABLISHED
(sf) :
$ sudo netstat tnulp | grep 41041
tcp 0 0 localhost:40610 localhost:41041 TIME_WAIT
tcp 0 0 localhost:41041 localhost:40616 ESTABLISHED
tcp 0 0 localhost:40614 localhost:41041 ESTABLISHED
tcp 0 0 localhost:41041 localhost:40614 ESTABLISHED
tcp 0 0 localhost:40616 localhost:41041 ESTABLISHED
(sf) :$ sudo netstat tnulp | grep 42505
tcp 0 0 localhost:42505 localhost:52662 ESTABLISHED
tcp 0 0 localhost:52662 localhost:42505 ESTABLISHED
tcp 0 0 localhost:52668 localhost:42505 ESTABLISHED
tcp 0 0 localhost:42505 localhost:52668 ESTABLISHED
(sf) :
$ sudo lsof -i:55661
COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME
ray::SPUR 8602 30u IPv4 37020 0t0 TCP localhost:55661 (LISTEN)
ray::SPUR 8602 54u IPv4 38036 0t0 TCP localhost:55661->localhost:44110 (ESTABLISHED)
ray::SPUR 8602 57u IPv4 38037 0t0 TCP localhost:55661->localhost:44112 (ESTABLISHED)
ray::SPUR 8604 51u IPv4 36202 0t0 TCP localhost:44110->localhost:55661 (ESTABLISHED)
ray::SPUR 8613 51u IPv4 35103 0t0 TCP localhost:44112->localhost:55661 (ESTABLISHED)
(sf) :$ sudo lsof -i:42505
COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME
ray::SPUR 8602 65u IPv4 37056 0t0 TCP localhost:52668->localhost:42505 (ESTABLISHED)
ray::SPUR 8604 61u IPv4 39155 0t0 TCP localhost:52662->localhost:42505 (ESTABLISHED)
ray::SPUR 8613 30u IPv4 35099 0t0 TCP localhost:42505 (LISTEN)
ray::SPUR 8613 61u IPv4 27402 0t0 TCP localhost:42505->localhost:52662 (ESTABLISHED)
ray::SPUR 8613 65u IPv4 40171 0t0 TCP localhost:42505->localhost:52668 (ESTABLISHED)
(sf) :
$ sudo lsof -i:41041
COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME
ray::SPUR 8602 64u IPv4 37055 0t0 TCP localhost:40616->localhost:41041 (ESTABLISHED)
ray::SPUR 8604 30u IPv4 36198 0t0 TCP localhost:41041 (LISTEN)
ray::SPUR 8604 64u IPv4 38039 0t0 TCP localhost:41041->localhost:40614 (ESTABLISHED)
ray::SPUR 8604 65u IPv4 38041 0t0 TCP localhost:41041->localhost:40616 (ESTABLISHED)
ray::SPUR 8613 64u IPv4 40163 0t0 TCP localhost:40614->localhost:41041 (ESTABLISHED)

发现提交issue的项目有些问题(是yacl不是secretflow)不好意思

from yacl.

warriorpaw avatar warriorpaw commented on June 19, 2024

执行环境:WSL:Ubuntu20.04 版本:SecretFlow: 1.1.0b0 执行代码:spu_device = sf.SPU(aby3_config) (文档:教程:SPU基础) 打印日志: (SPURuntime pid=19246) 2023-09-20 14:00:38.165 [info] [default_brpc_retry_policy.cc:DoRetry:52] socket error, sleep=1000000us and retry (SPURuntime pid=19245) 2023-09-20 14:00:38.165 [info] [default_brpc_retry_policy.cc:DoRetry:52] socket error, sleep=1000000us and retry (SPURuntime pid=19246) 2023-09-20 14:00:39.165 [info] [default_brpc_retry_policy.cc:LogHttpDetail:29] cntl ErrorCode '112', http status code '200', response header '', error msg '[E111]Fail to connect Socket{id=1 addr=127.0.0.1:56231} (0x0x4a73900): Connection refused [R1][E112]Not connected to 127.0.0.1:56231 yet, server_id=1' (SPURuntime pid=19246) 2023-09-20 14:00:39.165 [info] [default_brpc_retry_policy.cc:DoRetry:75] aggressive retry, sleep=1000000us and retry (SPURuntime pid=19245) 2023-09-20 14:00:39.165 [info] [default_brpc_retry_policy.cc:LogHttpDetail:29] cntl ErrorCode '112', http status code '200', response header '', error msg '[E111]Fail to connect Socket{id=1 addr=127.0.0.1:56231} (0x0x4cdfd00): Connection refused [R1][E112]Not connected to 127.0.0.1:56231 yet, server_id=1' (SPURuntime pid=19245) 2023-09-20 14:00:39.165 [info] [default_brpc_retry_policy.cc:DoRetry:75] aggressive retry, sleep=1000000us and retry (SPURuntime pid=19246) 2023-09-20 14:00:40.166 [info] [default_brpc_retry_policy.cc:LogHttpDetail:29] cntl ErrorCode '112', http status code '200', response header '', error msg '[E111]Fail to connect Socket{id=1 addr=127.0.0.1:56231} (0x0x4a73900): Connection refused [R1][E112]Not connected to 127.0.0.1:56231 yet, server_id=1 [R2][E112]Not connected to 127.0.0.1:56231 yet, server_id=1' (SPURuntime pid=19246) 2023-09-20 14:00:40.166 [info] [default_brpc_retry_policy.cc:DoRetry:75] aggressive retry, sleep=1000000us and retry (SPURuntime pid=19245) 2023-09-20 14:00:40.166 [info] [default_brpc_retry_policy.cc:LogHttpDetail:29] cntl ErrorCode '112', http status code '200', response header '', error msg '[E111]Fail to connect Socket{id=1 addr=127.0.0.1:56231} (0x0x4cdfd00): Connection refused [R1][E112]Not connected to 127.0.0.1:56231 yet, server_id=1 [R2][E112]Not connected to 127.0.0.1:56231 yet, server_id=1' (SPURuntime pid=19245) 2023-09-20 14:00:40.166 [info] [default_brpc_retry_policy.cc:DoRetry:75] aggressive retry, sleep=1000000us and retry (SPURuntime pid=19248) 2023-09-20 14:00:40.214 [info] [default_brpc_retry_policy.cc:DoRetry:69] not retry for reached rcp timeout, ErrorCode '1008', error msg '[E1008]Reached timeout=2000ms @127.0.0.1:34335'

不知道是具体哪块的问题,是WSL网络方面的问题还是其他情况,希望能解答一下,谢谢。

这些日志是info 不是error ,多个计算节点的端口拉起的时间有差,导致刚启动几秒钟可能出现这种日志,但不是异常了。继续执行教程内容就行。

from yacl.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.