Coder Social home page Coder Social logo

Comments (17)

vyzo avatar vyzo commented on August 16, 2024 2

can you try either disabling mplex or with libp2p/go-mplex#99 ?

from edgevpn.

vyzo avatar vyzo commented on August 16, 2024 1

can you also check whether mplex is involved?
You probably dont need it at all, can you try limiting the muxer to just yamux?

from edgevpn.

vyzo avatar vyzo commented on August 16, 2024 1

you can also get logs with GOLOG_LOG_LEVEL=debug, there should be some hints there.

from edgevpn.

vyzo avatar vyzo commented on August 16, 2024 1

yeah, the default inbound conn limit is very conservative.

from edgevpn.

vyzo avatar vyzo commented on August 16, 2024

keep us in the loop, v18 is an important release and we want to iron out all issues.

Are you using bitswap by any chance?

Another pointer is that i suspect there might be some bug in yamux that makes it incapable of responidng correctly to refusal to increase the window, but thats still only a theory at this point.

from edgevpn.

mudler avatar mudler commented on August 16, 2024

keep us in the loop, v18 is an important release and we want to iron out all issues.

Sure will do 👍 , thanks!

Are you using bitswap by any chance?

Nope, things here are relatively much more simple as we just send over one block to the nodes (don't implement any real PoW, but just using it as a sync mechanism) and there is no block syncing (yet?)... so it is more tight to libp2p core modules and simple pub/sub mechanism which are just extensions of libp2p samples

Another pointer is that i suspect there might be some bug in yamux that makes it incapable of responidng correctly to refusal to increase the window, but thats still only a theory at this point.

I'll keep my eyes open there, thanks for the hint!

from edgevpn.

mudler avatar mudler commented on August 16, 2024

can you also check whether mplex is involved? You probably dont need it at all, can you try limiting the muxer to just yamux?

I'll give it a shot and try to collect as much info as possible, thanks for the pointers! The fact that nodes can't re-establish a connection afterwards should help trace it, I'll capture logs with libp2p component with debug loglevel and try to getting them in that exact moment to have a clearer picture of what's going on

from edgevpn.

mudler avatar mudler commented on August 16, 2024

can you try either disabling mplex or with libp2p/go-mplex#99 ?

Going to try that! Thanks ! Although I can test only later in the day as I'm afk now, letting you know as soon as I am at it and keeping you in the loop

from edgevpn.

mudler avatar mudler commented on August 16, 2024

I'm sorry I didn't had time to get back at this yet during the weekend, I have still to setup my test environment to reproduce the issue as it is time-consuming process to do that manually (I observed this while set up kubernetes clusters on top of it, and it's the straightforward way for me to reproduce it). I'll look at it during the week and keep you posted

from edgevpn.

mudler avatar mudler commented on August 16, 2024

I'm following up the discussions on the PRs, will cut down later a specific version with libp2p/go-libp2p#1350 and check it out

from edgevpn.

mudler avatar mudler commented on August 16, 2024

I'm trying to setup a small automated test that I'm running on GHA to be able to narrow it down. It seems the problem is still there (https://github.com/mudler/edgevpn/runs/5432147596?check_suite_focus=true ) I'm trying to send over a file of 2GB between two nodes in the above.

I will enhance it to able to collect pprof and libp2p debug logs too so to have a better view of it. This could have been also something flaky, the setup of the test right now is really simplicistic (at the moment is just bashism so it is a bit hard to debug. will move it to golang soon so I can make it more interesting scenario)

from edgevpn.

mudler avatar mudler commented on August 16, 2024

I just did a test in my homelab with multiple VMs and everything seems good here! I'll cut a rel and test it a bit more in a bigger scenario. Up to now the connection on the nodes seems back to stable, and no drops anymore at all! I'll keep you posted if I notice something strange

from edgevpn.

mudler avatar mudler commented on August 16, 2024

I've cut v0.11.0 with libp2p 0.18.0-rc6, thanks! will keep you in the loop if I spot something

from edgevpn.

vyzo avatar vyzo commented on August 16, 2024

great, thank you!

from edgevpn.

mudler avatar mudler commented on August 16, 2024

Alright, seems while testing it on a bigger scale I'm seeing the same issues. Intermittently nodes are dropping off and not connecting back again. I've cut also a release of c3os with it, where the issue can be observed too https://github.com/c3os-io/c3os/releases/tag/v1.21.4-36

from edgevpn.

mudler avatar mudler commented on August 16, 2024

There seems to be a slightly difference, it seems to happen when I start to send over big chunk of data. It survives pings and other stuff just good.

from edgevpn.

mudler avatar mudler commented on August 16, 2024

ok disabling the rcmgr make everything work as usual, so it has to do with the defaults limits probably. I was using rcmgr defaults in my first attempts so maybe that was too much conservative indeed.

I'm going to disable by default rcmgr and play around it until I get some good defaults by running some benchmarks and maybe reuse the same maxConns approach as in lotus to see if that's suits defaults for my case as well (might not fit very well on Pis, but shall see :) ).

from edgevpn.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.