Comments (16)
Oh I see what it might be going on. See my comments on the snippet.
setup_node_id(#{host := _Host, port := _Port, client_id := NodeID} = State) ->
_CurrName = node(),
%% >>>>
%% Here partisan started, adopted a dynamic name
%% and if a discovery agent was configured on the sys.config
%% it might have already joined the cluster with its dynamic name
%% <<<<
%% >>>>
%% Stop will not do a leave, so you need to call partisan_peer_service:leave() before stopping the app
%% <<<<
partisan:stop(),
net_kernel:stop(),
{NodeName, HostName} =
case net_kernel:start(NodeID, #{name_domain => shortnames}) of
{ok, _Pid} ->
partisan:start(),
{ok , LocalHost} = inet:gethostname(),
{node(), LocalHost};
{error, Reason} ->
io:format("error starting net_kernel: ~p~n", [Reason]),
{undefined, undefined}
end,
State#{node => atom_to_binary(NodeName), hostname => list_to_binary(HostName)}
.
Overall is there any reason why you are starting disterl? I wouldn't do it like this :-)
I tend to completely disable it and set the nodename via the vm.args
but you can also set it via sys.conf using {partisan, [{name, NODENAME}, ...]}.
.
You can use vm.args.src
in rebar3 to take the value from an ENV VAR too.
## Name of the node (used by Partisan)
-name ${ERLANG_NODENAME}
## Cookie for disterl (used for remote_console only)
-setcookie my cookie
## Explicit connections only (Deprecated in latest versions)
-connect_all false
-auto_connect never
## Disable disterl
-start_epmd false
-hidden
You can also just tell relx
to load
partisan but not start it, so that you can do your custom name configuration, then do sth like the following:
%% Asumming partisan has been loaded but not started
Nodename = ...do your magic here
application:set_env(partisan, name, Nodename),
_ = application:ensure_all_started(partisan)
from partisan.
Hi @mcesaro , I will look into that tomorrow morning and let you know when fixed. Thanks a lot!
from partisan.
Which version/ tag are you using?
from partisan.
It's the one included in tag "leapsight-1.5.1"
in erleans
, which is in my settings "v5.0.0-rc.11"
.
Now that you mention it, one of the other issues I should address is a generic handling of the partisan
versions among applications including erleans
and applications using partisan
directly.
from partisan.
hi @mcesaro just to clarify when you say you would like the server to keep on running which one are you referrying to? Cause the server that is crashing is partisan_peer_service_client
, the TCP client connection that Partisan is trying to create with 'pebble@max-a5'. During the handshake the peer identifies with a different name, hence the error, this must be the case when the node has a membership view where 'pebble@max-a5' listens to IP X, but IP X is now associated with a node with a different name. In this case the partisan_peer_service should crash and the Peer Service Manager (in this case partisan_pluggable_peer_service_manager
) should continuosly retry that connection until its view is updated (replacing
'pebble@max-a5' with a new spec).
Although the partisan_peer_service_client
server crashes and its linked to the Peer Service Manager, the latter should still be running as it traps exits and handles them in handle_info
.
from partisan.
Hi,
I see the logic, although it's not clear to me how the peer service view is updated (should it be automatic or triggered somehow ?).
I was concerned because the peer service manager itself crashed, bringing down the rest of the application.
My test envirnoment is made of a bunch of lxc
containers with 2/3 different bridged LANs and the local DNS (manage by lxd
) might be part of the problem.
from partisan.
HI,
Can you provide me the log entries were the Peer Service crashes with the previous logs as well for context.
If the Peer Service crashes due to the above ther must be something else. In any case that would be a bug.
it's not clear to me how the peer service view is updated
The view is updated by each node gossiping its state, a CRDT object, which is merged on every node at every round.
This is done by the Peer Service Manager, in your case
partisan_pluggable_peer_service_manager
(process) delegating the details to partisan_full_membership_strategy
(module). The first one periodically calls the second one periodic
function. The latter uses partisan_membership_set
a module wrapping access to a CRDT, to maintain its view of the cluster. The periodic function returns the messages the Peer Service Manager for it to send to all the other peers. On reception the remote Peer Service Manager merges the received CRDT with the one maintained by its local partisan_full_membership_strategy
module.
from partisan.
UPDATE: the problem affects all the members of a cluster.
Hi,
I can't replicate the crash, so it probably was due to a different reason in my own code.
However, I'm experiencing the following behavior that renders the system unusable: when one of the cluster members restarts and try to join again, a kind of loop condition is generated.
See the following low from 2 nodes, pebble
and cwork
:
peer pebble
=ERROR REPORT==== 25-Oct-2023::11:28:49.915373 ===
description: Unexpected peer, aborting
expected: '[email protected]'
got: 'pebble@max-a5'
=ERROR REPORT==== 25-Oct-2023::11:28:49.915515 ===
** Generic server <0.3063.0> terminating
** Last message in was {tcp,#Port<0.1137>,
<<131,104,2,119,5,104,101,108,108,111,119,13,112,
101,98,98,108,101,64,109,97,120,45,97,53>>}
** When Server state == {state,
{partisan_peer_socket,#Port<0.1137>,gen_tcp,inet,
false},
#{port => 10202,ip => {192,168,1,120}},
partisan_membership,
#{monotonic => false,parallelism => 1,
compression => true},
[compressed],
<0.1066.0>,
#{name =>
'[email protected]',
listen_addrs =>
[#{port => 10202,ip => {192,168,1,120}}],
channels =>
#{undefined =>
#{monotonic => false,parallelism => 1,
compression => false},
data =>
#{monotonic => false,parallelism => 1,
compression => false},
partisan_membership =>
#{monotonic => false,parallelism => 1,
compression => true}}}}
** Reason for termination ==
** {unexpected_peer,'pebble@max-a5',
'[email protected]'}
=CRASH REPORT==== 25-Oct-2023::11:28:49.915713 ===
crasher:
initial call: partisan_peer_service_client:init/1
pid: <0.3063.0>
registered_name: []
exception exit: {unexpected_peer,'pebble@max-a5',
'[email protected]'}
in function gen_server:handle_common_reply/8 (gen_server.erl, line 1208)
ancestors: [partisan_pluggable_peer_service_manager,
partisan_peer_service_sup,partisan_sup,<0.1059.0>]
message_queue_len: 0
messages: []
links: [<0.1066.0>]
dictionary: [{{partisan_peer_service_client,peer},
#{name => '[email protected]',
listen_addrs => [#{port => 10202,ip => {192,168,1,120}}],
channels =>
#{undefined =>
#{monotonic => false,parallelism => 1,
compression => false},
data =>
#{monotonic => false,parallelism => 1,
compression => false},
partisan_membership =>
#{monotonic => false,parallelism => 1,
compression => true}}}},
{{partisan_peer_service_client,channel_opts},
#{monotonic => false,parallelism => 1,compression => true}},
{{partisan_peer_service_client,from},<0.1066.0>},
{{partisan_peer_service_client,listen_addr},
#{port => 10202,ip => {192,168,1,120}}},
{{partisan_peer_service_client,channel},partisan_membership},
{{partisan_peer_service_client,egress_delay},0}]
trap_exit: false
status: running
heap_size: 610
stack_size: 28
reductions: 13853
neighbours:
=ERROR REPORT==== 25-Oct-2023::11:28:50.917261 ===
description: Unexpected peer, aborting
expected: '[email protected]'
got: 'pebble@max-a5'
=ERROR REPORT==== 25-Oct-2023::11:28:50.917498 ===
** Generic server <0.3080.0> terminating
** Last message in was {tcp,#Port<0.1145>,
<<131,104,2,119,5,104,101,108,108,111,119,13,112,
101,98,98,108,101,64,109,97,120,45,97,53>>}
** When Server state == {state,
{partisan_peer_socket,#Port<0.1145>,gen_tcp,inet,
false},
#{port => 10202,ip => {192,168,1,120}},
undefined,
#{monotonic => false,parallelism => 1,
compression => false},
[],<0.1066.0>,
#{name =>
'[email protected]',
listen_addrs =>
[#{port => 10202,ip => {192,168,1,120}}],
channels =>
#{undefined =>
#{monotonic => false,parallelism => 1,
compression => false},
data =>
#{monotonic => false,parallelism => 1,
compression => false},
partisan_membership =>
#{monotonic => false,parallelism => 1,
compression => true}}}}
** Reason for termination ==
** {unexpected_peer,'pebble@max-a5',
'[email protected]'}
=ERROR REPORT==== 25-Oct-2023::11:28:50.917912 ===
description: Unexpected peer, aborting
expected: '[email protected]'
got: 'pebble@max-a5'
=CRASH REPORT==== 25-Oct-2023::11:28:50.917764 ===
crasher:
initial call: partisan_peer_service_client:init/1
pid: <0.3080.0>
registered_name: []
exception exit: {unexpected_peer,'pebble@max-a5',
'[email protected]'}
in function gen_server:handle_common_reply/8 (gen_server.erl, line 1208)
ancestors: [partisan_pluggable_peer_service_manager,
partisan_peer_service_sup,partisan_sup,<0.1059.0>]
message_queue_len: 0
messages: []
links: [<0.1066.0>]
dictionary: [{{partisan_peer_service_client,peer},
#{name => '[email protected]',
listen_addrs => [#{port => 10202,ip => {192,168,1,120}}],
channels =>
#{undefined =>
#{monotonic => false,parallelism => 1,
compression => false},
data =>
#{monotonic => false,parallelism => 1,
compression => false},
partisan_membership =>
#{monotonic => false,parallelism => 1,
compression => true}}}},
{{partisan_peer_service_client,channel_opts},
#{monotonic => false,parallelism => 1,
compression => false}},
{{partisan_peer_service_client,from},<0.1066.0>},
{{partisan_peer_service_client,listen_addr},
#{port => 10202,ip => {192,168,1,120}}},
{{partisan_peer_service_client,channel},undefined},
{{partisan_peer_service_client,egress_delay},0}]
trap_exit: false
status: running
heap_size: 610
stack_size: 28
reductions: 13644
neighbours:
=ERROR REPORT==== 25-Oct-2023::11:28:50.918026 ===
** Generic server <0.3084.0> terminating
** Last message in was {tcp,#Port<0.1148>,
<<131,104,2,119,5,104,101,108,108,111,119,13,112,
101,98,98,108,101,64,109,97,120,45,97,53>>}
** When Server state == {state,
{partisan_peer_socket,#Port<0.1148>,gen_tcp,inet,
false},
#{port => 10202,ip => {192,168,1,120}},
data,
#{monotonic => false,parallelism => 1,
compression => false},
[],<0.1066.0>,
#{name =>
'[email protected]',
listen_addrs =>
[#{port => 10202,ip => {192,168,1,120}}],
channels =>
#{undefined =>
#{monotonic => false,parallelism => 1,
compression => false},
data =>
#{monotonic => false,parallelism => 1,
compression => false},
partisan_membership =>
#{monotonic => false,parallelism => 1,
compression => true}}}}
** Reason for termination ==
** {unexpected_peer,'pebble@max-a5',
'[email protected]'}
=ERROR REPORT==== 25-Oct-2023::11:28:50.918281 ===
description: Unexpected peer, aborting
expected: '[email protected]'
got: 'pebble@max-a5'
=CRASH REPORT==== 25-Oct-2023::11:28:50.918334 ===
crasher:
initial call: partisan_peer_service_client:init/1
pid: <0.3084.0>
registered_name: []
exception exit: {unexpected_peer,'pebble@max-a5',
'[email protected]'}
in function gen_server:handle_common_reply/8 (gen_server.erl, line 1208)
ancestors: [partisan_pluggable_peer_service_manager,
partisan_peer_service_sup,partisan_sup,<0.1059.0>]
message_queue_len: 0
messages: []
links: [<0.1066.0>]
dictionary: [{{partisan_peer_service_client,peer},
#{name => '[email protected]',
listen_addrs => [#{port => 10202,ip => {192,168,1,120}}],
channels =>
#{undefined =>
#{monotonic => false,parallelism => 1,
compression => false},
data =>
#{monotonic => false,parallelism => 1,
compression => false},
partisan_membership =>
#{monotonic => false,parallelism => 1,
compression => true}}}},
{{partisan_peer_service_client,channel_opts},
#{monotonic => false,parallelism => 1,
compression => false}},
{{partisan_peer_service_client,from},<0.1066.0>},
{{partisan_peer_service_client,listen_addr},
#{port => 10202,ip => {192,168,1,120}}},
{{partisan_peer_service_client,channel},data},
{{partisan_peer_service_client,egress_delay},0}]
trap_exit: false
status: running
heap_size: 610
stack_size: 28
reductions: 13471
neighbours:
=ERROR REPORT==== 25-Oct-2023::11:28:50.918440 ===
** Generic server <0.3087.0> terminating
** Last message in was {tcp,#Port<0.1150>,
<<131,104,2,119,5,104,101,108,108,111,119,13,112,
101,98,98,108,101,64,109,97,120,45,97,53>>}
** When Server state == {state,
{partisan_peer_socket,#Port<0.1150>,gen_tcp,inet,
false},
#{port => 10202,ip => {192,168,1,120}},
partisan_membership,
#{monotonic => false,parallelism => 1,
compression => true},
[compressed],
<0.1066.0>,
#{name =>
'[email protected]',
listen_addrs =>
[#{port => 10202,ip => {192,168,1,120}}],
channels =>
#{undefined =>
#{monotonic => false,parallelism => 1,
compression => false},
data =>
#{monotonic => false,parallelism => 1,
compression => false},
partisan_membership =>
#{monotonic => false,parallelism => 1,
compression => true}}}}
** Reason for termination ==
** {unexpected_peer,'pebble@max-a5',
'[email protected]'}
=CRASH REPORT==== 25-Oct-2023::11:28:50.918656 ===
crasher:
initial call: partisan_peer_service_client:init/1
pid: <0.3087.0>
registered_name: []
exception exit: {unexpected_peer,'pebble@max-a5',
'[email protected]'}
in function gen_server:handle_common_reply/8 (gen_server.erl, line 1208)
ancestors: [partisan_pluggable_peer_service_manager,
partisan_peer_service_sup,partisan_sup,<0.1059.0>]
message_queue_len: 0
messages: []
links: [<0.1066.0>]
dictionary: [{{partisan_peer_service_client,peer},
#{name => '[email protected]',
listen_addrs => [#{port => 10202,ip => {192,168,1,120}}],
channels =>
#{undefined =>
#{monotonic => false,parallelism => 1,
compression => false},
data =>
#{monotonic => false,parallelism => 1,
compression => false},
partisan_membership =>
#{monotonic => false,parallelism => 1,
compression => true}}}},
{{partisan_peer_service_client,channel_opts},
#{monotonic => false,parallelism => 1,compression => true}},
{{partisan_peer_service_client,from},<0.1066.0>},
{{partisan_peer_service_client,listen_addr},
#{port => 10202,ip => {192,168,1,120}}},
{{partisan_peer_service_client,channel},partisan_membership},
{{partisan_peer_service_client,egress_delay},0}]
trap_exit: false
status: running
heap_size: 610
stack_size: 28
reductions: 13852
neighbours:
on cwork
peer:
=ERROR REPORT==== 25-Oct-2023::09:28:04.263766 ===
description: Unexpected peer, aborting
expected: '[email protected]'
got: 'pebble@max-a5'
=ERROR REPORT==== 25-Oct-2023::09:28:04.264086 ===
description: Unexpected peer, aborting
expected: '[email protected]'
got: 'pebble@max-a5'
=ERROR REPORT==== 25-Oct-2023::09:28:04.264151 ===
description: Unexpected peer, aborting
expected: '[email protected]'
got: 'pebble@max-a5'
=ERROR REPORT==== 25-Oct-2023::09:28:04.264024 ===
** Generic server <0.1739.0> terminating
** Last message in was {tcp,#Port<0.467>,
<<131,104,2,119,5,104,101,108,108,111,119,13,112,
101,98,98,108,101,64,109,97,120,45,97,53>>}
** When Server state == {state,
{partisan_peer_socket,#Port<0.467>,gen_tcp,inet,
false},
#{port => 10202,ip => {192,168,1,120}},
partisan_membership,
#{monotonic => false,parallelism => 1,
compression => true},
[compressed],
<0.855.0>,
#{name =>
'[email protected]',
listen_addrs =>
[#{port => 10202,ip => {192,168,1,120}}],
channels =>
#{undefined =>
#{monotonic => false,parallelism => 1,
compression => false},
data =>
#{monotonic => false,parallelism => 1,
compression => false},
partisan_membership =>
#{monotonic => false,parallelism => 1,
compression => true}}}}
** Reason for termination ==
** {unexpected_peer,'pebble@max-a5',
'[email protected]'}
=ERROR REPORT==== 25-Oct-2023::09:28:04.264304 ===
** Generic server <0.1738.0> terminating
** Last message in was {tcp,#Port<0.466>,
<<131,104,2,119,5,104,101,108,108,111,119,13,112,
101,98,98,108,101,64,109,97,120,45,97,53>>}
** When Server state == {state,
{partisan_peer_socket,#Port<0.466>,gen_tcp,inet,
false},
#{port => 10202,ip => {192,168,1,120}},
data,
#{monotonic => false,parallelism => 1,
compression => false},
[],<0.855.0>,
#{name =>
'[email protected]',
listen_addrs =>
[#{port => 10202,ip => {192,168,1,120}}],
channels =>
#{undefined =>
#{monotonic => false,parallelism => 1,
compression => false},
data =>
#{monotonic => false,parallelism => 1,
compression => false},
partisan_membership =>
#{monotonic => false,parallelism => 1,
compression => true}}}}
** Reason for termination ==
** {unexpected_peer,'pebble@max-a5',
'[email protected]'}
=ERROR REPORT==== 25-Oct-2023::09:28:04.264353 ===
** Generic server <0.1737.0> terminating
** Last message in was {tcp,#Port<0.465>,
<<131,104,2,119,5,104,101,108,108,111,119,13,112,
101,98,98,108,101,64,109,97,120,45,97,53>>}
** When Server state == {state,
{partisan_peer_socket,#Port<0.465>,gen_tcp,inet,
false},
#{port => 10202,ip => {192,168,1,120}},
undefined,
#{monotonic => false,parallelism => 1,
compression => false},
[],<0.855.0>,
#{name =>
'[email protected]',
listen_addrs =>
[#{port => 10202,ip => {192,168,1,120}}],
channels =>
#{undefined =>
#{monotonic => false,parallelism => 1,
compression => false},
data =>
#{monotonic => false,parallelism => 1,
compression => false},
partisan_membership =>
#{monotonic => false,parallelism => 1,
compression => true}}}}
** Reason for termination ==
** {unexpected_peer,'pebble@max-a5',
'[email protected]'}
=CRASH REPORT==== 25-Oct-2023::09:28:04.264295 ===
crasher:
initial call: partisan_peer_service_client:init/1
pid: <0.1739.0>
registered_name: []
exception exit: {unexpected_peer,'pebble@max-a5',
'[email protected]'}
in function gen_server:handle_common_reply/8 (gen_server.erl, line 1208)
ancestors: [partisan_pluggable_peer_service_manager,
partisan_peer_service_sup,partisan_sup,<0.845.0>]
message_queue_len: 0
messages: []
links: [<0.855.0>]
dictionary: [{{partisan_peer_service_client,listen_addr},
#{port => 10202,ip => {192,168,1,120}}},
{{partisan_peer_service_client,egress_delay},0},
{{partisan_peer_service_client,channel},partisan_membership},
{{partisan_peer_service_client,from},<0.855.0>},
{{partisan_peer_service_client,channel_opts},
#{monotonic => false,parallelism => 1,compression => true}},
{{partisan_peer_service_client,peer},
#{name => '[email protected]',
listen_addrs => [#{port => 10202,ip => {192,168,1,120}}],
channels =>
#{undefined =>
#{monotonic => false,parallelism => 1,
compression => false},
data =>
#{monotonic => false,parallelism => 1,
compression => false},
partisan_membership =>
#{monotonic => false,parallelism => 1,
compression => true}}}}]
trap_exit: false
status: running
heap_size: 610
stack_size: 28
reductions: 13864
neighbours:
=CRASH REPORT==== 25-Oct-2023::09:28:04.264475 ===
crasher:
initial call: partisan_peer_service_client:init/1
pid: <0.1738.0>
registered_name: []
exception exit: {unexpected_peer,'pebble@max-a5',
'[email protected]'}
in function gen_server:handle_common_reply/8 (gen_server.erl, line 1208)
ancestors: [partisan_pluggable_peer_service_manager,
partisan_peer_service_sup,partisan_sup,<0.845.0>]
message_queue_len: 0
messages: []
links: [<0.855.0>]
dictionary: [{{partisan_peer_service_client,listen_addr},
#{port => 10202,ip => {192,168,1,120}}},
{{partisan_peer_service_client,egress_delay},0},
{{partisan_peer_service_client,channel},data},
{{partisan_peer_service_client,from},<0.855.0>},
{{partisan_peer_service_client,channel_opts},
#{monotonic => false,parallelism => 1,
compression => false}},
{{partisan_peer_service_client,peer},
#{name => '[email protected]',
listen_addrs => [#{port => 10202,ip => {192,168,1,120}}],
channels =>
#{undefined =>
#{monotonic => false,parallelism => 1,
compression => false},
data =>
#{monotonic => false,parallelism => 1,
compression => false},
partisan_membership =>
#{monotonic => false,parallelism => 1,
compression => true}}}}]
trap_exit: false
status: running
heap_size: 610
stack_size: 28
reductions: 13632
neighbours:
=CRASH REPORT==== 25-Oct-2023::09:28:04.264567 ===
crasher:
initial call: partisan_peer_service_client:init/1
pid: <0.1737.0>
registered_name: []
exception exit: {unexpected_peer,'pebble@max-a5',
'[email protected]'}
in function gen_server:handle_common_reply/8 (gen_server.erl, line 1208)
ancestors: [partisan_pluggable_peer_service_manager,
partisan_peer_service_sup,partisan_sup,<0.845.0>]
message_queue_len: 0
messages: []
links: [<0.855.0>]
dictionary: [{{partisan_peer_service_client,listen_addr},
#{port => 10202,ip => {192,168,1,120}}},
{{partisan_peer_service_client,egress_delay},0},
{{partisan_peer_service_client,channel},undefined},
{{partisan_peer_service_client,from},<0.855.0>},
{{partisan_peer_service_client,channel_opts},
#{monotonic => false,parallelism => 1,
compression => false}},
{{partisan_peer_service_client,peer},
#{name => '[email protected]',
listen_addrs => [#{port => 10202,ip => {192,168,1,120}}],
channels =>
#{undefined =>
#{monotonic => false,parallelism => 1,
compression => false},
data =>
#{monotonic => false,parallelism => 1,
compression => false},
partisan_membership =>
#{monotonic => false,parallelism => 1,
compression => true}}}}]
trap_exit: false
status: running
heap_size: 610
stack_size: 28
reductions: 13508
neighbours:
=ERROR REPORT==== 25-Oct-2023::09:28:04.436731 ===
description: Unexpected peer, aborting
expected: '[email protected]'
got: 'pebble@max-a5'
=ERROR REPORT==== 25-Oct-2023::09:28:04.437083 ===
description: Unexpected peer, aborting
expected: '[email protected]'
got: 'pebble@max-a5'
=ERROR REPORT==== 25-Oct-2023::09:28:04.437245 ===
description: Unexpected peer, aborting
expected: '[email protected]'
got: 'pebble@max-a5'
=ERROR REPORT==== 25-Oct-2023::09:28:04.437387 ===
** Generic server <0.1752.0> terminating
** Last message in was {tcp,#Port<0.470>,
<<131,104,2,119,5,104,101,108,108,111,119,13,112,
101,98,98,108,101,64,109,97,120,45,97,53>>}
** When Server state == {state,
{partisan_peer_socket,#Port<0.470>,gen_tcp,inet,
false},
#{port => 10202,ip => {192,168,1,120}},
partisan_membership,
#{monotonic => false,parallelism => 1,
compression => true},
[compressed],
<0.855.0>,
#{name =>
'[email protected]',
listen_addrs =>
[#{port => 10202,ip => {192,168,1,120}}],
channels =>
#{undefined =>
#{monotonic => false,parallelism => 1,
compression => false},
data =>
#{monotonic => false,parallelism => 1,
compression => false},
partisan_membership =>
#{monotonic => false,parallelism => 1,
compression => true}}}}
** Reason for termination ==
** {unexpected_peer,'pebble@max-a5',
'[email protected]'}
=ERROR REPORT==== 25-Oct-2023::09:28:04.437584 ===
** Generic server <0.1751.0> terminating
** Last message in was {tcp,#Port<0.469>,
<<131,104,2,119,5,104,101,108,108,111,119,13,112,
101,98,98,108,101,64,109,97,120,45,97,53>>}
** When Server state == {state,
{partisan_peer_socket,#Port<0.469>,gen_tcp,inet,
false},
#{port => 10202,ip => {192,168,1,120}},
data,
#{monotonic => false,parallelism => 1,
compression => false},
[],<0.855.0>,
#{name =>
'[email protected]',
listen_addrs =>
[#{port => 10202,ip => {192,168,1,120}}],
channels =>
#{undefined =>
#{monotonic => false,parallelism => 1,
compression => false},
data =>
#{monotonic => false,parallelism => 1,
compression => false},
partisan_membership =>
#{monotonic => false,parallelism => 1,
compression => true}}}}
** Reason for termination ==
** {unexpected_peer,'pebble@max-a5',
'[email protected]'}
=ERROR REPORT==== 25-Oct-2023::09:28:04.436970 ===
** Generic server <0.1750.0> terminating
** Last message in was {tcp,#Port<0.468>,
<<131,104,2,119,5,104,101,108,108,111,119,13,112,
101,98,98,108,101,64,109,97,120,45,97,53>>}
** When Server state == {state,
{partisan_peer_socket,#Port<0.468>,gen_tcp,inet,
false},
#{port => 10202,ip => {192,168,1,120}},
undefined,
#{monotonic => false,parallelism => 1,
compression => false},
[],<0.855.0>,
#{name =>
'[email protected]',
listen_addrs =>
[#{port => 10202,ip => {192,168,1,120}}],
channels =>
#{undefined =>
#{monotonic => false,parallelism => 1,
compression => false},
data =>
#{monotonic => false,parallelism => 1,
compression => false},
partisan_membership =>
#{monotonic => false,parallelism => 1,
compression => true}}}}
** Reason for termination ==
** {unexpected_peer,'pebble@max-a5',
'[email protected]'}
=CRASH REPORT==== 25-Oct-2023::09:28:04.437814 ===
crasher:
initial call: partisan_peer_service_client:init/1
pid: <0.1752.0>
registered_name: []
exception exit: {unexpected_peer,'pebble@max-a5',
'[email protected]'}
in function gen_server:handle_common_reply/8 (gen_server.erl, line 1208)
ancestors: [partisan_pluggable_peer_service_manager,
partisan_peer_service_sup,partisan_sup,<0.845.0>]
message_queue_len: 0
messages: []
links: [<0.855.0>]
dictionary: [{{partisan_peer_service_client,listen_addr},
#{port => 10202,ip => {192,168,1,120}}},
{{partisan_peer_service_client,egress_delay},0},
{{partisan_peer_service_client,channel},partisan_membership},
{{partisan_peer_service_client,from},<0.855.0>},
{{partisan_peer_service_client,channel_opts},
#{monotonic => false,parallelism => 1,compression => true}},
{{partisan_peer_service_client,peer},
#{name => '[email protected]',
listen_addrs => [#{port => 10202,ip => {192,168,1,120}}],
channels =>
#{undefined =>
#{monotonic => false,parallelism => 1,
compression => false},
data =>
#{monotonic => false,parallelism => 1,
compression => false},
partisan_membership =>
#{monotonic => false,parallelism => 1,
compression => true}}}}]
trap_exit: false
status: running
heap_size: 610
stack_size: 28
reductions: 13855
neighbours:
It looks like it is related to the handling of unexpected peers, that triggers this loop condition.
from partisan.
Hi @mcesaro , yes so the Peer Service will keep on trying to open a connection to the nodes in the membership list forever. Maybe in this case we need to do something different. But before doing that I would like to understand what might be going on.
Umm maybe I have introduced something weird in the latest update to the listen_addrs
and IP resolution.
The unexpected_peer
error seems to be triggered by your node trying to connect to
peer [email protected]
which is listening on #{port => 10202,ip => {192,168,1,120}
according to the membership set. The fact that the name is {UUID}@127.0.0.1 means that node couldn't resolve its erlang nodename on init and defaulted to a dynamically generated one [1].
When the node at #{port => 10202,ip => {192,168,1,120}
is contacted it identifies itself with pebble@max-a5
and thus the error. This could be a bug or a result of a crash (without a leave) followed by another node starting on the same IP:Port as the one that crashed.
Lets check if it could be the latter. Is it possible that you are experiencing the following?
- You start two nodes: this one and the peer. The peer is started at
#{port => 10202,ip => {192,168,1,120}
. - For some reason the peer could not resolve its erlang nodename, adopting a dynamic one i.e.
{UUID}@127.0.0.1
joins the cluster with the first one, then is stopped (maybe you stopped it cause the name was wrong?) or crashed without issuing apartisan_peer_service:leave()
. So it remains as a member in the cluster membership set in this node. - A new peer starts using the same IP:Port i.e.
#{port => 10202,ip => {192,168,1,120}
but now correctly taking its nodenamepebble@max-a5
. pebble@max-a5
joins the cluster , and now Partisan has 3 members in the cluster, this node,pebble@max-a5
and[email protected]
.- As a result, this node and
pebble@max-a5
will try to connect to[email protected]
at#{port => 10202,ip => {192,168,1,120}
but the peer running there is calledpebble@max-a5
!
If I am correct here you should get the three codenames as a result fo calling partisan_peer_service:members()
on this node.
If this is the case should be able to resolve the issue by calling
partisan_peer_service:leave('[email protected]')
[1] Here I see a bug since the generated name should have been [email protected]
.
from partisan.
Hi @aramallo,
since I had some doubts about resolving the erlang nodename, in my code I usually do something like that:
setup_node_id(#{host := _Host, port := _Port, client_id := NodeID} = State) ->
_CurrName = node(),
partisan:stop(),
net_kernel:stop(),
{NodeName, HostName} =
case net_kernel:start(NodeID, #{name_domain => shortnames}) of
{ok, _Pid} ->
partisan:start(),
{ok , LocalHost} = inet:gethostname(),
{node(), LocalHost};
{error, Reason} ->
io:format("error starting net_kernel: ~p~n", [Reason]),
{undefined, undefined}
end,
State#{node => atom_to_binary(NodeName), hostname => list_to_binary(HostName)}
.
thus trying to make sure that the short name of the node is the one assigned.
This is why I can't see if there is an obvious reason for the need of dynamic names.
from partisan.
I see.
Actually I do not want to user disterl (that's the reason why I like partisan!), but I thought that the only way to assign a reliable shortname to the erlang node was starting the net_kernel
in a controlled way.
Since I need to name the nodes dynamically, I guess I might address this by deferring partisan startup after assigning the node name.
from partisan.
Obviously, setting the name
param will not affect disterl, that is Partisan will not set the the node name in net_kernel. If you want both net_kernel and partisan to have the same node i.e. node() == partisan:node()
then the best is to set the name in your vm.args
or vm.args.src
file.
from partisan.
I guess I will just rely on partisan doing the right thing, i.e. take away any reference to disterl. However, I will need to do that independently of the vm.args
settings.
from partisan.
@mcesaro I guess the ideal would be for each node to have a persistent name? So that the same node in a fleet takes the same node after each restart. Not sure how you deploy your system, if you are using k8s you would do that using StatefulSets and exporting the pod name as an Env Var to use in the vm.args.
from partisan.
@aramallo apparently a combination of methods works.
I changed the setup of a cluster node like this:
setup_node_id(#{host := _Host, port := _Port, client_id := NodeID} = State) ->
partisan:stop(),
{ok , HostName} = inet:gethostname(),
NodeName =
[io_lib:format("~p@~s", [NodeID, HostName])]
/ lists:flatten
/ list_to_atom,
io:format("nodename: ~p~n", [NodeName]),
application:set_env(partisan, name, NodeName),
_ = application:ensure_all_started(partisan),
State#{node => atom_to_binary(NodeName), hostname => list_to_binary(HostName)}
.
Seems to work great with my container setup and also on bare metal. No DNS issues (hopefully),
nodename: 'ska@max-a5'
=NOTICE REPORT==== 25-Oct-2023::22:00:17.637182 ===
name: 'ska@max-a5'
description: Partisan node name configured
disterl_enabled: false
=NOTICE REPORT==== 25-Oct-2023::22:00:17.641555 ===
addr: {0,0,0,0,0,0,0,1}
family: inet6
host: max-a5
description: Resolved IP address for host
setup: partisan peer service join ok
setup: partisan try rpc call
setup: partisan rpc call success
Dynamic names seem to work with partisan!
P.S. I use the great epipe
parse transform to emulate the Elixir |> operator.
from partisan.
@mcesaro Very nice!
BTW I was not aware or epipe, thanks for the tip. I will start working on the other issues and closing this one.
from partisan.
Related Issues (20)
- rebar_erl_vsn is unmaintained HOT 3
- RPC test fails in 19.3 build HOT 1
- status, roadmap, doc? HOT 3
- partisan.cloud down HOT 7
- How to use Partisan ? HOT 10
- Review the need for the `channels` property in `node_spec()`
- Peer Service crashes when message uses an unknown causal_label
- Add options to use new `socket` module as transport
- Improve peer service concurrency (leverage channel parallelism)
- Improve concurrency HOT 1
- Improve concurrency HOT 1
- Add QUIC transport option
- Reduce the number of connections between nodes
- partisan_plumtree_broadcast server garbage collection
- Multiple interfaces peer problem HOT 7
- `partisan_plumtree_broadcast` crashing HOT 4
- AAE keeps track of non-cluster hosts HOT 1
- Send fails in HyParView when a node is randomly selected to become passive
- `sys:get_state/1` on `partisan_gen_statem` process fails HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from partisan.