Coder Social home page Coder Social logo

Comments (3)

temeo avatar temeo commented on August 22, 2024

Got this again in nightly builds.

Log from node1 (b70d4ba3)

2014-06-04 04:26:17 30379 [Note] WSREP: New cluster view: global state: b70e0409-eb8d-11e3-aeac-53684f5182d6:1293933, view# 2: Primary, number of nodes: 2, my index: 0, protocol version 2
2014-06-04 04:26:17 30379 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification.
2014-06-04 04:26:17 30379 [Note] WSREP: REPL Protocols: 5 (3, 1)
2014-06-04 04:26:17 30379 [Note] WSREP: Service thread queue flushed.
2014-06-04 04:26:17 30379 [Note] WSREP: Assign initial position for certification: 1293933, protocol version: 3
2014-06-04 04:26:17 30379 [Note] WSREP: Service thread queue flushed.
2014-06-04 04:26:17 30379 [Note] WSREP: Member 1.0 (vagrant-ubuntu-precise-64) synced with group.
2014-06-04 04:26:18 30379 [Warning] WSREP: discarding established (time wait) ba14c929 (tcp://10.0.2.15:10031) 
2014-06-04 04:26:19 30379 [Note] WSREP: (b70d4ba3, 'tcp://0.0.0.0:10011') turning message relay requesting off
2014-06-04 04:26:19 30379 [Note] WSREP:  cleaning up ba14c929 (tcp://10.0.2.15:10031)
2014-06-04 04:26:20 30379 [Note] WSREP: (b70d4ba3, 'tcp://0.0.0.0:10011') turning message relay requesting on, nonlive peers: 
2014-06-04 04:26:21 30379 [Note] WSREP: evs::proto(b70d4ba3, OPERATIONAL, view_id(REG,b70d4ba3,126)):  detected new message source ba14c929
2014-06-04 04:26:21 30379 [Note] WSREP: evs::proto(b70d4ba3, OPERATIONAL, view_id(REG,b70d4ba3,126)):  shift to GATHER due to foreign message from ba14c929
2014-06-04 04:26:21 30379 [Note] WSREP: evs::proto(b70d4ba3, OPERATIONAL, view_id(REG,b70d4ba3,126)):  state change: OPERATIONAL -> GATHER
2014-06-04 04:26:23 30379 [Note] WSREP: (b70d4ba3, 'tcp://0.0.0.0:10011') turning message relay requesting off
2014-06-04 04:26:26 30379 [Note] WSREP: evs::proto(b70d4ba3, GATHER, view_id(REG,b70d4ba3,126)): setting b9aa36a7 inactive in asymmetry elimination
2014-06-04 04:26:26 30379 [Note] WSREP: evs::proto(b70d4ba3, GATHER, view_id(REG,b70d4ba3,126)): before asym elimination
1 1 1 
2014-06-04 04:26:26 30379 [Note] WSREP: evs::proto(b70d4ba3, GATHER, view_id(REG,b70d4ba3,126)): after asym elimination
1 0 1 
2014-06-04 04:26:26 30379 [Note] WSREP: evs::proto(b70d4ba3, GATHER, view_id(REG,b70d4ba3,126)): sending install message
2014-06-04 04:26:26 30379 [Note] WSREP: evs::proto(b70d4ba3, GATHER, view_id(REG,b70d4ba3,126)): delayed b9aa36a7 requesting range [10,9]
2014-06-04 04:26:26 30379 [Note] WSREP: evs::proto(b70d4ba3, GATHER, view_id(REG,b70d4ba3,126)):  state change: GATHER -> INSTALL
2014-06-04 04:26:26 30379 [Note] WSREP: evs::proto(b70d4ba3, INSTALL, view_id(REG,b70d4ba3,126)):  state change: INSTALL -> OPERATIONAL
2014-06-04 04:26:26 30379 [Note] WSREP: evs::proto(b70d4ba3, INSTALL, view_id(REG,b70d4ba3,126)):  delivering view view(view_id(TRANS,b70d4ba3,126) memb {
    b70d4ba3,0
} joined {
} left {
} partitioned {
    b9aa36a7,0
})
2014-06-04 04:26:26 30379 [Note] WSREP: evs::proto(b70d4ba3, OPERATIONAL, view_id(REG,b70d4ba3,127)): delivering view view(view_id(REG,b70d4ba3,127) memb {
    b70d4ba3,0
    ba14c929,0
} joined {
    ba14c929,0
} left {
} partitioned {
    b9aa36a7,0
})
2014-06-04 04:26:26 30379 [Note] WSREP: declaring ba14c929 at tcp://10.0.2.15:10031 stable
2014-06-04 04:26:26 30379 [Note] WSREP: New COMPONENT: primary = no, bootstrap = no, my_idx = 0, memb_num = 1
2014-06-04 04:26:26 30379 [Note] WSREP: Flow-control interval: [16, 16]
2014-06-04 04:26:26 30379 [Note] WSREP: Received NON-PRIMARY.
2014-06-04 04:26:26 30379 [Note] WSREP: Shifting SYNCED -> OPEN (TO: 1293933)
2014-06-04 04:26:26 30379 [Note] WSREP: New cluster view: global state: b70e0409-eb8d-11e3-aeac-53684f5182d6:1293933, view# -1: non-Primary, number of nodes: 1, my index: 0, protocol version 2
2014-06-04 04:26:26 30379 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification.
2014-06-04 04:26:26 30379 [Note] WSREP: evs::proto(b70d4ba3, OPERATIONAL, view_id(REG,b70d4ba3,127)):  detected new message source b9aa36a7
2014-06-04 04:26:26 30379 [Note] WSREP: New COMPONENT: primary = no, bootstrap = no, my_idx = 0, memb_num = 2
2014-06-04 04:26:26 30379 [Note] WSREP: evs::proto(b70d4ba3, OPERATIONAL, view_id(REG,b70d4ba3,127)):  shift to GATHER due to foreign message from b9aa36a7
2014-06-04 04:26:26 30379 [Note] WSREP: Flow-control interval: [23, 23]
2014-06-04 04:26:26 30379 [Note] WSREP: evs::proto(b70d4ba3, OPERATIONAL, view_id(REG,b70d4ba3,127)):  state change: OPERATIONAL -> GATHER
2014-06-04 04:26:26 30379 [Note] WSREP: Received NON-PRIMARY.
2014-06-04 04:26:26 30379 [Note] WSREP: New cluster view: global state: b70e0409-eb8d-11e3-aeac-53684f5182d6:1293933, view# -1: non-Primary, number of nodes: 2, my index: 0, protocol version 2
2014-06-04 04:26:26 30379 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification.
2014-06-04 04:26:26 30379 [Note] WSREP: evs::proto(b70d4ba3, GATHER, view_id(REG,b70d4ba3,127)): sending install message
2014-06-04 04:26:26 30379 [Note] WSREP: evs::proto(b70d4ba3, GATHER, view_id(REG,b70d4ba3,127)):  state change: GATHER -> INSTALL
2014-06-04 04:26:26 30379 [Note] WSREP: evs::proto(b70d4ba3, INSTALL, view_id(REG,b70d4ba3,127)):  state change: INSTALL -> OPERATIONAL
2014-06-04 04:26:26 30379 [Note] WSREP: evs::proto(b70d4ba3, INSTALL, view_id(REG,b70d4ba3,127)):  delivering view view(view_id(TRANS,b70d4ba3,127) memb {
    b70d4ba3,0
    ba14c929,0
} joined {
} left {
} partitioned {
})
2014-06-04 04:26:26 30379 [Note] WSREP: evs::proto(b70d4ba3, OPERATIONAL, view_id(REG,b70d4ba3,128)): delivering view view(view_id(REG,b70d4ba3,128) memb {
    b70d4ba3,0
    b9aa36a7,0
    ba14c929,0
} joined {
    b9aa36a7,0
} left {
} partitioned {
})
2014-06-04 04:26:26 30379 [Note] WSREP: declaring b9aa36a7 at tcp://10.0.2.15:10021 stable
2014-06-04 04:26:26 30379 [Note] WSREP: declaring ba14c929 at tcp://10.0.2.15:10031 stable
2014-06-04 04:26:26 30379 [Note] WSREP: New COMPONENT: primary = no, bootstrap = no, my_idx = 0, memb_num = 3
2014-06-04 04:26:26 30379 [Note] WSREP: Flow-control interval: [28, 28]
2014-06-04 04:26:26 30379 [Note] WSREP: Received NON-PRIMARY.
2014-06-04 04:26:26 30379 [Note] WSREP: New cluster view: global state: b70e0409-eb8d-11e3-aeac-53684f5182d6:1293933, view# -1: non-Primary, number of nodes: 3, my index: 0, protocol version 2
2014-06-04 04:26:26 30379 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification.

Log from node3 (ba14c929):

2014-06-04 04:26:16 30522 [Note] WSREP: gcomm: connecting to group 'my_wsrep_cluster', peer '10.0.2.15:10021'
2014-06-04 04:26:16 30522 [Note] WSREP: evs::proto(ba14c929, CLOSED, view_id(TRANS,ba14c929,124)):  state change: CLOSED -> JOINING
2014-06-04 04:26:16 30522 [Note] WSREP: (ba14c929, 'tcp://0.0.0.0:10031') turning message relay requesting on, nonlive peers: tcp://10.0.2.15:10011 
2014-06-04 04:26:17 30522 [Note] WSREP: evs::proto(ba14c929, JOINING, view_id(TRANS,ba14c929,124)):  detected new message source b9aa36a7
2014-06-04 04:26:17 30522 [Note] WSREP: evs::proto(ba14c929, JOINING, view_id(TRANS,ba14c929,124)):  shift to GATHER due to foreign message from b9aa36a7
2014-06-04 04:26:17 30522 [Note] WSREP: evs::proto(ba14c929, JOINING, view_id(TRANS,ba14c929,124)):  state change: JOINING -> GATHER
2014-06-04 04:26:17 30522 [Note] WSREP: evs::proto(ba14c929, GATHER, view_id(TRANS,ba14c929,124)):  detected new message source b70d4ba3
2014-06-04 04:26:17 30522 [Note] WSREP: evs::proto(ba14c929, GATHER, view_id(TRANS,ba14c929,124)):  shift to GATHER due to foreign message from b70d4ba3
2014-06-04 04:26:17 30522 [Note] WSREP: gcomm: connected
2014-06-04 04:26:17 30522 [Note] WSREP: Changing maximum packet size to 64500, resulting msg size: 32636
2014-06-04 04:26:17 30522 [Note] WSREP: Shifting CLOSED -> OPEN (TO: 0)
2014-06-04 04:26:17 30522 [Note] WSREP: Opened channel 'my_wsrep_cluster'
2014-06-04 04:26:17 30522 [Note] /tmp/galera/local3/mysql/sbin/mysqld: ready for connections.
Version: '5.6.17'  socket: '/tmp/galera/local3/mysql/var/mysqld.sock'  port: 3313  MySQL Community Server (GPL), wsrep_25.5.r4094
2014-06-04 04:26:18 30522 [Note] WSREP: (ba14c929, 'tcp://0.0.0.0:10031') reconnecting to b9aa36a7 (tcp://10.0.2.15:10021), attempt 0
2014-06-04 04:26:18 30522 [Note] WSREP: (ba14c929, 'tcp://0.0.0.0:10031') reconnecting to b70d4ba3 (tcp://10.0.2.15:10011), attempt 0
2014-06-04 04:26:19 30522 [Note] WSREP: (ba14c929, 'tcp://0.0.0.0:10031') reconnecting to b9aa36a7 (tcp://10.0.2.15:10021), attempt 0
2014-06-04 04:26:20 30522 [Note] WSREP: (ba14c929, 'tcp://0.0.0.0:10031') reconnecting to b70d4ba3 (tcp://10.0.2.15:10011), attempt 0
2014-06-04 04:26:23 30522 [Note] WSREP: (ba14c929, 'tcp://0.0.0.0:10031') turning message relay requesting off
2014-06-04 04:26:24 30522 [Warning] WSREP: evs::proto(ba14c929, GATHER, view_id(TRANS,ba14c929,124)) install timer expired
2014-06-04 04:26:24 30522 [Note] WSREP: evs::proto(ba14c929, GATHER, view_id(TRANS,ba14c929,124)): before inspection:
2014-06-04 04:26:24 30522 [Note] WSREP: evs::proto(ba14c929, GATHER, view_id(TRANS,ba14c929,124)): consensus: 0
2014-06-04 04:26:24 30522 [Note] WSREP: evs::proto(ba14c929, GATHER, view_id(TRANS,ba14c929,124)): repr     : 0
2014-06-04 04:26:24 30522 [Note] WSREP: evs::proto(ba14c929, GATHER, view_id(TRANS,ba14c929,124)): state dump for diagnosis:
evs::proto(evs::proto(ba14c929, GATHER, view_id(TRANS,ba14c929,124)), GATHER) {
current_view=view(view_id(TRANS,ba14c929,124) memb {
    ba14c929,0
} joined {
} left {
} partitioned {
}),
input_map=evs::input_map: {aru_seq=-1,safe_seq=-1,node_index=node: {idx=0,range=[0,-1],safe_seq=-1} ,msg_index=,recovery_index=},
fifo_seq=16,
last_sent=-1,
known:
b70d4ba3 at tcp://10.0.2.15:10011
{o=1,s=0,i=0,fs=43,jm=
{v=0,t=4,ut=255,o=1,s=9,sr=-1,as=9,f=4,src=b70d4ba3,srcvid=view_id(REG,b70d4ba3,126),ru=00000000,r=[-1,-1],fs=43,nl=(
    b70d4ba3, {o=1,s=0,f=0,ls=-1,vid=view_id(REG,b70d4ba3,126),ss=9,ir=[10,9],}
    b9aa36a7, {o=1,s=0,f=0,ls=-1,vid=view_id(REG,b70d4ba3,126),ss=9,ir=[10,9],}
    ba14c929, {o=1,s=0,f=0,ls=-1,vid=view_id(TRANS,ba14c929,124),ss=-1,ir=[0,-1],}
)
},
}
b9aa36a7 at tcp://10.0.2.15:10021
{o=1,s=0,i=0,fs=30,jm=
{v=0,t=4,ut=255,o=1,s=9,sr=-1,as=9,f=4,src=b9aa36a7,srcvid=view_id(REG,b70d4ba3,126),ru=00000000,r=[-1,-1],fs=30,nl=(
    b70d4ba3, {o=1,s=0,f=0,ls=-1,vid=view_id(REG,b70d4ba3,126),ss=9,ir=[10,9],}
    b9aa36a7, {o=1,s=0,f=0,ls=-1,vid=view_id(REG,b70d4ba3,126),ss=9,ir=[10,9],}
    ba14c929, {o=1,s=0,f=0,ls=-1,vid=view_id(REG,00000000,0),ss=-1,ir=[-1,-1],}
)
},
}
ba14c929 at 
{o=1,s=0,i=0,fs=-1,jm=
{v=0,t=4,ut=255,o=1,s=-1,sr=-1,as=-1,f=0,src=ba14c929,srcvid=view_id(TRANS,ba14c929,124),ru=00000000,r=[-1,-1],fs=16,nl=(
    b70d4ba3, {o=1,s=0,f=0,ls=-1,vid=view_id(REG,b70d4ba3,126),ss=9,ir=[10,9],}
    b9aa36a7, {o=1,s=0,f=0,ls=-1,vid=view_id(REG,b70d4ba3,126),ss=9,ir=[10,9],}
    ba14c929, {o=1,s=0,f=0,ls=-1,vid=view_id(TRANS,ba14c929,124),ss=-1,ir=[0,-1],}
)
},
}
 }
2014-06-04 04:26:24 30522 [Note] WSREP: evs::proto(ba14c929, GATHER, view_id(TRANS,ba14c929,124)):  setting source b9aa36a7 as inactive due to expired install timer
2014-06-04 04:26:24 30522 [Note] WSREP: no install message received
2014-06-04 04:26:24 30522 [Note] WSREP: evs::proto(ba14c929, GATHER, view_id(TRANS,ba14c929,124)): after inspection:
2014-06-04 04:26:24 30522 [Note] WSREP: evs::proto(ba14c929, GATHER, view_id(TRANS,ba14c929,124)): consensus: 0
2014-06-04 04:26:24 30522 [Note] WSREP: evs::proto(ba14c929, GATHER, view_id(TRANS,ba14c929,124)): repr     : 0
2014-06-04 04:26:26 30522 [Note] WSREP: evs::proto(ba14c929, GATHER, view_id(TRANS,ba14c929,124)):  state change: GATHER -> INSTALL
2014-06-04 04:26:26 30522 [Note] WSREP: evs::proto(ba14c929, INSTALL, view_id(TRANS,ba14c929,124)):  state change: INSTALL -> OPERATIONAL
2014-06-04 04:26:26 30522 [Note] WSREP: evs::proto(ba14c929, INSTALL, view_id(TRANS,ba14c929,124)):  delivering view view(view_id(TRANS,ba14c929,124) memb {
    ba14c929,0
} joined {
} left {
} partitioned {
})
2014-06-04 04:26:26 30522 [Note] WSREP: evs::proto(ba14c929, OPERATIONAL, view_id(REG,b70d4ba3,127)): delivering view view(view_id(REG,b70d4ba3,127) memb {
    b70d4ba3,0
    ba14c929,0
} joined {
    b70d4ba3,0
} left {
} partitioned {
})
2014-06-04 04:26:26 30522 [Note] WSREP: declaring b70d4ba3 at tcp://10.0.2.15:10011 stable
2014-06-04 04:26:26 30522 [Note] WSREP: New COMPONENT: primary = no, bootstrap = no, my_idx = 1, memb_num = 2
2014-06-04 04:26:26 30522 [Note] WSREP: Flow-control interval: [23, 23]
2014-06-04 04:26:26 30522 [Note] WSREP: Received NON-PRIMARY.
2014-06-04 04:26:26 30522 [Note] WSREP: evs::proto(ba14c929, OPERATIONAL, view_id(REG,b70d4ba3,127)):  detected new message source b9aa36a7
2014-06-04 04:26:26 30522 [Note] WSREP: evs::proto(ba14c929, OPERATIONAL, view_id(REG,b70d4ba3,127)):  shift to GATHER due to foreign message from b9aa36a7
2014-06-04 04:26:26 30522 [Note] WSREP: evs::proto(ba14c929, OPERATIONAL, view_id(REG,b70d4ba3,127)):  state change: OPERATIONAL -> GATHER
2014-06-04 04:26:26 30522 [Note] WSREP: New cluster view: global state: b70e0409-eb8d-11e3-aeac-53684f5182d6:1293933, view# -1: non-Primary, number of nodes: 2, my index: 1, protocol version -1
2014-06-04 04:26:26 30522 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification.
2014-06-04 04:26:26 30522 [Note] WSREP: evs::proto(ba14c929, GATHER, view_id(REG,b70d4ba3,127)):  state change: GATHER -> INSTALL
2014-06-04 04:26:26 30522 [Note] WSREP: evs::proto(ba14c929, INSTALL, view_id(REG,b70d4ba3,127)):  state change: INSTALL -> OPERATIONAL
2014-06-04 04:26:26 30522 [Note] WSREP: evs::proto(ba14c929, INSTALL, view_id(REG,b70d4ba3,127)):  delivering view view(view_id(TRANS,b70d4ba3,127) memb {
    b70d4ba3,0
    ba14c929,0
} joined {
} left {
} partitioned {
})
2014-06-04 04:26:26 30522 [Note] WSREP: evs::proto(ba14c929, OPERATIONAL, view_id(REG,b70d4ba3,128)): delivering view view(view_id(REG,b70d4ba3,128) memb {
    b70d4ba3,0
    b9aa36a7,0
    ba14c929,0
} joined {
    b9aa36a7,0
} left {
} partitioned {
})
2014-06-04 04:26:26 30522 [Note] WSREP: declaring b70d4ba3 at tcp://10.0.2.15:10011 stable
2014-06-04 04:26:26 30522 [Note] WSREP: declaring b9aa36a7 at tcp://10.0.2.15:10021 stable
2014-06-04 04:26:26 30522 [Note] WSREP: New COMPONENT: primary = no, bootstrap = no, my_idx = 2, memb_num = 3
2014-06-04 04:26:26 30522 [Note] WSREP: Flow-control interval: [28, 28]
2014-06-04 04:26:26 30522 [Note] WSREP: Received NON-PRIMARY.
2014-06-04 04:26:26 30522 [Note] WSREP: New cluster view: global state: b70e0409-eb8d-11e3-aeac-53684f5182d6:1293933, view# -1: non-Primary, number of nodes: 3, my index: 2, protocol version -1
2014-06-04 04:26:26 30522 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification.

Joining node3 somehow makes the node1, node2 cluster split, possibly because of

2014-06-04 04:26:26 30379 [Note] WSREP: evs::proto(b70d4ba3, GATHER, view_id(REG,b70d4ba3,126)): after asym elimination
1 0 1 

There are three issues here, which should be dealt with in the following order:

  1. Why PC is lost on the split/remerge?
  2. Why joining node3 decides to declare node2 as nonoperational and causes the split?
  3. It would be better to favor nodes in the current group in asym elimination.

from galera.

dirtysalt avatar dirtysalt commented on August 22, 2024

fixed in 7e5ddba

from galera.

temeo avatar temeo commented on August 22, 2024

Reassigned for further EVS side investigation.

from galera.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.