Notes:
- TestInitialElection2A will get
warning: term changed even though there were no failures, term1: 4, term2:24
In test_test.sh, tester stops for a while, so no network failure right now
// sleep a bit to avoid racing with followers learning of the
// election, then check that all peers agree on the term.
time.Sleep(50 * time.Millisecond)
term1 := cfg.checkTerms()
if term1 < 1 {
t.Fatalf("term is %v, but should be at least 1", term1)
}
// does the leader+term stay the same if there is no network failure?
time.Sleep(2 * RaftElectionTimeout)
term2 := cfg.checkTerms()
if term1 != term2 {
fmt.Printf("warning: term changed even though there were no failures, term1: %d, term2:%d\n", term1, term2)
}
The docs in lab pages also says "for the leader to remain the leader if there are no failures", so the codes now can not do it.
The step to maintain leader is HeartBeat, the null AppendMessage will make Follower reset ElectionTimeout.
So is the reason.
Solution: In ticker()
, the routine should sleep ElectionTimeout
at first, then judge the state(Follower/Candidate) to start election. I got in wrong order to judge at first. Stupid mistake. I should go back to primary school.
Spend so much time on TestBackup2B ๐ฅ
Flow:
The first index in log is "1", not '0', see detail in bugs
- Initilization
- in for loops, keep sending command
100
to servers for 3 times.
nd, cmd := cfg.nCommitted(index)
counts servers that think the log entry at index is committed.nd
is the count number of servers that committed the entry.cmd
is the command of this index.xindex := cfg.one(index*100, servers, false)
does a complete agreement. In a 10 seconds timeout for loop, it first pick out the leader right now and applyStart(command)
to append new log entries to the leader. Then, if the leader exists, in a 2 seconds timeout loop, keep checking if other servers commit the new entries by usingnd, cmd1 := cfg.nCommitted(index)
. Of course servers will receive log entries included in periodic heartbeats.- At last, the agreement will apply command
100
to the servers for 3 times.
-
Only leader committed logs. Forget to commit logs in Followers.
-
The first index should be 1, which has been told in Figure 2, why didn't I read it more clearly before.
each command is sent to each peer just once.
Test that a follower participates after disconnect and re-connect.
- After one of the server disconnect from the network, the leader and other servers can't agree. The leader itself can not commit?
Solution: Forget to commit the log entries in Leader when entries are appended in Start(command)
, so when the leader at last calculates count of the committed entries in the same index, the count is 1 less(not counting the leader).
- When the server re-connects to the network, the leader and other servers can't agree. The re-connected one will keep meeting ElectionTimeout and start election. The leader stops sending HeartBeat, the network crashed.
Solution: when the server comes back, its term should be larger than existing servers(it always asks for election). So when the leader send HeartBeat to the coming back server, the fresher term will reply false. Then the leader deal with the reply and update term itself and be Follower to start election.
- As the picture above, the lastIndex fails to update which means that the logs are not appended successfully to the previous missing server.
Solution: InAppendEntries()
, ifargs.PrevLogIndex > len(rf.logEntry)-1
, should not return immediately, or the new entries with older index will never be appended if the follower lacks older entries.
Most of the servers failed, so all of the entries will be uncommited, thus never apply. But when servers come back, start new election and keep going on.
When several commands are requested concurrently, the leader ensure that one command is processed at one time. And no miss due to concurrency.
Start -> Add entry 101 to the leader in network-> disconnect leader -> add entries 102, 103, 104 to privately to the missing leader -> add entry 103 to the network -> disconnect the current leader -> connect the old leader -> add entry 104 to network -> connect the second disconnected leader -> add entry 105 to network
- Mistakenly set PrevlogIndex, so in
AppendEntries
RPC, the rules in 5.3 in paper which tells to find the latest two agreed log entry and delete the logs after that in Follower is not satisfied.
- When now leader disconnects and old leader come back, the network started election, but keep election for a long time.
Reason: Variable conflict
Server\Round | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 |
---|---|---|---|---|---|---|---|---|
0 | [1]1* | [1]51* | [1]x | [1]x | [50]51 | [50]101 | [50]101 | [50]102 |
1 | [1]1 | x | [2]51* | [2]101* | [2]x | [2]x | [50]101 | [50]102 |
2 | [1]1 | x | [2]51 | [2]101 | [2]x | [2]x | [50]101 | [50]102 |
3 | [1]1 | x | [2]51 | [5]x | [50]51* | [50]101* | [50]101* | [50]102* |
4 | [1]1 | [1]51 | [25]x | [45]x | [50]51 | [50]101 | [50]101 | [50]102 |
This is the basic demo procedure of the process, the number in the table means the CommitIndex
, the [x] means the xth term
, * means the actual leader
in the network. x means the server is disconnected
. The Italic means not committed
. The bold means committed
.
One of the importance point is between round 4 and 5, that when old leader S0 is back. It at first sends new log(same term 1 as old leader's term) to the servers S3 & S4. S3 & S4 both have larger team to let S1 update its term. And S3 is more up-to-date because it has logs with larger term, it will not add the log. S0 sends to S4 which has the same logs as itself, also will not add logs. So the electionTimeout is not reset. Thus start election. The S3 is the last winner due to the logs with larger term. But it will take terms to finish. Sometimes the test fails because the leader is not selected after many rounds election. So I set the electionTimeout to be as seperated as possible like rf.electionTimeout = time.Millisecond * time.Duration(rand.Intn(300)+200)
. As the 5.2 in paper says.
One more point is the update of rf.nextIndex[]
when from round 4 - 5, the new leader S3 needs to send 50 logs in term 2 to S0 & S4. The original method is the decrease one by one, but it costs so many RPCs that the test fails early. The detail is in Bug 2.
- When brings back the servers which are partitioned at first and a later disconnected server, the leader should be the later disconnected server for the reason that is has more up-to-date logs. Some problems exist in
VoteRequest
.
Solved. Forget to include "have voted" situation when args.Term > rf.currentTerm
- After bringing back the old leader, and select the up-to-date one as the leader, too many conflicting entries make the decrement of the
nextIndex[i]
to be very slow, so the test will fail.
Solution: According to the paper, the follower can include the term of the conflicting entry and the first index it stores for that term. So it will reduce much time and pass the test.
If 2B is done perfectly, 2C is very easy to complete, just to finish persist()
and readPersist()
In TestFigure8Unreliable2C
which is the toughest test in 2C, generates large number of new logs and at the same time make the network in chaos. In my debugging process, the term came into chaos. By reading the Term confusion from the students-guide-to-raft from the lab page, I knew that my network is not able to cope with old RPC replies when chaos, so when the leader receive replies, compare the current term with the original term sent in original RPC. If different, drop the reply and return. And this works.
Main tasks:
- Write Snapshot codes
- Rewrite all the variable about index and term. It's very annoying.
Don't need to care about the InstallSnapshot RPC in the first test.
Much workload on rewriting true index and term in every place.
- In this crash test, the server will crash and when it comes back, it will recall
Make()
. It needs to assign therf.lastApplied = rf.lastIncludedIndex
which is read from the persistence.
- Almost the same as before, but this time is
rf.commitIndex = rf.lastIncludedIndex
afterMake()