Coder Social home page Coder Social logo

sofastack / sofa-jraft Goto Github PK

View Code? Open in Web Editor NEW
3.5K 127.0 1.1K 3.47 MB

A production-grade java implementation of RAFT consensus algorithm.

Home Page: https://www.sofastack.tech/projects/sofa-jraft/

License: Apache License 2.0

Shell 0.07% Java 99.93%
raft-algorithm raft-java raft sofastack sofa-jraft sofa-bolt distributed-consensus-algorithms java consensus

sofa-jraft's Introduction

SOFAJRaft

build License Maven Central

中文

Overview

SOFAJRaft is a production-level, high-performance Java implementation based on the RAFT consistency algorithm that supports MULTI-RAFT-GROUP for high-load, low-latency scenarios. With SOFAJRaft you can focus on your business area. SOFAJRaft handles all RAFT-related technical challenges. SOFAJRaft is very user-friendly, which provides several examples, making it easy to understand and use.

Features

  • Leader election and priority-based semi-deterministic leader election
  • Log replication and recovery
  • Read-only member (learner)
  • Snapshot and log compaction
  • Cluster membership management, adding nodes, removing nodes, replacing nodes, etc.
  • Mechanism of transfer leader for reboot, load balance scene, etc.
  • Symmetric network partition tolerance
  • Asymmetric network partition tolerance
  • Fault tolerance, minority failure doesn't affect the overall availability of system
  • Manual recovery cluster available for majority failure
  • Linearizable read, ReadIndex/LeaseRead
  • Replication pipeline
  • Rich statistics to analyze the performance based on Metrics
  • Passed Jepsen consistency verification test
  • SOFAJRaft includes an embedded distributed KV storage implementation

Requirements

Compile requirement: JDK 8+ and Maven 3.2.5+ .

Documents

Contribution

How to contribute

Acknowledgement

SOFAJRaft was ported from Baidu's braft with some optimizing and improvement. Thanks to the Baidu braft team for opening up such a great C++ RAFT implementation.

License

SOFAJRaft is licensed under the Apache License 2.0. SOFAJRaft relies on some third-party components, and their open source protocol is also Apache License 2.0. In addition, SOFAJRaft also directly references some code (possibly with minor changes), which open source protocol is Apache License 2.0, including

  • NonBlockingHashMap/NonBlockingHashMapLong in JCTools
  • HashedWheelTimer in Netty, also referenced Netty's Pipeline design
  • Efficient encoding/decoding of UTF8 String in Protobuf

Community

See our community materials.

Join the user group on Slack

Scan the QR code below with DingTalk(钉钉) to join the SOFAStack user group.

Scan the QR code below with WeChat(微信) to Follow our Official Accounts.

Known Users

These are the companies using SOFAStack (the names are in no particular order). Please leave a comment here to tell us your scenario to make SOFAStack better.

蚂蚁集团 网商银行 恒生电子 数立信息 Paytm 天弘基金 **人保 信美相互 南京银行 民生银行 重庆农商行 中信证券 富滇银行 挖财 拍拍贷 OPPO金融 运满满 译筑科技 杭州米雅信息科技 邦道科技 申通快递 深圳大头兄弟文化 烽火科技 亚信科技 成都云智天下科技 上海溢米辅导 态赋科技 风一科技 武汉易企盈 极致医疗 京东 小象生鲜 北京云族佳 欣亿云网 山东网聪 深圳市诺安赛威 上扬软件 长沙点三 网易云音乐 虎牙直播 **移动 无纸科技 黄金钱包 独木桥网络 wueasy 北京攸乐科技 易宝支付 威马汽车 亿通国际 新华三 klilalagroup

sofa-jraft's People

Contributors

1294566108 avatar alchemyding avatar brotherlu-xcq avatar caicancai avatar claire9910 avatar cmonkey avatar fengjiachun avatar funky-eyes avatar gakkiyomi avatar haoyann avatar horizonzy avatar howie-xu avatar huangyunbin avatar hzh0425 avatar javacodercff avatar killme2008 avatar lfygh avatar masaimu avatar nobodyiam avatar pifuant avatar qiujiayu avatar seeflood avatar shibd avatar shihuili1218 avatar slievrly avatar stenicholas avatar xiaoheng1 avatar ye-xiaowei avatar yuyang0423 avatar zongtanghu avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

sofa-jraft's Issues

获取存活follower

作为leader,想获取存活的follower,通知存活的follower处理数据。

目前存在 com.alipay.sofa.jraft.core.ReplicatorGroupImpl#failureReplicators,但是接口没有提供获取的方法。

针对 windows 环境做系统测试

目前已发现的兼容问题:

  1. 文件路径名,部分测试用了临时目录和特殊符号的目录。
  2. ProtoBufFile 在文件存在的时候 save 返回 false #54

1.2.5 发布

  • 单元测试通过
  • jepsen 验证通过
  • 文档补充
  • 发布和 release note

Change the jdk version in travis

We should use openjdk instead of oraclejdk because oraclejdk will charge. Openjdk8 and openjdk11 will be used as two versions of the official long-term maintenance. At the same time, openjdk11 has a lot of syntax adjustments, so it needs to do the corresponding compatibility test.

提问请教: 一个端口是否能服务于多个raft group?

请教几个问题:
假如我想要在一个节点上有多个raft group,是否可以只监听一个端口,靠group id来区分它们吗?还是有多少个raft group,就分别要开一个端口进行服务?
我看到PeerId里面有一个idx字段,这个字段现在有用吗?

还有一个不明白的地方,在CounterClient这个例子里面,客户端在进行rpc调用的时候,只用到了Endpoint信息(也就是IP+端口),groupID都没用到。是不是可以理解为一个端口只能对应一个groupID?

cliClientService.getRpcClient().invokeWithCallback(leader.getEndpoint().toString(), request...

jraft构建失败

操作系统:mac os
JDK:oracle jdk8
开发工具:idea 2018

jmh基准测试框架的scope为test,项目构建失败;
改为provide后,运行用例失败。
image

同一个RpcClient异步调用顺序和最后状态机执行的顺序不同

现在我希望在一个client上提交大量的请求,这些请求是有前后关系的,不能乱序,如果我使用同一个客户端循环调用invokeWithCallback,那么jraft应该无法保证状态机执行的顺序和invoke的顺序一致吧?那么我只能一个一个的同步调用,速度就会非常慢。
请问这种问题有什么解决方案吗?

CounterServer结束leader进程,过段时间重启会一直刷日志

Sofa-Middleware-Log SLF4J : Actual binding is of type [ com.alipay.remoting Log4j2 ]
2019-03-19 11:35:54 [main] INFO log:30 - Sofa-Middleware-Log SLF4J : Actual binding is of type [ com.alipay.remoting Log4j2 ]
2019-03-19 11:35:56 [main] INFO FSMCallerImpl:188 - Starts FSMCaller successfully.
2019-03-19 11:35:56 [Jraft-FSMCaller-disruptor-0] INFO StateMachineAdapter:79 - onConfigurationCommitted: 127.0.0.1:8081,127.0.0.1:8082,127.0.0.1:8083
2019-03-19 11:35:56 [Jraft-FSMCaller-disruptor-0] INFO SnapshotExecutorImpl:435 - Node <counter/127.0.0.1:8081> onSnapshotLoadDone, last_included_index: 1
last_included_term: 1
peers: "127.0.0.1:8081"
peers: "127.0.0.1:8082"
peers: "127.0.0.1:8083"

2019-03-19 11:35:56 [main] INFO NodeImpl:793 - Node <counter/127.0.0.1:8081> init, term: 1, lastLogId: LogId [index=1, term=1], conf: 127.0.0.1:8081,127.0.0.1:8082,127.0.0.1:8083, old_conf:
2019-03-19 11:35:57 [main] INFO RaftGroupService:139 - Start the RaftGroupService successfully.
Started counter server at port:8081
2019-03-19 11:35:57 [Rpc-netty-server-worker-1-thread-1] WARN RaftRpcServerFactory:263 - JRaft SET bolt.rpc.dispatch-msg-list-in-default-executor to be false for replicator pipeline optimistic.
2019-03-19 11:35:57 [counter/127.0.0.1:8081-AppendEntriesThread0] INFO LocalRaftMetaStorage:121 - Save raft meta, path=/tmp/server1\raft_meta, term=2, votedFor=0.0.0.0:0, cost time=15 ms
2019-03-19 11:35:57 [counter/127.0.0.1:8081-AppendEntriesThread0] WARN NodeImpl:1589 - Node <counter/127.0.0.1:8081> reject term_unmatched AppendEntriesRequest from 127.0.0.1:8082 in term 2 prevLogIndex 2 prevLogTerm 2 localPrevLogTerm 0 lastLogIndex 1 entriesSize 0
2019-03-19 11:35:57 [Jraft-FSMCaller-disruptor-0] INFO StateMachineAdapter:89 - onStartFollowing: LeaderChangeContext [leaderId=127.0.0.1:8082, term=2, status=Status[ENEWLEADER<10011>: Raft node receives message from new leader with higher term.]]
2019-03-19 11:35:57 [Bolt-default-executor-6-thread-2] INFO NodeImpl:2695 - Node <counter/127.0.0.1:8081> received InstallSnapshotRequest lastIncludedLogIndex 2 lastIncludedLogTerm 2 from 127.0.0.1:8082 when lastLogId=LogId [index=1, term=1]
2019-03-19 11:35:57 [Bolt-conn-event-executor-5-thread-1] INFO ClientServiceConnectionEventProcessor:50 - Peer 127.0.0.1:8082 is connected
2019-03-19 11:35:57 [JRaft-Closure-Executor-1] INFO LocalSnapshotStorage:167 - Deleting snapshot /tmp/server1\snapshot\temp
2019-03-19 11:35:57 [JRaft-Closure-Executor-1] ERROR Utils:138 - Fail to close
java.io.IOException
at com.alipay.sofa.jraft.storage.snapshot.local.LocalSnapshotStorage.close(LocalSnapshotStorage.java:251) ~[classes/:?]
at com.alipay.sofa.jraft.storage.snapshot.local.LocalSnapshotWriter.close(LocalSnapshotWriter.java:98) ~[classes/:?]
at com.alipay.sofa.jraft.storage.snapshot.local.LocalSnapshotWriter.close(LocalSnapshotWriter.java:93) ~[classes/:?]
at com.alipay.sofa.jraft.util.Utils.closeQuietly(Utils.java:135) [classes/:?]
at com.alipay.sofa.jraft.storage.snapshot.local.LocalSnapshotCopier.internalCopy(LocalSnapshotCopier.java:113) [classes/:?]
at com.alipay.sofa.jraft.storage.snapshot.local.LocalSnapshotCopier.startCopy(LocalSnapshotCopier.java:85) [classes/:?]
at com.alipay.sofa.jraft.storage.snapshot.local.LocalSnapshotCopier$$Lambda$9/945205179.run(Unknown Source) [classes/:?]
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) [?:1.8.0_45]
at java.util.concurrent.FutureTask.run(FutureTask.java:266) [?:1.8.0_45]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [?:1.8.0_45]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [?:1.8.0_45]
at java.lang.Thread.run(Thread.java:745) [?:1.8.0_45]
2019-03-19 11:35:57 [Jraft-FSMCaller-disruptor-0] ERROR CounterStateMachine:124 - Raft error: %s
com.alipay.sofa.jraft.error.RaftException: ERROR_TYPE_SNAPSHOT
at com.alipay.sofa.jraft.storage.snapshot.SnapshotExecutorImpl.reportError(SnapshotExecutorImpl.java:660) ~[classes/:?]
at com.alipay.sofa.jraft.storage.snapshot.SnapshotExecutorImpl.loadDownloadingSnapshot(SnapshotExecutorImpl.java:499) ~[classes/:?]
at com.alipay.sofa.jraft.storage.snapshot.SnapshotExecutorImpl.installSnapshot(SnapshotExecutorImpl.java:484) ~[classes/:?]
at com.alipay.sofa.jraft.core.NodeImpl.handleInstallSnapshot(NodeImpl.java:2699) ~[classes/:?]
at com.alipay.sofa.jraft.rpc.impl.core.InstallSnapshotRequestProcessor.processRequest0(InstallSnapshotRequestProcessor.java:51) ~[classes/:?]
at com.alipay.sofa.jraft.rpc.impl.core.InstallSnapshotRequestProcessor.processRequest0(InstallSnapshotRequestProcessor.java:1) ~[classes/:?]
at com.alipay.sofa.jraft.rpc.impl.core.NodeRequestProcessor.processRequest(NodeRequestProcessor.java:58) ~[classes/:?]
at com.alipay.sofa.jraft.rpc.RpcRequestProcessor.handleRequest(RpcRequestProcessor.java:53) ~[classes/:?]
at com.alipay.sofa.jraft.rpc.RpcRequestProcessor.handleRequest(RpcRequestProcessor.java:1) ~[classes/:?]
at com.alipay.remoting.rpc.protocol.RpcRequestProcessor.dispatchToUserProcessor(RpcRequestProcessor.java:224) ~[bolt-1.5.3.jar:?]
at com.alipay.remoting.rpc.protocol.RpcRequestProcessor.doProcess(RpcRequestProcessor.java:145) ~[bolt-1.5.3.jar:?]
at com.alipay.remoting.rpc.protocol.RpcRequestProcessor$ProcessTask.run(RpcRequestProcessor.java:366) ~[bolt-1.5.3.jar:?]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) ~[?:1.8.0_45]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) ~[?:1.8.0_45]
at java.lang.Thread.run(Thread.java:745) [?:1.8.0_45]
2019-03-19 11:35:57 [Jraft-FSMCaller-disruptor-0] WARN NodeImpl:2014 - Node <counter/127.0.0.1:8081> got error=Error [type=ERROR_TYPE_SNAPSHOT, status=Status[EIO<1014>: Fail to sync writer]]
2019-03-19 11:35:57 [Jraft-FSMCaller-disruptor-0] INFO StateMachineAdapter:84 - onStopFollowing: LeaderChangeContext [leaderId=127.0.0.1:8082, term=2, status=Status[EBADNODE<10009>: Raft node(leader or candidate) is in error.]]
2019-03-19 11:35:58 [counter/127.0.0.1:8081-AppendEntriesThread0] WARN NodeImpl:1540 - Node <counter/127.0.0.1:8081> is not in active state, current term 2
2019-03-19 11:35:58 [Bolt-default-executor-6-thread-6] WARN NodeImpl:2663 - Node <counter/127.0.0.1:8081> ignore InstallSnapshotRequest as it is not in active state STATE_ERROR
2019-03-19 11:35:58 [counter/127.0.0.1:8081-AppendEntriesThread0] WARN NodeImpl:1540 - Node <counter/127.0.0.1:8081> is not in active state, current term 2
2019-03-19 11:35:58 [Bolt-default-executor-6-thread-7] WARN NodeImpl:2663 - Node <counter/127.0.0.1:8081> ignore InstallSnapshotRequest as it is not in active state STATE_ERROR
2019-03-19 11:35:58 [counter/127.0.0.1:8081-AppendEntriesThread0] WARN NodeImpl:1540 - Node <counter/127.0.0.1:8081> is not in active state, current term 2
2019-03-19 11:35:58 [Bolt-default-executor-6-thread-8] WARN NodeImpl:2663 - Node <counter/127.0.0.1:8081> ignore InstallSnapshotRequest as it is not in active state STATE_ERROR
2019-03-19 11:35:58 [counter/127.0.0.1:8081-AppendEntriesThread0] WARN NodeImpl:1540 - Node <counter/127.0.0.1:8081> is not in active state, current term 2
2019-03-19 11:35:58 [Bolt-default-executor-6-thread-9] WARN NodeImpl:2663 - Node <counter/127.0.0.1:8081> ignore InstallSnapshotRequest as it is not in active state STATE_ERROR
2019-03-19 11:35:58 [counter/127.0.0.1:8081-AppendEntriesThread0] WARN NodeImpl:1540 - Node <counter/127.0.0.1:8081> is not in active state, current term 2
2019-03-19 11:35:58 [Bolt-default-executor-6-thread-10] WARN NodeImpl:2663 - Node <counter/127.0.0.1:8081> ignore InstallSnapshotRequest as it is not in active state STATE_ERROR
2019-03-19 11:35:58 [counter/127.0.0.1:8081-AppendEntriesThread0] WARN NodeImpl:1540 - Node <counter/127.0.0.1:8081> is not in active state, current term 2
2019-03-19 11:35:58 [Bolt-default-executor-6-thread-11] WARN NodeImpl:2663 - Node <counter/127.0.0.1:8081> ignore InstallSnapshotRequest as it is not in active state STATE_ERROR
2019-03-19 11:35:58 [counter/127.0.0.1:8081-AppendEntriesThread0] WARN NodeImpl:1540 - Node <counter/127.0.0.1:8081> is not in active state, current term 2
2019-03-19 11:35:58 [Bolt-default-executor-6-thread-12] WARN NodeImpl:2663 - Node <counter/127.0.0.1:8081> ignore InstallSnapshotRequest as it is not in active state STATE_ERROR
2019-03-19 11:35:58 [counter/127.0.0.1:8081-AppendEntriesThread0] WARN NodeImpl:1540 - Node <counter/127.0.0.1:8081> is not in active state, current term 2
2019-03-19 11:35:58 [Bolt-default-executor-6-thread-13] WARN NodeImpl:2663 - Node <counter/127.0.0.1:8081> ignore InstallSnapshotRequest as it is not in active state STATE_ERROR
2019-03-19 11:35:58 [counter/127.0.0.1:8081-AppendEntriesThread0] WARN NodeImpl:1540 - Node <counter/127.0.0.1:8081> is not in active state, current term 2
2019-03-19 11:35:58 [Bolt-default-executor-6-thread-14] WARN NodeImpl:2663 - Node <counter/127.0.0.1:8081> ignore InstallSnapshotRequest as it is not in active state STATE_ERROR
2019-03-19 11:35:58 [counter/127.0.0.1:8081-AppendEntriesThread0] WARN NodeImpl:1540 - Node <counter/127.0.0.1:8081> is not in active state, current term 2
2019-03-19 11:35:58 [Bolt-default-executor-6-thread-15] WARN NodeImpl:2663 - Node <counter/127.0.0.1:8081> ignore InstallSnapshotRequest as it is not in active state STATE_ERROR
2019-03-19 11:35:59 [counter/127.0.0.1:8081-AppendEntriesThread0] WARN NodeImpl:1540 - Node <counter/127.0.0.1:8081> is not in active state, current term 2
2019-03-19 11:35:59 [Bolt-default-executor-6-thread-16] WARN NodeImpl:2663 - Node <counter/127.0.0.1:8081> ignore InstallSnapshotRequest as it is not in active state STATE_ERROR
2019-03-19 11:35:59 [counter/127.0.0.1:8081-AppendEntriesThread0] WARN NodeImpl:1540 - Node <counter/127.0.0.1:8081> is not in active state, current term 2
2019-03-19 11:35:59 [Bolt-default-executor-6-thread-17] WARN NodeImpl:2663 - Node <counter/127.0.0.1:8081> ignore InstallSnapshotRequest as it is not in active state STATE_ERROR
2019-03-19 11:35:59 [counter/127.0.0.1:8081-AppendEntriesThread0] WARN NodeImpl:1540 - Node <counter/127.0.0.1:8081> is not in active state, current term 2
2019-03-19 11:35:59 [Bolt-default-executor-6-thread-18] WARN NodeImpl:2663 - Node <counter/127.0.0.1:8081> ignore InstallSnapshotRequest as it is not in active state STATE_ERROR
2019-03-19 11:35:59 [counter/127.0.0.1:8081-AppendEntriesThread0] WARN NodeImpl:1540 - Node <counter/127.0.0.1:8081> is not in active state, current term 2
2019-03-19 11:35:59 [Bolt-default-executor-6-thread-19] WARN NodeImpl:2663 - Node <counter/127.0.0.1:8081> ignore InstallSnapshotRequest as it is not in active state STATE_ERROR
2019-03-19 11:35:59 [counter/127.0.0.1:8081-AppendEntriesThread0] WARN NodeImpl:1540 - Node <counter/127.0.0.1:8081> is not in active state, current term 2
2019-03-19 11:35:59 [Bolt-default-executor-6-thread-20] WARN NodeImpl:2663 - Node <counter/127.0.0.1:8081> ignore InstallSnapshotRequest as it is not in active state STATE_ERROR

【PR】修复PreVote流程的疑似bug?

最近鄙人也在做Raft的研究和实现,看到贵团队开源的Java版Raft库,大喜若狂,遂连夜品读代码,收获良多。

但是,预投票流程看着好像有点问题,根据我理解的Raft,预投票的关键是2点:
1.预投票请求报文中的term,应该是发起预投票者的term,不需要term+1(因为prevote的出现就是解决网络分区后节点不断增加term扰乱全局而增加的优化流程)
2.收到预投票请求的节点,应该检查lastLeaderTimestamp是否超过了最小的election超时时间,如果否,则认为当前的leader依然有效,拒绝该preVote(保证稳定性)

而JRaft的实现中,感觉有点问题,也可能是鄙人不理解造成的:
1.发起者以term+1发起预投票,原因何在?
2.接收者,handlePreVoteRequest方法中,具体问题看下面代码中间的中文注释部分:

     do {
        if (request.getTerm() < this.currTerm) {
            LOG.info("Node {} ignore PreVote from {} in term {} currTerm {}", this.getNodeId(),
                request.getServerId(), request.getTerm(), this.currTerm);
            // A follower replicator may not be started when this node become leader, so we must check it.
            checkReplicator(candidateId);
            break;
        } else if (request.getTerm() == this.currTerm + 1) {

               //!!!!!此判断开始往后就根据log的term和index判断是否投赞成票!!!!
               //这样的话,不管当前节点是Follower还是leader,遇到prevote只要没有更新的log,都会赞成
               //这违背了preVote的初衷。
			
            // A follower replicator may not be started when this node become leader, so we must check it.
            // check replicator state
            checkReplicator(candidateId);
        }
        doUnlock = false;
        this.writeLock.unlock();

        final LogId logId = this.logManager.getLastLogId(true);

        doUnlock = true;
        this.writeLock.lock();
        final LogId requestLastLogId = new LogId(request.getLastLogIndex(), request.getLastLogTerm());
        granted = (requestLastLogId.compareTo(logId) >= 0);

        LOG.info(
            "Node {} received PreVote from {} in term {} currTerm {} granted {}, request last logId: {}, current last logId: {}",
            this.getNodeId(), request.getServerId(), request.getTerm(), this.currTerm, granted,
            requestLastLogId, logId);
    } while (false);

    final RequestVoteResponse.Builder responseBuilder = RequestVoteResponse.newBuilder();
    responseBuilder.setTerm(this.currTerm);
    responseBuilder.setGranted(granted);
    return responseBuilder.build();

这里是困惑之处,如果是鄙人眼拙,希望指教下阁下的思路。

鄙人按照自己的理解,修改了下源码,请求PR,主要改动有几个:
1.handlePreVoteRequest方法:前面增加if(Utils.nowMs()-lastLeaderTimestamp<=options.getElectionTimeoutMs())的判断(实现刚刚提到的论文核心要点2)
2.handlePreVoteRequest方法:如果是leader,且赞成此次preVote(term > currTerm && req.log不旧),则stepdown
3.handleElectionTimeout方法(调用preVote前):if(Utils.nowMs()-lastLeaderTimestamp<=options.getElectionTimeoutMs())的判断去掉,因为这段貌似多余,因为electionTimout了已经,而且这段逻辑应该放到接收者才对。
4.preVote方法:req.setTerm(this.currTerm);不加1!

不知道鄙人理解是否有问题,还望阁下指教!~
如果没问题,希望接受此PR,以贡献小弟绵薄之力。

LogStorage 基于 SPI 可扩展

Your question

Your scenes

方便用户自定义实现 LogStorage,当然,可能不仅仅 LogStorage 需要基于 SPI 扩展

Your advice

实现一套 SPI 加载机制,可能需要对 ServiceLoader 稍微扩展一下

Environment

1.2.4 发布

  • 单元测试通过
  • jepsen 验证通过
  • 文档补充
  • 发布和 release note

LogStorage 实现数据和索引分离

通常类似 rocksdb/leveldb 都只适合做小的 key/value 存储,在 key/value 较大的时候,性能不是很理想。

目前 LogStorage 的默认实现是直接基于 Rocksdb 的 RocksdbLogStorage 实现,虽然用户可提供自己的实现,不过默认实现也有优化的余地:

  1. 将日志按照 segment 方式的 append 添加,一定大小拆分文件(1G假设)。Segment 文件名为写入的第一条日志的 log index
  2. rocksdb 存储的 key 保持不变为 log index,而 value 从直接存储日志数据,修改为存储日志在 segment 中的偏移量 (file_index, write_position),value 就可以缩小到 12 个字节(8 + 4)实现。
  3. 对于 truncate 日志的实现也较为容易,可以直接按照 segment 粒度(依据 segment 文件名来定位)来删除,同时需要删除索引数据。这里需要涉及一些 recover 处理。
  4. segment 的实现可以使用 mmap 和 group commit fsync 来提升性能。

文档翻译

需要将所有文档都英文翻译下, readme 也应该国际化

ignore PreVote because the leader's lease is still valid

hi,
当前拿两个节点操作,经过几次添加删除节点操作后,出现了下面的日志,请问这种情况如何恢复?谢谢。

2019-03-29 15:34:27,010 INFO
Bolt-default-executor-6-thread-16 - Node <test/127.0.0.1:8080> ignore PreVote from 127.0.0.1:8081 in term 5 currTerm 4, because the leader 127.0.0.1:8080's lease is still valid.

jraft的成熟度和使用情况如何?

看了下jraft的源码,感觉还不错,很多细节都有考虑,对raft的实现也是比较完善。
打算使用jraft,但是如何说服其他人,说服上级呢?
如果有些成功的案例,如果是生产上已经大规模的使用,这些都会比较有说服力。

metric 统计接口化

允许可插拔的 metric 类库接入,默认是 dropwizard,用户可实现该接口提供其他实现,如 micrometer 等。

ReadLease 的处理似乎并不严谨

简单的阅读了一下 read lease 部分处理的代码,发现read lease只是判断了状态是不是STATE_LEADER。

考虑这样一个case。一个 {A,B,C} raft 复制组,刚开始一切正常。过了一段时间 A和C发生了网络分区,这样{A,B}因为能达到多数,所以仍然能正常工作。在网络分区后,没有任何新日志被写入,所以几个副本的数据是完全一致的。C在election timeout后发起选主,然后和B达成一致,C成为新leader。这时候,A的心跳还没到达B,所以A这是stale leader。在A发现有新主之前,新日志被写下了。这时,A的状态其实仍然是STATE_LEADER。按照代码的实现,仍然在提供读,发生了stale read。

新leader在起来之后,应该等待一段时间才能处理请求,确保老leader的lease过期。这个时长应该超过lease period。

目前,在代码中没有看到相关的处理。当然,可能是我看得不够仔细。

点Star列表前30里,为何接近一半为两天内注册的全新账号?

https://zhuanlan.zhihu.com/p/61034386

下面提到的情况的详细截图在上述知乎文章里都已提供。希望从事实出发,合理讨论,是非曲直自有公论。

上周末与两周前的周末,均发生大量一两天内注册的无头像、无github实际使用用户密集点赞的情况。比如今天下午北京时间4点18分的时候,最近30个点赞者中,有13个是今天和昨天两天注册的全新账号。

这些新号,都是固定点赞几个完全类别、语言都不同的项目,比如jraft和一个与互联网完全无关的IT管理软件。更甚者,有多个账号在两三个小时内注册,并完成对完全一样的四个项目含jraft的点赞。

这样大量的新号的密集点赞的结果就是点赞数不合理增长并将项目推入当天trending榜。

请问,支付宝是受到灰黑产业的“打击报复性点赞攻击”了吗?为何似乎支付宝是这种“打击报复性点赞攻击”的受益方呢?

ProtoBufFile#save(Message msg, boolean sync) 返回false问题

如果ProtoBufFile.path文件已经存在,savesave将一直返回false,
LocalSnapshotCopier.internalCopy()中调用了filter();和copyFile(file);
这可能会导致调用两次LocalSnapshotWriter.sync(),第一次返回ture,第二次返回false

ignore PreVote because the leader's lease is still valid

hi,
当前拿两个节点操作,经过几次添加删除节点操作后,出现了下面的日志,请问这种情况如何恢复?谢谢。

2019-03-29 15:34:27,010 INFO
Bolt-default-executor-6-thread-16 - Node <test/127.0.0.1:8080> ignore PreVote from 127.0.0.1:8081 in term 5 currTerm 4, because the leader 127.0.0.1:8080's lease is still valid.

RheaKV多Region共享StateMachine的疑问

刚刚阅读了下RheaKV的代码,收获颇丰,但是有个疑问,甚是不解,希望指导一下。

RheaKV采用multi-raft的设计,但一个节点的多个region共享一个RocksDB状态机。
这样,snapshot是以整个节点所有region的维度来做的。
那么,当一个raft group的leader和follower之间做install snapshot时,是否会把leader所在的节点的所有数据(包括所有的raft group)都install过去了?

此处看了好久,也没有看到怎么把不同raft group的数据区分开,还望赐教。

没有处理 RaftMetaStorage 的异常返回

感谢开发者的辛苦工作。

Describe the bug

Raft 的 meta 信息对 Raft 正常工作来说至关重要,看上去目前 LocalRaftMetaStorage 存储 Meta 信息时虽然会捕获 IO 异常,但仅仅是打日志和返回 false 表示存 meta 失败,使用 RaftMetaStorage 的地方均未检查返回值是否是 false,(比如 这里 这里 这里) 也即相当于当出现 IO 异常时候仅仅只有日志记录,此后 Raft 一旦重启,恢复的 Meta 信息就是错的。

比如可能出现在某个 Term 100 下已经 Vote 过某个节点 A 为 Leader,结果 Meta 没存成功又异常重启了,启动后就忘记自己在 Term 100 下 Vote 过 A 节点为 Leader (可能恢复的是老 Term 下的 Vote For),于是在 term 100 下又能 Vote 另一个节点为 Leader 了,从而违反了 Raft 的约定。

不知道是不是我遗漏了什么信息,如有理解错误还望见谅。

Expected behavior

处理 RaftMetaStorage 存 meta 失败时返回的 false 值,存失败时候是不是可以考虑将整个节点停止。

Actual behavior

看上去没有任何地方处理 RaftMetaStorage 在存 Meta 失败后返回的 false 值。

Steps to reproduce

比如在 Raft 启动后,将存储 Meta 的路径完全删除或修改权限让 Raft 进程无权操作目标路径。

但看上去这里真的想引起问题可能不太容易,比如即使存 Meta 失败,当前进程只要不重启就还能保持正确,即使重启只要 Leader 还在没发生选举能收到 Leader 的数据就还能在重启后更新 term 到正确值,不过看上去是有隐患在。

Environment

  • SOFAJRaft version: v1.2.5
  • JVM version (e.g. java -version): 任意
  • OS version (e.g. uname -a): 任意
  • Maven version: 任意
  • IDE version: 任意

时序图

能不能提供一下counter-incrementAndGet的时序图

Multi-Raft初始化时leader的均衡性问题

用JRaft实现multi-raft,在初始化时,目前没有办法指定哪个node优先成为leader,这或许会因为节点的启动顺序等因素,造成leader过于集中于某一个节点。有这个可能性吗?能否做下改进?

性能测试工具?

是否有性能测试的工具?有比较多的raft备选方案,现在需要选择一个,期望给出性能测试的工具或者结果

NodeImpl#executeApplyingTasks方法疑似bug

if (!this.ballotBox.appendPendingTask(this.conf.getConf(), conf.isStable() ? null : conf.getOldConf(),
            task.done)) {
            Utils.runClosureInThread(task.done, new Status(RaftError.EINTERNAL, "Fail to append task."));
            return;
}

此处的return不应该是continue更合适吗

关于AddPeer的疑问

addPeer流程,假设新的node没有任何数据,那么,先把新peer的node启动起来,注意node的initialServerList不能只有这个新peer,可以加上现有的集群列表。
因为如果只有这个新peer,它就会选举自己成为leader。~~
如果initialServerList包括现有的集群列表,那么在对现有集群发送addPeer请求前,这个新peer可能会选举超时,然后发起“预投票”,原有集群的节点会拒绝此次预投票,因为它的term比较小。新peer会stepdown,然后等待下一次选举超时,继续发“预投票”,如此往复。这个过程中,如果对原集群leader进行addPeer,就可以把这个节点加进去。

不知道我的理解对不对?麻烦指教~

提问请教:example子项目里面关于CounterServer的redirect()方法作用

你好,在读example子项目源码的时候看到CounterServer类中的redirect函数有点不太理解

public ValueResponse redirect() {
    final ValueResponse response = new ValueResponse();
    response.setSuccess(false);
    if (this.node != null) {
        final PeerId leader = this.node.getLeaderId();
        if (leader != null) {
            response.setRedirect(leader.toString());
        }
    }
    return response;
}

这里设置了response.setRedirect(leader.toString());但是没有看到ValueResponse中getRedirect()方法被调用过。

在IncrementAndGetRequestProcessor类中,

@Override
public void handleRequest(final BizContext bizCtx, final AsyncContext asyncCtx, final IncrementAndGetRequest request) {
    if (!this.counterServer.getFsm().isLeader()) {
        // 这里要重定向到leader,但是却没有看到用ValueResponse中getRedirect()方法同获取leader,不知道是怎么做到重定向的
        asyncCtx.sendResponse(this.counterServer.redirect());
        return;
    }
    ...
}

感觉ValueResponse中的redirect没什么用。不知道我理解的对不对。

[疑问] ReadOnlyLeaseBased疑问

感谢sofa贡献的sofa-jraft(以下简称jraft),我此次观阅了ReadIndex相关的一些代码,有些疑问,请教下各位

在tikv中,描述了一个场景:
前提使用lease base:
在leader transfer过程中,由于网络分区问题导致老的leader没有stepdown以至于可以执行读请求.
场景描述原文

这个场景在jraft中貌似是没有经过处理的? (我读了transferLeadershipTo相关方法)

所以请教下在此场景下jraft是否存在问题?

是否可以优化此场景?(虽然jraft在注释中明确说明lease read是不安全的, 但是毕竟lease read要比 read index效率高)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.