Coder Social home page Coder Social logo

phxpaxos 启动不了 about phxpaxos HOT 9 CLOSED

tencent avatar tencent commented on May 18, 2024
phxpaxos 启动不了

from phxpaxos.

Comments (9)

lynncui00 avatar lynncui00 commented on May 18, 2024

从这个日志来看,PaxosLog部分丢失了。
打开完整的日志级别,然后贴完整的启动日志来看看。

from phxpaxos.

xinmingyao avatar xinmingyao commented on May 18, 2024

我们bSync设置的是false

2016-11-22 09:03:51.11s CheckpointInstanceID 6497752

2016-11-22 09:03:51.11s DEBUG(0): PN8phxpaxos8LogStoreE::ParseFileID fileid 19 offset 52932087 checksum 2691965949
2016-11-22 09:03:51.11s Imp(0): PN8phxpaxos8LogStoreE::RebuildIndex START fileid 19 offset 52932087 checksum 2691965949
2016-11-22 09:03:51.11s Showy(0): PN8phxpaxos8LogStoreE::OpenFile ok, path ../storage/paxoslog/g0/vfile/19.f
2016-11-22 09:03:51.11s Showy(0): PN8phxpaxos8LogStoreE::RebuildIndexForOneFile rebuild one index ok, fileid 19 offset 52932087 instanceid 6570821 checksum 2691965949 buff
er size 399 )
2016-11-22 09:03:51.11s Imp(0): PN8phxpaxos8LogStoreE::RebuildIndexForOneFile File Data End, fileid 19 offset 52932498
2016-11-22 09:03:51.11s DEBUG(0): PN8phxpaxos8LogStoreE::RebuildIndexForOneFile file not exist, filepath ../storage/paxoslog/g0/vfile/20.f
2016-11-22 09:03:51.11s Showy(0): PN8phxpaxos8LogStoreE::RebuildIndex END rebuild ok, nowfileid 20
2016-11-22 09:03:51.11s Showy(0): PN8phxpaxos8LogStoreE::OpenFile ok, path ../storage/paxoslog/g0/vfile/19.f
2016-11-22 09:03:51.11s Imp(0): PN8phxpaxos8LogStoreE::Init ok, path ../storage/paxoslog/g0/vfile fileid 19 meta checksum 1676158829 nowfilesize 104857600 nowfilewriteoffs
et 52932498 )
2016-11-22 09:03:51.11s Showy(0): PN8phxpaxos8DatabaseE::Init OK, db_path ../storage/paxoslog/g0
2016-11-22 09:03:51.11s Showy: PN8phxpaxos13MultiDatabaseE::Init OK, DBPath ../storage/paxoslog groupcount 1
2016-11-22 09:03:51.11s Showy: PN8phxpaxos5PNodeE::InitLogStorage OK, use default logstorage
2016-11-22 09:03:51.11s Showy: PN8phxpaxos5PNodeE::InitNetWork OK, use default network
2016-11-22 09:03:51.11s Imp(0): PN8phxpaxos18MasterStateMachineE::Init OK, master nodeid 2095519671909359521 version 6560819 expiretime 1222895188
2016-11-22 09:03:51.11s DEBUG(0): PN8phxpaxos8DatabaseE::GetFromLevelDB LevelDB.Get not found, instanceid 18446744073709551614
2016-11-22 09:03:51.11s Showy(0): PN8phxpaxos20SystemVariablesStoreE::Read DB.Get not found, groupidx 0
2016-11-22 09:03:51.11s Showy(0): PN8phxpaxos9SystemVSME::Init variables not exist
2016-11-22 09:03:51.11s Imp(0): PN8phxpaxos9SystemVSME::RefleshNodeID ip 10.10.122.228 port 8097 nodeid 16463482425872228257
2016-11-22 09:03:51.11s Imp(0): PN8phxpaxos9SystemVSME::RefleshNodeID ip 10.10.123.130 port 8097 nodeid 9402119685132001185
2016-11-22 09:03:51.11s Imp(0): PN8phxpaxos9SystemVSME::RefleshNodeID ip 10.10.123.153 port 8097 nodeid 11059444348004343713
2016-11-22 09:03:51.11s Imp(0): PN8phxpaxos6ConfigE::Init OK
2016-11-22 09:03:51.11s Showy(0): PN8phxpaxos11MasterDamonE::TryBeMaster Ohter as master, can't try be master, masterid 2095519671909359521 myid 9402119685132001185
2016-11-22 09:03:51.11s Showy(0): PN8phxpaxos11MasterDamonE::run TryBeMaster, sleep time 3299ms
2016-11-22 09:03:51.11s Showy(0): PN8phxpaxos8PaxosLogE::GetMaxInstanceIDFromLog OK, MaxInstanceID 6570821 groupidsx 0
2016-11-22 09:03:51.11s DEBUG(0): PN8phxpaxos8LogStoreE::ParseFileID fileid 19 offset 52932087 checksum 2691965949
2016-11-22 09:03:51.11s Showy(0): PN8phxpaxos8LogStoreE::OpenFile ok, path ../storage/paxoslog/g0/vfile/19.f
2016-11-22 09:03:51.11s Showy(0): PN8phxpaxos8LogStoreE::Read ok, fileid 19 offset 52932087 instanceid 6570821 buffer size 399
2016-11-22 09:03:51.11s Showy(0): PN8phxpaxos13AcceptorStateE::Load GroupIdx 0 InstanceID 6570821 PromiseID 246 PromiseNodeID 2095519671909359521 AccectpedID 246 AcceptedN
odeID 2095519671909359521 ValueLen 359 Checksum 275208196
2016-11-22 09:03:51.11s Showy(0): PN8phxpaxos8AcceptorE::Init OK
2016-11-22 09:03:51.11s DEBUG(0): PN8phxpaxos8DatabaseE::GetFromLevelDB LevelDB.Get not found, instanceid 18446744073709551615
2016-11-22 09:03:51.11s ERR(0): PN8phxpaxos8DatabaseE::GetMinChosenInstanceID no min chosen instanceid
2016-11-22 09:03:51.11s DEBUG(0): PN8phxpaxos8LogStoreE::ParseFileID fileid 0 offset 34 checksum 4187667134
2016-11-22 09:03:51.11s Showy(0): PN8phxpaxos8LogStoreE::OpenFile ok, path ../storage/paxoslog/g0/vfile/0.f
2016-11-22 09:03:51.11s Showy(0): PN8phxpaxos8LogStoreE::Read ok, fileid 0 offset 34 instanceid 0 buffer size 70
2016-11-22 09:03:51.11s Showy(0): PN8phxpaxos7CleanerE::FixMinChosenInstanceID ok, old minchosen 0 fix minchosen 0
2016-11-22 09:03:51.11s Showy(0): PN8phxpaxos8InstanceE::Init Acceptor.OK, Log.InstanceID 6570821 Checkpoint.InstanceID 6497753
2016-11-22 09:03:51.11s DEBUG(0): PN8phxpaxos8LogStoreE::ParseFileID fileid 19 offset 24157577 checksum 2042402563
2016-11-22 09:03:51.11s Showy(0): PN8phxpaxos8LogStoreE::OpenFile ok, path ../storage/paxoslog/g0/vfile/19.f
2016-11-22 09:03:51.11s Showy(0): PN8phxpaxos8LogStoreE::Read ok, fileid 19 offset 24157577 instanceid 6497753 buffer size 399
2016-11-22 09:03:51.11s no need to sync checkpoint, skiptimes 1

2016-11-22 09:03:51.11s DEBUG(0): PN8phxpaxos8DatabaseE::GetFromLevelDB LevelDB.Get not found, instanceid 6497754
2016-11-22 09:03:51.11s Showy(0): PN8phxpaxos8PaxosLogE::ReadState DB.Get not found, groupidx 0
2016-11-22 09:03:51.11s ERR(0): PN8phxpaxos8InstanceE::PlayLog log read fail, instanceid 6497754 ret 1

from phxpaxos.

lynncui00 avatar lynncui00 commented on May 18, 2024

leveldb的数据丢了一个instance 6497754,原因未知。
你编译一下src/tools目录下的paxos_log_tools工具,检查一下6497753之后的数据丢失情况。

from phxpaxos.

xinmingyao avatar xinmingyao commented on May 18, 2024

检查了一下 6497753 到6570821(最大值)之间的数据
6497754 6498905 6530370 6540384 ,然后是65707722到6553500之间大概丢了16660个数据。
磁盘是普通的sata盘,bSync设置的是false

from phxpaxos.

lynncui00 avatar lynncui00 commented on May 18, 2024

期间机器是否有重启过?如果bSync设置为false并且机器重启的话是有可能出问题的。

目前的解决办法只能直接删掉paxos log数据重启了。

from phxpaxos.

xinmingyao avatar xinmingyao commented on May 18, 2024

bSync设置为false的话在sata盘上性能比较差的,
phxpaxos 是否可以提供一个stop的接口,这样重启前可以先存盘paxos log,另外如果能做到只丢后面的数据而不是中间的数据应该比较合适,这样不会影响服务启动,丢掉的数据能从集群中别的机器同步过来

from phxpaxos.

lynncui00 avatar lynncui00 commented on May 18, 2024

机器重启没有机会写磁盘的,比如突然机器断电。
另外一般情况应该也不会丢中间数据的,这里的情况应该也属于极端异常了。
要做到不丢数据,除了设置bSync为true,暂时没有好的方法,另外phxpaxos对sata盘的性能很差,如果要用在sata盘,建议可以自己重写存储模块。

from phxpaxos.

lynncui00 avatar lynncui00 commented on May 18, 2024

参考 https://github.com/tencent-wechat/phxpaxos/wiki/%E5%A6%82%E4%BD%95%E4%BD%BF%E7%94%A8%E8%87%AA%E5%B7%B1%E7%9A%84%E5%AD%98%E5%82%A8%E4%BB%A5%E5%8F%8A%E7%BD%91%E7%BB%9C%E6%A8%A1%E5%9D%97%E6%9D%A5%E6%9E%84%E5%BB%BAPhxPaxos

https://github.com/tencent-wechat/phxpaxos/wiki/PhxPaxos%E4%BD%BF%E7%94%A8%E7%BB%8F%E9%AA%8C%E4%BB%A5%E5%8F%8A%E5%8F%AF%E8%83%BD%E9%81%87%E5%88%B0%E7%9A%84%E9%97%AE%E9%A2%98

from phxpaxos.

xinmingyao avatar xinmingyao commented on May 18, 2024

暂时先在paxos_log_tools基础上加了个方法,从第一个丢失的开始到最大值都删除掉

from phxpaxos.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.