Coder Social home page Coder Social logo

hdt3213 / rdb Goto Github PK

View Code? Open in Web Editor NEW
355.0 16.0 74.0 549 KB

Golang implemented Redis RDB parser for secondary development and memory analysis

Home Page: https://www.cnblogs.com/Finley/p/16251360.html

License: Apache License 2.0

Go 98.89% CSS 0.91% Shell 0.20%
go rdb redis parser analyzer

rdb's People

Contributors

chenrui333 avatar hdt3213 avatar kovogo avatar kuzznya avatar makingl avatar niconical avatar zonewave avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

rdb's Issues

stream 类型某种情况下会解析失败

复现流程:

  1. 启动一个 redis 实例:docker run -it --rm --name redistest -v $PWD/tempdata/:/data/ -p 6379:6379 redis:7.0 redis-server --save 10 1
  2. 运行下面go程序 写入数据:
package main

import (
	"context"
	"fmt"

	redis "github.com/redis/go-redis/v9"
)

func main() {
	c := redis.NewClient(&redis.Options{
		Addr: "127.0.0.1:6379",
		DB:   0,
	})

	n := 9999
	for i := 0; i < n*2; i++ {
		err := c.XAdd(context.Background(), &redis.XAddArgs{
			Stream: "mytest",
			Values: []string{"info", `abcd`},
			MaxLen: int64(n),
			Approx: true,
		}).Err()
		if err != nil {
			panic(err)
		}
		fmt.Println(i)
	}
}
  1. 进入上面的 tempdata 目录开始分析,rdb -c memory -o memory.csv dump.rdb,有报错
error: read stream item id seq failed: -1 is not a uint

遇到了一个错误 cursor out of range

定位了一下,发现是在decoder某个key时,出现的异常。
补充一下具体信息:flag=18 ,typeListQuickList2,readQuickList2

该key为list类型,LLEN=624

统计结果不准

对比其他工具,对同一个rdb文件分析,得出的结果相差太大,计算出的总大小情况如下:

  1. 使用redis-rdb-tools:3849536069(bytes),和实际值比较接近
  2. 使用本工具:1617817854(bytes),和实际值相差太大

listpack over 8192 cause 'error: cursor out of range' panic error

Thanks for your awesome work! I found a bug when I test the rdb tool on my Mac about Redis7.2.

  • redis
redis-server -v                                          
Redis server v=7.2.1 sha=00000000:0 malloc=libc bits=64 build=7b8617dd94058f85
  • test data
redis-benchmark -c 1  --dbnum 0 -d 10000 -r 2000 -t rpush
  • code should be
         //  https://github.com/HDT3213/rdb/blob/887c2ca2556ffff3be967e74147ec5fe428af8b1/core/listpack.go#L105
	case 0: // 1111 0000 -> str, 4 bytes len
		var lenBytes []byte
		lenBytes, err = readBytes(buf, cursor, 4)
		if err != nil {
			return nil, 0, err
		}
		//strLen := int(binary.BigEndian.Uint32(lenBytes))
		strLen := int(binary.LittleEndian.Uint32(lenBytes))
		result, err := readBytes(buf, cursor, strLen)
		if err != nil {
			return nil, 0, err
		}
		//length = readVarInt(buf, cursor) // read element length
		length = uint32(len(result)) // read element length
		return result, length, nil

rdb文件version:10 解析到stream时遇到错误

rdb文件version:10 解析到stream时遇到错误, 以下是一些信息. 现有rdb数据结构里没有stream的数据结构. 如果有参考文档请标明下,十分感谢
`readStreamListPacks version: 2 encode: listpack
readStreamEntries: length: 60
readStreamEntries index: 0 header: [0 0 1 137 147 155 165 86 0 0 0 0 0 0 0 0]
readStreamEntries firstId:1690398598486-0 ms: 1690398598486, seq: 0
line132 count: 17
line137 deleted: 0
line144 : fieldNum0: 4
line158 : masterFieldNames: [app_id client_id status payload]
client_status:.ead stream item flag failed: $a0346185-50b5-4d47-b2c8-004d2694ec52�1 (2

account_id 866f3b913df14d472007941600001010:
statusONLINE:
port20083 is not a uint
client_status:.eam item flag110 failed: $a0346185-50b5-4d47-b2c8-004d2694ec52�1 (2

部分实际数据
`*********:6379> XRANGE client_status_stream 1690398598486-0 1690508095589-5

    1. "1690398598486-0"
      1. "app_id"
      2. "123456"
      3. "client_id"
      4. "ChAUnX10KZBleXxHWtxmLimDEAEY6dQgIN2YBigC"
      5. "status"
      6. "OFFLINE"
      7. "payload"
      8. "\x12$************************************\x18\xd6\xca\xee\x9c\x991 \xa0\x99\x02(\x042\rclient_status:.\n\naccount_id\x12 866f3b913df14d472007941600001010:\x11\n\x06status\x12\aOFFLINE"
    1. "1690398600783-0"
      1. "payload"
      2. "\x12$************************************\x18\xce\xdc\xee\x9c\x991 \xa0\x99\x02(\x042\rclient_status:.\n\naccount_id\x12 866f3b913df14d472007941600001010:\x10\n\x06status\x12\x06ONLINE:\r\n\x04port\x12\x0520083"
      3. "client_id"
      4. "ChAUnX10KZBleXxHWtxmLimDEAEY6dQgIN2YBigC"
      5. "app_id"
      6. "123456"
      7. "status"
      8. "ONLINE"
        `

出现了OOM的情况,请教内存的占用情况

请教一个问题,内存的占用和释放是如何处理的,譬如分析一个10G的RDB需要占用多少内存呢?

在使用过程中出现过OOM的情况,当时遭遇了一个400MB的HashKey。

Is `v1.0.7` the latest release?

Hi! Homebrew maintainer here. We noticed that v1.0.7 was tagged, but is not yet marked as latest on the repo's releases page. I'm filing this issue to confirm if the release is ready to be marked as latest, so that we can go ahead and ship it. Thanks!

version 10 报错

Got a connection, launched process /Users/t/Library/Caches/JetBrains/GoLand2023.2/tmp/GoLand/___go_build_github_com_hdt3213_rdb (pid = 5053).
error: cursor out of range
Exiting.

统计top n的大key有bug

`
// Append new object into tree set
// time complexity: O(n*log(m)), n is number of redis object, m is heap capacity. m if far less than n
func (h *redisTreeSet) Append(x model.RedisObject) {
// if heap is full && x.Size > minSize, then pop min
if h.set.Size() == h.capacity {
min := h.GetMin()
if min.GetSize() < x.GetSize() {
h.set.Remove(min)
h.set.Add(x)
}
} else {
h.set.Add(x)
}
}

`

Possible to try and skip malformed entries?

Hi, and thank you so much for this wonderful tool!

I have a question-

I have an RDB dump file that has at least one entry that seems to be invalid.
redis-check-rdb dump.rdb shows an error:

[offset 0] Checking RDB file dump.rdb
[offset 26] AUX FIELD redis-ver = '5.0.8'
[offset 40] AUX FIELD redis-bits = '64'
[offset 52] AUX FIELD ctime = '1681088401'
[offset 67] AUX FIELD used-mem = '1359262528'
[offset 83] AUX FIELD aof-preamble = '0'
[offset 85] Selecting DB ID 0
--- RDB ERROR DETECTED ---
[offset 51432] Internal error in RDB reading offset 0, function at rdb.c:2080 -> Ziplist integrity check failed.
[additional info] While doing: read-object-value
[additional info] Reading key 'badkey-x'
[additional info] Reading type 14 (quicklist)
[info] 87 keys read
[info] 1 expires
[info] 0 already expired
46161:C 10 Apr 2023 11:42:54.008 # Terminating server after rdb file reading failure.

I'm hoping to try to use or modify your tool to see if it might be possible to skip this badkey-x, and try to save as much of the rest of the data in the rdb file as possible.

When I try ./rdb -c aof -o dump.aof dump.rdb I get the following error:
error: panic: runtime error: slice bounds out of range [:10] with capacity 0

I'm currently digging into the source code to try better understand how parsing works and if it might be possible to try skip over a broken entry.

In case you have any advice on if this might be possible, or how to do it - I'd really appreciate your guidance.

Thank you again!

bigkeys

top 10 bigkeys只显示其中6个key
rdb -c bigkey -n 10 127.0.0.1_6379.rdb
database,key,type,size,size_readable,element_count
0,one,string,88,88B,0
0,key:3275,string,72,72B,0
0,jaava,hash,66,66B,1
0,key:42,string,64,64B,0
0,key:1111,string,56,56B,0
0,java,string,48,48B,0

Will there be support for version 10 RDB files?

When running on a more recent version of a Redis RDB file, I receive the following error:

Traceback (most recent call last):
  File "/usr/local/bin/rdb", line 8, in <module>
    sys.exit(main())
  File "/usr/local/lib/python3.8/site-packages/rdbtools/cli/rdb.py", line 106, in main
    parser.parse(options.dump_file[0])
  File "/usr/local/lib/python3.8/site-packages/rdbtools/parser.py", line 394, in parse
    self.parse_fd(open(filename, "rb"))
  File "/usr/local/lib/python3.8/site-packages/rdbtools/parser.py", line 399, in parse_fd
    self.verify_version(f.read(4))
  File "/usr/local/lib/python3.8/site-packages/rdbtools/parser.py", line 963, in verify_version
    raise Exception('verify_version', 'Invalid RDB version number %d' % version)
Exception: ('verify_version', 'Invalid RDB version number 10')
Expecting value: line 1 column 1 (char 0)

Are there plans to expand the support to version 10?

Kind regards,
Jeff Groves

自定义报表生成

非常感谢您提供的工具,我想请教一下报表内容是否可以实现?
我希望通过读取RDB(或者连接Redis实例)得到以下报表:
表头如下:
Key前缀,Key数量,占用内存大小,未设置TTL的Key数量

我们通过Key前缀来区分不同服务,所以我需要以上内容来统计每种服务使用Redis是否规范。
请问以当前此工具提供的功能,是否可以自定义此报表的生成?
谢谢

数字类型 string 解析有误

这些

		case encodeInt8:
			b, err := dec.readByte()
			return []byte(strconv.Itoa(int(b))), err
		case encodeInt16:
			b, err := dec.readUint16()
			return []byte(strconv.Itoa(int(b))), err
		case encodeInt32:
			b, err := dec.readUint32()
			return []byte(strconv.Itoa(int(b))), err

应该用 int8 int16 int32 获取符号后,再用 int 转类型。

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.