valyala / bytebufferpool Goto Github PK

Anti-memory-waste byte buffer pool

License: MIT License

Go 100.00%

bytebufferpool's Introduction

bytebufferpool

An implementation of a pool of byte buffers with anti-memory-waste protection.

The pool may waste limited amount of memory due to fragmentation. This amount equals to the maximum total size of the byte buffers in concurrent use.

Benchmark results

Currently bytebufferpool is fastest and most effective buffer pool written in Go.

You can find results here.

bytebufferpool users

bytebufferpool's People

Contributors

Stargazers

Watchers

Forkers

kirilldanshin ateleshev oldtree gallir gridl etsangsplk jinbanglin boyone phraxos monjovi birowo gosom c3mb0 jursonmo elvinshang thanhpk hellastorm arvancs pubgo dongfanliang junftnt hh9net jaysoncena digilant modasi jbn forkkit aichitutu happy-co taipanbox healiha cmanewone jsyzjhj andypeng2015 bonedaddy angenal berrycol isgasho tsingson vforks overtalk jangocheng houkx cw2030 super-rain jackey925 junxie6 khorevaa asellappen finishy1995 gitsrc eggding daheige kingstars leeonsoft math345 allenxuxu baajarmeh jemmyh crab21 illirgway kobe148 duringnone zhouruisong yuezhijian kokizzu zhanglei chengzheng007 lw000 cctvbtx melnikk byebyebruce diegosz eleztian fgeth ink-splatters wyaow tddh zhoufrederic dumpmemory undertreetech luolanmeet czxfreedom daotlresearch fufuok net-byte pkafma-aon yifeng01 moredure destel haifenghuang pyjcc pyjcode tong3jie chaggle blastbao sodri126 roanbrand vivek-yadav zeroallox

bytebufferpool's Issues

b.String()?

Hi!
Can you add support for func (b *ByteBuffer) String() string method?
Thanks!

*bytebufferpool.ByteBuffer does not implement io.Reader (missing Read method)

It would be more useful to implement Read method like 'bytes.Buffer'

Methods for working directly with []byte slices

I might be missing it, but considering ByteBuffer only has B inside, would you like some pool methods that give/take byte slices directly? Seems to fit in existing code easier that way.

Неверный расчёт defaultSize и maxSize

Если в строке https://github.com/valyala/bytebufferpool/blob/master/pool.go#L113 сделать вывод результатов калибровки, то окажется, что defaultSize и maxSize очень маленькие. У меня было всего 128 байт почему-то. Хотя данные с которыми работает в основном пул лежат в интервале 1-2кб. Из-за этого естественно append-ы генерят кучу аллокаций сводя на нет все преимущества bytebufferpool .

[Question] minBitSize?

Hi again @valyala :)
In this line you wrote that minBitSize is a cache line size. It equals to six for now. As far as I know cache line size typically equals to 64 bytes. Why it equals to six?
Thanks :)

race detection

==================
WARNING: DATA RACE
Write at 0x00c000281f80 by goroutine 54:
runtime.slicecopy()
/usr/local/Cellar/go/1.11.4/libexec/src/runtime/slice.go:221 +0x0
_/Users/hesk/Documents/bigbangsurvivalrun/backendc/main/common/websocket/io.(*ByteBuffer).Write()
/Users/hesk/Documents/bigbangsurvivalrun/backendc/main/common/websocket/io/bytebuffer.go:78 +0x130
_/Users/hesk/Documents/bigbangsurvivalrun/backendc/main/common/websocket.(*messageSerializer).serialize()
/Users/hesk/Documents/bigbangsurvivalrun/backendc/main/common/websocket/message.go:90 +0xda
_/Users/hesk/Documents/bigbangsurvivalrun/backendc/main/common/websocket.(*emitter).Emit()
/Users/hesk/Documents/bigbangsurvivalrun/backendc/main/common/websocket/emitter.go:40 +0x11a
_/Users/hesk/Documents/bigbangsurvivalrun/backendc/main/common/websocket.(*connection).Emit()
/Users/hesk/Documents/bigbangsurvivalrun/backendc/main/common/websocket/connection.go:531 +0x8a
_/Users/hesk/Documents/bigbangsurvivalrun/backendc/main/core_x.StartWebsocket.func1.12()
/Users/hesk/Documents/bigbangsurvivalrun/backendc/main/core_x/ws_core.go:375 +0x71

Previous read at 0x00c000281f80 by goroutine 56:
runtime.slicecopy()
/usr/local/Cellar/go/1.11.4/libexec/src/runtime/slice.go:221 +0x0
_/Users/hesk/Documents/bigbangsurvivalrun/backendc/main/common/websocket.(*emitter).Emit()
/Users/hesk/Documents/bigbangsurvivalrun/backendc/main/common/websocket/emitter.go:46 +0x236
_/Users/hesk/Documents/bigbangsurvivalrun/backendc/main/core_x.ReturnAuth()
/Users/hesk/Documents/bigbangsurvivalrun/backendc/main/core_x/ws_core.go:465 +0x113
_/Users/hesk/Documents/bigbangsurvivalrun/backendc/main/core_x.StartWebsocket.func1.3.2()
/Users/hesk/Documents/bigbangsurvivalrun/backendc/main/core_x/ws_core.go:211 +0x166
_/Users/hesk/Documents/bigbangsurvivalrun/backendc/main/fund/sql.UserLogin()
/Users/hesk/Documents/bigbangsurvivalrun/backendc/main/fund/sql/rawsql.go:35 +0x21b
_/Users/hesk/Documents/bigbangsurvivalrun/backendc/main/core_x.StartWebsocket.func1.3()
/Users/hesk/Documents/bigbangsurvivalrun/backendc/main/core_x/ws_core.go:157 +0x340
_/Users/hesk/Documents/bigbangsurvivalrun/backendc/main/common/websocket.(*connection).messageReceived()
/Users/hesk/Documents/bigbangsurvivalrun/backendc/main/common/websocket/connection.go:443 +0x4f7
_/Users/hesk/Documents/bigbangsurvivalrun/backendc/main/common/websocket.(*connection).startReader()
/Users/hesk/Documents/bigbangsurvivalrun/backendc/main/common/websocket/connection.go:413 +0x25e
_/Users/hesk/Documents/bigbangsurvivalrun/backendc/main/common/websocket.(*connection).Wait()
/Users/hesk/Documents/bigbangsurvivalrun/backendc/main/common/websocket/connection.go:587 +0x8f
_/Users/hesk/Documents/bigbangsurvivalrun/backendc/main/common/websocket.(*Server).Handler.func1()
/Users/hesk/Documents/bigbangsurvivalrun/backendc/main/common/websocket/server.go:108 +0x155
github.com/kataras/iris/context.DefaultNext()
/Users/hesk/go/src/github.com/kataras/iris/context/context.go:1208 +0x134

i have checked for many time that there is a race detection over the this byte operations. I cant think of a solution to resolve this..

[Question] Why do you use self-made byte buffers?

Hi valyala! Thanks for your library. It reduced memory usage of my package by ~10%.
But I have one question: why do you wrote your own self-made byte buffers and don't use buffers from standard bytes package? Buffers from bytes package have functions that I need (for example, ReadFrom), but your buffers haven't :(

bytebuffer will be gc when reallocating memory

func (b *ByteBuffer) ReadFrom(r io.Reader) (int64, error) {
	p := b.B
	nStart := int64(len(p))
	nMax := int64(cap(p))
	n := nStart
	if nMax == 0 {
		nMax = 64
		p = make([]byte, nMax)
	} else {
		p = p[:nMax]
	}
	for {
		if n == nMax {
			nMax *= 2
			bNew := make([]byte, nMax)
			copy(bNew, p)
			p = bNew
		}
		nn, err := r.Read(p[n:])
		n += int64(nn)
		if err != nil {
			b.B = p[:n]
			n -= nStart
			if err == io.EOF {
				return n, nil
			}
			return n, err
		}
	}
}

if n == nMax, create a slice that is twice as long as cap(p.B), old p.B slice will be gc?

[Question] append vs. bytes.Buffer-like "manual cap/len check and copy"

I noticed your ByteBuffer grows via append, whereas we know the bytes.Buffer grows by hand-coded comparison of cap/len and enlarging/copying when needed.

For this very small specific sub-aspect of your overall buffer-pool project, did you bench both approaches? Did you:

verify append to be visibly, noticably, consistently "faster"/more-performant (perhaps because it compiles to better asm..) than a hand-rolled grow logic?
or did you just go for append for the simplicity? 😏

I'm curious in this mainly because I too keep around a lean tiny private alternative to bytes.Buffer (without your pooling aspects — not needed so far) and I was wondering if append might not be "intrinsically preferable", after all this growth logic is exactly its purpose. 😁 but then it makes me wonder why bytes.Buffer went for hand-rolled growth and avoiding append.

Since you have seen more heavy loads in your real-world projects so far than I did in my still-toy-stage usage so far, I figured you're the guy whose opinion to ask! =)

i implement an 2x faster, no memory allocation data structure compared to sync.pool

if you already know max number of byte array needed,

you could preallocate it to enhance performance:

package utils

import (
    "reflect"
    "sync/atomic"
    "unsafe"
)

// BytePool is used to concurrently obtain byte arrays of size cap.
type BytePool struct {
    i      atomic.Int64
    caches [][]byte
    used   []atomic.Bool
    head   uintptr
    cap    int
}

// NewBytePool creates a byte array pool with max byte arrays, each with a capacity of cap.
// If sync.Pool is used, at least 24 bytes of memory will be allocated each time,
// see https://blog.mike.norgate.xyz/unlocking-go-slice-performance-navigating-sync-pool-for-enhanced-efficiency-7cb63b0b453e.
// By using the space-for-time method, zero memory allocation can be achieved.
// Here, a large byte array is first allocated, and then it is divided into small byte arrays.
// The address difference between the small byte arrays is the same.
// For each byte array, an atomic.Bool is used to determine whether it is in use.
// If it is false, it means it is not in use, and it is converted to true and returned to the caller;
// if it is true, it means it has been used.
// When the byte array is used,
// we find the corresponding atomic.Bool by the difference between its address and the first address of the large byte array,
// and then set it to false.
func NewBytePool(cap, max int) *BytePool {
    data := make([]byte, cap*max)
    caches := make([][]byte, max)
    for i := range caches {
       caches[i] = data[i*cap : (i+1)*cap][:0]
    }
    rt := (*reflect.SliceHeader)(unsafe.Pointer(&data))
    return &BytePool{
       caches: caches,
       used:   make([]atomic.Bool, max),
       head:   rt.Data,
       cap:    cap,
    }
}

// Get byte array from pool.
func (b *BytePool) Get() []byte {
    for {
       cur := b.i.Add(1)
       if cur >= int64(len(b.caches)) {
          cur = 0
          b.i.Store(0)
       }
       if b.used[cur].CompareAndSwap(false, true) {
          return b.caches[cur][:0]
       }
    }
}

// Put the data back into the BytePool.
func (b *BytePool) Put(x []byte) {
    rt := (*reflect.SliceHeader)(unsafe.Pointer(&x))
    i := int(rt.Data-b.head) / b.cap
    b.used[i].Store(false)
}

Possibility to set a limit to this bytebufferpool and use as cache?

thanks for fastcache. I'm using victoriametrics fastcache and come across bytebufferpool, which seemed to be faster. So is there a possibility to make it so that there's a limit imposed on this bytebufferpool so we can use as cache?
what happens when oom?

invalid memory address or nil pointer dereference error will happen in Put method

func (p *Pool) Put(b *ByteBuffer) {
    idx := index(len(b.B))

    if atomic.AddUint64(&p.calls[idx], 1) > calibrateCallsThreshold {
        p.calibrate()
    }

    maxSize := int(atomic.LoadUint64(&p.maxSize))
    if maxSize == 0 || cap(b.B) <= maxSize {
        b.Reset()
        p.pool.Put(b)
    }
}

if b is nil,len(b.B) will have a error.

if b==nil{
return
}

Go runtime adoption

I was thinking- couldn't Go perform pooling for byte arrays natively and implicitly? Thought I'd bring it up here if this sounds like a logical optimization the runtime could perform.

Avoid allocation when the internal buffer grows

Usage example: https://github.com/mailru/easyjson/blob/master/jwriter/writer.go#L114:L117

Basic idea: https://github.com/mailru/easyjson/blob/master/buffer/pool.go#L80:L104

Question about Redundant Code in Go - Clarification Needed

Issue Description

Discovered a potentially redundant section in the Go code, seeking confirmation and correction assistance.

Code Snippet

for i := 0; i < steps; i++ {
    if callsSum > maxSum {
        break
    }
    callsSum += a[i].calls
    size := a[i].size
    if size > maxSize {
        maxSize = size
    }
}

Issue Details

The iterative part of the code involves updating maxSize, but due to the pre-sorted nature of the slice a, the condition size > maxSize will never be true. Is this a redundant logic, or is there a potential misunderstanding?