Coder Social home page Coder Social logo

lotusdblabs / lotusdb Goto Github PK

View Code? Open in Web Editor NEW
2.0K 26.0 174.0 7.4 MB

Most advanced key-value database written in Go, extremely fast, compatible with LSM tree and B+ tree.

Home Page: https://lotusdblabs.github.io

License: Apache License 2.0

Go 99.86% Shell 0.14%
lsm-tree bptree kv-store golang database storage

lotusdb's People

Contributors

ahdong2007 avatar akiozihao avatar calebgcc avatar eahitechnology avatar mikkelhjuul avatar roseduan avatar saint-yellow avatar tensshinet avatar testwill avatar tobehardest avatar weedsfrenzy avatar wsyingang avatar wziww avatar yanxiaoqi932 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

lotusdb's Issues

whatever dirLock.Release returns in cf.Close will be discarded

whatever dirLock.Release returns in cf.Close will be discarded

// Close close current colun family.
func (cf *ColumnFamily) Close() error {
	atomic.StoreUint32(&cf.closed, 1)
	close(cf.closedC)
	var err error
	for _, dirLock := range cf.dirLocks {
		err = dirLock.Release()
	}
	// sync all contents.
	err = cf.Sync()
	return err
}

Data Race Problem

==================
WARNING: DATA RACE
Read at 0x00c0000ac288 by goroutine 33:
  github.com/flower-corp/lotusdb.(*valueLog).handleCompaction()
      go/pkg/mod/github.com/flower-corp/[email protected]/vlog.go:278 +0x1b8
  github.com/flower-corp/lotusdb.openValueLog.func2()
      go/pkg/mod/github.com/flower-corp/[email protected]/vlog.go:115 +0x38

Previous write at 0x00c0000ac288 by main goroutine:
  github.com/flower-corp/lotusdb.(*LotusDB).OpenColumnFamily()
      go/pkg/mod/github.com/flower-corp/[email protected]/cf.go:138 +0x4f4
  github.com/flower-corp/lotusdb.Open()
      go/pkg/mod/github.com/flower-corp/[email protected]/db.go:39 +0x170

Goroutine 33 (running) created at:
  github.com/flower-corp/lotusdb.openValueLog()
      go/pkg/mod/github.com/flower-corp/[email protected]/vlog.go:115 +0x584
  github.com/flower-corp/lotusdb.(*LotusDB).OpenColumnFamily()
      go/pkg/mod/github.com/flower-corp/[email protected]/cf.go:133 +0x49c
  github.com/flower-corp/lotusdb.Open()
      go/pkg/mod/github.com/flower-corp/[email protected]/db.go:39 +0x170
==================
Found 1 data race(s)

如何解决数据miss的问题呢?

对于大量不存在的key的请求是如何处理的呢?大概看了下代码,如果key不存在的话是需要遍历内存表以及硬盘的b+ tree,或者从log中获取数据。没看到用bloomfilter来防止这样的请求呢?

Adding Dockerfile

Hello,
I am wondering if there is a need for a Docker image/container in this repo?

Benchmark improvements

Hi, just came across lotusdb and I like it (need to find out how to deep-store maps and other complex structures without resorting to plain serialization).

I noticed util.RandomValue(1024) is being called in the hot paths of the pico benchmarks.

Note this might cause a significant delay and flaw the whole benchmark. Perhaps use a huge precomputed table (1Gigabyte or more) of random numbers (precomputed after the start of the benchmark but before the time starts to be measured).

the Arena function growBufSize() may have concurrency issue

https://github.com/flower-corp/lotusdb/blob/0f5d9df69c7fdcd1e8e3054e2a63f3a36441feca/arenaskl/arena.go#L125-L129

When different goroutines run to the growBufSize(), newBuf allocated by one goroutine may replaced by another. Replacement of buf and newBuf is not atomic.
I just modified the skl_test.go and found this error and I don't know if it is designed that way

// TestConcurrentBasic tests concurrent writes followed by concurrent reads.
func TestConcurrentBasic(t *testing.T) {
	const n = 1000

	// Set testing flag to make it easier to trigger unusual race conditions.
	//l := NewSkiplist(NewArena(arenaSize))
       // Change NewArena size to a small one
        l := NewSkiplist(NewArena(10))
	l.testing = true

	var wg sync.WaitGroup
	for i := 0; i < n; i++ {
		wg.Add(1)
		go func(i int) {
			defer wg.Done()

			var it Iterator
			it.Init(l)

			it.Put([]byte(fmt.Sprintf("%05d", i)), newValue(i))
		}(i)
	}
	wg.Wait()

	// Check values. Concurrent reads.
	for i := 0; i < n; i++ {
		wg.Add(1)
		go func(i int) {
			defer wg.Done()

			var it Iterator
			it.Init(l)

			found := it.Seek([]byte(fmt.Sprintf("%05d", i)))
			require.True(t, found)
			require.EqualValues(t, newValue(i), it.Value())
		}(i)
	}
	wg.Wait()
	require.Equal(t, n, length(l))
	require.Equal(t, n, lengthRev(l))
}

Is LotusDB better than RoseDB?

Hello,

I am trying to determine which is the latest DB that I should be trying to use for an embedded application in development:

RoseDB
or
LotusDB

The miniDB seems to be stripped down version of RoseDB.

Can you please let me know which is best to use?

Bug : Delete function

Execute the Delete function, and then execute the Exist function, the result is true, it should be false

func TestDB_Delete_Exist(t *testing.T){
      // Create a instance of Options struct to pass to the Open function.
	options := DefaultOptions
	options.DirPath = t.TempDir()
	defer os.RemoveAll(options.DirPath)
	// Call the Open function for testing
	db, err := Open(options)
	assert.NoError(t, err, "Open should not return an error")
	assert.NotNil(t, db, "db should not be nil")
	// Call the Put function for testing
	writeOptions := &WriteOptions{
		Sync:       true,
		DisableWal: false,
	}
	k, v := []byte("Lumia"), []byte("Qian")
	err = db.Put(k, v, writeOptions)
	assert.NoError(t, err, "Put should not return an error")
	// Call the Exist function for testing
	isExist, err := db.Exist(k)
	assert.NoError(t, err, "Exist should not return an error")
	assert.Equal(t, true, isExist, "expected isExist is true")
	// Call the Delete function for testing
	err = db.Delete(k, writeOptions)
	assert.NoError(t, err, "Delete should not return an error")
	// Call the Exist function for testing
	isExist, err = db.Exist(k)
	assert.NoError(t, err, "Exist should not return an error")
	assert.Equal(t, false, isExist, "expected isExist is false")
	// Delete an not exist key
	err = db.Delete([]byte("Hello"), writeOptions)
	assert.NoError(t, err, "Delete should not return an error")
	// Call the Close function for testing
	err = db.Close()
	assert.NoError(t, err, "Close should not return an error")
}

Support Transaction

Our architecture is suitable for supporting transaction.
(Time travel too?)

TODO: Define Behavior for key, value

  • Clarify the behavior of key and value when they are nil or len=0 for different indexType.
  • Modify the test cases in hashtable_test to directly request the index instead of going through the DB.
  • Add tests for different indexType in the DB, and test get in scenarios involving pendingWrites, activateMem, oldMem, and index.

the latest of lotusdb and the go.etcd.io/bblot of v1.3.6 not match

the latest of lotusdb and the go.etcd.io/bblot of v1.3.6 not match
https://github.com/flower-corp/lotusdb/blob/9bd9047f19ee551993c4ba85fee3d12ab57b3455/index/bptree.go#L115

func (b *BPTree) Put(key, value []byte) (err error) {
	var tx *bbolt.Tx
	if tx, err = b.db.Begin(true); err != nil {
		return
	}
	bucket := tx.Bucket(b.opts.BucketName)
	if _, err = bucket.Put(key, value); err != nil {  // bucket.Put  need two return values here
		_ = tx.Rollback()
		return
	}
	return tx.Commit()
}

but go.etcd.io/bblot/bucket.go here only one return value https://github.com/etcd-io/bbolt/blob/fd5535f71f488dda0915f610b6ca8c77c9ca2c59/bucket.go#L280

func (b *Bucket) Put(key []byte, value []byte) error {
	if b.tx.db == nil {
		return ErrTxClosed
	} else if !b.Writable() {
		return ErrTxNotWritable
	} else if len(key) == 0 {
		return ErrKeyRequired
	} else if len(key) > MaxKeySize {
		return ErrKeyTooLarge
	} else if int64(len(value)) > MaxValueSize {
		return ErrValueTooLarge
	}

	// Move cursor to correct position.
	c := b.Cursor()
	k, _, flags := c.seek(key)

	// Return an error if there is an existing key with a bucket value.
	if bytes.Equal(key, k) && (flags&bucketLeafFlag) != 0 {
		return ErrIncompatibleValue
	}

	// Insert into node.
	key = cloneBytes(key)
	c.node().put(key, key, value, 0, 0)

	return nil
}

waitMemtableSpace 存在并发问题

不同的batch同时调用Commit方法,内部复用的同一个 db *DB 对象,在执行 func (db *DB) waitMemtableSpace()方法的时候,假如都判断 activeMem.isFull 成立,会同时对activeMem 创建新的。
修改如下:
code

example can not run

https://github.com/flower-corp/lotusdb/blob/main/examples/basic/basic_operation.go

../../../go/pkg/mod/github.com/flower-corp/[email protected]/index/bptree.go:114:12: assignment mismatch: 2 variables but bucket.Put returns 1 value
../../../go/pkg/mod/github.com/flower-corp/[email protected]/index/bptree.go:147:19: assignment mismatch: 2 variables but bucket.Put returns 1 value
../../../go/pkg/mod/github.com/flower-corp/[email protected]/index/bptree.go:185:19: assignment mismatch: 2 variables but bucket.Delete returns 1 value
../../../go/pkg/mod/github.com/flower-corp/[email protected]/index/bptree.go:205:10: assignment mismatch: 2 variables but tx.Bucket(b.opts.BucketName).Delete returns 1 value

Comparison to pebble

Could you provide a detailed comparison to pebble? I can understand how it's easy to improve over badger and bbolt, both have clear issues. I'm interested in what kind of performance and feature improvement you provide over pebble, which in my testing is very high quality.

I'm wary of this comment on Reddit, but it might be worth demonstrating it with some benchmarks against pebble.

Thanks!

cannot compile

go build lotusTest.go

github.com/flower-corp/lotusdb/index

../../../pkg/mod/github.com/flower-corp/[email protected]/index/bptree.go:114:14: assignment mismatch: 2 variables but bucket.Put returns 1 value
../../../pkg/mod/github.com/flower-corp/[email protected]/index/bptree.go:147:22: assignment mismatch: 2 variables but bucket.Put returns 1 value
../../../pkg/mod/github.com/flower-corp/[email protected]/index/bptree.go:185:22: assignment mismatch: 2 variables but bucket.Delete returns 1 value
../../../pkg/mod/github.com/flower-corp/[email protected]/index/bptree.go:205:13: assignment mismatch: 2 variables but tx.Bucket(b.opts.BucketName).Delete returns 1 value
lotusTest.txt

I think the issue is fixed in the current repository, but release 1.0 has an issue

example shows error

# github.com/flower-corp/lotusdb/index
/root/go/pkg/mod/github.com/flower-corp/[email protected]/index/bptree.go:114:14: assignment mismatch: 2 variables but bucket.Put returns 1 value
/root/go/pkg/mod/github.com/flower-corp/[email protected]/index/bptree.go:147:22: assignment mismatch: 2 variables but bucket.Put returns 1 value
/root/go/pkg/mod/github.com/flower-corp/[email protected]/index/bptree.go:185:22: assignment mismatch: 2 variables but bucket.Delete returns 1 value
/root/go/pkg/mod/github.com/flower-corp/[email protected]/index/bptree.go:205:13: assignment mismatch: 2 variables but tx.Bucket(b.opts.BucketName).Delete returns 1 value

log in lotusdb

investigate the common used log package in Golang, and choose one for our project.

main goals:

  • lighweight
  • fast
  • simple to use

Use as a cache on disk

I've been looking for a cache implementation that stores on disk. I've not found one, but determined that LSMs do not seem to be suitable, but B+Trees usually have performance issues with write. Is your hybrid B+Tree and LSM approach feasible for this problem? In particular it must be possible to keep the total disk storage in use below a certain threshold (maybe exempting the WALs from this requirement), and add some kind of eviction policy (for example LRU or LFU) (this is usually not feasible with pure LSM).

Bug: read lock is never released

image
f record == nil, the if statement cannot enter, and Rlock will never Runlock, which will cause the wait reader of this lock to increase continuously, and finally when the wait reader == 98, panic

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.