Coder Social home page Coder Social logo

jibby's Introduction

jibby - High-performance streaming JSON-to-BSON decoder in Go

Go Reference Go Report Card Github Actions codecov License

jibby: A general term to describe an exceptionally positive vibe, attitude, or influence.

~ Urban Dictionary

The jibby package provide high-performance conversion of JSON objects to BSON documents. Key features include:

  • stream decoding - white space delimited or from a JSON array container
  • no reflection
  • minimal abstraction
  • minimal copy
  • allocation-friendly

Examples

import (
	"bufio"
	"bytes"
	"log"

	"github.com/xdg-go/jibby"
)

func ExampleUnmarshal() {
	json := `{"a": 1, "b": "foo"}`
	bson := make([]byte, 0, 256)

	bson, err := jibby.Unmarshal([]byte(json), bson)
	if err != nil {
		log.Fatal(err)
	}
}

func ExampleDecoder_Decode() {
	json := `{"a": 1, "b": "foo"}`
	bson := make([]byte, 0, 256)

	jsonReader := bufio.NewReaderSize(bytes.NewReader([]byte(json)), 8192)
	jib, err := jibby.NewDecoder(jsonReader)
	if err != nil {
		log.Fatal(err)
	}

	bson, err = jib.Decode(bson)
	if err != nil {
		log.Fatal(err)
	}
}

Extended JSON

Jibby optionally supports the MongoDB Extended JSON v2 format. There is limited support for the v1 format -- specifically, the $type and $regex keys use heuristics to determine whether these are extended JSON or MongoDB query operators.

Escape sequences are not supported in Extended JSON keys or number formats, only in naturally textual fields like $symbol, $code, etc. In practice, MongoDB Extended JSON generators should never output escape sequences in keys and number fields anyway.

Limitations

  • Maximum depth defaults to 200 levels of nesting (but is configurable)
  • Only well-formed UTF-8 encoding (including optional BOM) is supported.
  • Numbers (floats and ints) must conform to formats/limits of Go's strconv library.
  • Escape sequences not supported in extended JSON keys and some extended JSON values.

Testing

Jibby is extensively tested.

Jibby's JSON-to-BSON output is compared against reference output from the MongoDB Go driver. Extended JSON conversion is tested against the MongoDB BSON Corpus.

JSON parsing support is tested against data sets from Nicholas Seriot's Parsing JSON is a Minefield article. It behaves correctly against all "y" (must support) tests and "n" (must error) tests. It passes all "i" (implementation defined) tests except for cases exceeding Go's numerical precision or with invalid/unsupported Unicode encoding.

Performance

Performance varies based on the shape of the input data.

For a 92 MB mixed JSON dataset with some extended JSON:

           jibby 283.46 MB/s
   jibby extjson 207.42 MB/s
   driver bsonrw 43.77 MB/s
naive json->bson 43.25 MB/s

For a 4.3 MB pure JSON dataset with lots of arrays:

           jibby 107.15 MB/s
   jibby extjson 123.76 MB/s
   driver bsonrw 25.68 MB/s
naive json->bson 32.78 MB/s

The jibby and jibby extjson figures are jibby without and with extended JSON enabled, respectively. The driver bsonrw figures use the MongoDB Go driver in a streaming mode with bsonrw.NewExtJSONValueReader. The naive json->bson figures use Go's encoding/json to decode to map[string]interface{} and the Go driver's bson.Marshal function.

Copyright and License

Copyright 2020 by David A. Golden. All rights reserved.

Licensed under the Apache License, Version 2.0 (the "License"). You may obtain a copy of the License at http://www.apache.org/licenses/LICENSE-2.0

jibby's People

Contributors

xdg avatar utagai avatar flowhater avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.