Coder Social home page Coder Social logo

encodingx / binary Goto Github PK

View Code? Open in Web Editor NEW
17.0 2.0 2.0 262 KB

Marshal and unmarshal binary message and file formats in Go

License: GNU General Public License v3.0

Go 100.00%
go golang binary bitfields encoding formats struct structures marshal unmarshal

binary's Introduction

Marshal and Unmarshal Binary Formats in Go

The Go standard library features packages for converting between structs and various message and file formats. Packages encoding/json and encoding/xml are well known for providing convenient functions Marshal() and Unmarshal() that leverage the declarative nature of struct tags. These functions are however missing from encoding/binary, leaving developers without an accessible and intuitive way to work with binary formats.

This module supplies the ubiquitous Marshal() and Unmarshal() function signatures so that developers can define custom binary formats using only struct tags and avoid custom bit manipulation.

User Story

As a Go developer implementing a binary message or file format,
I want a pair of functions "Marshal/Unmarshal" like those in "encoding/json"
that convert a struct into a series of bits in a byte slice and vice versa,
so that I can avoid the complexities of custom bit manipulation.

See the rest of the behaviour-driven specifications of this module below.

Binary Formats

Message and file formats specify how bits are arranged to encode information. Control over individual bits or groups smaller than a byte is often required to put together and take apart these binary structures.

Message Format Examples

Describing the anatomy of TCP/IP headers at the beginning of every internet datagram ("packet") are

File Format Examples

Binary file formats are not significantly different from message formats from an application developer's perspective. RFC 1952 describes the GZIP File Format Specification.

Working with Binary Formats in Go

The smallest data structures Go provides are the basic type byte (alias of uint8, an unsigned 8-bit integer), and bool, both eight bits long. To manipulate data at a scale smaller than eight bits would require the use of bitwise logical and shift operators such as AND (&), OR (|), left shift (<<), and right shift (>>).

Relevant Questions Posted on StackOverflow

Suggestions on StackOverflow are limited to the use of bitwise operators.

Behaviour-Driven Specifications

Feature: Marshal and Unmarshal

    As a Go developer implementing a binary message or file format,
    I want a pair of functions "Marshal/Unmarshal" like those in "encoding/json"
    that convert a struct into a series of bits in a byte slice and vice versa,
    so that I can avoid the complexities of custom bit manipulation.

    Background:
        # Ubiquitous language
        Given a message or file "format"
            """
            A format specifies how bits are arranged to encode information.
            """
        And the format is a series of "bit fields"
            """
            A bit field is one or more adjacent bits representing a value,
            and should not be confused with struct fields.
            """
        And adjacent bit fields are grouped into "words"
            """
            A word is a series of bits that can be simultaneously processed
            by a given computer architecture and programming language.
            """

        # Define format-structs
        And a format is represented by a type definition of a "format-struct"
        And the format-struct nests one or more exported "word-structs"
        And the words are tagged to indicate their lengths in number of bits
            type RFC791InternetHeaderFormatWithoutOptions struct {
                RFC791InternetHeaderFormatWord0 `word:"32"`
                RFC791InternetHeaderFormatWord1 `word:"32"`
                RFC791InternetHeaderFormatWord2 `word:"32"`
                RFC791InternetHeaderFormatWord3 `word:"32"`
                RFC791InternetHeaderFormatWord4 `word:"32"`
            }
        And the length of each word is a multiple of eight in the range [8, 64]

        # Define word-structs
        And each word-struct has exported field(s) corresponding to bit field(s)
        And the fields are of unsigned integer or boolean types
        And the fields are tagged to indicate the lengths of those bit fields
            type RFC791InternetHeaderFormatWord0 struct {
                Version     uint8  `bitfield:"4"`
                IHL         uint8  `bitfield:"4"`
                Precedence  uint8  `bitfield:"3"`
                Delay       bool   `bitfield:"1"`
                Throughput  bool   `bitfield:"1"`
                Reliability bool   `bitfield:"1"`
                Reserved    uint8  `bitfield:"2"`
                TotalLength uint16 `bitfield:"16"`
            }
        And the length of each bit field does not overflow the type of the field
            """
            A bit field overflows a type
            when it is long enough to represent values
            outside the set of values of the type.
            """
        And the sum of lengths of all fields is equal to the length of that word

Marshal

    Scenario: Marshal a struct into a byte slice
        Given a format-struct variable representing a binary message or file
            internetHeader = RFC791InternetHeaderFormatWithoutOptions{
                RFC791InternetHeaderFormatWord0{
                    Version: 4,
                    IHL:     5,
                    // ...
                },
                // ...
            }
        And the struct field values do not overflow corresponding bit fields
            """
            A struct field value overflows its corresponding bit field
            when it falls outside the range of values
            that can be represented by that bit field given its length.
            """
        When I pass to function Marshal() a pointer to that struct variable
            var (
                bytes []byte
                e     error
            )

            bytes, e = binary.Marshal(&internetHeader)
        Then Marshal() should return a slice of bytes and a nil error
        And I should see struct field values reflected as bits in those bytes
            log.Printf("%08b", bytes)
            // [01000101 ...]

            log.Println(e == nil)
            // true
        And I should see that the lengths of the slice and the format are equal
            """
            The length of a format is the sum of lengths of the words in it.
            The length of a word is the sum of lengths of the bit fields in it.
            """

Unmarshal

    Scenario: Unmarshal a byte slice into a struct
        Given a format-struct type representing a binary message or file format
            var internetHeader RFC791InternetHeaderFormatWithoutOptions
        And a slice of bytes containing a binary message or file
            var bytes []byte

            // ...

            log.Printf("%08b", bytes)
            // [01000101 ...]
        And the lengths of the slice and the format (measured in bits) are equal
        When I pass to function Unmarshal() the slice of bytes as an argument
        And I pass to the function a pointer to the struct as a second argument
            e = binary.Unmarshal(bytes, &internetHeader)
        Then Unmarshal() should return a nil error
        And I should see struct field values matching the bits in those bytes
            log.Println(e == nil)
            // true

            log.Println(internetHeader.RFC791InternetHeaderFormatWord0.Version)
            // 4

            log.Println(internetHeader.RFC791InternetHeaderFormatWord0.IHL)
            // 5

Performance and Optimisation

This module is optimised for performance.

$ go test -cpuprofile cpu.prof -memprofile mem.prof -bench . -benchmem
goos: linux
goarch: arm64
pkg: github.com/encodingx/binary
BenchmarkMarshal-2     	 3435181	       349.9 ns/op	      64 B/op	       6 allocs/op
BenchmarkUnmarshal-2   	 3004425	       396.4 ns/op	      64 B/op	       8 allocs/op
PASS
ok  	github.com/encodingx/binary	3.288s

binary's People

Contributors

joel-ling avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

binary's Issues

Nested types and arrays in struct

Hi!
Is it possible to do nested types and arrays in structs like

type nested struct {
    A     uint8  `bitfield:"4"`
    B     uint32  `bitfield:"32"`                
}

type mainstruct struct {
    data0  uint32 `word:"32"`    
    data1  uint32 `word:"32"`
    nested nested
    nestedArray [4]nested
}

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.