Coder Social home page Coder Social logo

snappier's Introduction

Snappier

.NET Core

Introduction

Snappier is a pure C# port of Google's Snappy compression algorithm. It is designed with speed as the primary goal, rather than compression ratio, and is ideal for compressing network traffic. Please see the Snappy README file for more details on Snappy.

Project Goals

The Snappier project aims to meet the following needs of the .NET community.

  • Cross-platform C# implementation for Linux and Windows, without P/Invoke or special OS installation requirements
  • Compatible with .NET 4.6.1 and later and .NET Core 2.0 and later
  • Use .NET paradigms, including asynchronous stream support
  • Full compatibility with both block and stream formats
  • Near C++ level performance
    • Note: This is only possible on .NET Core 3.0 and later with the aid of Span<T> and System.Runtime.Intrinsics.
    • .NET Core 2.1 is almost as good, .NET 4.6.1 is the slowest
  • Keep allocations and garbage collection to a minimum using buffer pools

Installing

Simply add a NuGet package reference to the latest version of Snappier.

<PackageReference Include="Snappier" Version="1.0.0" />

or

dotnet add package Snappier

Block compression/decompression using a buffer you already own

using Snappier;

public class Program
{
    private static byte[] Data = {0, 1, 2}; // Wherever you get the data from

    public static void Main()
    {
        // This option assumes that you are managing buffers yourself in an efficient way.
        // In this example, we're using heap allocated byte arrays, however in most cases
        // you would get these buffers from a buffer pool like ArrayPool<byte> or MemoryPool<byte>.

        // Compression
        byte[] buffer = new byte[Snappy.GetMaxCompressedLength(Data)];
        int compressedLength = Snappy.Compress(Data, buffer);
        Span<byte> compressed = buffer.AsSpan(0, compressedLength);

        // Decompression
        byte[] outputBuffer = new byte[Snappy.GetUncompressedLength(compressed)];
        int decompressedLength = Snappy.Decompress(compressed, outputBuffer);

        for (var i = 0; i < decompressedLength; i++)
        {
            // Do something with the data
        }
    }
}

Block compression/decompression using a memory pool buffer

using Snappier;

public class Program
{
    private static byte[] Data = {0, 1, 2}; // Wherever you get the data from

    public static void Main()
    {
        // This option uses `MemoryPool<byte>.Shared`. However, if you fail to
        // dispose of the returned buffers correctly it can result in memory leaks.
        // It is imperative to either call .Dispose() or use a using statement.

        // Compression
        using (IMemoryOwner<byte> compressed = Snappy.CompressToMemory(Data))
        {
            // Decompression
            using (IMemoryOwner<byte> decompressed = Snappy.DecompressToMemory(compressed.Memory.Span))
            {
                // Do something with the data
            }
        }
    }
}

Block compression/decompression using heap allocated byte[]

using Snappier;

public class Program
{
    private static byte[] Data = {0, 1, 2}; // Wherever you get the data from

    public static void Main()
    {
        // This is generally the least efficient option,
        // but in some cases may be the simplest to implement.

        // Compression
        byte[] compressed = Snappy.CompressToArray(Data);

        // Decompression
        byte[] decompressed = Snappy.DecompressToArray(compressed);
    }
}

Stream compression/decompression

Compressing or decompressing a stream follows the same paradigm as other compression streams in .NET. SnappyStream wraps an inner stream. If decompressing you read from the SnappyStream, if compressing you write to the SnappyStream

This approach reads or writes the Snappy framing format designed for streaming. The input/output is not the same as the block method above. It includes additional headers and CRC32C checks.

using System.IO;
using System.IO.Compression;
using Snappier;

public class Program
{
    public static async Task Main()
    {
        using var fileStream = File.OpenRead("somefile.txt");

        // First, compression
        using var compressed = new MemoryStream();

        using (var compressor = new SnappyStream(compressed, CompressionMode.Compress, true)) {
            await fileStream.CopyToAsync(compressor);

            // Disposing the compressor also flushes the buffers to the inner stream
            // We pass true to the constructor above so that it doesn't close/dispose the inner stream
            // Alternatively, we could call compressor.Flush()
        }

        // Then, decompression

        compressed.Position = 0; // Reset to beginning of the stream so we can read
        using var decompressor = new SnappyStream(compressed, CompressionMode.Decompress);

        var buffer = new byte[65536];
        var bytesRead = decompressor.Read(buffer, 0, buffer.Length);
        while (bytesRead > 0)
        {
            // Do something with the data

            bytesRead = decompressor.Read(buffer, 0, buffer.Length)
        }
    }
}

Other Projects

There are other projects available for C#/.NET which implement Snappy compression.

  • Snappy.NET - Uses P/Invoke to C++ for great performance. However, it only works on Windows, and is a bit heap allocation heavy in some cases. It also hasn't been updated since 2014 (as of 10/2020). This project may still be the best choice if your project is on the legacy .NET Framework on Windows, where Snappier is much less performant.
  • IronSnappy - Another pure C# port, based on the Golang implemenation instead of the C++ implementation.

snappier's People

Contributors

ak88 avatar brantburnett avatar mjebrahimi avatar rodo-r2r avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

snappier's Issues

"Data too long" and "Invalid copy offset" when decoding stream.

Hi,

I have a ~500mb snappy compressed (generated from Snappy.NET) file which I am unable to decompress with Snappier.
Decoding the file works fine in Snappy.NET, and with golang/snappy.

I fixed a bunch of bugs in Snappier trying to diagnose this (see #24), now am a bit stuck.

To load the file I'm using:

  using var compressed = File.OpenRead(@"c:\path\to\file.snappy");   // 515mb fails.
  using var decompressed = new MemoryStream();
  using (var decompressor = new SnappyStream(compressed, CompressionMode.Decompress, true))
  {
      decompressor.CopyTo(decompressed);
  }

And getting the following error.

System.IO.InvalidDataException
  HResult=0x80131501
  Message=Data too long
  Source=Snappier
  StackTrace:
   at Snappier.Internal.SnappyDecompressor.ThrowInvalidDataException(String message) in C:\src\Snappier\Snappier\Internal\SnappyDecompressor.cs:line 543
   at Snappier.Internal.SnappyDecompressor.AppendFromSelf(Byte* buffer, UInt32 copyOffset, Int64 length, SByte* pshufbFillPatterns) in C:\src\Snappier\Snappier\Internal\SnappyDecompressor.cs:line 574
   at Snappier.Internal.SnappyDecompressor.DecompressAllTags(ReadOnlySpan`1 inputSpan) in C:\src\Snappier\Snappier\Internal\SnappyDecompressor.cs:line 371
   at Snappier.Internal.SnappyDecompressor.Decompress(ReadOnlySpan`1 input) in C:\src\Snappier\Snappier\Internal\SnappyDecompressor.cs:line 63
   at Snappier.Internal.SnappyStreamDecompressor.Decompress(Span`1 buffer) in C:\src\Snappier\Snappier\Internal\SnappyStreamDecompressor.cs:line 120
   at Snappier.SnappyStream.ReadCore(Span`1 buffer) in C:\src\Snappier\Snappier\SnappyStream.cs:line 185
   at Snappier.SnappyStream.Read(Byte[] buffer, Int32 offset, Int32 count) in C:\src\Snappier\Snappier\SnappyStream.cs:line 154
   at System.IO.Stream.CopyTo(Stream destination, Int32 bufferSize)
   at Snappier.Tests.SnappyStreamTests.LoadEurope() in C:\src\Snappier\Snappier.Tests\SnappyStreamTests.cs:line 120

Any idea what could be causing this?
Sadly I can't provide the file causing the issue, though I have a similar file (~130mb) which works fine.

While trying to produce a test case I was able to come up with:

        [Fact]
        public void BigStream()
        {
            // Test to trigger "Invalid copy offset" error

            // Generate 6mb of fairly compressable, but not totally repeated, data.
            // 5mb doesn't cause an error. Purely incrementing values don't cause an error.
            var rawBytes = new byte[6_000_000];
            for (int i = 0; i < rawBytes.Length; i++)
            {
                rawBytes[i] = (byte)(i * i % 0xff);
            }

            using var output = new MemoryStream();
            using (var compressor = new SnappyStream(output, CompressionMode.Compress, true))
            {
                using var mem = new MemoryStream(rawBytes);
                mem.CopyTo(compressor);
            }

            output.Position = 0;

            byte[] decompressedBytes;
            using (var decompressor = new SnappyStream(output, CompressionMode.Decompress, true))
            {
                using var mem = new MemoryStream(rawBytes);
                decompressor.CopyTo(mem);
                decompressedBytes = mem.ToArray();
            }

            Assert.Equal(rawBytes.Length, decompressedBytes.Length);
            Assert.Equal(rawBytes, decompressedBytes);
        }

Which triggers a different (but likely related?) error.

System.IO.InvalidDataException
  HResult=0x80131501
  Message=Invalid copy offset
  Source=Snappier
  StackTrace:
   at Snappier.Internal.SnappyDecompressor.ThrowInvalidDataException(String message) in C:\src\Snappier\Snappier\Internal\SnappyDecompressor.cs:line 543
   at Snappier.Internal.SnappyDecompressor.AppendFromSelf(Byte* buffer, UInt32 copyOffset, Int64 length, SByte* pshufbFillPatterns) in C:\src\Snappier\Snappier\Internal\SnappyDecompressor.cs:line 570
   at Snappier.Internal.SnappyDecompressor.DecompressAllTags(ReadOnlySpan`1 inputSpan) in C:\src\Snappier\Snappier\Internal\SnappyDecompressor.cs:line 371
   at Snappier.Internal.SnappyDecompressor.Decompress(ReadOnlySpan`1 input) in C:\src\Snappier\Snappier\Internal\SnappyDecompressor.cs:line 63
   at Snappier.Internal.SnappyStreamDecompressor.Decompress(Span`1 buffer) in C:\src\Snappier\Snappier\Internal\SnappyStreamDecompressor.cs:line 120
   at Snappier.SnappyStream.ReadCore(Span`1 buffer) in C:\src\Snappier\Snappier\SnappyStream.cs:line 185
   at Snappier.SnappyStream.Read(Byte[] buffer, Int32 offset, Int32 count) in C:\src\Snappier\Snappier\SnappyStream.cs:line 154
   at System.IO.Stream.CopyTo(Stream destination, Int32 bufferSize)
   at Snappier.Tests.SnappyStreamTests.BigStream() in C:\src\Snappier\Snappier.Tests\SnappyStreamTests.cs:line ```

Note the "Invalid copy offset" vs "Data too long" messages.

Add licence type to Nuget Package

Is your feature request related to a problem? Please describe.

In dependency track this package is reporting as not having a licence type specified

Describe the solution you'd like

The PackageLicenseExpression property is set in the csproj so that the NUGET package has the information. This can then be included in the SBOM so that analysis can occur in the appropriate tools

Describe alternatives you've considered

Ignore them, bundle licence file into package & reference the file.

Additional context

N:a

A couple of questions

  1. Is this license compatible with the MIT license? My not-a-lawyer interpretation of it is that it probably is, but I'm considering using this in something MIT-licensed, so I wanted to make sure.

The next couple are just passing thoughts I had:

  1. I noticed at least one 256-bit buffer when I took a quick scan over the code. Is there anything to be gained by use of Vector256 and/or available intrinsics in .net core 3.0+?
  2. Related to 2, there's an implementation of CRC32 in there. Is it worth considering the Sse42.Crc32 intrinsic also available in .net core 3.0+?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.