Coder Social home page Coder Social logo

javareedsolomon's Introduction

JavaReedSolomon

This is a simple and efficient Reed-Solomon implementation in Java, which was originally built at Backblaze. There is an overview of how the algorithm works in my blog post.

The ReedSolomon class does the encoding and decoding, and is supported by Matrix, which does matrix arithmetic, and Galois, which is a finite field over 8-bit values.

For examples of how to use ReedSolomon, take a look at SampleEncoder and SampleDecoder. They show, in a very simple way, how to break a file into shards and encode parity, and then how to take a subset of the shards and reconstruct the original file.

There is a Gradle build file to make a jar and run the tests. Running it is simple. Just type: gradle build

We would like to send out a special thanks to James Plank at the University of Tennessee at Knoxville for his useful papers on erasure coding. If you'd like an intro into how it all works, take a look at this introductory paper.

This project is limited to a pure Java implementation. If you need more speed, and can handle some assembly-language programming, you may be interested in using the Intel SIMD instructions to speed up the Galois field multiplication. You can read more about that in the paper on Screaming Fast Galois Field Arithmetic.

Performance Notes

The performance of the inner loop depends on the specific processor you're running on. There are twelve different permutations of the loop in this library, and the ReedSolomonBenchmark class will tell you which one is faster for your particular application. The number of parity and data shards in the benchmark, as well as the buffer sizes, match the usage at Backblaze. You can set the parameters of the benchmark to match your specific use before choosing a loop implementation.

These are the speeds I got running the benchmark on a Backblaze storage pod:

    ByteInputOutputExpCodingLoop         95.2 MB/s
    ByteInputOutputTableCodingLoop      107.0 MB/s
    ByteOutputInputExpCodingLoop        130.3 MB/s
    ByteOutputInputTableCodingLoop      181.4 MB/s
    InputByteOutputExpCodingLoop         94.4 MB/s
    InputByteOutputTableCodingLoop      138.3 MB/s
    InputOutputByteExpCodingLoop        200.4 MB/s
    InputOutputByteTableCodingLoop      525.7 MB/s
    OutputByteInputExpCodingLoop        143.7 MB/s
    OutputByteInputTableCodingLoop      209.5 MB/s
    OutputInputByteExpCodingLoop        217.6 MB/s
    OutputInputByteTableCodingLoop      515.7 MB/s

Bar Chart of Benchmark Results

javareedsolomon's People

Contributors

akalin avatar alanfranz avatar brianb-backblaze avatar bwbeach avatar iblech avatar trel avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

javareedsolomon's Issues

Precalculate Galois.multiply results

You can pre-calculate results of Galois.multiply into a 64KB 2D table. This also gets rid of the branching.

I am porting the library to Go, and for me that gives a nice speed-up (~30%), and I suspect it will be even more in Java.

Packaged JAR Releases

If there are no plans for publishing this to the Central Maven Repo (see #15), then there should be a section for the built JAR files to download in order to make it easier to use this library and to promote its use, which would hopefully lead to developers actually taking care of this repository.

SSE3 Parallel Galois Multiplication

(This is not an issue with the java implementation, but something you might find interesting, if you don't feel free to close the issue)

Since I may eventually do an assembler version of the main loop, I was researching if there was anything out there, and I found this: Screaming Fast Galois Field Arithmetic, which can do constant multiplications using two 16 bytes tables, which is perfect for SSE3. [their implementation]

This requires the main loop to be slightly different (in Go)

    // output should be cleared, or first iRow loop should overwrite
    for c := 0; c < r.DataShards; c++ {
        in := inputs[c]
        for iRow := 0; iRow < outputCount; iRow++ {
            m := matrixRows[iRow][c]
            o := outputs[iRow]
            for iByte := 0; iByte < byteCount; iByte++ {
                o[iByte] ^= galMultiply(m, in[iByte])
            }
        }
    }

Now the one value (m) is a constant in the entire inner loop, so I should be able to do the multiplication with the byte[256][16] table, which only requires two look-ups for the entire inner loop.

I have generated the tables needed, and I will begin writing the assembler when I get a bit more time.

Gradle Build Fail

I'm trying to benchmark throughput using this Java Reed Solomon code. Correct me if I'm wrong in assuming that using the gradle build file to run the README-mentioned tests is the best way to do that. When I try to run gradle build as indicated in the README, but I'm running into the following issue:

FAILURE: Build failed with an exception.

* Where:
Build file '.../JavaReedSolomon/build.gradle' line: 8

* What went wrong:
A problem occurred evaluating root project 'JavaReedSolomon'.
> Could not find method testCompile() for arguments [{group=junit, name=junit, version=4.+}] on object of type org.gradle.api.internal.artifacts.dsl.dependencies.DefaultDependencyHandler.

* Try:
> Run with --stacktrace option to get the stack trace.
> Run with --info or --debug option to get more log output.
> Run with --scan to get full insights.

* Get more help at https://help.gradle.org

BUILD FAILED in 910ms

I'm wondering if this is an issue related to the JavaReedSolomon project itself. Any help would be appreciated to help me run the tests.

Extra Information:

  • I had just installed and checked updates for Gradle.
  • I am running on an M1 Mac with MacOS Monterey (12.3).

dataShardCount >= 257 creates singular matrix

I was quite surprised when I found this.

After researching a bit, I found this in the blog post:

The math—and the code—works with any numbers as long as you have at least one data shard and don’t have more than 256 shards total.

If there is no way for this to work on bigger shard numbers, it should be mentioned in the library, and you should check it in the constructor.

Also - is the above correct? Shouldn't it be "and don’t have more than 256 DATA shards total". I don't have any problem creating an encoder with 256 data and 256 parity shards. But as soon as there is more than 256 data shards it fails.

License for image files

Hi there,

I want to know the license of image files on your blog post:
https://www.backblaze.com/blog/reed-solomon/
Such as:
https://www.backblaze.com/blog/wp-content/uploads/2015/06/blog-rs-7.png

The reason is one of the image is used in a github project, which I'm trying to package to Debian.
It's required to confirm license of every file in the project before going into Debian.

Your blog post already claimed the source code is in MIT license. So I guess the image files are also in MIT license. Could you kindly help to confirm it? Thank you!

Cheers,
Roger

[help wanted] some questions

Hello, I have some questions about generating the logarithm table in your code.

In the generateLogTable function, why does the expression b = ((b - FIELD_SIZE) ^ polynomial); ensure that b is set to FIELD_SIZE different and non-repeating values?
Looking forward to your explanation, thank you.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.