Coder Social home page Coder Social logo

community.archives's Introduction

Community.Archives

A collection of libraries that support fast and efficient forward-only reading of various popular archives.

  • 🚀 Fast and efficient: Only extracts matched file. Forward-only access. Uses Task to offload IO to separate threads.
  • 😀 Licensed under MIT. Similar projects are licensed under GPL.
  • 😍 100% test coverage

Supported archive formats

Format Package Documentation
Ar Get Community.Archives.Ar on nuget Get started
Cpio Get Community.Archives.Cpio on nuget Get started
Rpm Get Community.Archives.Rpm on nuget Get started
Tar Get Community.Archives.Tar on nuget Get started
Apk Get Community.Archives.Apk on nuget Get started

Supported frameworks

  • .Net Standard 2.1
  • .Net 5
  • .Net 6

On any platform that's supported by the above frameworks, including Windows, Linux and MacOS.

Getting started

Each package exports an implementation of IArchiveReader.

Extract all or specific files

var reader = new TarArchiveReader(); // or RpmArchiveReader or ...

await foreach (
    var entry in reader
        .GetFileEntriesAsync(stream, IArchiveReader.MATCH_ALL_FILES)
) {
    // entry.Name
    // entry.Content
    Console.WriteLine($"Found file {entry.Name} ({entry.Content.Length} bytes)")
}

Extract specific files only

var reader = new TarArchiveReader(); // or RpmArchiveReader or ...

// use regular expression to match files (path + file name)
await foreach (
    var entry in reader
        .GetFileEntriesAsync(stream, "[.]md$", "[.]txt$")
) {
    // found a Markdown or text file
}

Extract metadata of the archive

var reader = new RpmArchiveReader();

var metaData = await reader.GetMetaDataAsync(stream);

Console.WriteLine(metaData.Package); // for example: "gh"
Console.WriteLine(metaData.Version); // for example: "2.4.0"

❗ Only rpm archives contain meta data. Check reader.SupportsMetaData at runtime or the documentation of the reader before using it.

Recommended usage

The implementations of IArchiveReader allow forward-only access of supported archives.

But why forward-only and not random-access?

All of these archive formats do not have an central index of files. That means that (in worst case) the complete archive needs to be scanned to find a file. In addition, archives like tar are usually compressed. Decompressing them is easy but because the tar archive as a whole and not individual files are compressed, the whole file needs to be decompressed for random-access.

There are many different archive extractors (for example 7z) that can easily extract any modern archive.

The purpose of IArchiveReader is to quickly and efficiently find and extract one or more files. Without using native or fat dependencies like RecursiveExtractor or SharpZipLib.

IArchiveReader will only allocate memory (byte[]) for matched files.

Usage with Dependency Injection

You can either register IArchiveReader and a single implemenation of it. Or, if you are using multiple implementations in the same project, register the implementation directly. All implementations are using virtual functions. You can easily mock the classes using your favorite mocking framework.

Found a bug? Have a suggestion?

Please create an issue and attach the file (if it's not confidental or contains personally identifiable information (PII)).

Pull requests are always welcome 😍

License

This software is released under the MIT License.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.