Coder Social home page Coder Social logo

timo-reymann / deterministic-zip Goto Github PK

View Code? Open in Web Editor NEW
51.0 2.0 5.0 187 KB

Simple (almost drop-in) replacement for zip that produces deterministic files.

License: Other

Makefile 7.31% Go 86.53% Shell 3.34% Dockerfile 2.82%
go zip deterministic reproducible-builds deterministic-zip

deterministic-zip's Introduction

deterministic-zip

GitHub Release PyPI version DockerHub Pulls GitHub all releases download count CircleCI Build Status codecov Renovate Quality Gate Status Maintainability Rating Go Report Card Security Rating FOSSA Status


Simple (almost drop-in) replacement for zip that produces deterministic files.

Features

  • dropin for zip
  • remove all metadata from files added
  • immutable zip util

Installation

Automatic install

bash <(curl -sS https://raw.githubusercontent.com/timo-reymann/deterministic-zip/main/installer)

Manual

Linux (64-bit)

curl -LO https://github.com/timo-reymann/deterministic-zip/releases/download/$(curl -Lso /dev/null -w %{url_effective} https://github.com/timo-reymann/deterministic-zip/releases/latest | grep -o '[^/]*$')/deterministic-zip_linux-amd64 && \
chmod +x deterministic-zip_linux-amd64 && \
sudo mv deterministic-zip_linux-amd64 /usr/local/bin/deterministic-zip

Darwin (Intel)

brew
brew tap timo-reymann/deterministic-zip
brew install deterministic-zip
manual
curl -LO https://github.com/timo-reymann/deterministic-zip/releases/download/$(curl -Lso /dev/null -w %{url_effective} https://github.com/timo-reymann/deterministic-zip/releases/latest | grep -o '[^/]*$')/deterministic-zip_darwin-amd64 && \
chmod +x deterministic-zip_darwin-amd64 && \
sudo mv deterministic-zip_darwin-amd64 /usr/local/bin/deterministic-zip

Install with go

go get -u github.com/timo-reymann/deterministic-zip

Install with pip(x)

Using pipx you can just use the following command use deterministic-zip as it is:

pipx install deterministic-zip-go

If you want to use it directly using the subprocess module you can install it with pip:

pip install deterministic-zip-go

And use the package like this:

import subprocess

from deterministic_zip_go import exec

# Run process and prefix stdout and stderr
exec.exec_with_templated_output(["--help"])

# Create a subprocess, specifying how to handle stdout, stderr
exec.create_subprocess(["--help"], stdout=subprocess.PIPE, stderr=subprocess.PIPE)

# Perform command with suppressed output and return finished proces instance,
# on that one can also check if the call was successfully
exec.exec_silently(["--version"])

Docker

Please check the Containerized section in Usage for more details.

Supported platforms

The following platforms are supported (and have prebuilt binaries / ready to use integration):

  • Linux
    • 32-bit
    • 64-bit
    • ARM 64-bit
    • ARM 32-bit
  • Darwin
    • 64-bit
    • ARM (M1/M2)
  • Windows
    • ARM
    • 32-bit
    • 64-bit
  • FreeBSD
    • 32-bit
    • 64-bit
    • ARM 64-bit
    • ARM 32-bit
  • OpenBSD
    • 32-bit
    • 64-bit
  • OCI compatible container engines (Docker, podman etc)
    • ARM
    • 64-bit
  • CircleCI
  • GitHub Actions

Where to find the latest release for your platform

Binaries

Binaries for all of these can be found on the latest release page.

Docker

For the docker image check the docker hub.

CI Provider

Usage

Command Line

If you installed the binary via Releases, Install-Script or using go you can just run deterministic-zip as a command.

deterministic-zip -h

Containerized

Please be aware that the image contains just the binary, no OS, libs or anything else. It also runs as root to be able to zip files no matter the ownership, feel free to build your own images based on that as well.

Using the container directly

If you want to use the tool on a platform not supported yet or dont want to install the tool locally you can also mount your folder in /workspace which is the default working directory. Than you can just execute commands as you want to.

docker run -v $PWD:/workspace timoreymann/deterministic-zip:latest

Integrating into your CI image

If you want to integrate the tool directly into your build image, you can also utilize the auto updates from tools like renovatebot or dependabot. Using docker built in features you can just get the binary directly from the image.

FROM base-image:tag
# do your customizations
COPY --from=timoreymann/deterministic-zip:latest /deterministic-zip /usr/bin/deterministic-zip

Motivation

Why another zip-tool? What is this deterministic stuff?!

When we are talking about deterministic it means that the hash of the zip file won't change unless the contents of the zip file changes.

This means only the content, no metadata. You can achieve this with zip, yes.

The problem that still remains is that the order is almost unpredictable and zip is very platform specific, so you will end up with a bunch of crazy shell pipelines. And I am not even talking about windows at this point.

So this is where this tool comes in, it is intended to be a drop-in replacement for zip in your build process.

The use cases for this are primary:

  • Zipping serverless code
  • Backups or other files that get rsynced

Want to know more about the topic of deterministic/reproducible builds?

I can recommend the following resources:

Documentation

How reliable is it?

Of course, it is not as reliable as the battle-proven and billions of times executed zip.

Even though I am heavily relying on the go stdlib this software can of course have bugs. And you are welcome to report them and help make this even more stable. Of course there will be tests to cover most use cases but at the end this is still starting from scratch, so if you need advanced features or just dont feel comfortable about using this tool don't do it!

Differences between zip and deterministic-zip

Please see docs/differences

Contributing

I love your input! I want to make contributing to this project as easy and transparent as possible, whether it's:

  • Reporting a bug
  • Discussing the current state of the configuration
  • Submitting a fix
  • Proposing new features
  • Becoming a maintainer

To get started please read the Contribution Guidelines.

Development

Requirements

Test

make test-coverage-report

Build

make build

Alternatives

As far as I know the following (GitHub) projects exist:

All in all they are just simply not what I needed. My favourite is Rust, because its just simply dropping in a binary. Something that's very convenient especially when it comes to Docker builds.

The main problem that all these solutions share is that it in my opinion cool things like excluding patterns, that I regularly use are simply not implemented, and i REALLY love glob patterns.

Credits

This whole project wouldnt be possible with the great work of the following libraries:

deterministic-zip's People

Contributors

nathanklick avatar renovate[bot] avatar timo-reymann avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

deterministic-zip's Issues

Create GitHub action

Description

Currently there is already an CircleCI Orb for deterministic-zip, so it would be nice to also provide a GitHub Action for users that use that for their deployments etc.

References

#2

Add an option to recreate the archive if needed with -FS

Description

Currently using -FS / --filesync appends the newly created files to the end of the archive which makes it non deterministic, having an extra option that deletes the archive and recreates it if needed would be nice to have.

References

No response

Dependency Dashboard

This issue lists Renovate updates and detected dependencies. Read the Dependency Dashboard docs to learn more.

This repository currently has no open or pending branches.

Detected dependencies

circleci
.circleci/config.yml
  • github-cli 2.3.0
  • docker 2.6.0
  • codecov 4.1.0
  • cimg/go 1.22
  • cimg/python 3.12
dockerfile
Dockerfile
gomod
go.mod
  • go 1.19
  • github.com/gobwas/glob v0.2.3
  • github.com/spf13/pflag v1.0.5

  • Check this box to trigger a request for Renovate to run again on this repository

Add tests unzipping file content

Community Note

  • Please vote on this issue by adding a ๐Ÿ‘ reaction to the original issue to help the community and maintainers prioritize this request
  • Please do not leave "+1" or other comments that do not add relevant new information or questions, they generate extra noise for issue followers and do not help prioritize the request
  • If you are interested in working on this issue or have submitted a pull request, please leave a comment

Description

Verify that the zipped files are extractable to prevent bugs like the one with the directories being created as files.

[Bug]: bad zipfile offset

What happened?

I'm zipping a folder using: deterministic-zip -r ../logs-extension-api.zip . -x "*.rush/*" -x "*__tests__/*" -x "*.log" -v.
The zip is generated without issue, but when I do: unzip ./logs-extension-api.zip it says:

Archive:  ./logs-extension-api.zip
.rush/temp/operation/zip/state.json:  mismatching "local" filename (extensions-api.js),
         continuing with "central" filename version
  inflating: .rush/temp/operation/zip/state.json  
  error:  invalid compressed data to inflate
file #2:  bad zipfile offset (local header sig):  148
file #3:  bad zipfile offset (local header sig):  1028
file #4:  bad zipfile offset (local header sig):  4378
file #5:  bad zipfile offset (local header sig):  5210
file #6:  bad zipfile offset (local header sig):  5844
file #7:  bad zipfile offset (local header sig):  6663
file #8:  bad zipfile offset (local header sig):  7399
file #9:  bad zipfile offset (local header sig):  8076
file #10:  bad zipfile offset (local header sig):  8754
file #11:  bad zipfile offset (local header sig):  9311
file #12:  bad zipfile offset (local header sig):  9708
file #13:  bad zipfile offset (local header sig):  9846
file #14:  bad zipfile offset (local header sig):  10430
file #15:  bad zipfile offset (local header sig):  11177
file #16:  bad zipfile offset (local header sig):  22049
file #17:  bad zipfile offset (local header sig):  22374
file #18:  bad zipfile offset (local header sig):  23003
file #19:  bad zipfile offset (local header sig):  23383

I tried to remove the -x options and it works perfectly.

OS

Darwin ARM

Problem description

I think the repetition of -x options make a problem in the zip file

Expected behaviour

Should unzip without issue

Actual behaviour

I got this error while unzip:

Archive:  ./logs-extension-api.zip
.rush/temp/operation/zip/state.json:  mismatching "local" filename (extensions-api.js),
         continuing with "central" filename version
  inflating: .rush/temp/operation/zip/state.json  
  error:  invalid compressed data to inflate
file #2:  bad zipfile offset (local header sig):  148
file #3:  bad zipfile offset (local header sig):  1028
file #4:  bad zipfile offset (local header sig):  4378
file #5:  bad zipfile offset (local header sig):  5210
file #6:  bad zipfile offset (local header sig):  5844
file #7:  bad zipfile offset (local header sig):  6663
file #8:  bad zipfile offset (local header sig):  7399
file #9:  bad zipfile offset (local header sig):  8076
file #10:  bad zipfile offset (local header sig):  8754
file #11:  bad zipfile offset (local header sig):  9311
file #12:  bad zipfile offset (local header sig):  9708
file #13:  bad zipfile offset (local header sig):  9846
file #14:  bad zipfile offset (local header sig):  10430
file #15:  bad zipfile offset (local header sig):  11177
file #16:  bad zipfile offset (local header sig):  22049
file #17:  bad zipfile offset (local header sig):  22374
file #18:  bad zipfile offset (local header sig):  23003
file #19:  bad zipfile offset (local header sig):  23383

Steps to Reproduce

try to use the command with one or multiple -x flag

Important Factoids

No response

References

No response

I read and aggree to the contribution guidelines

  • I read and aggree to the contribution guidelines

argument list too long

Community Note

  • Please vote on this issue by adding a ๐Ÿ‘ reaction to the original issue to help the community and maintainers prioritize this request
  • Please do not leave "+1" or other comments that do not add relevant new information or questions, they generate extra noise for issue followers and do not help prioritize the request
  • If you are interested in working on this issue or have submitted a pull request, please leave a comment

Used versions

  • Host OS: Darwin

Problem description

Trying to zip some files, like this deterministic-zip -D package.zip node_modules/**/*

Expected Behavior

Files are added to the zip

Actual Behavior

I get this error
zsh: argument list too long: deterministic-zip

Steps to Reproduce

Important Factoids

References

  • #0000

[Bug]: Latest v1 release is broken

What happened?

The binary downloaded by installer script is not found in the latest v1 release.

OS

Ubuntu

Problem description

The following link is generated by the installer script, which points to a binary file that does not exist.

curl -LsS https://github.com/timo-reymann/deterministic-zip/releases/download/v1/deterministic-zip_linux-amd64

Output:
Not Found

Expected behaviour

Correct binary to be downloaded.

Actual behaviour

Link to binary is broken.

Steps to Reproduce

curl -LsS https://github.com/timo-reymann/deterministic-zip/releases/download/v1/deterministic-zip_linux-amd64

Important Factoids

No response

References

No response

I read and aggree to the contribution guidelines

  • I read and aggree to the contribution guidelines

I would like to see docker images and github actions

Community Note

  • Please vote on this issue by adding a ๐Ÿ‘ reaction to the original issue to help the community and maintainers prioritize this request
  • Please do not leave "+1" or other comments that do not add relevant new information or questions, they generate extra noise for issue followers and do not help prioritize the request
  • If you are interested in working on this issue or have submitted a pull request, please leave a comment

Description

I like this tool, because it solves a problem I have when building lambda functions for aws and only want to deploy them if anything changes - so deterministic builds are very helpful.

But it would be great, if there are docker images and github actions I can use :-)

I may be able to implement this, but are you interested in seeing this project?

[Bug]: Ignore own file for zipping

What happened?

When you want to zip the directory that also contains the target zip file, deterministic-zip will try to add the currently open file, which results in an infinite growing file.

OS

Ubuntu

Problem description

Only possible if the zip file will be in the same file, there are no other things required

Expected behaviour

When deterministic-zip encounters itself in the file list, it should be ignored

Actual behaviour

The application tries to add the zip file which is growing, effectively, resulting in an endless loop.

Steps to Reproduce

  1. Create a folder with some files
  2. Try to zip the current folder, e.g., deterministic-zip -r test.zip .
  3. The tool will try to add the file until the process is killed

Important Factoids

No response

References

No response

I read and aggree to the contribution guidelines

  • I read and aggree to the contribution guidelines

Include directory entries in the zip file

Community Note

  • Please vote on this issue by adding a ๐Ÿ‘ reaction to the original issue to help the community and maintainers prioritize this request
  • Please do not leave "+1" or other comments that do not add relevant new information or questions, they generate extra noise for issue followers and do not help prioritize the request
  • If you are interested in working on this issue or have submitted a pull request, please leave a comment

Description

Due to the limitations of some zip implementations, it is sometimes necessary to include directory entries in the zip file. Additionally, in certain use cases an empty directory should be created when the zip file is extracted.

References

Make output less verbose by default

Description

By default, deterministic-zip outputs information about all files added, this clutters output.

It would be nicer to display this information only in verbose mode, also making output more consistent and easier.

Proposal for new outputs in verbose mode:

Action Sample output
Added to file set + {file-name} added to file set by '{feature name}'
Removed from file set by filter - {file-name} removed from file set by '{feature name}'
Feature loaded โ—‰ {Feature feature-name is active}
Feature not loaded โ—ฏ {Feature feature-name is not active}

Regular debug outputs start with debug:, directly added in log tool.

If it makes output easier to read at first glance, colored output, if supported by the target system, would be a benefit.

Code changes

  • Source files are not modifiable from outside
  • Source files can be changed with Add/Remove method
  • Add debug prefix for debug outputs, making it easy to distinguish

Benefits of new output format

Allows effortless troubleshooting if, e.g., the final zip in CI looks weird and should be more straightforward to debug with clear and consistent output.

References

No response

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.