Coder Social home page Coder Social logo

generate-git-repo's Introduction

generate-git-repo

Generate Git repositories for testing purposes.

If you have tooling/scripts that read and interact with Git repositories, you might want to test them against reproducible variations.

Example

CLI:

cat ./example-input.json | generate-git-repo --bare ./path-to-new-repo

Result:

Example

example-input.json:

[
  { "type": "commit",   "id": "a", "message": "Initial commit" },
  { "type": "commit",   "id": "b", "message": "Commit B",      "parents": ["a"] },
  { "type": "commit",   "id": "c", "message": "Commit C",      "parents": ["a"] },
  { "type": "merge",    "id": "d", "commits": ["b", "c"],      "tags": ["1.0.0"] },
  { "type": "commit",   "id": "e", "message": "Commit E",      "parents": ["d"],      "branches": ["master"] },
  { "type": "config",   "all_name": "Danny", "all_email": "[email protected]" },
  { "type": "commit",   "id": "f", "message": "Commit F",      "parents": ["d"] },
  { "type": "branch",   "name": "pull-request", "on": "f" }
]

Write a program to generate the input, and pipe it into generate-git-repo

The motivation for using stdin is to allow other programs to generate the list of commands. This is the intended purpose of generate-git-repo.

Suppose you have a Node.js script create-tree.js. It declares a recursive tree.

node ./create-tree.js | generate-git-repo --bare ./crazy-tree-repo

Or if you'd rather save a compiled list of commands and feed that into the progrram, you can do that too.

Command documentation

WORK IN PROGRESS

"type": "commit"

Creates a commit.

Fields:

  • id - Required. Commit identifier. This is NOT a Git commit hash, but rather a way for the generator to internally keep track of commits.
  • message - Optional. Commit message. Can be a single line, or multiple lines. If not specified, the message is the commit identifier.
  • parents - Optional. A list of parent commits. If not specified, creates an orphaned commit (i.e. for initial commits).
  • tree - Optional. A object where the key is the path, and the value is the file contents. It specifies the files and directories that should be in the commit. If not specified, the commit uses the default set of files (none by default).
  • branches - Optional. A list of branch names. All listed branch names will be set to this commit. Branches can also be created in the "type": "branch" command.
  • tags - Optional. A list of tag names. All listed tag names will be set as lightweight tags to this commit. Tags can also be created in the "type": "tag" command.

For any tree string values, the encoding is UTF-8.

Example:

{
  "type": "commit",
  "id": "4",
  "message": "Test",
  "parents": ["1", "2", "3"],
  
  "tree": {
    "hello.txt": "world",
    "directory/foo.txt": "bar",
    "directory/fizz.txt": "buzz"
  }
}

"type": "branch"

Creates a branch at the reference. Branches can also be created in the "type": "commit" command.

Fields:

  • name - Required. The name of the branch. e.g. master
  • on - Required. Where to create the branch.

"type": "config"

Applies miscellaneous configuration, such as author/commiter/tagger information.

Note: all_ has low precedence. e.g. If author_name and all_name are both set, then author_name will apply to the author, and all_name will apply to the committer and tagger.

Author/committer/tagger fields:

  • all_name - Optional. Sets the author, committer and tagger name.
  • all_email - Optional. Sets the author, committer and tagger email.
  • author_name - Optional. Sets the author name.
  • author_email - Optional. Sets the author email.
  • committer_name - Optional. Sets the committer name.
  • committer_email - Optional. Sets the committer email.
  • tagger_name - Optional. Sets the tagger name.
  • tagger_email - Optional. Sets the tagger email.

Other fields:

  • tree - Optional. A recursive object. Sets the default tree. (see "type": "commit" documentation)

"type": "merge"

Merges a commit with one or more other commits. Fast-forwards by default.

Commit fields:

  • id - Required. Commit identifier. This is NOT a Git commit hash, but rather a way for the generator to internally keep track of commits.
  • commits - Required. A list of commits to merge to. Can be a single commit, but is usually two or more commits. Cannot be empty.
  • message - Optional. Only used for merge commits, ignored for fast-forwards. Commit message. Can be a single line, or multiple lines. If not specified, the message is Merge commits '<commit1>', '<commit2>', ....
  • tree - Optional. Only used for merge commits, ignored for fast-forwards. A object where the key is the path, and the value is the file contents. It specifies the files and directories that should be in the commit. If not specified, the commit uses the default set of files (none by default).
  • branches - Optional. A list of branch names. All listed branch names will be set to this commit. Branches can also be created in the "type": "branch" command.
  • tags - Optional. A list of tag names. All listed tag names will be set as lightweight tags to this commit. Tags can also be created in the "type": "tag" command.
  • no_ff - Optional. If set to true, will always create a merge commit (disables fast-forward merges). Fast-forwards are enabled by default (i.e. "no_ff": false).

"FAQ"

Couldn't I just run a bunch of git commands to generate a test repo?

Yes, you definitely could! But in my experience, this is quite tricky for repos with non-trivial commit graphs.

Creating commits with git commands in a linear fashion works well, but you hit a snag as soon as you have branches and merges. If you want to create branching commits that hop across different commit parents, you could try git checkout <HASH> && git commit ... or git commit-tree, but you'd need to implement a lookup to remember the commit hashes. You could also tag/branch everything you need to go back to, as long as you delete them. Again, it's a lot of overhead work.

If you want to additionally populate each commit with different files, you'd also have to worry about the index (or resort to git plumbing commands like git hash-object and git mktree).

But I should be testing a logic layer in my program instead of the Git repository directly, right?

So in other words: [Mock Git DB] <-> [Logic] <-> [Tests] instead of [Actual Git DB] <-> [Logic] <-> [Tests]

And that's entirely valid.

The reason you may decide not to do it this way is you'd need to create the mock in the first place. It's the exact same problem with unit-testing functions that call SQL databases - most people just end up standing up a local database and doing a minimal amount of tests to ensure that integration works. Sure, leave the domain logic to unit tests, but the integration testing has to happen somewhere.

With Git, you'd need to reimplement all the stuff you're interested in: read commit logs, files, lists of branches, etc. And if your program calls the git commands directly, you might need to implement command-line arg parsing as well, or find some other way to abstract it in your program.

Why Rust?

I wanted a program that's standalone, portable across platforms, and not bound to any language runtime. It should additionaly not require interfacing with the git command (based on my own benchmarks, using git plumbing commands can be up to 6x slower for large repos!).

The libgit2 bindings for Rust are quite solid, and it's used in popular projects like Cargo, Rust's package manager.

Building

You'll need Rust and Cargo installed on your machine. The easiest way to do this is with rustup: https://rustup.rs/

Dependencies

Cargo will handle all dependencies automatically. If there are problems building git2, it might be because of a missing C compiler. One way to quickly resolve these type of build errors may be to install gcc, clang or similar. See also: https://crates.io/crates/cc

This project depends on git2: https://github.com/alexcrichton/git2-rs

Running locally

The following will run a debug build.

cargo run -- <arguments go here>

Installing locally

The following comamnd will build the release binary and place it in a location controlled by Cargo. If Cargo was installed via rustup, this location should be in your PATH.

cargo install --path .

generate-git-repo's People

Contributors

nukep avatar

Stargazers

 avatar  avatar

Watchers

 avatar  avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.