Coder Social home page Coder Social logo

godaddy / tartufo Goto Github PK

View Code? Open in Web Editor NEW
456.0 21.0 71.0 1.84 MB

Searches through git repositories for high entropy strings and secrets, digging deep into commit history

Home Page: https://tartufo.readthedocs.io/

License: GNU General Public License v2.0

Dockerfile 0.27% Shell 0.34% Python 99.40%
security-scanner security git secrets-detection secrets-scan entropy-checking secrets entropy hacktoberfest security-tools

tartufo's Introduction

tartufo logo

ci Codecov PyPI PyPI - Status PyPI - Python Version PyPI - Downloads Documentation Status License

tartufo searches through git repositories for secrets, digging deep into commit history and branches. This is effective at finding secrets accidentally committed. tartufo also can be used by git pre-commit scripts to screen changes for secrets before they are committed to the repository.

This tool will go through the entire commit history of each branch, and check each diff from each commit, and check for secrets. This is both by regex and by entropy. For entropy checks, tartufo will evaluate the shannon entropy for both the base64 char set and hexidecimal char set for every blob of text greater than 20 characters comprised of those character sets in each diff. If at any point a high entropy string > 20 characters is detected, it will print to the screen.

Example

Example Issue

Documentation

Our main documentation site is hosted by Read The Docs, at https://tartufo.readthedocs.io.

Usage

Usage: tartufo [OPTIONS] COMMAND [ARGS]...

  Find secrets hidden in the depths of git.

  Tartufo will, by default, scan the entire history of a git repository for
  any text which looks like a secret, password, credential, etc. It can also
  be made to work in pre-commit mode, for scanning blobs of text as a pre-
  commit hook.

Options:
  --default-regexes / --no-default-regexes
                                  Whether to include the default regex list
                                  when configuring search patterns. Only
                                  applicable if --rules is also specified.
                                  [default: default-regexes]
  --entropy / --no-entropy        Enable entropy checks.  [default: entropy]
  --regex / --no-regex            Enable high signal regexes checks.
                                  [default: regex]
  --scan-filenames / --no-scan-filenames
                                  Check the names of files being scanned as
                                  well as their contents.  [default: scan-
                                  filenames]
  -of, --output-format [json|compact|text|report]
                                  Specify the format in which the output needs
                                  to be generated `--output-format
                                  json/compact/text/report`. Either `json`,
                                  `compact`, `text` or `report` can be
                                  specified. If not provided (default) the
                                  output will be generated in `text` format.
  -od, --output-dir DIRECTORY     If specified, all issues will be written out
                                  as individual JSON files to a uniquely named
                                  directory under this one. This will help
                                  with keeping the results of individual runs
                                  of tartufo separated.
  -td, --temp-dir DIRECTORY       If specified, temporary files will be
                                  written to the specified path
  --buffer-size INTEGER           Maximum number of issue to buffer in memory
                                  before shifting to temporary file buffering
                                  [default: 10000]
  --git-rules-repo TEXT           A file path, or git URL, pointing to a git
                                  repository containing regex rules to be used
                                  for scanning. By default, all .json files
                                  will be loaded from the root of that
                                  repository. --git-rules-files can be used to
                                  override this behavior and load specific
                                  files.
  --git-rules-files TEXT          Used in conjunction with --git-rules-repo,
                                  specify glob-style patterns for files from
                                  which to load the regex rules. Can be
                                  specified multiple times.
  --config FILE                   Read configuration from specified file.
                                  [default: tartufo.toml]
  --target-config/--no-target-config
                                  Enable or Disable processing of the config file in the
                                  repository or folder being scanned
                                  i.e. config files like tartufo.toml or pyproject.toml
                                  [default: target-config]
  -q, --quiet / --no-quiet        Quiet mode. No outputs are reported if the
                                  scan is successful and doesn't find any
                                  issues
  -v, --verbose                   Display more verbose output. Specifying this
                                  option multiple times will incrementally
                                  increase the amount of output.
  --log-timestamps / --no-log-timestamps
                                  Enable or disable timestamps in logging
                                  messages.  [default: log-timestamps]
  --entropy-sensitivity INTEGER RANGE
                                  Modify entropy detection sensitivity. This
                                  is expressed as on a scale of 0 to 100,
                                  where 0 means "totally nonrandom" and 100
                                  means "totally random". Decreasing the
                                  scanner's sensitivity increases the
                                  likelihood that a given string will be
                                  identified as suspicious.  [default: 75;
                                  0<=x<=100]
  --color / --no-color            Enable or disable terminal color. If not
                                  provided (default), enabled if output is a
                                  terminal (TTY).
  -V, --version                   Show the version and exit.
  -h, --help                      Show this message and exit.

Commands:
  pre-commit        Scan staged changes in a pre-commit hook.
  scan-remote-repo  Automatically clone and scan a remote git repository.
  scan-folder       Scan a folder.
  scan-local-repo   Scan a repository already cloned to your local system.

Contributing

All contributors and contributions are welcome! Please see our contributing docs for more information.

Attributions

This project was inspired by and built off of the work done by Dylan Ayrey on the truffleHog project.

tartufo's People

Contributors

bandrel avatar burgessa23 avatar christarazi avatar cooperhammond avatar dclayton-godaddy avatar dependabot[bot] avatar dxa4481 avatar emayuri-godaddy avatar icco avatar janerikcarlsen avatar jgowdy avatar jhall1-godaddy avatar jwilhelm-godaddy avatar mayuriesha avatar mdayanc-godaddy avatar milo-minderbinder avatar mxhenry-godaddy avatar namithasind avatar paulodiovani avatar rami-alloush avatar rashmi-k-a avatar rbailey-godaddy avatar rdrey avatar renovate[bot] avatar rscottbailey avatar sjacoby-godaddy avatar stephengroat avatar surbhishah avatar sushantmimani avatar tarkatronic avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

tartufo's Issues

Convert CI/CD to GitHub Actions

Feature Request

Is your feature request related to a problem? Please describe.

We currently use TravisCI for our builds. It's got a whole host of problems; far too many to go into right now. Suffice to say, we don't want to use it anymore.

Describe the solution you'd like

All of the current functionality performed by Travis should be replicated by GitHub Actions. As an added bonus, we may be able to get the individual tests steps (different tox environments) listed as separate build checks, making it easier to discover exactly what failed.

Describe alternatives you've considered

We've looked at CircleCI, but it's had its own set of problems with reliability, and recently on asherah we've had issue with visibility of builds. Plus, GitHub Actions is built right in here, and works super well.

Teachability, Documentation, Adoption, Migration Strategy

Very little should need to be changed in the docs for this. In fact, I'm not sure our CI/CD pipeline is documented at all right now. But we can leave that for another issue...

Add a command to automatically create an exclusion list

Feature Request

Is your feature request related to a problem? Please describe.

Right now, creating an exclusion list can be very time consuming, especially for a repository with a long history.
#71 (comment) calls that out. @sjacoby-godaddy wrote up a bunch of bash scripting to essentially do just this already. We should not require all users to implement this on their own.

Describe the solution you'd like

I should be able to run something like tartufo create-exclusion-list which will gather the signatures of all currently matched strings, and either

A. Print them to the command line, or
B. Inject them into the config file, creating one if it does not exist.

Heck, we could probably even give the user the option of which they prefer.

The generated list should probably contain comments showing what is excluded, to make auditing easier. For example:

[tool.tartufo]
exclude-signatures = [
  "842533f44cf32fb3d93a7f56227977aa4f16caafe82ace8f5f4de27d750c1ec1",  # tests/test_scanner.py: d15627104d...
  "d039c652f27c4d42026c5b5c9be31bfe368b283b24248e98d92f131272580053",  # tartufo/scanner.py: ABCDEFGHIJ...
]

Addition of Pull request template

๐Ÿ“ƒ Summary

Community profile suggests the use of a Pull request template

Expected documentation

Pull request template should exist and provide value to the PR process

Add trojan packages to blacklist

Feature Request

In addition to flagging disclosure of secrets, we could leverage tartufo to flag usage of known-bad libraries etc.

Is your feature request related to a problem? Please describe.

Regarding https://www.zdnet.com/article/two-malicious-python-libraries-removed-from-pypi/ which identifies python3-dateutil and jeIlyfish (note funky spelling of the latter), it seems useful to blacklist these names so we'd catch any accidental usage.

Describe the solution you'd like

Add these package names to list of blacklisted expressions.

Describe alternatives you've considered

I haven't considered alternatives -- this is a knee-jerk response. ;-)

Teachability, Documentation, Adoption, Migration Strategy

As a bonus, it would be nice if we were able to explain why these strings are blacklisted, but that presumably would require more infrastructure than just adding them to the existing blacklist. It is unfortunate that a user, seeing tartufo scream about "python3-dateutil" (for example), really won't have any idea why the error is being raised.

One approach might be to have multiple lists and report the name of the list as part of the finding, so that you might see "PGP secret key (default blacklist)" and "python3-dateutil (trojan blacklist)" tags or something similar. (Or just reformulate the existing list to include a tag/description/explanation with each pattern.)

Support Scanning Directory

Feature Request

Is your feature request related to a problem? Please describe.

We would like to scan the current directory. From this scan, we can remediate any active secret usage and allow tools like bfg to clean git history. We only are about the secrets currently being used.

Describe the solution you'd like

A CLI command like tartufo scan ..

Describe alternatives you've considered

Rumor has it we can use bfg to temporarily clean history, then rerun tartufo, but that is painful.

Teachability, Documentation, Adoption, Migration Strategy

Something like, "To run a scan against a directory, use tartufo scan [{options}] {directory}."

File Extension-based Matching

Feature Request

Is your feature request related to a problem? Please describe.

To detect literal passwords in files like YAML, JSON, JS, etc require different matching.

Examples:

  • yaml: password: mypassword
  • json: "password": "mypassword"
  • js: {password: 'mypassword'}

Describe the solution you'd like

Add the file extension to regexs file.

- message: contains a password or token
  match: '(password|passwd|pwd|token|pass)(["])?[=:]'
  conditions:
    - file: '*.js'
      match: '(password|passwd|pwd|token|pass)([''" \t]*)[=:]([ \t]*)(([''"][^''"])|([\[]))'
    - file: '*.json'
      match: '(password|passwd|pwd|token|pass)([''" \t]*)[=:]([ \t]*)(([''"][^''"])|([\[]))'
    - file: '*.py'
      match: '(password|passwd|pwd|token|pass)([''" \t]*)[=:]([ \t]*)(([''"][^''"])|([\[]))'

Describe alternatives you've considered

Teachability, Documentation, Adoption, Migration Strategy

Optimize handling of temp files (vs --cleanup)

This annoys me so much. ;-)

  1. Can we make --cleanup the default behavior, at least if no findings are found? It makes no sense that 99.999% of the time the repo is clean and the scratch directory is empty, why should the user have to expend extra effort to remove the empty directory?

Frankly, I'm not sure that --cleanup shouldn't be the default all the time, anyway. Findings are reported on stdout anyway, so the utility here seems to be the case where you expect to have massive numbers of findings and want to save them conveniently for later review.

My proposal would be this:

  • eliminate --cleanup (both the command flag and the util.clean_outputs() function)
  • push creation of output_dir from scanner's find_staged() and find_strings() into handle_results() where it is done on a just-in-time basis when needed
  • add a new flag like --save-to-dir which enables calling handle_results(), and which defaults to false/none, which means we never call handle_results() unless the user asks to save the files, and never create the directory unless we have something to put into it.
  1. In addition to, or alternatively, could we at least include "tartufo" in the name (say as a prefix argument to mkdtemp)?

Allow Entropy b64/hex Score Override

Feature Request

Is your feature request related to a problem? Please describe.

Add the ability to override the current hard coded entropy b64/hex scores. This will enable us to specify the level of entropy risk.

Describe the solution you'd like

Add b64/hex entropy options to toml config files or CLI.

Describe alternatives you've considered

Teachability, Documentation, Adoption, Migration Strategy

Remove support for Python 3.5

Python 3.5 is about to hit EOL.
Python 3.6 is better and generally available.
We already enforce >= 3.6 for aws-okta-processor.
Python 3.6 has the shiny happy new dictionary implementation.
Python 3.6 has blake2 built in, so one fewer external dependency.
Python 3.6 has class variable annotations. We can stop using the comments for type annotations. Yay!

Let's do this.

Allow configuration via file

To help with guaranteed repeatability, it should be possible to configure tartufo by way of a configuration file, as opposed to always having to specify options on the command line.

My personal recommendation would be to add the configuration to the pyproject.toml file exclusively, in a [tool.tartufo] section, as with black. But other standard places which would be acceptable include:

  • setup.cfg in a [tartufo] section
  • .tartufo.yml

Some tools allow configuring from 3 distinct locations. I think that, since we are starting fresh, a single location would be ideal. And given that pyproject.toml is becoming The Way for configuration, that is my recommendation.

The goal is that all command line options should be configurable in this file, or files, such that simply running tartufo will always produce the same results, with the desired options.

The current documentation is fragmented, scattered, and confusing

๐Ÿ“ƒ Summary

I wrote most of the documentation myself, but even I don't know where most of the bits are. It's confusing and not entirely helpful for the most part.

Expected documentation

I think it would be helpful to take a holistic view at revamping the docs to be more user friendly. With so much changing in v2.0, I think this is the perfect opportunity to work on this.

Would be good to have an option to exclude commits related to adding/removing git submodules

Feature Request

Would be good to have an option to exclude commits related to adding/removing git submodules

Is your feature request related to a problem? Please describe.

When adding submodules, git adds a diff which includes the commit hash of the submodule. similar to following:

@@ -0,0 +1 @@\n+Subproject commit 821b2b1037dd11d4afb61e1eed870ca4e2fbf608\n

This commit is getting reported by tartufo scan.

an example of issue reported by tartufo:

{
  "file_path": "packages/a-submodule-package",
  "matched_string": "821b2b1037dd11d4afb61e1eed870ca4e2fbf608",
  "diff": "@@ -0,0 +1 @@\n+Subproject commit 821b2b1037dd11d4afb61e1eed870ca4e2fbf608\n",
  "signature": "eb27ebd567f1d17149e4cc9b2221490e0d5b651c121f06c22a0364444ce8b28c",
  "issue_type": "High Entropy",
  "issue_detail": null,
  "commit_time": "2019-11-07 15:22:22",
  "commit_message": "add submodule",
  "commit_hash": "e10234695b3cdbd79cc9575111c730cd2863bf3f",
  "branch": "origin/master"
}

More information on git submodules can be found here - https://git-scm.com/book/en/v2/Git-Tools-Submodules

Describe the solution you'd like

The diff matches the following pattern:
@@ -0,0 +1 @@\n+Subproject commit 821b2b1037dd11d4afb61e1eed870ca4e2fbf608\n

I think we tartufo excludes this particular diff pattern it may solve the issue.
Also look at https://git-scm.com/book/en/v2/Git-Tools-Submodules to check if there is a better alternative.

Describe alternatives you've considered

As a workaround I excluded the submodules paths (Of the directory, not the files within the directories).

Teachability, Documentation, Adoption, Migration Strategy

tests.test_scanner.ScannerTests.test_return_correct_commit_hash scans actual git history

๐Ÿ› Bug Report

The test currently looks like this:

    def test_return_correct_commit_hash(self):
        """FIXME: Split this test out into multiple smaller tests w/o real clone
        FIXME: Also, this test will continue to grow slower the more times we commit

        Necessary:
            * Make sure all commits are checked (done)
            * Make sure all branches are checked
            * Make sure `diff_worker` flags bad diffs
            * Make sure all bad diffs are returned
        """
        # Start at commit d15627104d07846ac2914a976e8e347a663bbd9b, which
        # is immediately followed by a secret inserting commit:
        # https://github.com/dxa4481/truffleHog/commit/9ed54617547cfca783e0f81f8dc5c927e3d1e345
        since_commit = "d15627104d07846ac2914a976e8e347a663bbd9b"
        commit_w_secret = "9ed54617547cfca783e0f81f8dc5c927e3d1e345"
        xcheck_commit_w_scrt_comment = "OH no a secret"

        tmp_stdout = six.StringIO()
        bak_stdout = sys.stdout

        # Redirect STDOUT, run scan and re-establish STDOUT
        sys.stdout = tmp_stdout
        try:
            # We have to clone tartufo mostly because TravisCI only does a shallow clone
            repo_path = util.clone_git_repo("https://github.com/godaddy/tartufo.git")
            try:
                scanner.find_strings(
                    str(repo_path),
                    since_commit=since_commit,
                    print_json=True,
                    suppress_output=False,
                )
            finally:
                shutil.rmtree(repo_path)
        finally:
            sys.stdout = bak_stdout

        json_result_list = tmp_stdout.getvalue().split("\n")
        results = [json.loads(r) for r in json_result_list if bool(r.strip())]
        filtered_results = [
            result for result in results if result["commit_hash"] == commit_w_secret
        ]
        self.assertEqual(1, len(filtered_results))
        self.assertEqual(commit_w_secret, filtered_results[0]["commit_hash"])
        # Additionally, we cross-validate the commit comment matches the expected comment
        self.assertEqual(
            xcheck_commit_w_scrt_comment, filtered_results[0]["commit"].strip()
        )

No mocks, reading from the file system, checking the actual git history. The biggest problem with this is that, as the history of this project grows, this test continues to take longer.

This needs to be fixed, which may include some actual code cleanup, as the current code is not the most testable.

Regex-based Exclusions

Feature Request

Is your feature request related to a problem? Please describe.

To avoid noise in a scan, we'd like to provide exclusions using regular expressions on the contents of files.

Describe the solution you'd like

An exclusions file that can contain the regular expression exclusion rules.

For example, to ignore a git hash in a url:

'([ =''/"]+)[a-z0-9]{40}([/''" ])?$'

Describe alternatives you've considered

Teachability, Documentation, Adoption, Migration Strategy

The new implementation is a bit slower than the old

๐Ÿ› Bug Report

When self-scanning tartufo, I found that it was taking a bit longer (using time tartufo to measure) to run a full scan. In general, it changed from about 13 seconds to about 20 seconds. I did some preliminary testing and wasn't able to find any obvious reasons why. It'd be good to go through at some point with a fine-toothed profiling comb and see what can be optimized in here.

On a fun note, I found that the @lru_cache decorators I used did shave about a second off the run time! Yay!

It's worth mentioning that I built the new implementation with the intention of optimizing for memory rather than speed, using things such as generators etc. I don't know if some of this ended up being detrimental to the speed. But may be worth investigating.

To Reproduce

Run the following commands and you'll be able to see the discrepancy.

git checkout v1.x
time tartufo
git checkout master
time tartufo

Expected Behavior

The new version should be a similar speed, if not faster. Certainly not nearly twice as slow.

Documentation should be expanded and published to RTD

๐Ÿ“ƒ Summary

Right now, the documentation consists of nothing but a README.md. While this covers some of the basics, it does not go very deep into things like methodology, configuration, code structure, etc.

Expected documentation

It would be nice to have a full documentation site published to https://readthedocs.org/. This ยท is ยท the ยท standard ยท for ยท Python ยท projects.

There is a Getting Started guide here: https://docs.readthedocs.io/en/stable/intro/getting-started-with-sphinx.html

If a `pyproject.toml` is present, `tartufo.toml` is never checked for configuration

๐Ÿ› Bug Report

In Python projects, the pyproject.toml file is becoming somewhat ubiquitous. It is being used for configuration of all forms of tooling, and even tartufo allows users to place their config in this file. However, if users opt to, instead, put their tartufo config into a tartufo.toml file when they already have a pyproject.toml file, this will have no effect.

This is especially problematic once users start using hash-based signatures. These signatures, themselves, will get detected and reported as high entropy strings, meaning that the tartufo configuration file should be ignored. In practice, it should generally be perfectly acceptable to ignore the tartufo.toml, but ignoring an entire pyproject.toml, which could have some legitimate secrets, should be avoided.

To Reproduce

  • Create a pyproject.toml file, even an empty one
  • Create a tartufo.toml file, with a [tool.tartufo] section and actual config rules
  • Run tartufo
  • The configuration found in the tartufo.toml will not be read nor used.

Expected Behavior

If a tartufo.toml is present, that should likely take precedence over pyproject.toml. Also, a pyproject.toml with no [tool.tartufo] section should not prevent the tartufo.toml from being read.

The simplest and most obvious fix might just be to swap the order in which the files are checked: Check for a tartufo.toml first; if that doesn't exist, check for a pyproject.toml. This will need to be done in both the config.read_pyproject.toml, and in the code that checks repositories for their local exclusion rules. (Does this still exist? Did this get removed in the work up to 2.0?)

Refine the error message when unable to read from remote repository

Feature Request

Refine the error message when unable to read from remote repository

Is your feature request related to a problem? Please describe.

If the remote git repository is not reachable due to network issue / VPN connection or Due to remote not being configured correctly, running tartufo scan results in an error like the following:

Traceback (most recent call last):
  File "/usr/local/bin/tartufo", line 10, in <module>
    sys.exit(main())
  File "/Library/Python/3.8/site-packages/click/core.py", line 829, in __call__
    return self.main(*args, **kwargs)
  File "/Library/Python/3.8/site-packages/click/core.py", line 782, in main
    rv = self.invoke(ctx)
  File "/Library/Python/3.8/site-packages/click/core.py", line 1259, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/Library/Python/3.8/site-packages/click/core.py", line 1066, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/Library/Python/3.8/site-packages/click/core.py", line 610, in invoke
    return callback(*args, **kwargs)
  File "/Library/Python/3.8/site-packages/click/decorators.py", line 33, in new_func
    return f(get_current_context().obj, *args, **kwargs)
  File "/Library/Python/3.8/site-packages/click/decorators.py", line 21, in new_func
    return f(get_current_context(), *args, **kwargs)
  File "/Users/sambhav/Library/Python/3.8/lib/python/site-packages/tartufo/commands/scan_local_repo.py", line 40, in main
    issues = scanner.scan()
  File "/Users/sambhav/Library/Python/3.8/lib/python/site-packages/tartufo/scanner.py", line 188, in scan
    for chunk in self.chunks:
  File "/Users/sambhav/Library/Python/3.8/lib/python/site-packages/tartufo/scanner.py", line 306, in chunks
    branches = self._repo.remotes.origin.fetch()
  File "/Library/Python/3.8/site-packages/git/remote.py", line 797, in fetch
    res = self._get_fetch_info_from_stderr(proc, progress)
  File "/Library/Python/3.8/site-packages/git/remote.py", line 676, in _get_fetch_info_from_stderr
    proc.wait(stderr=stderr_text)
  File "/Library/Python/3.8/site-packages/git/cmd.py", line 408, in wait
    raise GitCommandError(self.args, status, errstr)
git.exc.GitCommandError: Cmd('git') failed due to: exit code(128)
  cmdline: git fetch -v origin
  stderr: 'fatal: Could not read from remote repository.

Please make sure you have the correct access rights
and the repository exists.'

Describe the solution you'd like

It would be better if tartufo wraps the error and gives a more refined error message when this happens.

Describe alternatives you've considered

NA

Teachability, Documentation, Adoption, Migration Strategy

Users would see a more clearer error message and will be able to identify the cause of issue quickly.

Add support for tagging repository with scan results

Feature Request

After scanning a repository with tartufo, I would like to add an assertion to the repository that a scan was completed (successfully). Presumably this assertion would include the tartufo version and arguments.

Is your feature request related to a problem? Please describe.

Suppose I am migrating a repository to a new environment. I, the owner, will have run tartufo to ensure the repo is "clean" prior to migration. It would be nice to avoid re-running a potentially time- and resource-intensive full history scan.

Describe the solution you'd like

  • When a scan completes and finds no problems, it optionally creates a tag (or whatever) in the target repository that asserts this fact.
  • Finding a tartufo assertion in the commit history (as a tag?) would be more convenient and at least or more secure than communicating a head commit OOB and claiming "it's clean"
  • (scope creep already!) Optionally, subsequent tartufo scans could be short-circuited by not progressing backwards in time beyond an encountered successful scan assertion.

This presents an open problem (caveat emptor) regarding treatment of encountered scan tags:

  • Do I trust the person who created the tag (not to have falsified it)?
  • Do I like/accept (or can I even see) the flags used by the scan?

Describe alternatives you've considered

I haven't, really. But "repeating scans in their entirety to trust-but-verify" seems heavy-handed, and "well just email me a commit and tell me you did it" seems a little flaky.

Teachability, Documentation, Adoption, Migration Strategy

Probably something like:

tartufo --tag-if-passed ...

and optionally

tartufo --stop-at-previous-scan ...

Drop Python 2 support

Feature Request

Is your feature request related to a problem? Please describe.

Python 2.7 is officially EOL as of Jan 1, 2020. And we have to go through all kinds of effort to keep things working against it. We should explicitly stop that.

Describe the solution you'd like

Down with Python 2.7! Remove the future imports! Remove the backports!

And since we'll then be Python 3.5+, we can start using PEP 484 style type hints for function signatures, instead of all the comments. We will have to keep the comments for variables though, until such time as we exclusively support 3.6+.

Describe alternatives you've considered

The alternative is... not doing it. Which is just not so much an option.

Teachability, Documentation, Adoption, Migration Strategy

This will be a new major version, so it will be easy to say that version 2.0+ only supports Python 3.5+, and if Python 2.7 support is needed, then users will have to stick to 1.x.

Diff output shows additions as deletions and vice versa

๐Ÿ› Bug Report

It appears that the diff output from tartufo is produced in the wrong direction; lines which were added in a commit are shown as deleted, and lines which were deleted in the commit are shown as added. For example:

~~~~~~~~~~~~~~~~~~~~~
Reason: High Entropy
Filepath: tartufo/scanner.py
Branch: origin/master
Date: 2019-12-10 12:34:56
Hash: 2b46d80fcff3defb102dd0a1b844c7cbb521767a
Commit: Remove backport imports

@@ -1,12 +1,15 @@
 # -*- coding: utf-8 -*-

+from __future__ import absolute_import
+from __future__ import division
+from __future__ import print_function
+
 import datetime
 import enum
 import hashlib
 import json
 import math
 import os
-import pathlib
 import tempfile
 import uuid
 from typing import cast, Dict, Iterable, List, Optional, Pattern, Set, Union
@@ -17,6 +20,11 @@ import toml
 from tartufo import config
 from tartufo.util import style_ok, style_warning

+try:
+    import pathlib
+except ImportError:
+    import pathlib2 as pathlib  # type: ignore
+

 BASE64_CHARS = "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+/="
 HEX_CHARS = "1234567890abcdefABCDEF"

~~~~~~~~~~~~~~~~~~~~~

As you can see from the commit message, this actually removed the backward compatibility imports, but this shows that they were added in this commit.

I tested this in truffleHog as well, and it exhibits the same behavior.

To Reproduce

From the root of a tartufo clone, run the following:

tartufo --repo-path . --since-commit 16312b6935d98cfb24c9493dcdfa85593196f065 --cleanup

Expected Behavior

The diff should show code alterations as they actually happened, rather than reversed.

Environment

I have the following configuration in my pyproject.toml (the same which is checked into the tartufo repo):

[tool.tartufo]
repo-path = "."
json = false
cleanup = true
regex = true
entropy = true

Support scanning bare repos

Currently, attempting to scan a bare git repo, using --repo_path, produces an error along the lines of:

Traceback (most recent call last):
  File "/home/jwilhelm/Documents/workspace/tartufo/.venv/bin/tartufo", line 11, in <module>
    load_entry_point('tartufo', 'console_scripts', 'tartufo')()
  File "/home/jwilhelm/Documents/workspace/tartufo/tartufo/cli.py", line 58, in main
    path_exclusions=path_exclusions,
  File "/home/jwilhelm/Documents/workspace/tartufo/tartufo/scanner.py", line 287, in find_strings
    for curr_commit in repo.iter_commits(branch_name, max_count=max_depth):
  File "/home/jwilhelm/Documents/workspace/tartufo/.venv/lib64/python3.7/site-packages/git/objects/commit.py", line 278, in _iter_from_process_or_stream
    finalize_process(proc_or_stream)
  File "/home/jwilhelm/Documents/workspace/tartufo/.venv/lib64/python3.7/site-packages/git/util.py", line 332, in finalize_process
    proc.wait(**kwargs)
  File "/home/jwilhelm/Documents/workspace/tartufo/.venv/lib64/python3.7/site-packages/git/cmd.py", line 414, in wait
    raise GitCommandError(self.args, status, errstr)
git.exc.GitCommandError: Cmd('git') failed due to: exit code(128)
  cmdline: git rev-list --max-count=1000000 1/head --
  stderr: 'fatal: bad revision '1/head'

This is because a bare repo is a wholly different structure from a normal clone, and produces different results from git operations. The specific problem causing the error here is this:

>>> repo = git.Repo('tartufo.git')
>>> for branch in repo.remotes.origin.fetch():
...     print(branch.name)
... 
master
split_tests
1/head
10/head
11/head
16/head
19/head
2/head
20/head
21/head
22/head
23/head
24/head
25/head
3/head
8/head
9/head
v0.0.1
v0.0.2
>>> repo = git.Repo('../tartufo')
>>> for branch in repo.remotes.origin.fetch():
...     print(branch.name)
... 
origin/master
origin/split_tests
>>>

All of the X/head references are not actual valid git revisions, and so tartufo chokes on them.

We should figure out a way to either scan all refs, or only scan actual branches.

The searchOrg script should become a first class citizen

Feature Request

Is your feature request related to a problem? Please describe.

Right now, searchOrg.py exists under the scripts folder as a completely isolated entity. As the code evolves, this is guaranteed to break, because it's not currently testable. It's also not ideal to run, since you have to manually install extra dependencies, which are not documented.

Also, the inclusion of all the regex rules in the script is somewhat conspicuous, since all others have been moved out to an external module.

This current structure is just not maintainable.

Describe the solution you'd like

The searchOrg script should be migrated, either to a console_script, how tartufo itself is, or moved out to an external package. Something like tartufo-org-scanner.

Describe alternatives you've considered

The only other option I can see would be to leave it where it is and try to keep it up to date in-place. But this is less than ideal, for reasons mentioned up in the problem section.

Teachability, Documentation, Adoption, Migration Strategy

Given that there are is no current documentation for this script, I don't think there's any "migration path", per se. We would just be implementing a new entry point; a whole new feature.

The code itself is poorly and inconsistently documented

๐Ÿ“ƒ Summary

Most of the code, including code currently in PR (#67), does not have docstrings. This makes the code overall harder to understand, and makes for a steeper developer onboarding curve.

Expected documentation

All modules, all functions, all classes, and all methods should have docstrings describing them using conventions specified in PEP 257, and Sphinx Annotations.

Ideally this will be checked and enforced via a pylint plugin, if such a thing exists. I'm pretty sure it will, since one does exist for flake8.

Remove log location output if no security risks are detected

Feature Request

Is your feature request related to a problem? Please describe.

No.

Describe the solution you'd like

By default, when no security risks are discovered Tartufo should not output a log location. This should be opt-in using a --output-log-location flag.

Describe alternatives you've considered

Nothing.

Teachability, Documentation, Adoption, Migration Strategy

I envision something like tartufo --regex --output-log-location to show log location when no risks are found. Otherwise, no output when no risks are found.

When cloning a remote repo, config should be read from its pyproject.toml

We will soon be able to read default configuration from a local pyproject.toml file, but this leaves a bit of a gap:

When you invoke tartufo with a git_url, it will clone that url into a local folder, and then scan that folder. But if that repo contains a pyproject.toml with a [tool.tartufo] section, we don't respect any of those options.

I don't think it makes sense, in this scenario, to read all options. But I think it does make sense to scan select options. The options I think we should attempt to read are:

  • rules -- files containing regexes
  • include-files -- Files to be included in the scan
  • exclude-files -- Files to be excluded from the scan

Config is no longer read from remote repositories upon scan

๐Ÿ› Bug Report

In v1.x, a target repository would be checked for the presence of a pyproject.toml/tartufo.toml in order to gather that repository's exclusion rules: https://github.com/godaddy/tartufo/blob/v1.x/tartufo/scanner.py#L360-L385

This functionality appears to have been missed in the lead up to 2.0.

To Reproduce

Running against tartufo itself will reproduce this issue:

> tartufo -od ~/temp scan-remote-repo [email protected]:godaddy/tartufo.git
# ... massive amounts of output here from detected issues ...
Results have been saved in ~/temp/tartufo-scan-results-2020-09-21T09:03:51.974672
>

Expected Behavior

tartufo should read the configuration file(s) from the remote repository before scanning in order to utilize its exclusion and inclusion rules. Both path-based and signature-based rules should be taken into consideration.

Code Example

N/A

Environment

This is being run from an empty directory with no explicit configuration.

CI/CD fails on master during deploy

The error we see is:

Uploading distributions to https://upload.pypi.org/legacy/
Uploading tartufo-0.0.1-py2.py3-none-any.whl
100%|โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆ| 20.2k/20.2k [00:00<00:00, 73.9kB/s]
NOTE: Try --verbose to see response content.
HTTPError: 403 Client Error: Invalid or non-existent authentication information. for url: https://upload.pypi.org/legacy/
The command "./scripts/build_and_deploy_wheel.sh" exited with 1.

The problems here are two-fold:

  1. We do not have any credentials specified for PyPi
  2. We are attempting to publish an artifact on every merge to master, instead of only when we want to make a new release.

For an example of a working implementation using TravisCI, see here: https://github.com/godaddy/django-snow/blob/master/.travis.yml#L41-L55

This has an encrypted/secure password stored in the config file, and only attempts to push to PyPi when a new release is tagged in the form X.Y.Z or vX.Y.Z.

Running `tartufo` with no arguments produces an error

(.venv) [jwilhelm@LMIT-JWILH tartufo]$ tartufo
Traceback (most recent call last):
  File "/Users/jwilhelm/Documents/workspace/tartufo/.venv/bin/tartufo", line 11, in <module>
    load_entry_point('tartufo', 'console_scripts', 'tartufo')()
  File "/Users/jwilhelm/Documents/workspace/tartufo/tartufo/tartufo.py", line 35, in main
    rules_regexes = configure_regexes_from_args(args, truffleHogRegexes.regexChecks.regexes)
AttributeError: module 'truffleHogRegexes' has no attribute 'regexChecks'
(.venv) [jwilhelm@LMIT-JWILH tartufo]$

This is after a fresh pip install -e .[tests], ensuring that I have the latest everything installed.

Alternatively, running tartufo --help works as expected.

(.venv) [jwilhelm@LMIT-JWILH tartufo]$ tartufo --help
usage: tartufo [-h] [--json] [--git-rules-repo GIT_RULES_REPO]
               [--git-rules GIT_RULES_FILENAMES [GIT_RULES_FILENAMES ...]]
               [--rules RULES_FILENAMES [RULES_FILENAMES ...]]
               [--default-regexes [BOOLEAN]] [--entropy [BOOLEAN]]
               [--regex [BOOLEAN]] [--since_commit SINCE_COMMIT]
               [--max_depth MAX_DEPTH] [--branch BRANCH]
               [-i INCLUDE_PATHS_FILE] [-x EXCLUDE_PATHS_FILE]
               [--repo_path REPO_PATH] [--cleanup] [--pre_commit]
               [git_url]
...etc...

Running tartufo from docker does not scan target repo

๐Ÿ› Bug Report

Running Tartufo from the docker image in the recommended manner results in Tartufo scanning itself in the docker image, not the target repository

To Reproduce

docker pull godaddy/tartufo
docker run -t --rm -v "$PWD:/git" godaddy/tartufo --repo_path .

results in tartufo scanning the tartufo directory in the container, not the mounted repo.

Changing the invocation to reference the mountpoint correctly fixes the scan target issue, but leads to failing ssh

docker run -t --rm -v "$PWD:/git" godaddy/tartufo --repo-path /git
Traceback (most recent call last):
...
cmdline: git fetch -v origin
stderr: 'error: cannot run ssh: No such file or directory

Expected Behavior

Docker container should be able to execute the above command successfully and without user interaction

Code Example

At minimum requires the addition of

apk add openssh-client

to the Dockerfile

Whitelist needs more granularity

ISBAT: define down to the line and column of the file whether to ignore a violation. Current implementation of ignoring an entire file is too broad.

Tests fail on MacOS

๐Ÿ› Bug Report

Tests pass successfully on Linux, but fail to pass on MacOS. Currently 4 tests are failing:

       4 failed
         - tests/test_cli.py:56 ProcessIssuesTest.test_output_dir_is_called_out
         - tests/test_pre_commit.py:10 PreCommitTests.test_scan_is_executed_against_current_working_directory
         - tests/test_scan_local_repo.py:23 ScanLocalRepoTests.test_scan_exits_gracefully_when_target_is_not_git_repo
         - tests/test_scan_remote_repo.py:98 ScanRemoteRepoTests.test_subdir_of_work_dir_is_passed_to_clone_repo

An example error output is:

E           AssertionError: Expected call: clone_git_repo('[email protected]:godaddy/tartufo.git', PosixPath('/var/folders/p3/v5952rgx353dpjss7_fbldy00000gp/T/tmp2whea6u6/tartufo.git'))
E           Actual call: clone_git_repo('[email protected]:godaddy/tartufo.git', PosixPath('/private/var/folders/p3/v5952rgx353dpjss7_fbldy00000gp/T/tmp2whea6u6/tartufo.git'))

To Reproduce

On MacOS, run the tests either with pytest, or tox.

Expected Behavior

Tests should behave consistently across platforms!

Code Example

N/A

Environment

MacOS

Additive whitelist/blacklist support

ISBAT: define additional whitelist/blacklist rules or expressions to be included beyond the base ones by providing a list of additive rules defined in a file. This would allow the augmentation of the base rules on a case-by-case basis without replacing the existing rules or having to copy the existing rules into a new rules file as how the current --rules switch works.

Allow regex rules to be stored in a separate repository, specified on the command line

Feature Request

Is your feature request related to a problem? Please describe.

When attempting to scan a great number of repos, I don't have a good way of storing a global list of regular expressions to search for.

Describe the solution you'd like

I would like to be able to build a separate git repository consisting exclusively of rules files. Then, have a way to specify on the command line, where that repository exists. Something like

> tartufo [email protected]:tarkatronic/lots-of-rules.git ...

This would clone the specified repo, read in the rules, and use them as supplemental regex checks.

Report out of the list of exclusions

Feature Request

I would like a CLI output and/or a output.txt file generate with the list of exclusions when the scan was run.

Is your feature request related to a problem? Please describe.

When run the scan with exclusions and it is clean, no output to CLI.

Describe the solution you'd like

Output to CLI:
date:time
'02/20/2020 - 100% Clean'
List of exclusions:
excludethisfile.txt

Describe alternatives you've considered

Teachability, Documentation, Adoption, Migration Strategy

Add logging to tartufo

Feature Request

Is your feature request related to a problem? Please describe.

It's hard to see into what tartufo is doing because right now there is no logging available anywhere. This also makes debugging errors much more difficult. Even as I'm working on development, I usually do shotgun print debugging, wishing I could just add a debug log message.

Describe the solution you'd like

  • There should be log message liberally sprinkled throughout the code base with appropriate log levels (info, debug, warning, error).
  • There should be a -v/--verbose option, which can be repeated for extra verbosity (-vvv), AND/OR a -l LOGLEVEL for example -l info to specify desired log level
  • There should be a -q/--quiet option if there is default output, which mutes all output

One possibility I've looked at that might make this easier is click-log.

This might also help with #37, #42, and #43.

Given a thorough implementation this could also help with #11.

Consider using sphinx-click

๐Ÿ“ƒ Summary

Right now, the usage shown both in the README.md and the docs/configuration.rst is copy-pasted by hand. This is error prone and can easily become outdated. Especially as #39 gets implemented, this situation will only become worse. It'd be great if we could automate this somehow.

sphinx-click claims to provide that. So, we should try it out and see if we can make it work for our needs. That would be ideal!

Expected documentation

We will want automatically updated documentation of the usage of the primary command, as well as all sub-commands.

Split functionality into sub-commands?

Feature Request

Is your feature request related to a problem? Please describe.

This results from some thoughts I've had on a few issues:

  1. The difference between specifying git_url and --repo-path is nebulous at best. Especially since you can specify a filesystem path as git_url, and it will just re-clone that to a temporary directory.
  2. Neither of those options applies when using --pre-commit.
  3. We've talked, in #23, about integrating the searchOrg functionality into core, as opposed to a hacky script. Adding another --github-org-url is... not great.

Describe the solution you'd like

The main solution I have come up with is to use sub-commands. So then we would end up with, for example:

$ tartufo scan-remote <git url>
$ tartufo scan-local <filesystem path>
$ tartufo pre-commit
$ tartufo scan-github-org

Most options would stay top-level/global, so available to all commands. So you would specify for example:

$ tartufo --regex --entropy --json scan-local ./my/awesome/repo

Describe alternatives you've considered

The only alternative I can think is to split this into multiple commands... which kinda sucks.

Teachability, Documentation, Adoption, Migration Strategy

This would definitely be a breaking change, and so would require a major version change. Luckily, this is already planned for 2.0, with a bunch of other breaking changes. We also now have a Read The Docs site set up, which will help with documentation. I've also just found a sphinx-click plugin which might help us automagically document the usage. But I haven't used it yet, so I can't be absolutely sure.

This would be a huge change, so I would love to have some discussion on the topic, and see if this even makes sense, before anybody just dives into it.

Report output on clean or successful scan

Feature Request

Need output to screen and or text file when scan results are clean.

Example:
scan run: 02-20-2020
scan result: 100% Clean. Success

or

'02/20/2020 - 100% Clean'

Is your feature request related to a problem? Please describe.

If scan is clear, command line returns nothing. Safe to assume the scan was clean....but would like CLI feedback.

Describe the solution you'd like

CLI returns when no errors/issues:

'date:time- 100% Clean'

Describe alternatives you've considered

Teachability, Documentation, Adoption, Migration Strategy

Consider switching away from GitPython

Feature Request

Is your feature request related to a problem? Please describe.

GitPython is "in maintenance mode", and has no active development or even active bugfixing. It does not play well with Windows. And it's actually mostly a wrapper around the git command line application. Which seems... strange.

Describe the solution you'd like

I would like to see us move away from using GitPython, with the hopes of adding first-class Windows support and improving some of the performance of this application.

Describe alternatives you've considered

One option is for us to take over maintenance of GitPython. Honestly, that's just not realistic. So let's scratch that one immediately.

A larger discussion around this can be found in the dvc project. A lot of good information on the topic can be found in that thread, along with two alternatives:

I am leaning toward dulwich at the moment. It's pure Python, and looks a little higher level than pygit2. In general I'm okay with low level APIs, but it seems to me that pygit2 is just a bit too low level/primitive. For example, this is how you clone a repo over SSH. I would rather not have to get that far into the weeds.

Teachability, Documentation, Adoption, Migration Strategy

This change should be transparent to the user, with the only changes being actual Windows support and hopefully improved performance.

Files with symbols are not omitted when added to the exclusion paths list

๐Ÿ› Bug Report

Given a file with this full path:

Tests/Classes/SomeClass+Tests.swift

and an entry into the file path exclusions list that matches the above, tartufo will still output results for that file.

If you try to escape the symbol, the same result as above is achieved.

The only way to omit the file in question from the results is (as far as I can tell) to wild-card the entire parent directory.

To Reproduce

Create a file with a symbol in the name
Add a high entropy value to the file
Add the file to path to an exclusion list
Run tartufo with exclusion paths

Expected Behavior

No output for the excluded path to be shown

Expand CI to run across multiple platforms

Feature Request

Is your feature request related to a problem? Please describe.

Recently, I've been doing most of my dev work for tartufo on a Linux machine. Everything was great and hunky-dory, until I moved over to my Mac and tried running the test suite there. That was when #94 reared its ugly head. We should be able to safe-guard against these types of things happening.

Describe the solution you'd like

The testing matrix for tartufo should expand to run on as many platforms as we can make it. We should be able to, at minimum, run on both Ubuntu and Mac OS. I believe it is also possible to run GitHub Actions against Windows, but I am uncertain if our underlying gitpython dependency supports Windows. It's probably worth trying out though! Ultimately, Windows support would be great!

Describe alternatives you've considered

The only alternative I can see is manually testing on multiple platforms. And that's just not great.

Teachability, Documentation, Adoption, Migration Strategy

Nothing should be necessary!

Docker release script tags pre-releases as "latest"

๐Ÿ› Bug Report

After releasing v2.0.0a1 it was brought up that the latest tag for the tartufo Docker image now points to this alpha version. (https://hub.docker.com/layers/godaddy/tartufo/latest/images/sha256-c003fb74266289a49ad5304ae31e99ca7e42b980c8d2633b4a1585ce9ad84ce9?context=repo)

This should not be the case, since this was not a final release. It should still be pointing to the previous release: v1.1.2

To Reproduce

Issue a pre-release from the project

Expected Behavior

Only final releases should get tagged as "latest".

Specifying a non-git path throws a big error

๐Ÿ› Bug Report

When the path specified to scan is not a git repository, tartufo throws a big ugly error message. This message can even come in two different flavors, depending how you call it!

To Reproduce

Example 1: Specifying as a GIT_URL

$ tartufo ~/
Traceback (most recent call last):
  File "/Users/jwilhelm/Documents/workspace/godaddy/tartufo/.venv/bin/tartufo", line 11, in <module>
    load_entry_point('tartufo', 'console_scripts', 'tartufo')()
  File "/Users/jwilhelm/Documents/workspace/godaddy/tartufo/.venv/lib/python3.7/site-packages/click/core.py", line 829, in __call__
    return self.main(*args, **kwargs)
  File "/Users/jwilhelm/Documents/workspace/godaddy/tartufo/.venv/lib/python3.7/site-packages/click/core.py", line 782, in main
    rv = self.invoke(ctx)
  File "/Users/jwilhelm/Documents/workspace/godaddy/tartufo/.venv/lib/python3.7/site-packages/click/core.py", line 1066, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/Users/jwilhelm/Documents/workspace/godaddy/tartufo/.venv/lib/python3.7/site-packages/click/core.py", line 610, in invoke
    return callback(*args, **kwargs)
  File "/Users/jwilhelm/Documents/workspace/godaddy/tartufo/.venv/lib/python3.7/site-packages/click/decorators.py", line 21, in new_func
    return f(get_current_context(), *args, **kwargs)
  File "/Users/jwilhelm/Documents/workspace/godaddy/tartufo/tartufo/cli.py", line 151, in main
    repo_path = util.clone_git_repo(options.git_url)
  File "/Users/jwilhelm/Documents/workspace/godaddy/tartufo/tartufo/util.py", line 62, in clone_git_repo
    git.Repo.clone_from(git_url, project_path)
  File "/Users/jwilhelm/Documents/workspace/godaddy/tartufo/.venv/lib/python3.7/site-packages/git/repo/base.py", line 1019, in clone_from
    return cls._clone(git, url, to_path, GitCmdObjectDB, progress, multi_options, **kwargs)
  File "/Users/jwilhelm/Documents/workspace/godaddy/tartufo/.venv/lib/python3.7/site-packages/git/repo/base.py", line 960, in _clone
    finalize_process(proc, stderr=stderr)
  File "/Users/jwilhelm/Documents/workspace/godaddy/tartufo/.venv/lib/python3.7/site-packages/git/util.py", line 328, in finalize_process
    proc.wait(**kwargs)
  File "/Users/jwilhelm/Documents/workspace/godaddy/tartufo/.venv/lib/python3.7/site-packages/git/cmd.py", line 408, in wait
    raise GitCommandError(self.args, status, errstr)
git.exc.GitCommandError: Cmd('git') failed due to: exit code(128)
  cmdline: git clone -v /Users/jwilhelm/ /var/folders/p3/v5952rgx353dpjss7_fbldy00000gp/T/tmptgway4qe
  stderr: 'fatal: repository '/Users/jwilhelm/' does not exist

Example 2: Specifying as a --repo-path

$ tartufo --repo-path ~/
Traceback (most recent call last):
  File "/Users/jwilhelm/Documents/workspace/godaddy/tartufo/.venv/bin/tartufo", line 11, in <module>
    load_entry_point('tartufo', 'console_scripts', 'tartufo')()
  File "/Users/jwilhelm/Documents/workspace/godaddy/tartufo/.venv/lib/python3.7/site-packages/click/core.py", line 829, in __call__
    return self.main(*args, **kwargs)
  File "/Users/jwilhelm/Documents/workspace/godaddy/tartufo/.venv/lib/python3.7/site-packages/click/core.py", line 782, in main
    rv = self.invoke(ctx)
  File "/Users/jwilhelm/Documents/workspace/godaddy/tartufo/.venv/lib/python3.7/site-packages/click/core.py", line 1066, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/Users/jwilhelm/Documents/workspace/godaddy/tartufo/.venv/lib/python3.7/site-packages/click/core.py", line 610, in invoke
    return callback(*args, **kwargs)
  File "/Users/jwilhelm/Documents/workspace/godaddy/tartufo/.venv/lib/python3.7/site-packages/click/decorators.py", line 21, in new_func
    return f(get_current_context(), *args, **kwargs)
  File "/Users/jwilhelm/Documents/workspace/godaddy/tartufo/tartufo/cli.py", line 153, in main
    repo_scanner = scanner.GitRepoScanner(options, repo_path)
  File "/Users/jwilhelm/Documents/workspace/godaddy/tartufo/tartufo/scanner.py", line 244, in __init__
    self._repo = self.load_repo(self.repo_path)
  File "/Users/jwilhelm/Documents/workspace/godaddy/tartufo/tartufo/scanner.py", line 266, in load_repo
    return git.Repo(repo_path)
  File "/Users/jwilhelm/Documents/workspace/godaddy/tartufo/.venv/lib/python3.7/site-packages/git/repo/base.py", line 181, in __init__
    raise InvalidGitRepositoryError(epath)
git.exc.InvalidGitRepositoryError: /Users/jwilhelm

Expected Behavior

This should, instead, give a nice friendly error message along the lines of /Users/jwilhelm is not a valid git repository.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.