Coder Social home page Coder Social logo

mazen160 / secrets-patterns-db Goto Github PK

View Code? Open in Web Editor NEW
995.0 12.0 120.0 179 KB

Secrets Patterns DB: The largest open-source Database for detecting secrets, API keys, passwords, tokens, and more.

Home Page: https://mazinahmed.net/blog/secrets-patterns-db/

License: Creative Commons Attribution Share Alike 4.0 International

Python 69.06% Shell 0.16% JavaScript 2.27% PowerShell 28.51%
gitleaks regex regular-expression regular-expressions secrets secrets-detection trufflehog trufflehog3

secrets-patterns-db's Introduction

πŸ—„οΈ Secrets Patterns Database πŸ—„οΈ

The largest open-source database for detecting secrets, API keys, passwords, tokens, and more. Use secrets-patterns-db to feed your secret scanning engine with regex patterns for identifying secrets.


πŸš€ Features

  • Over 1600 regular expressions for detecting secrets, passwords, API keys, tokens, and more.
  • Format agnostic. A Single format that supports secret detection tools, including Trufflehog and Gitleaks.
  • Tested and reviewed Regular expressions.
  • Categorized by confidence levels of each pattern.
  • All regular expressions are tested against ReDos attacks.

❔ Why?

There are limited resources online for Regular Expressions patterns for secrets. TruffleHog offers ~700 as built-in rules. GitLeaks offers ~60 rules. While it's a good start, it's not enough. There's a lot of work that needs to be done for maintenance and keeping up with new secrets patterns.

I have collected and curated Regular Expressions Patterns for Secrets, API Tokens, Keys, and Passwords. I'm open-sourcing the database I built (Secrets-Patterns-DB), and hope that security teams contribute to it!

The Secrets-Patterns-DB contains over 1600 Regular Expressions. I have also written scripts to validate Regexes against ReDoS attacks, and CI jobs to load and validate Regexes, and I also manually cleaned-up invalid ones.

It's in Beta. There’s a lot of room for improvement on the project. I'm looking forward to your Pull Requests and Issues on Github to enhance Secrets-Patterns-DB for everyone.

Are you planning to enhance your secrets detection in your AppSec program? Please take some time to contribute to the project! πŸ™


πŸ’» Contribution

Contribution is always welcome! Please feel free to report issues on Github and create Pull Requestss for new features.

πŸ“Œ Ideas to Start on

Using

For Trufflehog v2 $> ./convert-rules.py --db ../db/rules-stable.yml --type trufflehog For Gitleaks $> ./convert-rules.py --db ../db/rules-stable.yml --type gitleaks

Optional: --export - Set filename, extension will be added by type (gitleaks = toml, trufflehog = json)

Would like to contribute to secrets-patterns-db? Here are some ideas that you may start with:

  • Support severity
  • Categorize patterns by type?
  • Categorize patterns by tags?
  • Support more tools?

πŸ“„ License

This work is licensed under a Creative Commons Attribution 4.0 International License.

πŸ’š Author

Mazin Ahmed

secrets-patterns-db's People

Contributors

awkspace avatar el-prova avatar ethorneloe avatar jatrost avatar mazen160 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

secrets-patterns-db's Issues

Add container registry rule

Hey I very much like this project and I'm using it in one of my own.

I found one rule to be missing and I'm unsure where to place it.

  - pattern:
      name: docker_registry_auth
      regex: auth":.*"(?:[A-Za-z0-9+\/]{4})*(?:[A-Za-z0-9+\/]{4}|[A-Za-z0-9+\/]{3}=|[A-Za-z0-9+\/]{2}={2})
      confidence: high

Could you add it or tell me where I need to insert it?

cheers

nuclei support

Hi, how to use below 2 file with nuclei ?

nuclei-regexes.yml
nuclei-generic-1.yaml

BBOT Integration

Awesome work on this repo!

I have integrated this into a new BBOT module, and the results look very promising. The module enables secrets-patterns-db to search web page content (html, .js files, etc.) across an organization's entire attack surface.

The main challenge is weeding out the false positives. Right now there are quite a few of these, as you can see even by running the BBOT module against www.example.com:

git clone -b secrets-patterns-db https://github.com/blacklanternsecurity/bbot && cd bbot
pip install poetry; poetry install; poetry shell

bbot -m secrets -t http://www.example.com

We at Black Lantern would love to contribute to this repo. Would you be interested in working with us to find a solution to the false positives? Our first thought was to maintain a sort of blacklist. This would be simple but difficult to maintain. A more advanced approach might involve deep learning, such as what the guys at Praetorian did with Nosey Parker (sadly they did not open source their machine learning stuff):

image

EDIT: I made a PR: #4

Again, great work on this, and curious to hear your thoughts!

Inconsistent Indentation in YAML Files

There are two YAML files with the same content but with different levels of indentation for the key-value pairs. This causes confusion and can make the YAML files harder to read and maintain.

From db/:

patterns:
  - pattern:
      key: value

From datasets/:

patterns:
  - pattern:
    key: value

The different indentation levels affect the structure of the data, and it is a best practice to maintain consistent indentation throughout a YAML file for readability and maintainability.

How to create new dataset

Hi @mazen160,

How can we create new datasets? So the rules-stable.yml will be updated with the resources.
You have on your blog only the way how to create an export for GitLeaks or TruffleHog (this missing in the MD of readonly :-) )

Kind regards

Wrong character class for URLs regexp

Some regular expressions in the stable-rules db include this character class definition [.-_] which represents this charset from . to _ (./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_)

The intended character class is probably ._-.

Examples:

- name: AWS API Gateway
  regex: '[0-9a-z]+.execute-api.[0-9a-z.-_]+.amazonaws.com'
- name: AWS CloudFront
  regex: '[0-9a-z.-_]+.cloudfront.net'
- name: AWS EC2 External
  regex: ec2-[0-9a-z.-_]+.compute(-1)?.amazonaws.com
- name: AWS EC2 Internal
  regex: '[0-9a-z.-_]+.compute(-1)?.internal'
- name: AWS ELB
  regex: '[0-9a-z.-_]+.elb.amazonaws.com'
- name: AWS ElasticCache
  regex: '[0-9a-z.-_]+.cache.amazonaws.com'
- name: AWS RDS
  regex: '[0-9a-z.-_]+.rds.amazonaws.com'
- name: AWS S3 Bucket
  regex: s3://[0-9a-z.-_/]+
- name: AWS S3 Endpoint
  regex: '[a-zA-Z0-9.-_]+.s3.[a-zA-Z0-9.-_]+.amazonaws.com'
- name: Tru - 2
  regex: (?:tru).{0,40}\b([0-9a-zA-Z.-_]{26})\b

Query regarding license

Apologies for opening an annoying issues on what looks like a useful and early endeavour. I just want to avoid odd problems for users later.

The README appears to imply the rules have been collected from other projects, including Trufflehog.

I have collected and curated Regular Expressions Patterns for Secrets

The database itself indicates this is licensed under a Creative Commons license

This work is licensed under a Creative Commons Attribution 4.0 International License.

However, Trufflehog is licensed under the AGPL, which carries certain conditions, including copyright notices, disclosure source and importantly carrying the same license.

I may be mistaken in the above, but I'd love you to clarify. It may be worth updating the README to avoid the confusion, or mean changing the license and adhering to the upstream terms.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    πŸ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. πŸ“ŠπŸ“ˆπŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❀️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.