Searches through git repositories for high entropy strings and secrets, digging deep into commit history

Overview

truffleHog

codecov

Searches through git repositories for secrets, digging deep into commit history and branches. This is effective at finding secrets accidentally committed.

Join The Slack

Have questions? Feedback? Jump in slack and hang out with me

https://join.slack.com/t/trufflehog-community/shared_invite/zt-pw2qbi43-Aa86hkiimstfdKH9UCpPzQ

NEW

truffleHog previously functioned by running entropy checks on git diffs. This functionality still exists, but high signal regex checks have been added, and the ability to suppress entropy checking has also been added.

truffleHog --regex --entropy=False https://github.com/dxa4481/truffleHog.git

or

truffleHog file:///user/dxa4481/codeprojects/truffleHog/

With the --include_paths and --exclude_paths options, it is also possible to limit scanning to a subset of objects in the Git history by defining regular expressions (one per line) in a file to match the targeted object paths. To illustrate, see the example include and exclude files below:

include-patterns.txt:

src/
# lines beginning with "#" are treated as comments and are ignored
gradle/
# regexes must match the entire path, but can use python's regex syntax for
# case-insensitive matching and other advanced options
(?i).*\.(properties|conf|ini|txt|y(a)?ml)$
(.*/)?id_[rd]sa$

exclude-patterns.txt:

(.*/)?\.classpath$
.*\.jmx$
(.*/)?test/(.*/)?resources/

These filter files could then be applied by:

trufflehog --include_paths include-patterns.txt --exclude_paths exclude-patterns.txt file://path/to/my/repo.git

With these filters, issues found in files in the root-level src directory would be reported, unless they had the .classpath or .jmx extension, or if they were found in the src/test/dev/resources/ directory, for example. Additional usage information is provided when calling trufflehog with the -h or --help options.

These features help cut down on noise, and makes the tool easier to shove into a devops pipeline.

Example

Install

pip install truffleHog

Customizing

Custom regexes can be added with the following flag --rules /path/to/rules. This should be a json file of the following format:

{
    "RSA private key": "-----BEGIN EC PRIVATE KEY-----"
}

Things like subdomain enumeration, s3 bucket detection, and other useful regexes highly custom to the situation can be added.

Feel free to also contribute high signal regexes upstream that you think will benefit the community. Things like Azure keys, Twilio keys, Google Compute keys, are welcome, provided a high signal regex can be constructed.

trufflehog's base rule set sources from https://github.com/dxa4481/truffleHogRegexes/blob/master/truffleHogRegexes/regexes.json

To explicitly allow particular secrets (e.g. self-signed keys used only for local testing) you can provide an allow list --allow /path/to/allow in the following format:

{
    "local self signed test key": "-----BEGIN EC PRIVATE KEY-----\nfoobar123\n-----END EC PRIVATE KEY-----",
    "git cherry pick SHAs": "regex:Cherry picked from .*",
}

Note that values beginning with regex: will be used as regular expressions. Values without this will be literal, with some automatic conversions (e.g. flexible newlines).

How it works

This module will go through the entire commit history of each branch, and check each diff from each commit, and check for secrets. This is both by regex and by entropy. For entropy checks, truffleHog will evaluate the shannon entropy for both the base64 char set and hexidecimal char set for every blob of text greater than 20 characters comprised of those character sets in each diff. If at any point a high entropy string >20 characters is detected, it will print to the screen.

Help

usage: trufflehog [-h] [--json] [--regex] [--rules RULES] [--allow ALLOW]
                  [--entropy DO_ENTROPY] [--since_commit SINCE_COMMIT]
                  [--max_depth MAX_DEPTH]
                  git_url

Find secrets hidden in the depths of git.

positional arguments:
  git_url               URL for secret searching

optional arguments:
  -h, --help            show this help message and exit
  --json                Output in JSON
  --regex               Enable high signal regex checks
  --rules RULES         Ignore default regexes and source from json list file
  --allow ALLOW         Explicitly allow regexes from json list file
  --entropy DO_ENTROPY  Enable entropy checks
  --since_commit SINCE_COMMIT
                        Only scan from a given commit hash
  --branch BRANCH       Scans only the selected branch
  --max_depth MAX_DEPTH
                        The max commit depth to go back when searching for
                        secrets
  -i INCLUDE_PATHS_FILE, --include_paths INCLUDE_PATHS_FILE
                        File with regular expressions (one per line), at least
                        one of which must match a Git object path in order for
                        it to be scanned; lines starting with "#" are treated
                        as comments and are ignored. If empty or not provided
                        (default), all Git object paths are included unless
                        otherwise excluded via the --exclude_paths option.
  -x EXCLUDE_PATHS_FILE, --exclude_paths EXCLUDE_PATHS_FILE
                        File with regular expressions (one per line), none of
                        which may match a Git object path in order for it to
                        be scanned; lines starting with "#" are treated as
                        comments and are ignored. If empty or not provided
                        (default), no Git object paths are excluded unless
                        effectively excluded via the --include_paths option.

Running with Docker

First, enter the directory containing the git repository

cd /path/to/git

To launch the trufflehog with the docker image, run the following"

docker run --rm -v "$(pwd):/proj" dxa4481/trufflehog file:///proj

-v mounts the current working dir (pwd) to the /proj dir in the Docker container

file:///proj references that very same /proj dir in the container (which is also set as the default working dir in the Dockerfile)

Wishlist

  • A way to detect and not scan binary diffs
  • Don't rescan diffs if already looked at in another branch
  • A since commit X feature
  • Print the file affected
Comments
  • fix #8 - add `--include` and `--exclude` options

    fix #8 - add `--include` and `--exclude` options

    Fixes issue #8 by adding --include_paths and --exclude_paths options that allow the user to limit scanning to a subset of objects in the Git history by defining regular expressions (one per line) in a file to match the targeted object paths.

    If provided, the --include_paths option should point to a file with regular expressions (one per line), at least one of which must match a Git object path in order for it to be scanned. If empty or not provided (default), all Git object paths are included (unless otherwise excluded via the --exclude_paths option).

    Likewise, the --exclude_paths option, when provided, should point to a file with regular expressions, none of which may match a Git object path in order for it to be scanned. If empty or not provided (default), no Git object paths are excluded (unless effectively excluded via the --include_paths option).

    In either file, lines starting with "#" are treated as comments and are ignored.

    opened by milo-minderbinder 22
  • fix --since_commit parameter

    fix --since_commit parameter

    Hi, how can I contribute to this project? I was running truffleHog and using the --since_commit parameter, however it was buggy and did not work as expected. I made a very small change, and it worked as expected. Do you accept PRs or should I just tell you the change so you can verify it?

    opened by fahrishb 18
  • The regex functionality is not working as expected

    The regex functionality is not working as expected

    I git cloned the truffleHog repository. Changed my regexChecks.py file to look like below:

    import re
    
    regexes = {
        "Slack Token XOXP": re.compile('xoxp.*'),
        "Slack Token XOXB": re.compile('xoxb.*'),
        "Slack Token XOXO": re.compile('xoxo.*'),
        "Slack Token XOXA": re.compile('xoxa.*'),
        "AWS API Key": re.compile('AKIA.*'),
        "Private key": re.compile('-----BEGIN PRIVATE KEY-----.*')
    }
    

    I then installed the libraries required to run the tool by typing pip install -r requirements.txt. My requirements.txt file looked like below:

    GitPython==2.1.5
    gitdb2==2.0.2
    smmap2==2.0.2
    

    Finally, I ran the tool by typing - python truffleHog.py --regex --entropy=False https://github.com/secretuser1/secretrepo.git

    It printed out the Private Key, Slack Token XOXP and Slack Token XOXB. It should have also printed out the AWS key here - https://github.com/secretuser1/secretrepo/blob/master/secretfile.txt#L2 but it did not, even though the regex is present.

    Any idea why?

    opened by anshumanbh 14
  • Adding the capability for scanning a directory

    Adding the capability for scanning a directory

    This PR adds the capability for truffleHog to recursively scan a directory instead of a Git repository with all its history. This can be useful in CI pipelines or other situations where it is desirable to scan the codebase at a single point in time. Additionally, it can also be used to scan code that is not stored in Git.

    I've done some minor refactoring to the existing scanning code to reduce code duplication.

    opened by runako 14
  • ValueError: unknown reasons (During run application on EC2 RedHat)

    ValueError: unknown reasons (During run application on EC2 RedHat)

    If anyone can help I will be appreciate! Describe the bug Having an error during running the app on EC2 RedHat : ValueError: unknown reasons

    I installed on redhat ec2 instance trufflehog. By default there python 2.7 and 3.6 Trufflehog was installed from pip3. pip3 freeze shows that everything installed : gitdb==4.0.5 gitdb2==4.0.2 GitPython==3.0.6 smmap==3.0.5 truffleHog==2.2.1 truffleHogRegexes==0.0.7

    After installation and running next command ( just to check does it work or not) trufflehog --regex --entropy=False https://github.com/dxa4481/truffleHog.git ( I got next error) : Traceback (most recent call last): File "/usr/local/bin/trufflehog", line 11, in sys.exit(main()) File "/usr/local/lib64/python3.6/site-packages/truffleHog/truffleHog.py", line 93, in main surpress_output=False, branch=args.branch, repo_path=args.repo_path, path_inclusions=path_inclusions, path_exclusions=path_exclusions, allow=allow) File "/usr/local/lib64/python3.6/site-packages/truffleHog/truffleHog.py", line 351, in find_strings diff_hash = hashlib.md5((str(prev_commit) + str(curr_commit)).encode('utf-8')).digest() ValueError: unknown reasons

    opened by RepositoryOfCode 12
  •  Git issue when trying to scan cloned project in Apple-mac: trufflehog file:///

    Git issue when trying to scan cloned project in Apple-mac: trufflehog file:///

    Traceback (most recent call last): File "/Library/Frameworks/Python.framework/Versions/3.6/bin/trufflehog", line 10, in sys.exit(main()) File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/truffleHog/truffleHog.py", line 82, in main surpress_output=False, branch=args.branch, repo_path=args.repo_path, path_inclusions=path_inclusions, path_exclusions=path_exclusions) File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/truffleHog/truffleHog.py", line 309, in find_strings project_path = clone_git_repo(git_url) File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/truffleHog/truffleHog.py", line 152, in clone_git_repo Repo.clone_from(git_url, project_path) File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/git/repo/base.py", line 925, in clone_from return cls._clone(git, url, to_path, GitCmdObjectDB, progress, **kwargs) File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/git/repo/base.py", line 880, in _clone finalize_process(proc, stderr=stderr) File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/git/util.py", line 341, in finalize_process proc.wait(**kwargs) File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/git/cmd.py", line 291, in wait raise GitCommandError(self.args, status, errstr) git.exc.GitCommandError: Cmd('git') failed due to: exit code(128) cmdline: git clone -v file:///GitHub/Github-guardian/ /var/folders/37/md_m401d073bw1vt7q863pbw0000gn/T/tmpchgjs_ln stderr: 'Cloning into '/var/folders/37/md_m401d073bw1vt7q863pbw0000gn/T/tmpchgjs_ln'... fatal: '/GitHub/Github-guardian/' does not appear to be a git repository fatal: Could not read from remote repository.

    Please make sure you have the correct access rights and the repository exists. '

    opened by dgurazada 12
  • Depth limits are needed to prevent long jobs

    Depth limits are needed to prevent long jobs

    When leveraging trufflehog for repo scans, it would be helpful to introduce the concept of depth limits, to ensure that when a scan is performed, it only goes to a certain number of commits back. On a test repository that I have, there is a huge number of commits dating back to 2014, and the job is running well more than 24 hours to go deep across all of them.

    opened by dend 11
  • WindowsError: [Error 5] Access is denied

    WindowsError: [Error 5] Access is denied

    Traceback (most recent call last): File "trufflehog.py", line 106, in <module> find_strings(args.git_url) File "trufflehog.py", line 98, in find_strings shutil.rmtree(project_path) File "C:\Python27\lib\shutil.py", line 247, in rmtree rmtree(fullname, ignore_errors, onerror) File "C:\Python27\lib\shutil.py", line 247, in rmtree rmtree(fullname, ignore_errors, onerror) File "C:\Python27\lib\shutil.py", line 247, in rmtree rmtree(fullname, ignore_errors, onerror) File "C:\Python27\lib\shutil.py", line 252, in rmtree onerror(os.remove, fullname, sys.exc_info()) File "C:\Python27\lib\shutil.py", line 250, in rmtree os.remove(fullname) WindowsError: [Error 5] Access is denied: 'temp\\[uuid]\\.git\\objects\\pack\\pack-[uuid].idx'

    When scanning some repos. (This one crashes half way through, This one crashes at startup)

    opened by Peter-Maguire 11
  • i cant see the result

    i cant see the result

    1. See error

    {"level":"debug","msg":"Cloning remote Git repo without authentication","time":"2022-04-05T16:19:28Z"} {"level":"debug","msg":"Git repo local path: /tmp/trufflehog944564607","time":"2022-04-05T16:23:19Z"}

    2022/04/05 16:44:16 [updater parent] prog exited with 1

    I can see the result if found or note even if I use --json I cant see the saved file its always clone the repo in tmp folder after finish scanning it should delete the cloned folder in the tmp

    bug 
    opened by abramas 10
  • Hardcoded thresholds of 20 in get_strings_of_set()

    Hardcoded thresholds of 20 in get_strings_of_set()

    threshold keyword variable is declared and used on the last if statement Line 39, but not in the first else statement Line 35

    def get_strings_of_set(word, char_set, threshold=20):
        count = 0
        letters = ""
        strings = []
        for char in word:
            if char in char_set:
                letters += char
                count += 1
            else:
                if count > 20:
                    strings.append(letters)
                letters = ""
                count = 0
        if count > threshold:
            strings.append(letters)
    
    opened by bandrel 10
  • gitdb update breaks trufflehog

    gitdb update breaks trufflehog

    Probably related to #198

    We install inside a docker container using:

    $ pip install truffleHog==2.0.99
    

    We run:

    $ trufflehog --regex --entropy=False .
    

    Starting today this errored with:

    Traceback (most recent call last):
       File "/usr/local/bin/trufflehog", line 5, in <module>
         from truffleHog.truffleHog import main
       File "/usr/local/lib/python3.8/site-packages/truffleHog/truffleHog.py", line 17, in <module>
         from git import Repo
       File "/usr/local/lib/python3.8/site-packages/git/__init__.py", line 38, in <module>
         from git.config import GitConfigParser  # @NoMove @IgnorePep8
       File "/usr/local/lib/python3.8/site-packages/git/config.py", line 16, in <module>
         from git.compat import (
       File "/usr/local/lib/python3.8/site-packages/git/compat.py", line 16, in <module>
         from gitdb.utils.compat import (
     ModuleNotFoundError: No module named 'gitdb.utils.compat'
    

    A quick dive down the dependency tree showed that the trufflehog dependency on gitpython-2.1.1 (here) is pulling in gitdb2-3.0.2 (here) which has removed the gitdb.utils.compat (PR)

    Our fix for now is to use (may be useful to others):

    pip install gitdb2==3.0.0 truffleHog==2.0.99
    
    opened by danieldooley 9
  • Use access-token endpoint for validity check

    Use access-token endpoint for validity check

    This PR fixes the issue https://github.com/trufflesecurity/trufflehog/issues/990, it should correctly report keys as valid even if they are missing the user_read scope.

    opened by clonsdale-canva 1
  • Buildkite token validation missing tokens without user_read scope

    Buildkite token validation missing tokens without user_read scope

    Community Note

    • Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request
    • Please do not leave "+1" or other comments that do not add relevant new information or questions, they generate extra noise for issue followers and do not help prioritize the request
    • If you are interested in working on this issue or have submitted a pull request, please leave a comment

    TruffleHog Version

    3.21.0

    Expected Behavior

    Buildkite token is reported as valid

    Actual Behavior

    Buildkite token is not validated as the API call fails due to missing user_read scope

    Additional Context

    The logic to check if a buildkite token is valid will send out an API call to the /user endpoint https://github.com/trufflesecurity/trufflehog/blob/009756dce61948a66cf90a8b14018460c91ab4f0/pkg/detectors/buildkite/buildkite.go#L51. This will miss all tokens which do not have the read_user scope.

    Instead, we can use the access-token endpoint, which will return 200 for any valid token, and report on the scopes present / ID of the token - https://buildkite.com/docs/apis/rest-api/access-token.

    bug 
    opened by clonsdale-canva 0
  • Add max-depth limit to GitHub subcommand

    Add max-depth limit to GitHub subcommand

    Community Note

    • Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request
    • Please do not leave "+1" or other comments that do not add relevant new information or questions, they generate extra noise for issue followers and do not help prioritize the request
    • If you are interested in working on this issue or have submitted a pull request, please leave a comment

    Description

    Ability to limit the depth of the commit history being scanned for GitHub users We need the ability to set a --max-depth= limit to GitHub sub command.

    Problem to be Addressed

    It is very noisy for large GitHub enterprises to detect new issues due to the inability to ignore historical commit history that one has already remediated. Results have to be saved into a spreadsheet or database and then diff'd to see what has changed.

    Description of the Preferred Solution

    The ability to set a --max-depth= limit to GitHub sub command. This would be very beneficial when attempting to scan a GitHub enterprise repositories as a group.

    Additional Context

    References

    • #0000
    enhancement 
    opened by dwilliamsstc 0
  • go install - missing dot in first path element

    go install - missing dot in first path element

    build github.com/trufflesecurity/trufflehog/v3: cannot load embed: malformed module path "embed": missing dot in first path element

    Community Note

    • Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request
    • Please do not leave "+1" or other comments that do not add relevant new information or questions, they generate extra noise for issue followers and do not help prioritize the request
    • If you are interested in working on this issue or have submitted a pull request, please leave a comment

    TruffleHog Version

    Trace Output

    Expected Behavior

    Actual Behavior

    Steps to Reproduce

    Distributor ID: Elementary Description: elementary OS 6.1 Jólnir Release: 6.1 Codename: jolnir

    Additional Context

    References

    • #0000
    bug 
    opened by rip752 0
  • Run certain Detector Type

    Run certain Detector Type

    trufflehog version: trufflehog dev

    Currently I am running trufflehog as a pre-commit hook with all possible Detector type. Is it possible to only run few Detector types , say AWS keys, Private keys as such?

    bug 
    opened by Priyadhana 0
Releases(v3.21.0)
Description Basic Recon tool for beginners. Especially those who faces issue on how to recon or what all tools to use

Description Basic Recon tool for beginners. Especially those who faces issue on how to recon or what all tools to use. Will try to add atleast 10 more tools currently use 7 sources to gather domains.

Harinder Singh 7 Jan 03, 2022
CVE-2021-41773 Path Traversal for Apache 2.4.49

CVE-2021-41773 Path Traversal for Apache 2.4.49

ac1d 3 Oct 20, 2021
Anti Supercookie - Confusing the ISP & Escaping the Supercookie

Confusing the ISP & Escaping the Supercookie

Baris Dincer 2 Nov 22, 2022
Directory Traversal in Afterlogic webmail aurora and pro

CVE-2021-26294 Exploit Directory Traversal in Afterlogic webmail aurora and pro . Description: AfterLogic Aurora and WebMail Pro products with 7.7.9 a

Ashish Kunwar 8 Nov 09, 2022
compact and speedy hash cracker for md5, sha1, and sha256 hashes

hash-cracker hash cracker is a multi-functional and compact...hash cracking tool...that supports dictionary attacks against three kinds of hashes: md5

Abdullah Ansari 3 Feb 22, 2022
A Python r2pipe script to automatically create a Frida hook to intercept TLS traffic for Flutter based apps

boring-flutter A Python r2pipe script to automatically create a Frida hook to intercept TLS traffic for Flutter based apps. Currently only supporting

Hamza 64 Oct 18, 2022
POC for detecting the Log4Shell (Log4J RCE) vulnerability

Interactsh An OOB interaction gathering server and client library Features • Usage • Interactsh Client • Interactsh Server • Interactsh Integration •

ProjectDiscovery 2.1k Jan 08, 2023
Statistical Random Number Generator Attack Against The Kirchhoff-law-johnson-noise (Kljn) Secure Key Exchange Protocol

Statistical Random Number Generator Attack Against The Kirchhoff-law-johnson-noise (Kljn) Secure Key Exchange Protocol

zeze 1 Jan 13, 2022
How to exploit a double free vulnerability in 2021. 'Use-After-Free for Dummies'

This bug doesn’t exist on x86: Exploiting an ARM-only race condition How to exploit a double free and get a shell. "Use-After-Free for dummies" In thi

Stephen Tong 1.2k Dec 25, 2022
Apache OFBiz rmi反序列化EXP(CVE-2021-26295)

Apache OFBiz rmi反序列化EXP(CVE-2021-26295) 目前仅支持nc弹shell 将ysoserial.jar放置在同目录下,py3运行,根据提示输入漏洞url,你的vps地址和端口 第二次使用建议删除exp.ot 本工具仅用于安全测试,禁止未授权非法攻击站点,否则后果自负

15 Nov 09, 2022
一款Web在线自动免杀工具

一款利用加载器以及Python反序列化绕过AV的在线免杀工具 因为打包方式的局限性,不能跨平台,若要生成exe格式的只能在Windows下运行本项目 打包速度有点慢,提交后稍等一会 开发环境及运行 前端使用Bootstrap框架,后端使用Django框架 。

yhy 172 Nov 28, 2022
Visius Heimdall is a tool that checks for risks on your cloud infrastructure

Heimdall Cloud Checker 🇧🇷 About Visius is a Brazilian cybersecurity startup that follows the signs of the crimson thunder ;) 🎸 ! As we value open s

visius 48 Jun 20, 2022
This tool was created in order to automate some basic OSINT tasks for penetration testing assingments.

This tool was created in order to automate some basic OSINT tasks for penetration testing assingments. The main feature that I haven't seen much anywhere is the downloadd google dork function where t

Tobias 5 May 31, 2022
Python exploit code for CVE-2021-4034 (pwnkit)

Python3 code to exploit CVE-2021-4034 (PWNKIT). This was an exercise in "can I make this work in Python?", and not meant as a robust exploit. It Works

Joe Ammond 92 Dec 29, 2022
A Python Tool that uses Shodan API's to perform quick recon for vulnerabilities

Shodan Quick Recon A Python Tool that uses Shodan API's to perform quick recon for vulnerabilities Configuration You must edit the python code, and in

Black Hat Ethical Hacking 5 Aug 09, 2022
Python script to tamper with pages to test for Log4J Shell vulnerability.

log4jShell Scanner This shell script scans a vulnerable web application that is using a version of apache-log4j 2.15.0. This application is a static

GoVanguard 8 Oct 20, 2022
evtx-hunter helps to quickly spot interesting security-related activity in Windows Event Viewer (EVTX) files.

Introduction evtx-hunter helps to quickly spot interesting security-related activity in Windows Event Viewer (EVTX) files. It can process a high numbe

NVISO 116 Dec 29, 2022
Apk Framework Detector

🚀🚀🚀Program helps you to detect the major framework or technology used in writing any android app. Just provide the apk 😇😇

Daniel Agyapong 10 Dec 07, 2022
A small script to export all AWAF policies from a BIG-IP device

This script leverages BIG-IP iControl REST API to export ALL AWAF policies in the system and saves them locally. The policies can be exported in the following formats: xml, plc and json.

3 Feb 03, 2022
CVE-2021-43936 is a critical vulnerability (CVSS3 10.0) leading to Remote Code Execution (RCE) in WebHMI Firmware.

CVE-2021-43936 CVE-2021-43936 is a critical vulnerability (CVSS3 10.0) leading to Remote Code Execution (RCE) in WebHMI Firmware. This vulnerability w

Jeremiasz Pluta 8 Jul 05, 2022