Coder Social home page Coder Social logo

tazeg / hscan Goto Github PK

View Code? Open in Web Editor NEW
7.0 3.0 0.0 14 KB

Scans recursively a path to match given sha1 checksums.

Home Page: https://jeffprod.com

License: MIT License

Go 96.72% Shell 3.28%
forensics forensics-investigations forensic-analysis golang sha1 sha1sum

hscan's Introduction

HSCAN

Scans recursively a path to match given sha1 checksums. Usefull to find duplicate files, or to find relevant/irrelevant/unknown files.

USAGE

hscan -d <PATH> -db <PATH>
-d string
      Directory to scan recursively
-db string
      Directory containing text files with sha1 to search (1 checksum by line)

EXAMPLE

You have the file dbpath/sha1.txt :

fed5cdfb1c9b121ea6d042dd54842407df3b4a6b
64725786589f263f0ecc1da55c2bcac7eb18e681
12d81f50767d4e09aa7877da077ad9d1b915d75b

Searching for files having those checksums in the directory test/ :

hscan -d test -db dbpath

# result :
Loading database file "dbpath/sha1.txt"... 3 uniq checksum found in "46.975µs"

Scanning path "tmp"...
  1964 files - 0 unreadable files - 492 dirs - 0 unreadable dirs - 3 matches

RESULT
  sha1tmp.txt                              : 3 matches
  Total                                    : 3 matches

Done in 292.09673ms

Matching files, unknown files, and errors are written in real time into result.csv :

# sha1,dbfile,filename,error
dff8a1731f59ccad056b346102d1e1d014b843f3,nsrl_uniq.txt,/home/jeff/tmp/.vscode/settings.json,
0841f15b7436126cb2877b094d632dbc2707eda0,,/home/jeff/tmp/img_20190502_175115.jpg,
98fb7452234c1d7666a54a53eb7340e501d8c173,sha1test.txt,/home/jeff/tmp/602352874.jpg,
,,/home/jeff/tmp/mysqltmp/undo_001,open /home/jeff/tmp/mysqltmp/undo_001: permission denied

A SQLite3 database named result.db with the same data as the CSV is created at the end of the process.

INSTALL

Get the latest release or download and install from source :

git config --global --add url."[email protected]:".insteadOf "https://github.com/"
go get github.com/Tazeg/hscan
cd ~/go/src/github.com/Tazeg/hscan

# Linux
env GOOS=linux GOARCH=amd64 go build hscan.go

# Windows
env GOOS=windows GOARCH=amd64 go build -o hscan.exe hscan.go

# Raspberry Pi
env GOARM=7 GOARCH=arm go build hscan.go

go install

TEST

go test

BENCHMARKS

Tried on :

  • OS : Linux
  • HDD : 128 Gb SSD + 2 Tb HDD
  • CPU: Intel(R) Xeon(R) CPU E5-1660 v3 @ 3.00GHz
  • Memory: 32 Gb

Loading a NIST/NSRL file of 1,2Gb containing 29,459,433 took 22.14s. Scanning 2Tb and 128 Gb of data took 1h32m34s. This depends on the data stored and the free space on the drive. Further tests will be done shortly.

$> hscan -d / -db bases_hash/
Loading database file "bases_hash/nsrl_sha1_uniq.txt"... 29459433 uniq checksum found in "22.146464941s"

Scanning path "/"...
  2012574 files - 12091 unreadable files - 274715 dirs - 2510 unreadable dirs - 287870 matches

RESULT
  nsrl_sha1_uniq.txt                       : 287870 matches
  Total                                    : 287870 matches

Done in 1h32m34.505006098s

hscan's People

Contributors

tazeg avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.