c-sto / recursebuster Goto Github PK

View Code? Open in Web Editor NEW

243.0 9.0 34.0 3.8 MB

rapid content discovery tool for recursively querying webservers, handy in pentesting and web application assessments

License: The Unlicense

Go 100.00%

gobuster recursive content-discovery

recursebuster's Introduction

RecurseBuster

It's like gobuster, but recursive!

I wanted a recursive directory brute forcer that was fast, and had certain features (like header blacklisting, and recursion). It didn't exist, so I started writing this. In reality, I'll probably merge a lot of the functionality into github.com/swarley7/gograbber since that solves a similar problem and has cool features that I don't want to implement (phantomjs, ugh). For now, it will do.

Installation

Ye olde go get install should work. Same command to update:

go get -u github.com/c-sto/recursebuster

Important releases will also be tagged and uploaded.

NOTE Since tagged releases have started, some old versions which have been obtained with go get -u seem to be broken. Removing the folder and starting again seems to work:

rm -rf $GOPATH/src/github.com/c-sto/recursebuster
rm $GOPATH/bin/recursebuster
go get -u github.com/c-sto/recursebuster

Usage

I wanted it to be fairly straightforward, but scope creep etc. Basic usage is just like gobuster:

recursebuster -u https://google.com -w wordlist.txt

This will run a recursive-HEAD-spider-assisted search with a single thread on google.com using the wordlist specified above. Results will print to screen, but more importantly, will be written to a file 'busted.txt'.

Features

HEAD Based Checks

For servers the support it, HEAD based checks speed up content discovery considerably, since no body is required to be transferred. The default logic is to use a HEAD request to determine if something exists. If it seems to exist, a GET is sent to retrieve and verify. If there are sensitive pages that perform actions (AKA, ones that don't really follow the HTTP Verb Spec), a file containing a list of exact URLS that should not requested can be blacklisted with the -blacklist flag.

Recursion

When a directory is identified, it gets added to the queue to be brute-forced. By default, one directory is brute-forced at a time, however you can 'cancel' a directory interactively by hitting 'ctrl+x' if in UI mode. If you're not in UI mode (-noui), you need to have added the directory to the blacklist.

Spider Assistance

Since we are getting the page content anyway, why not use it to our advantage? Some basic checks are done to look for links within the HTML response. The links are added, and any directories identified added too. By default, only the supplied host is whitelisted, so any links that go off-site (rather, to a different domain) are ignored. You can specify a file that contains a list of whitelisted domains that you are OK with including into the spider with the -whitelist flag.

Speed

Gobuster is pretty fast when you smash -t 200, but who would do that? One of my goals for this was to keep performance on-par with gobuster where possible. On most webservers, recursebuster seems to be faster, even though it sends both a HEAD and a GET request. This means you will hit WAF limits really quickly, and is why by default it's -t 1.

Proxy options

The ability to use a proxy is fairly useful in several situations. Not having to drop tools on a host in order to scan through it is always useful - recursebuster also works through burp if you specify it as a http proxy. When using Recursebuster to supplement the burp sitemap - use the -stitemap option to send only the 'found' or interesting responses to burp, this should help avoid filling up your HTTP History with 404's.

Usage args

Idk why you might want these, just run it with -h 2>&1 and grep for the keyword. Here they are anyway:

  -ajax
        Add the X-Requested-With: XMLHttpRequest header to all requests
  -all
        Show, and write the result of all checks
  -appendslash
        Append a / to all directory bruteforce requests (like extension, but slash instead of .yourthing)
  -auth string
        Basic auth. Supply this with the base64 encoded portion to be placed after the word 'Basic' in the Authorization header.
  -bad string
        Responses to consider 'bad' or 'not found'. Comma-separated. This works the opposite way of gobuster! (default "404")
  -badheader value
        Check for presence of this header. If an exact match is found, the response is considered bad.Supply as key:value. Can specify multiple - eg '-badheader Location:cats -badheader X-ATT-DeviceId:XXXXX'
  -blacklist string
        Blacklist of prefixes to not check. Will not check on exact matches.
  -canary string
        Custom value to use to check for wildcards
  -clean
        Output clean URLs to the output file for easy loading into other tools and whatnot.
  -cookies string
        Any cookies to include with requests. This is smashed into the cookies header, so copy straight from burp I guess.
  -debug
        Enable debugging
  -ext string
        Extensions to append to checks. Multiple extensions can be specified, comma separate them.
  -headers value
        Additional headers to include with request. Supply as key:value. Can specify multiple - eg '-headers X-Forwarded-For:127.0.01 -headers X-ATT-DeviceId:XXXXX'
  -https
        Use HTTPS instead of HTTP.
  -iL string
        File to use as an input list of URL's to start from
  -k    Ignore SSL check
  -len
        Show, and write the length of the response
  -methods string
        Methods to use for checks. Multiple methods can be specified, comma separate them. Requests will be sent with an empty body (unless body is specified) (default "GET")
  -nobase
        Don't perform a request to the base URL
  -noget
        Do not perform a GET request (only use HEAD request/response)
  -nohead
        Don't optimize GET requests with a HEAD (only send the GET)
  -norecursion
        Disable recursion, just work on the specified directory. Also disables spider function.
  -nospider
        Don't search the page body for links, and directories to add to the spider queue.
  -nostartstop
        Don't show start/stop info messages
  -nostatus
        Don't print status info (for if it messes with the terminal)
  -noui
        Don't use sexy ui
  -nowildcard
        Don't perform wildcard checks for soft 404 detection
  -o string
        Local file to dump into (default "./busted.txt")
  -proxy string
        Proxy configuration options in the form ip:port eg: 127.0.0.1:9050. Note! If you want this to work with burp/use it with a HTTP proxy, specify as http://ip:port
  -ratio float
        Similarity ratio to the 404 canary page. (default 0.95)
  -redirect
        Follow redirects
  -sitemap
        Send 'good' requests to the configured proxy. Requires the proxy flag to be set. ***NOTE: with this option, the proxy is ONLY used for good requests - all other requests go out as normal!***
  -t int
        Number of concurrent threads (default 1)
  -timeout int
        Timeout (seconds) for HTTP/TCP connections (default 20)
  -u string
        Url to spider
  -ua string
        User agent to use when sending requests. (default "RecurseBuster/1.5.11")
  -v int
        Verbosity level for output messages.
  -version
        Show version number and exit
  -w string
        Wordlist to use for bruteforce. Blank for spider only
  -whitelist string
        Whitelist of domains to include in brute-force

Credits:

OJ/TheColonial: Hack the planet!!!!

Swarley: Hack the planet!!!!!

Hackers: Hack the planet!!!!

recursebuster's People

Contributors

Stargazers

Watchers

recursebuster's Issues

Sort output

Currently it works best if you sort -u busted.txt > sorted.txt to view the rough sitemap discovered. Ideally I'd like to write it sorted to avoid this step....

Don't recurse down dots

like it says in title, don't do recursion/dir bruting on dotted paths. Probably an opt-out or in option I guess... (suggested by @l0ss)

Feature Request: Whitelisting Status Codes

We have a blacklisting flag but it would be nice to have a whitelist of status codes. That way we don't need to blacklist all status codes to get 200s.

colour based results (diff colour for 3xx, 2xx, 5xx etc)

enforce order on provided list (-iL)

input list is kind of out of order apparently, either add the ability to enforce ordering and take a performance hit, or just explicitly say 'lol this will be out of order' on invoke or something. (another by @l0ss )

Add ability to use basic auth

Errors swallowed when using fancy UI

When you encounter a panic with the fancy UI up, the error details are swallowed by the terminal resetting. It also totally breaks the terminal afterwards, which is really cool. Ideally, that won't happen.

To verify, add a timed async function to panic ~5 seconds after starting, and try and get the error details.

Filter HTTP Responses

Is it possible to have only HTTP 200 responses be returned?

Broken MIME header in server response prevents function

Found a server that responds with a strange header, seems to cause Golang's net/http lib to be unhappy.

Sample response header:

HTTP/1.1 500 Internal Server Error
Content-Length: 42
Content-Type: application/json; charset=utf-8
ETag: W/"etag-here"
XXXXXXXXXXXXXXXXXXXXXXXXXX
request-context: appId=cid-v1:guid-here
Access-Control-Allow-Origin: *
XXXXXXXXXXXXXXXXXXXXX

Error that will occur:

Get https://server.com/url: net/http: HTTP/1.x transport connection broken: malformed MIME header line: XXXXXXXXXXXXXXXXXXXXXXXXXX

Case sensitivity check to optimise un-optimised wordlists

as per title, would be nice to be able to provide a case sensitive wordlist that intelligently removes duplicates based on previous responses (or maybe a test upfront on the first 'good' result).

Should be reasonably easy logic wise for both bits..

Detection:
on first 'good' response, send 2 extra requests:
1 random case
2 inverse of previous request
if responses to all 3 requests (including the original) are identical (can leverage the current diffing function for this), it's case insensitive

filtering:
if the page is case insensitive, pass all the words through a tolower or toupper filter as they are read out of the wordlist. The first few words will probably already be in the queue, I'm not aware of a way of looking through the contents of a channel without emptying it, though...

Robots detection adds HTML content when presented with 404/soft 404

As in the title:

Issue should be pretty clear in the robots function.

Investigate Extension race condition

panic: sync: WaitGroup is reused before previous Wait has returned

goroutine 1 [running]:
sync.(*WaitGroup).Wait(0xc42013a1d0)
        /usr/lib/go-1.10/src/sync/waitgroup.go:131 +0xbb
main.main()
       /go/src/github.com/C-Sto/recursebuster/main.go:209 +0x17e0

url.Parse before send to channel

Hi, I think you should add url.Parse for url parse from html before send it to channel because something like this can happen.

GOOD: GET Found https://xxx.io/www.xxx.com --><!-- Last Published: Wed Sep 25 2019 00:19:19 GMT+0000 (UTC) --><html data-wf-domain="xxx.io" data-wf-page="5c6f4e9e89c36802e87289f8" data-wf-site="5c6eefaaeddf9248ac13bc72"><head><meta charset="utf-8"/phpmyadmin2017%2F [301 Moved Permanently] Length: 182 http://xxx.io/www.xxx.com%20--><!--%20Last%20Published:%20Wed%20Sep%2025%202019%2000:19:19%20GMT+0000%20(UTC)%20--><html%20data-wf-domain="xxx.io"%20data-wf-page="5c6f4e9e89c36802e87289f8"%20data-wf-site="5c6eefaaeddf9248ac13bc72"><head><meta%20charset="utf-8"/phpmyadmin2017

Add ability to use arbitrary headers

Feature Request: Flag option to parse robots and add to request queue

Was hoping this feature would also work with the norecursion flag too.

Flag as 'bad' using body value

As in title, use body val to override any other good checks (example given was F5 Blocked)

Proxy with sitemap not working correctly

recursebuster -u 'https://cunnnnnntttttt' -vhost 'cunnnnnntttttt' -proxy http://127.0.0.1:8080 -ua 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_14_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/75.0.3754.0 Safari/537.36' -sitemap -nohead -badheader 'Content-Length: 503' -k -t 5 -w ~/tools/SecLists/Discovery/Web-Content/quickhits.txt

Apparently sends all (including bad) requests to proxy. Investigation required.

Show % completed of wordlist bf

Probably read worlist into null on init and counts the words. Then update output based on count of words sent to goroutines

Make dirbust work without having to manually add / at the end of initial url/domain

Input List for content discovery

Add a Sitemap-friendly option

Having the option to use recursebuster with Burp is very cool, but all of the failed / 404'd requests will very quickly clog up the sitemap when used with even the smallest of wordlists.

Could you add some logic to only forward requests that respond with a good status code (e.g. non-404) through the proxy.

I'm thinking something like a -SitemapFriendly flag, which will require the Proxy flag to be set, and then maybe performs the initial HEAD request directly (not via the proxy) and then depending on the response code/data send the followup GET request via the proxy.

Too many open fd's

Can't reproduce, but more than one have indicated they have this issue

Panic on URL evaluation

Command line:
~/go/bin/recursebuster -u http://$line -w bang.txt -k -noui -t 100 -timeout 5 -norecursion -o dirs

panic: runtime error: invalid memory address or nil pointer dereferenceGET:https://domain.com/path
[signal SIGSEGV: segmentation violation code=0x1 addr=0x48 pc=0x755d43]

goroutine 17920 [running]:
github.com/c-sto/recursebuster/librecursebuster.(*State).evaluateURL(0xc42016e240, 0x838b2c, 0x3, 0xc42152e0f0, 0x2f, 0xc420154870, 0x2f, 0xc4210ec000, 0x33, 0x0, ...)
        /root/go/src/github.com/c-sto/recursebuster/librecursebuster/net.go:144 +0xe3
github.com/c-sto/recursebuster/librecursebuster.(*State).testURL(0xc42016e240, 0x838b2c, 0x3, 0xc42152e0f0, 0x2f, 0xc420154870)
        /root/go/src/github.com/c-sto/recursebuster/librecursebuster/logic.go:122 +0x163
created by github.com/c-sto/recursebuster/librecursebuster.(*State).dirBust
        /root/go/src/github.com/c-sto/recursebuster/librecursebuster/logic.go:244 +0x6fc

Segfault while running in GUI mode

While running recursebuster with most of the default options (-ext html,asp, -w, -u, -k), I recieved the following error during a run:

INFO: Finished dirbusting: https://[x]/
INFO: Dirbusting https://[x]/index.html/
GOOD: Found GET https://[x]/index.html/%20 [200 OK]

panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0x71af21]

goroutine 9 [running]:
github.com/jroimartin/gocui.(*View).draw(0xc42020e2d0, 0x2518, 0xc420046d80)
        /home/ss23/go/src/github.com/jroimartin/gocui/view.go:328 +0x991
github.com/jroimartin/gocui.(*Gui).draw(0xc420f80000, 0xc42020e2d0, 0x0, 0x0)
        /home/ss23/go/src/github.com/jroimartin/gocui/gui.go:581 +0xcc
github.com/jroimartin/gocui.(*Gui).flush(0xc420f80000, 0x0, 0x0)
        /home/ss23/go/src/github.com/jroimartin/gocui/gui.go:462 +0x20e
github.com/jroimartin/gocui.(*Gui).MainLoop(0xc420f80000, 0x0, 0x0)
        /home/ss23/go/src/github.com/jroimartin/gocui/gui.go:384 +0x224
github.com/c-sto/recursebuster/librecursebuster.(*State).StartUI(0xc420166360, 0xc420f5bab0, 0xc420072360)
        /home/ss23/go/src/github.com/c-sto/recursebuster/librecursebuster/ui.go:71 +0x4e5
created by main.main
        /home/ss23/go/src/github.com/c-sto/recursebuster/main.go:100 +0x146a

Could be a bug in gocui, but I thought I'd create the bug here in case it's something fixable on the recursebuster end.

This was built recently (28th of November) using the latest version of recursebuster.

thanks

Make testsed count update more frequently

too long to wait for update, impatient!!!!