Coder Social home page Coder Social logo

substats's Introduction


The idea was to collect all subdomains from all public bugbounty scope, find with amass all subdomains, make some analysis and generate a few wordlists on the results that may be helpful.

So I collected more than million subdomains for near 3000 domains from bugbounty scopes. Among them were google, paypal, apple, and many others. I used this resource https://github.com/arkadiyt/bounty-targets-data to gather all required data for further analysis.

What will you find in this repository?

Summary

During information gathering about the particular scope, no matter this is bugbounty or private assessments, it's always needed to find as much information. There are many examples of incidents when some companies were hacked through high critical vulnerabilities on their servers found through subdomain enumeration.

Great things in subdomains that most of them have some meaning or legend that are hidden in the name, like:

  • dev.services
  • api
  • staging
  • ds1-eu-central.portal
  • us-vpn-poc
  • blo01-01m01-sw01 - even this nasty one(!)

So if there was a way to find how these names are generated, they might be easier to find. For example, you can generate a massive wordlist with all possible combinations for us-vpn-poc. But, how big a wordlist will it be? So, there will be at least 26^8 or 208827064576 of different combinations... Do you need to iterate all of these combinations? I'm not sure, and it will probably take near a year, even with 10000 subdomains per second. And for ds1-eu-central - milleniums of millenium.

To make subdomain finding easier, There are a lot of different wordlists that contain popular subdomains names that allow researchers to find targets quickly, like:

And to enumerate the subdomains, you can also find many excellent tools like (each instrument has also built-in wordlist):

So the idea was to collect all subdomains from all public bugbounty scope, find with amass all subdomains, make some analysis and generate a few wordlists on the results that may be helpful.

So, what are top 10 subdomains for each level?

I replaced www with the following popular subdomain in the 0 column.

0 1 2 3 4 5
api mail ns matching c aws
m cust mail tms aws c
dev spider r isp tms net
mail insight ctr my paas tms
staging search stage internal k8s on
test storage ll aws s0
autodiscover fr np dmz us
stage us staff cloud internal
app m compute community
blog fwd c us
support my dev api

So, in general - the most popular subdomains have the speaking name - api, mail, aws, search, etc., that entirely refers to its purpose. In addition, all top masks contain only latin characters.

What is the most common length of subdomains on each level?

On most of the most common length is 3-4 symbols.

So this is it!

You can download the full list by following links:

Link Words count Info
all 401668 All valid collected subdomains with removed root domain.
all_unchecked 997285 All collected subdomains with removed root domain
complex 67265 List of words that used in complex subdomain names like mon01-dev-test. So the list contains words like: mon01,dev,test

In the wordlists folder you can find lists for each subdomain levels, from 1 to even 9*.

*Numbering begins from 0

substats's People

Contributors

zzzteph avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.