Coder Social home page Coder Social logo

web-traffic-generator's Introduction

web-traffic-generator

A quick and dirty HTTP/S "organic" traffic generator.

About

Just a simple (poorly written) Python script that aimlessly "browses" the internet by starting at pre-defined ROOT_URLS and randomly "clicking" links on pages until the pre-defined MAX_DEPTH is met.

I created this as a noise generator to use for an Incident Response / Network Defense simulation. The only issue is that my simulation environment uses multiple IDS/IPS/NGFW devices that will not pass and log simple TCPreplays of canned traffic. I needed the traffic to be as organic as possible, essentially mimicking real users browsing the web.

Tested on Ubuntu 14.04 & 16.04 minimal, but should work on any system with Python installed.

asciicast

How it works

About as simple as it gets...

First, specify a few settings at the top of the script...

  • MAX_DEPTH = 10, MIN_DEPTH = 5 Starting from each root URL (ie: www.yahoo.com), our generator will click to a depth radomly selected between MIN_DEPTH and MAX_DEPTH.

The interval between every HTTP GET requests is chosen at random between the following two variables...

  • MIN_WAIT = 5 Wait a minimum of 5 seconds between requests... Be careful with making requests to quickly as that tends to piss off web servers.

  • MAX_WAIT = 10 I think you get the point.

  • DEBUG = False A poor man's logger. Set to True for verbose realtime printing to console for debugging or development. I'll incorporate proper logging later on (maybe).

  • ROOT_URLS = [url1,url2,url3] The list of root URLs to start from when browsing. Randomly selected.

  • blacklist = [".gif", "intent/tweet", "badlink", etc...] A blacklist of strings that we check every link against. If the link contains any of the strings in this list, it's discarded. Useful to avoid things that are not traffic-generator friendly like "Tweet this!" links or links to image files.

  • userAgent = 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_12_3).......' You guessed it, the user-agent our headless browser hands over to the web server. You can probably leave it set to the default, but feel free to change it. I would strongly suggest using a common/valid one or else you'll likely get rate-limited quick.

Dependencies

Only thing you need and might not have is requests. Grab it with

sudo pip install requests

Usage

Create your config file first:

cp config.py.template config.py

Run the generator:

python gen.py

Troubleshooting and debugging

To get more deets on what is happening under the hood, change the Debug variable in config.py from False to True. This provides the following output...

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Traffic generator started
Diving between 3 and 10 links deep into 489 different root URLs,
Waiting between 5 and 10 seconds between requests.
This script will run indefinitely. Ctrl+C to stop.
Randomly selecting one of 489 URLs
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Recursively browsing [https://arstechnica.com] ~~~ [depth = 7]
  Requesting page...
  Page size: 77.6KB
  Data meter: 77.6KB
  Good requests: 1
  Bad reqeusts: 0
  Scraping page for links
  Found 171 valid links
  Pausing for 7 seconds...
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Recursively browsing [https://arstechnica.com/author/jon-brodkin/] ~~~ [depth = 6]
  Requesting page...
  Page size: 75.7KB
  Data meter: 153.3KB
  Good requests: 2
  Bad reqeusts: 0
  Scraping page for links
  Found 168 valid links
  Pausing for 9 seconds...
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Recursively browsing [https://arstechnica.com/information-technology/2020/01/directv-races-to-decommission-broken-boeing-satellite-before-it-explodes/] ~~~ [depth = 5]
  Requesting page...
  Page size: 43.8KB
  Data meter: 197.1KB
  Good requests: 3
  Bad reqeusts: 0
  Scraping page for links
  Found 32 valid links
  Pausing for 8 seconds...
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Recursively browsing [https://www.facebook.com/sharer.php?u=https%3A%2F%2Farstechnica.com%2F%3Fpost_type%3Dpost%26p%3D1647915] ~~~ [depth = 4]
  Requesting page...
  Page size: 64.2KB
  Data meter: 261.2KB
  Good requests: 4
  Bad reqeusts: 0
  Scraping page for links
  Found 0 valid links
  Stopping and blacklisting: no links

The last URL attempted provides a good example of when a particular URL throws an error. We simply add it to our config.blacklist array in memory, and continue browsing. This prevents a known bad URL from returning to the queue.

web-traffic-generator's People

Contributors

ecapuano avatar shyftxero avatar waywardone avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

web-traffic-generator's Issues

A bit of bug?

I uploaded the "config.py" and "gen.py" to a host and ran the "gen.py" and it returned a 404 error. After a series of research, I resulted in adding print "Content-type: text/html\n\n" right after #!/usr/bin/python and it didn't return 404, but a blank page in the browser.
So i don't know if it is how the script was designed to work, I didn't see any confirmation as to if it worked or not. So how was this programmed to show that the script executed? thanks.

How to use.

how can one use this bot step by step? can we put it in an online server? will it be an app that one will need to run every time he needs to use it? could it be automated and run even when admin is very ill?

The software not working..

After placing URL and then listed proxies from notepad still shows no traffic on analytics on the website! dont know but can say its not working

I cant make the cofig.py works?

What I done Already.
git clone https://github.com/ecapuano/web-traffic-generator.git

cd web-traffic-generator

and

I go straight to

python gen.py

How to create template config.py? Im trying to create config.py but cant work it out.

cp config.py.template config.py

Requests

Can't grab requests as my mac says command invalid

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.