Coder Social home page Coder Social logo

avalz / waf-a-mole Goto Github PK

View Code? Open in Web Editor NEW
155.0 9.0 31.0 4.54 MB

A guided mutation-based fuzzer for ML-based Web Application Firewalls

License: MIT License

Shell 0.08% Python 99.92%
web web-security web-application-firewall machine-learning adversarial-machine-learning

waf-a-mole's Introduction

WAF-A-MoLE

A guided mutation-based fuzzer for ML-based Web Application Firewalls, inspired by AFL and based on the FuzzingBook by Andreas Zeller et al.

Given an input SQL injection query, it tries to produce a semantic invariant query that is able to bypass the target WAF. You can use this tool for assessing the robustness of your product by letting WAF-A-MoLE explore the solution space to find dangerous "blind spots" left uncovered by the target classifier.

Python Version License Documentation Status

Architecture

WAF-A-MoLE Architecture

WAF-A-MoLE takes an initial payload and inserts it in the payload Pool, which manages a priority queue ordered by the WAF confidence score over each payload.

During each iteration, the head of the payload Pool is passed to the Fuzzer, where it gets randomly mutated, by applying one of the available mutation operators.

Mutation operators

Mutations operators are all semantics-preserving and they leverage the high expressive power of the SQL language (in this version, MySQL).

Below are the mutation operators available in the current version of WAF-A-MoLE.

Mutation Example
Case Swapping admin' OR 1=1#admin' oR 1=1#
Whitespace Substitution admin' OR 1=1#admin'\t\rOR\n1=1#
Comment Injection admin' OR 1=1#admin'/**/OR 1=1#
Comment Rewriting admin'/**/OR 1=1#admin'/*xyz*/OR 1=1#abc
Integer Encoding admin' OR 1=1#admin' OR 0x1=(SELECT 1)#
Operator Swapping admin' OR 1=1#admin' OR 1 LIKE 1#
Logical Invariant admin' OR 1=1#admin' OR 1=1 AND 0<1#
Number Shuffling admin' OR 1=1#admin' OR 2=2#

How to cite us

WAF-A-MoLE implements the methodology presented in "WAF-A-MoLE: Evading Web Application Firewalls through Adversarial Machine Learning". A pre-print of our article can also be found on arXiv.

If you want to cite us, please use the following (BibTeX) reference:

@inproceedings{demetrio20wafamole,
  title={WAF-A-MoLE: evading web application firewalls through adversarial machine learning},
  author={Demetrio, Luca and Valenza, Andrea and Costa, Gabriele and Lagorio, Giovanni},
  booktitle={Proceedings of the 35th Annual ACM Symposium on Applied Computing},
  pages={1745--1752},
  year={2020}
}

Running WAF-A-MoLE

Prerequisites

Setup

pip install -r requirements.txt

Sample Usage

You can evaluate the robustness of your own WAF, or try WAF-A-MoLE against some example classifiers. In the first case, have a look at the Model class. Your custom model needs to implement this class in order to be evaluated by WAF-A-MoLE. We already provide wrappers for sci-kit learn and keras classifiers that can be extend to fit your feature extraction phase (if any).

Help

wafamole --help

Usage: wafamole [OPTIONS] COMMAND [ARGS]...

Options:
  --help  Show this message and exit.

Commands:
  evade  Launch WAF-A-MoLE against a target classifier.

wafamole evade --help

Usage: wafamole evade [OPTIONS] MODEL_PATH PAYLOAD

  Launch WAF-A-MoLE against a target classifier.

Options:
  -T, --model-type TEXT     Type of classifier to load
  -t, --timeout INTEGER     Timeout when evading the model
  -r, --max-rounds INTEGER  Maximum number of fuzzing rounds
  -s, --round-size INTEGER  Fuzzing step size for each round (parallel fuzzing
                            steps)
  --threshold FLOAT         Classification threshold of the target WAF [0.5]
  --random-engine TEXT      Use random transformations instead of evolution
                            engine. Set the number of trials
  --output-path TEXT        Location were to save the results of the random
                            engine. NOT USED WITH REGULAR EVOLUTION ENGINE
  --help                    Show this message and exit.

Evading example models

We provide some pre-trained models you can have fun with, located in wafamole/models/custom/example_models. The classifiers we used are listed in the table below.

Classifier name Algorithm
WafBrain Recurrent Neural Network
ML-Based-WAF Non-Linear SVM
ML-Based-WAF Stochastic Gradient Descent
ML-Based-WAF AdaBoost
Token-based Naive Bayes
Token-based Random Forest
Token-based Linear SVM
Token-based Gaussian SVM
SQLiGoT - Directed Proportional Gaussian SVM
SQLiGoT - Directed Unproportional Gaussian SVM
SQLiGoT - Undirected Proportional Gaussian SVM
SQLiGoT - Undirected Unproportional Gaussian SVM

In addition to ML-based WAF, WAF-a-MoLE supports also rule-based WAFs. Specifically, it provides a wrapper for the ModSecurity WAF equipped with the OWASP Core Rule Set (CRS), based on the pymodsecurity project.

WAF-BRAIN - Recurrent Neural Newtork

Bypass the pre-trained WAF-Brain classifier using a admin' OR 1=1# equivalent.

wafamole evade --model-type waf-brain wafamole/models/custom/example_models/waf-brain.h5  "admin' OR 1=1#"

ML-Based-WAF - Non-Linear SVM (with original WAF-A-MoLE dataset)

Bypass the pre-trained ML-Based-WAF SVM classifier using a admin' OR 1=1# equivalent.

wafamole evade --model-type mlbasedwaf wafamole/models/custom/example_models/mlbasedwaf_svc.dump  "admin' OR 1=1#"

ML-Based-WAF - Non-Linear SVM (with SQLiV5/SQLiV3 datasets)

Bypass the pre-trained ML-Based-WAF SVM classifier using a admin' OR 1=1# equivalent. Note that SQLiV5 is a dataset sourced from Kaggle expanded with a series of queries generated by WAF-A-MoLE itself, as a proof of concept that WAF-A-MoLE queries can enhance the robustness of a WAF with retraining. Use mlbasedwaf_svc_sqliv3.dump to bypass the WAF trained with the original Kaggle dataset (SQLiV3).

wafamole evade --model-type mlbasedwaf wafamole/models/custom/example_models/mlbasedwaf_svc_sqliv5.dump  "admin' OR 1=1#"

ML-Based-WAF - Stochastic Gradient Descent (SGD)

Bypass the pre-trained ML-Based-WAF SGD classifier using a admin' OR 1=1# equivalent.

wafamole evade --model-type mlbasedwaf wafamole/models/custom/example_models/mlbasedwaf_sgd.dump  "admin' OR 1=1#"

ML-Based-WAF - AdaBoost

Bypass the pre-trained ML-Based-WAF AdaBoost classifier using a admin' OR 1=1# equivalent (takes longer than other models, at around 2 to 5 minutes of runtime).

wafamole evade --model-type mlbasedwaf wafamole/models/custom/example_models/mlbasedwaf_ada.dump  "admin' OR 1=1#"

Token-based - Naive Bayes

Bypass the pre-trained token-based Naive Bayes classifier using a admin' OR 1=1# equivalent.

wafamole evade --model-type token wafamole/models/custom/example_models/naive_bayes_trained.dump  "admin' OR 1=1#"

Token-based - Random Forest

Bypass the pre-trained token-based Random Forest classifier using a admin' OR 1=1# equivalent.

wafamole evade --model-type token wafamole/models/custom/example_models/random_forest_trained.dump  "admin' OR 1=1#"

Token-based - Linear SVM

Bypass the pre-trained token-based Linear SVM classifier using a admin' OR 1=1# equivalent.

wafamole evade --model-type token wafamole/models/custom/example_models/lin_svm_trained.dump  "admin' OR 1=1#"

Token-based - Gaussian SVM

Bypass the pre-trained token-based Gaussian SVM classifier using a admin' OR 1=1# equivalent.

wafamole evade --model-type token wafamole/models/custom/example_models/gauss_svm_trained.dump  "admin' OR 1=1#"

SQLiGoT

Bypass the pre-trained SQLiGOT classifier using a admin' OR 1=1# equivalent. Use DP, UP, DU, or UU for (respectivly) Directed Proportional, Undirected Proportional, Directed Unproportional and Undirected Unproportional.

wafamole evade --model-type DP wafamole/models/custom/example_models/graph_directed_proportional_sqligot "admin' OR 1=1#"

OWASP ModSecurity CRS - Rule-based WAF

Bypass the OWASP ModSecurity CRS using a admin' OR 1=1# equivalent. The user also need to specify the Paranoia Level as well as the path to locate the CRS rules (e.g., /etc/coreruleset).

wafamole evade --model-type modsecurity_pl[1-4] /etc/coreruleset "admin' OR 1=1#"

BEFORE LAUNCHING EVALUATION ON SQLiGoT

These classifiers are more robust than the others, as the feature extraction phase produces vectors with a more complex structure, and all pre-trained classifiers have been strongly regularized. It may take hours for some variants to produce a payload that achieves evasion (see Benchmark section).

Note on newer ML-Based-WAF models

Some models based on a slightly modified version of vladan-stojnic's ML-Based-WAF have been recently added, from an extension of WAF-A-MoLE entitled wafamole++ by nidnogg. Testing the AdaBoost model might take a longer time than usual (usually 2 to 5 minutes).

There are variants trained with the SQLiV5.json dataset, while most use the original SQL injection from WAF-A-MoLE dataset by default.

A Google Colaboratory notebook is provided with the training routines for some of these models, using the original WAF-A-MoLE dataset (modified to the SQLiV5 format). Any dataset can be used as long as they're in the same format as SQLiV5.json.

Custom adapters

First, create a custom Model class that implements the extract_features and classify methods.

class YourCustomModel(Model):
    def extract_features(self, value: str):
    	# TODO: extract features
        feature_vector = your_custom_feature_function(value)
        return feature_vector

    def classify(self, value):
    	# TODO: compute confidence
        confidence = your_confidence_eval(value)
        return confidence

Then, create an object from the model and instantiate an engine object that uses your model class.

model = YourCustomModel() #your init
engine = EvasionEngine(model)
result = engine.evaluate(payload, max_rounds, round_size, timeout, threshold)

Benchmark

We evaluated WAF-A-MoLE against all our example models.

The plot below shows the time it took for WAF-A-MoLE to mutate the admin' OR 1=1# payload until it was accepted by each classifier as benign.

On the x axis we have time (in seconds, logarithmic scale). On the y axis we have the confidence value, i.e., how sure a classifier is that a given payload is a SQL injection (in percentage).

Notice that being "50% sure" that a payload is a SQL injection is equivalent to flipping a coin. This is the usual classification threshold: if the confidence is lower, the payload is classified as benign.

Benchmark over time

Experiments were performed on DigitalOcean Standard Droplets.

Contribute

Questions, bug reports and pull requests are welcome.

In particular, if you are interested in expanding this project, we look for the following contributions:

  1. New WAF adapters
  2. New mutation operators
  3. New search algorithms

Team

waf-a-mole's People

Contributors

avalz avatar biagiom avatar nidnogg avatar zangobot avatar zxgio avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

waf-a-mole's Issues

Parametrize ModSecurity Paranoia Level

Currently the Paranoia Level for the ModSecurity Wrapper is hardcoded.

We need a smarter way to provide the Paranoia Level from the CLI

Possible solution: re-write CLI to have one command per model instead of a single evaluate one

Unable to run wafamole due to the error, Can not load keras model

I am implementing wafamole in the Virtual Machine using Ubuntu Desktop from this website, https://ubuntu.com/download/desktop
The version is Ubuntu Desktop 24.04 LTS. I then perform git clone https://github.com/AvalZ/WAF-A-MoLE and perform installing the requirements.txt and all the requirements are satisfied and then python setup.py install. It is completed

I have tried to run the wafamole with this command, wafamole evade --model-type waf-brain models/custom/example_models/waf-brain.h5 "admin' OR 1=1#" and I received the error indicating "Can not load keras model." I have provided the screenshot attached.

Please kindly advise and assist.

Wafamole Error

SqlFuzzer issues

Hi,

I wanted to use your solution in a project I am working on and I noticed that the sqlfuzzer sometimes doesn't produce the valid SQL so I wanted to double check if someone else had the same issues and if my assumptions are correct

  1. In the comment_rewriting method, when it is rewriting multiline comments in this payload
    select */**/ from mytable (select * command with added comments)
    the method identifies the first occurrence of */ as end of the multiline comment and generates invalid SQL
    select blah*/**/ from mytable

  2. Method swap_keywords, when selecting mapping "OR" => [" OR ", "||"] in payload with WAITFOR sql command in it will transform it to WAITF OR.
    (Of course that is not an issue if we don't use WAITFOR in the payload or if the used sql dialect is not supporting it)

Troubles running sample models with evade

Hey there!

So I've run into this issue when we're attempting to run most of the example models (pretty much all except WAF Brain):

https://pastebin.com/wS6DXJG4

Switching to a Python 3.5 installation and installing the 0.21.1 sklearn as suggested by the warning gives us a plethora of syntax errors instead, which I could post here as well.

I noticed that whenever I first build the setup.py file, I get a file wafamole.py (for module wafamole) not found warning at the end of the process, after which I'm able to run setup.py install as if nothing had happened. Is that file intended to be missing and/or could it have something to do with the evade errors found afterwards?

Error running 'wafamole --help'

Hi team,

I followed the setup instruction by:

  1. installing the required modules.
  2. python setup.py install
  3. Tried to verify installation 'wafamole --help'

Traceback error: ModuleNotFoundError: No module named 'wafamole.models.custom'

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.