Coder Social home page Coder Social logo

ait-aecid / aecid-alert-aggregation Goto Github PK

View Code? Open in Web Editor NEW
14.0 4.0 4.0 4.14 MB

A method for grouping, clustering, and merging semi-structured alerts

License: GNU General Public License v3.0

Python 75.63% R 24.37%
alerts clustering hids ids json log-data logs machine-learning nids unsupervised alert-aggregation meta-alerts semi-structured similarity intrusion-detection

aecid-alert-aggregation's Introduction

aecid-alert-aggregation

A method for grouping, clustering, and merging semi-structured alerts. To get started, just clone this repository and execute

python3 aggregate.py

to run the aecid-alert-aggregation with the default input files and configurations. To change the configuration, edit the aggregate_config.py file. The input files provided in this repository are alerts generated by aminer and Wazuh IDS that were used to analyze the AIT-LDSv1.1.

When running the python script, the current status of the aggregation is printed on console. In its standard configuration, the script runs for several minutes and then outputs the generated meta-alerts in the directory specified in the configuration file. The output should look as follows:

ubuntu@ubuntu:~/aecid-alert-aggregation$ python3 aggregate.py
delta = 0.5: 125 groups in ['data/ossec/ossec_cup.json', 'data/aminer/aminer_cup.txt']
delta = 5: 19 groups in ['data/ossec/ossec_cup.json', 'data/aminer/aminer_cup.txt']
delta = 0.5: 19 groups in ['data/ossec/ossec_onion.json', 'data/aminer/aminer_onion.txt']
delta = 5: 11 groups in ['data/ossec/ossec_onion.json', 'data/aminer/aminer_onion.txt']
delta = 0.5: 885 groups in ['data/ossec/ossec_insect.json', 'data/aminer/aminer_insect.txt']
delta = 5: 14 groups in ['data/ossec/ossec_insect.json', 'data/aminer/aminer_insect.txt']
delta = 0.5: 919 groups in ['data/ossec/ossec_spiral.json', 'data/aminer/aminer_spiral.txt']
delta = 5: 23 groups in ['data/ossec/ossec_spiral.json', 'data/aminer/aminer_spiral.txt']
Now processing file 1/4...
 Processing groups with delta = 0.5
  Processed group 1/125 from {'nmap'} phase with 2 alerts. New meta-alert 0 generated. (sim=-1.0)
  Processed group 2/125 from {'nmap'} phase with 2 alerts. New meta-alert 1 generated. (sim=0.0)
  Processed group 3/125 from {'nikto'} phase with 19603 alerts. New meta-alert 2 generated. (sim=0.0)
  Processed group 4/125 from {'vrfy'} phase with 2 alerts. New meta-alert 3 generated. (sim=0.0)
  Processed group 5/125 from {'vrfy'} phase with 38 alerts. New meta-alert 4 generated. (sim=0.0)
  Processed group 6/125 from {'vrfy'} phase with 100 alerts. Add group to meta-alert 4 (sim=0.44) representing {'vrfy'}
  Processed group 7/125 from {'vrfy'} phase with 80 alerts. Add group to meta-alert 4 (sim=1.0) representing {'vrfy'}
  Processed group 8/125 from {'vrfy'} phase with 80 alerts. Add group to meta-alert 4 (sim=1.0) representing {'vrfy'}
  ...

 Processing groups with delta = 5
  Processed group 1/23 from {'nmap'} phase with 4 alerts. Add group to meta-alert 22 (sim=0.6) representing {'nmap'}
  Processed group 2/23 from {'nikto'} phase with 6150 alerts. Add group to meta-alert 49 (sim=0.97) representing {'nikto'}
  Processed group 3/23 from {'vrfy'} phase with 766 alerts. Add group to meta-alert 25 (sim=0.82) representing {'vrfy'}
  Processed group 4/23 from {'hydra'} phase with 156 alerts. Add group to meta-alert 26 (sim=0.54) representing {'hydra', 'upload'}
  Processed group 5/23 from {'hydra'} phase with 101 alerts. Add group to meta-alert 26 (sim=0.52) representing {'hydra', 'upload'}
  Processed group 6/23 from {'hydra'} phase with 125 alerts. Add group to meta-alert 26 (sim=0.49) representing {'hydra', 'upload'}
  Processed group 7/23 from {'hydra'} phase with 107 alerts. Add group to meta-alert 26 (sim=0.54) representing {'hydra', 'upload'}
  Processed group 8/23 from {'hydra'} phase with 43 alerts. Add group to meta-alert 26 (sim=0.46) representing {'hydra', 'upload'}
  Processed group 9/23 from {'hydra'} phase with 148 alerts. Add group to meta-alert 26 (sim=0.49) representing {'hydra', 'upload'}
  Processed group 10/23 from {'hydra'} phase with 14 alerts. Add group to meta-alert 27 (sim=0.61) representing {'hydra'}
  Processed group 11/23 from {'hydra'} phase with 131 alerts. Add group to meta-alert 26 (sim=0.48) representing {'hydra', 'upload'}
  Processed group 12/23 from {'hydra'} phase with 126 alerts. Add group to meta-alert 26 (sim=0.54) representing {'hydra', 'upload'}
  Processed group 13/23 from {'hydra'} phase with 32 alerts. Add group to meta-alert 26 (sim=0.5) representing {'hydra', 'upload'}
  Processed group 14/23 from {'hydra'} phase with 184 alerts. Add group to meta-alert 26 (sim=0.5) representing {'hydra', 'upload'}
  Processed group 15/23 from {'hydra'} phase with 79 alerts. Add group to meta-alert 26 (sim=0.54) representing {'hydra', 'upload'}
  Processed group 16/23 from {'upload'} phase with 4 alerts. Add group to meta-alert 37 (sim=0.58) representing {'upload'}
  Processed group 17/23 from {'upload'} phase with 6 alerts. New meta-alert 55 generated. (sim=0.24)
  Processed group 18/23 from {'upload'} phase with 1 alerts. Add group to meta-alert 24 (sim=0.35) representing {'hydra', 'vrfy', 'upload'}
  Processed group 19/23 from {'exploit'} phase with 2 alerts. Add group to meta-alert 24 (sim=0.7) representing {'hydra', 'exploit', 'vrfy', 'upload'}
  Processed group 20/23 from {'exploit'} phase with 2 alerts. Add group to meta-alert 28 (sim=0.62) representing {'exploit'}
  Processed group 21/23 from {'exploit'} phase with 3 alerts. Add group to meta-alert 29 (sim=0.66) representing {'exploit'}
  Processed group 22/23 from {'exploit'} phase with 3 alerts. Add group to meta-alert 29 (sim=0.75) representing {'exploit'}
  Processed group 23/23 from {'exploit'} phase with 4 alerts. Add group to meta-alert 30 (sim=0.58) representing {'exploit'}

Results:
 delta = 0.5: 41 meta-alerts generated
 delta = 5: 15 meta-alerts generated

Meta-alerts are stored in data/out/aggregate/meta_alerts.txt

Alerts are stored in data/out/aggregate/alerts.txt

Each line of output corresponds to a group of alerts that occur in close temporal proximity. For each group, the attack phase and group size (i.e., number of alerts in the group) are printed. Moreover, the output indicates whether a group is added to an already existing meta-alert (including the similarity between the group and the meta-alert as well as the attack phase represented by the meta-alert) or whether a new meta-alert is generated. As stated at the end of the output, 41 meta-alerts were generated for a time delta of 0.5 seconds and 15 meta-alerts were generated for a time delta of 5 seconds (use parameter deltas in the agggregate_config.py file to specify the time deltas).

The directory 'samples' contains several examples that are useful for understanding the aggregation technique. The samples include:

  • sample_similarity.py similarities of sample alerts
  • sample_group_similarity.py similarities of sample alert groups
  • sample_merge.py aggregation of sample alerts
  • sample_group_merge.py aggregation of sample alert groups
  • sample_hierarchical_clustering.py execution of the hierarchical clustering method on sample data
  • sample.py execution of incremental meta-alert generation on sample data (corresponds to scenario 2 in paper)

The directory 'evaluation' contains several scripts that measure the performance of the approach. Note that the respective configurations are inside the scripts instead of the aggregate_config.py file. Evaluation scripts include:

  • mds.py generates a similarity matrix for multi-dimensional scaling
  • hierarchical_clustering.py generates an R script for plotting a dendrogram
  • evaluate.py uses unsupervised clustering for meta-alert generation
  • cross_validation.py uses supervised training for alert classification
  • noise_evaluate.py measures the robustness of the approach

Copy any of the sample and evaluation scripts into the main directory to execute it, e.g.:

cp samples/sample.py ./sample.py
python3 sample.py

The output will be generated on the console or in the respective directory in data/out.

If you use any resources from this repository, please cite the following publications:

aecid-alert-aggregation's People

Contributors

dependabot[bot] avatar landauermax avatar martineian avatar skopikf avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

aecid-alert-aggregation's Issues

KeyError Test when running sample.py

Hi there,
I am running the sample.py to understand the algorithm.

I get this error:

delta = 1: 8 groups in ['data/sample/test_cup.txt']
delta = 10: 2 groups in ['data/sample/test_cup.txt']
delta = 50: 1 groups in ['data/sample/test_cup.txt']
delta = 100: 1 groups in ['data/sample/test_cup.txt']
delta = 1: 16 groups in ['data/sample/test_spiral.txt']
delta = 10: 4 groups in ['data/sample/test_spiral.txt']
delta = 50: 2 groups in ['data/sample/test_spiral.txt']
delta = 100: 1 groups in ['data/sample/test_spiral.txt']
Now processing file 1/2...
 Processing groups with delta=1
Traceback (most recent call last):
  File "/home/robomotic/DevOps/github/aecid-alert-aggregation/samples/sample.py", line 32, in <module>
    label.label_group(group)
  File "/home/robomotic/DevOps/github/aecid-alert-aggregation/preprocessing/label.py", line 31, in label_group
    attack_label = get_phase(logfile, ts)
  File "/home/robomotic/DevOps/github/aecid-alert-aggregation/attacktimes.py", line 36, in get_phase
    for test_phase, interval in phase[scenario].items():
KeyError: 'test'

Hey,i got a question about requirement.txt

The highest Python supported version of cdifflib1.2.5 and editdistance0.5.3 is 3.7 but the lowest Python supported version of astropy5.3.3 is 3.9. which version python should be used?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.