The library is dedicated to estimating statistical properties of packets grouped within a network flows or even a device, given a .pcap file and target object identifier.
The identifier must be specified at .pcap processing stage, where packet-related features (packet size, inter-arrival time and direction) are extracted. For example, to extract device-level stats you can try the following:
export PYTHONPATH=.
python pcap_parsing/main.py \
--pcapfile=traffic_dumps/iot_amazon_echo.pcap \
--identifier='44:65:0d:56:cc:d3'
To process a separate flow, do something like:
python pcap_parsing/main.py \
--pcapfile=traffic_dumps/skypeLANhome.pcap \
--identifier="UDP 192.168.0.102:18826 192.168.0.105:26454" \
--flow_level
Given the target stats, there are two approaches to model them:
-
Train two hidden Markov models (one for each traffic direction), which are already sufficient to recreate network packets of the given flow/device.
python hmm_generator/train_evaluate_hmm.py \ --dataset="traffic_dumps/iot_amazon_echo_44:65:0d:56:cc:d3.csv"
-
Train two gaussian mixtures that map packet features to mixture centroids, effectively transforming initial features to discrete sequences, which are to be processed with a dedicated sequence model. This can be viewed as a decomposition of the HMM framework.
Fit Gaussian mixtures:
python features/train_quantizer.py \ --dataset="traffic_dumps/iot_amazon_echo_44:65:0d:56:cc:d3.csv"
This allows us to easily use various sequence models, like Markov chains:
python markov_baseline/train_evaluate_markov.py \ --dataset="traffic_dumps/iot_amazon_echo_44:65:0d:56:cc:d3.csv" \ --quantizer_path="obj/iot_amazon_echo_44:65:0d:56:cc:d3"
or autoregressive neural networks, either recurrent (RNN) or temporal convolutional networks (TCN):
python nn_generators/train_generator.py \ --dataset="traffic_dumps/iot_amazon_echo_44:65:0d:56:cc:d3.csv" \ --quantizer_path="obj/iot_amazon_echo_44:65:0d:56:cc:d3" \ --generator_name=RNN
The code for the paper below is available at this tag:
- Bikmukhamedov R., Nadeev A., Maione G., and Striccoli D., "Comparison of HMM and RNN models for network traffic modeling", Internet Technology Letters, 2020. DOI: 10.1002/itl2.147