Coder Social home page Coder Social logo

hgascon / pulsar Goto Github PK

View Code? Open in Web Editor NEW
344.0 22.0 72.0 3.27 MB

Protocol Learning and Stateful Fuzzing

License: BSD 3-Clause "New" or "Revised" License

Python 98.99% Shell 0.83% R 0.18%
protocol-learning fuzzing security vulnerability-identification networking simulation

pulsar's Introduction

PULSAR

Protocol Learning, Simulation and Stateful Fuzzer

Pulsar is a network fuzzer with automatic protocol learning and simulation capabilites. The tool allows to model a protocol through machine learning techniques, such as clustering, and Markov models. These models can be used to simulate communication between Pulsar and a real client or server thanks to semantically correct messages which, in combination with a series of fuzzing primitives, allow to test the implementation of an unknown protocol for errors in deeper states of its protocol state machine.

For detailed information about the method implemented by Pulsar, you can read the following publications:

Pulsar: Stateful Black-Box Fuzzing of Proprietary Network Protocols
Hugo Gascon, Christian Wressnegger, Fabian Yamaguchi, Daniel Arp and Konrad Rieck
Proc. of 11th EAI International Conference on Security and Privacy in Communication Networks (SECURECOMM) October 2015

Learning Stateful Models for Network Honeypots
Tammo Krueger, Hugo Gascon, Nicole Krämer and Konrad Rieck
ACM Workshop on Security and Artificial Intelligence (AISEC) October 2012

                 _
     _ __  _   _| |___  __ _ _ __
    | '_ \| | | | / __|/ _` | '__|
    | |_) | |_| | \__ \ (_| | |
    | .__/ \__,_|_|___/\__,_|_|  v0.1-dev
    |_|

usage: pulsar.py [-h] [-c CONF] [-l] [-p PCAP] [-b BINARIES] [-a] [-x]
                 [-o OUT] [-d DIMENSION] [-s] [-z] [-m MODEL]

Protocol Learning and Stateful Fuzzing

optional arguments:
  -h, --help            show this help message and exit
  -c CONF, --conf CONF  Change default directory for configuration files. If
                        no directory is given, the files from 'pulsar/conf'
                        will be read.

MODEL LEARNING:
  -l, --learner         Learn a model from a set of network traces.
  -p PCAP, --pcap PCAP  tcpdump output file (pcap) or list of files separated
                        by commas to use as input data for a new model.
  -b BINARIES, --binaries BINARIES
                        Name of binaries to process from the cuckoo storage
                        dir separated with commas.
  -a, --all-binaries    Generate models for all binaries from the cuckoo
                        storage dir (cuckoo/storage/binaries).
  -x, --process         Process derrick files through the functions defined in
                        utils/preprocessing/derrick.py.
  -o OUT, --out OUT     Change output directory for generated models. If no
                        directory is given, the model will be written to the
                        'models' directory.
  -d DIMENSION, --dimension DIMENSION
                        Number of components to be used for NMF clustering.

SIMULATION & FUZZING:
  -s, --simulate        Simulate communication based on a given model.
  -z, --fuzzer          Start a fuzzing session based on a given model.
  -m MODEL, --model MODEL
                        Path of the dir containing the model files to be
                        loaded for simulation or fuzzing.

Configuration

The directory pulsar/conf contains a series of configuration files that define the parameters required for certain operations in each one of the Pulsar methods for automatic learning, simulation and fuzzing.

Examples

Generate the model of a communication channel from individual PCAP files or the recorded traces of one or more binaries run by cuckoo sandbox:

$> pulsar.py -l -p file.pcap (1 pcap file)
$> pulsar.py -b 016169EBEBF1CEC2AAD6C7F0D0EE9026 (1 or more binaries from cuckoo storage)
$> pulsar.py -a (all binaries from cuckoo storage)

Simulate a communication channel based on a learnt model:

$> pulsar.py -s -m model_file

Initiate a fuzzing session against a target given the model of its communication channel:

$> pulsar.py -z -m model_file

pulsar's People

Contributors

dasbruns avatar hgascon avatar kaiserd avatar littletrojan avatar who3411 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

pulsar's Issues

Pulsar fails on certain PCAP files

Hi,

I have tested Pulsar on 2 PCAP files: one 3.9 GB (https://download.netresec.com/pcap/maccdc-2011/maccdc2011_00010_20110312194033.pcap.gz) and one 1.4 GB (not publicly available). The smaller one runs to completion but the larger one does not, with the following error:

Error in intI(i, n = x@Dim[1], dn[[1]], give.dn = FALSE) :
index larger than maximal 0
Calls: loadPrismaData ... callGeneric -> eval -> eval -> [ -> [ -> subCsp_rows -> intI
Execution halted
Error during clustering (not enough data?)
Cluster file not generated: ~/Documents/Fuzzing/pulsar/models/maccdc2011/maccdc2011.cluster
Exiting learning module...

Larger file output:

> # reading arguments
> cmd_args<- commandArgs(TRUE)
> prisma_dir<-cmd_args[1]
> capture_dir<-cmd_args[2]
> clusters_file<-cmd_args[3]
> nmf_ncomp<-cmd_args[4]
> print(cmd_args)
[1] "modules/PRISMA/R"                                                          
[2] "~/Documents/Fuzzing/pulsar/models/maccdc2011/maccdc2011"        
[3] "~/Documents/Fuzzing/pulsar/models/maccdc2011/maccdc2011.cluster"
[4] "0"                                                                         
> 
> # store the current directory
> initial_dir<-getwd()
> 
> # load necessary libraries
> # library(PRISMA)
> library(Matrix)
> 
> # change to prisma src dir and load scripts
> setwd(prisma_dir) 
> source("prisma.R")
> source("dimensionEstimation.R")
> source("matrixFactorization.R") 
> setwd(initial_dir)
> 
> # load the dataset
> data = loadPrismaData(capture_dir)
Reading data...
Splitting ngrams...
Calc indices...
Setup matrix...
to check: 2 
Error in intI(i, n = x@Dim[1], dn[[1]], give.dn = FALSE) : 
  index larger than maximal 0
Calls: loadPrismaData ... callGeneric -> eval -> eval -> [ -> [ -> subCsp_rows -> intI
Execution halted
Error during clustering (not enough data?)
Cluster file not generated: ~/Documents/Fuzzing/pulsar/models/maccdc2011/maccdc2011.cluster
Exiting learning module...

Smaller file output:

 > # reading arguments
> cmd_args<- commandArgs(TRUE)
> prisma_dir<-cmd_args[1]
> capture_dir<-cmd_args[2]
> clusters_file<-cmd_args[3]
> nmf_ncomp<-cmd_args[4]
> print(cmd_args)
[1] "modules/PRISMA/R"                                              
[2] ~/Documents/Fuzzing/pulsar/models/test/test"        
[3] "~/Documents/Fuzzing/pulsar/models/test/test.cluster"
[4] "0"                                                             
> 
> # store the current directory
> initial_dir<-getwd()
> 
> # load necessary libraries
> # library(PRISMA)
> library(Matrix)
> 
> # change to prisma src dir and load scripts
> setwd(prisma_dir) 
> source("prisma.R")
> source("dimensionEstimation.R")
> source("matrixFactorization.R") 
> setwd(initial_dir)
> 
> # load the dataset
> data = loadPrismaData(capture_dir)
Reading data...
Splitting ngrams...
Calc indices...
Setup matrix...
to check: 551 
to check: 518 
to check: 480 
to check: 479 
to check: 478 
to check: 476 
to check: 455 
to check: 430 
to check: 406 
to check: 404 
to check: 366 
to check: 365 
to check: 346 
to check: 345 
to check: 320 
to check: 319 
to check: 317 
to check: 266 
to check: 264 
to check: 262 
to check: 261 
to check: 241 
to check: 240 
to check: 238 
to check: 221 
to check: 206 
to check: 200 
to check: 198 
to check: 191 
to check: 190 
to check: 171 
to check: 168 
to check: 162 
to check: 157 
to check: 156 
to check: 155 
to check: 154 
to check: 153 
to check: 149 
to check: 148 
to check: 147 
to check: 146 
to check: 145 
to check: 143 
to check: 141 
to check: 139 
to check: 138 
to check: 132 
to check: 130 
to check: 129 
to check: 120 
to check: 119 
to check: 118 
to check: 116 
to check: 114 
to check: 113 
to check: 111 
to check: 110 
to check: 109 
to check: 106 
to check: 105 
to check: 104 
to check: 103 
to check: 102 
to check: 100 
to check: 99 
to check: 98 
to check: 97 
to check: 96 
to check: 95 
to check: 93 
to check: 91 
to check: 87 
to check: 86 
to check: 83 
to check: 81 
to check: 79 
to check: 78 
to check: 76 
to check: 75 
to check: 74 
to check: 73 
to check: 72 
to check: 69 
to check: 68 
to check: 65 
to check: 63 
to check: 62 
to check: 60 
to check: 59 
to check: 58 
to check: 57 
to check: 56 
to check: 55 
to check: 54 
to check: 53 
to check: 52 
to check: 50 
to check: 48 
to check: 47 
to check: 46 
to check: 45 
to check: 44 
to check: 43 
to check: 42 
to check: 41 
to check: 40 
to check: 38 
to check: 37 
to check: 36 
to check: 35 
to check: 33 
to check: 32 
to check: 31 
to check: 30 
to check: 29 
to check: 28 
to check: 27 
to check: 26 
to check: 24 
to check: 23 
to check: 22 
to check: 21 
to check: 20 
to check: 19 
to check: 18 
to check: 15 
to check: 13 
to check: 12 
to check: 11 
to check: 10 
to check: 8 
to check: 7 
to check: 6 
to check: 5 
to check: 4 
to check: 3 
to check: 2 
to check: 1 
> 
> # estimate number of components
> #dim = calcEstimateDimension(data$unprocessed)
> #cat("Estimated dimension:", estimateDimension(dim), "\n")
> #ncomp = estimateDimension(dim)
> 
> dimension = estimateDimension(data)
> ncomp <- dimension[[2]]
> if (ncomp == 0) {
+     ncomp <- strtoi(nmf_ncomp)
+ }
> print (ncomp)
[1] 26
> # find NMF decomposition
> pmf = prismaNMF(data, ncomp)
Error: 769.1271 
Error: 732.2996 
Error: 728.6931 
Error: 703.4451 
Error: 703.8828 
Error: 705.781 
Error: 688.984 
Error: 688.8207 
> 
> #compute and write clusters to a file
> clusters = calcDatacluster(pmf)
> write.table(clusters, clusters_file, row.names=FALSE, col.names=FALSE)
> 
Colouring 21 states:
14.UAC|18.UAS
START|8.UAC
START|13.UAC
START|12.UAC
None.UAS|None.UAS
None.UAS|None.UAC
None.UAC|None.UAS
START|4.UAC
START|5.UAC
26.UAS|26.UAC
START|None.UAC
START|26.UAC
8.UAC|26.UAS
None.UAC|None.UAC
START|1.UAC
START|19.UAC
START|22.UAC
26.UAC|26.UAS
26.UAS|8.UAC
START|2.UAC
START|24.UAC

Error in rep(1:N,Sapply(ngrams[-total],length)): invalid times argument

Tried pulsar on FTP capture but unable to get ahead of this.

Calls: loadPrismaData -> readPrismaInput -> readFSally -> sparseMatrix
Execution halted
Error during clustering (not enough data?)
Cluster file not generated: /home/santhosh/pulsar/models/44304/44304.cluster
Exiting learning module...

I have tried with different pcap files which has unique ip's and ports for source and destination but states are not generating.

The pcap file size is about 300M, Error in rep(1:N, sapply(ngrams[-total], length)) : invalid 'times' argument

I have tried many different network protocols. Each individual protocol is a pcap file, but the following errors will occur. Can you help me.. thank you..

  1. os: ubuntu20.04
  2. Language: python3.6.9
  3. Execute command: python3 pulsar.py -l -p icmp3.pcap
  4. Packet size: ~300M
  5. Data packets are from the same original address to the same destination address
  6. The error is as follows:

data = loadPrismaData(capture_dir)
Reading data...
Splitting ngrams...
Calc indices...
Setup matrix...
Error in rep(1:N, sapply(ngrams[-total], length)) :
invalid 'times' argument
Calls: loadPrismaData -> readPrismaInput -> readFSally -> sparseMatrix
Execution halted

“Error during clustering (not enough data?)”

hi, hgason. When I ran" python pulsar.py -l -p 11.pcap", I got the error “Error during clustering (not enough data?)”. My output is as follows:
`~/pulsar$ python pulsar.py -l -p 11.pcap

             _
 _ __  _   _| |___  __ _ _ __
| '_ \| | | | / __|/ _` | '__|
| |_) | |_| | \__ \ (_| | |
| .__/ \__,_|_|___/\__,_|_|  v0.1-dev
|_|

Creating dir /home/wx/pulsar/models/11
Extracting DERRICK files from 11
Generating PRISMA input files from 11
Clustering data...
sh: 1: R: not found
Error during clustering (not enough data?)
Cluster file not generated: /home/wx/pulsar/models/11/11.cluster
Exiting learning module...
`
My pcap file only includes the packages corresponding to a connection between a client and a server at specific ports. And I have tried the pcap file in the link of #7 (comment) , but I also get the same error. I dont know which step I did wrong. Looking forward to your reply.

source("prisma.R") error

I am going to use pulsar to learn a model with pcap file. I have a test.pcap in the dir of pulsar and when I am going to python pulsar.py -l -p test.pcap
The below exception raised:

store the current directory

initial_dir<-getwd()

load necessary libraries

library(PRISMA)

library(Matrix)

change to prisma src dir and load scripts

setwd(prisma_dir)
source("prisma.R")
Error in file(filename, "r", encoding = encoding) :
cannot open the connection
Calls: source -> file
In addition: Warning message:
In file(filename, "r", encoding = encoding) :
cannot open file 'prisma.R': No such file or directory
Execution halted
Error during clustering (not enough data?)
Cluster file not generated: /usr/local/src/pulsar-master/models/test/test.cluster
Exiting learning module...

I find the following codes will cause error:
source("prisma.R")
source("dimensionEstimation.R")
source("matrixFactorization.R")

I just installed prisma from R, I can't find the three ".R" files,How can I solve these problems?Thanks!

Add check for matrix dimensions > 0

To avoid such error, add a check for the size of the matrix that is given to PRISMA as input:

> data = loadPrismaData(capture_dir)
Reading data...
Splitting ngrams...
Calc indices...
Setup matrix...
to check: 2 
Error in intI(i, n = x@Dim[1], dn[[1]], give.dn = FALSE) : 
index larger than maximal 0
Calls: loadPrismaData ... callGeneric -> eval -> eval -> [ -> [ -> subCsp_rows -> intI
Execution halted

If data matrix passed to sparse.cor has no features, i.e. it is a "0 x number of documents" matrix. This will result in a not really meaningful row-index vector toCheck = [1 0] which leads to the error message when executing mat[toCheck, ]. To skip the feature correlation step do, data = loadPrismaData(capture_dir, skipFeatureCorrelation=TRUE) which should at least prevent the error.

Native SSL/TLS Support

Hello,

Really interested in trying this tool out, but i was wondering if there were ever any plans to do protocols inside SSL/TLS? or SSL/TLS protocols like TLS 1.3?

loadPrismaData Error

I am going to use pulsar to learn a model with pcap file. I have a test.pcap in the dir of pulsar and when I am going to python pulsar.py -l -p test.pcap
The below exception raised:

source("prisma.R")
source("dimensionEstimation.R")
source("matrixFactorization.R")
setwd(initial_dir)

load the dataset
print(capture_dir)
[1] "/usr/local/src/pulsar/models/test/test"
data = loadPrismaData(capture_dir)
Reading data...
Splitting ngrams...
Calc indices...
Setup matrix...
to check: 2
Error in intI(i, n = x@Dim[1], dn[[1]], give.dn = FALSE) :
index larger than maximal 0
Calls: loadPrismaData ... callGeneric -> eval -> eval -> [ -> [ -> subCsp_rows -> intI
Execution halted
Traceback (most recent call last):
File "pulsar.py", line 97, in
mg.generate_model()
File "/usr/local/src/pulsar/pulsar/core/model.py", line 59, in generate_model
self._generate_model_pcaps()
File "/usr/local/src/pulsar/pulsar/core/model.py", line 78, in _generate_model_pcaps
self._build_model_files(pcap_noext)
File "/usr/local/src/pulsar/pulsar/core/model.py", line 185, in _build_model_files
urllib.unquote(self.whitespace), self.subrules)
File "/usr/local/src/pulsar/pulsar/core/model.py", line 227, in init
self.dh = data.DataHandler(datapath, ngram, whitespace)
File "/usr/local/src/pulsar/pulsar/core/data.py", line 19, in init
self._readClusterAssignments()
File "/usr/local/src/pulsar/pulsar/core/data.py", line 35, in _readClusterAssignments
self.N, skipFirstLine=False)
File "/usr/local/src/pulsar/pulsar/core/data.py", line 78, in _processData
f = file(fname, "r")
IOError: [Errno 2] No such file or directory: '/usr/local/src/pulsar/models/test/test.cluster'

I noticed the code crash in curCor = sparse.cor(mat[toCheck, ]). I noticed that you are also the author of PRISMA. I am wondering whether this problem is related to my pcap file or if you need more information. Many Thanks. And it would also be great if you can share your test data

Bug; KeyError: '1'

>>> Selecting next state in OFS mode...
State selected: ('None.UAS|None.UAC', 12712)
>>> Searching template with no rules
Probability-based selection from templates: ['9', '10', '11', '12', '13', '14', '15', '16', '17', '18', '19', '20', '21', '22', '23', '24', '25', '26', '27', '28', '29', '30', '31', '32', '33', '34']
fields_to_fuzz:  [ 0  4  6  7  9 10 11]
>>> FUZZING msg...
>>> SELECTED TEMPLATE: 19
>>> TRANSITION TO STATE: None.UAS|None.UAC
>>> STATUS: OK, TRANSITION: 24;38;19
socket: [Errno 32] Broken pipe in send operation
>>> RESETING MODEL...

[*] Connected to server...
>>> Selecting next state in OFS mode...
State selected: ('START|None.UAC', 4236)
>>> Searching template with no rules
Probability-based selection from templates: ['1']
Traceback (most recent call last):
  File "./pulsar.py", line 115, in <module>
    f.run()
  File "/home/roel/fuzzer/pulsar/pulsar/core/fuzzer.py", line 113, in run
    snd_message = self.lens.transitionSelf()
  File "/home/roel/fuzzer/pulsar/pulsar/core/lens.py", line 895, in transitionSelf
    return self.transition(self.role, True)
  File "/home/roel/fuzzer/pulsar/pulsar/core/lens.py", line 991, in transition
    (state, template, msg, fields, transition) = self._transition_to_state(next_state)
  File "/home/roel/fuzzer/pulsar/pulsar/core/lens.py", line 1037, in _transition_to_state
    self.fuzzer)
  File "/home/roel/fuzzer/pulsar/pulsar/core/lens.py", line 656, in create_fuzzed_message
    fuzz_fields = fuzzer.get_fuzz_fields(next_template.ID)
  File "/home/roel/fuzzer/pulsar/pulsar/core/fuzzer.py", line 337, in get_fuzz_fields
    fuzz_mask_int = self.tracker[template_id]
KeyError: '1'

Error in 1:ncol(data) : argument of length 0

I am going to use pulsar to learn a model with pcap file. I have a test2.pcap in the dir of pulsar and when I am going to python pulsar.py -l -p test2.pcap
The below exception raised:

source("prisma.R")
source("dimensionEstimation.R")
source("matrixFactorization.R")
setwd(initial_dir)

load the dataset
data = loadPrismaData(capture_dir)
data = loadPrismaData(capture_dir, skipFeatureCorrelation=TRUE)
Reading data...
Splitting ngrams...
Calc indices...
Setup matrix...
Error in 1:ncol(data) : argument of length 0
Calls: loadPrismaData ... preprocessPrismaData -> duplicateRemover -> sapply -> lapply
Execution halted
Error during clustering (not enough data?)
Cluster file not generated: /home/mjy/pulsar/models/test2/test2.cluster
Exiting learning module...

And the test pcap are here.

Bug in filter.py?

I was testing pulsar with PCAP files containing DNS traffic and discovered that the resulting markov model was incorrect, as it only contained "START|X.UAC" states and no "X.UAS", even though DNS responses were also captured. In the .harry file, there were also only UAC entries which are mapped to the actual DNS responses:

dialogId        msgNumber       origin  type
0       0       UAC     1%ea%81%83%00%01 ...
1       0       UAC     %f5%0d%81%83%00%01 ...
2       0       UAC     %0c%a9%81%80%00%01 ...
3       0       UAC     %9d%e6%81%80%00%01 ...
4       0       UAC     %a5k%81%80%00%01 ...
5       0       UAC     p%06%81%80%00%01%00 ...
6       0       UAC     %edM%81%80%00%01%00%00 .. 
7       0       UAC     K%bb%81%80%00%01%00%02% ...
8       0       UAC     %db(%81%83%00%01%00%00%00 ...

However, after changing

for (k, msg) in src_dst_dic.iteritems():
    if response[k] == 0:
        mergedMessages.append(msg)
mergedMessages.sort(key=lambda m: m.ntime)

in singlePassCheck() in filter.py to

for (k, msg) in src_dst_dic.iteritems():
    mergedMessages.append(msg)
mergedMessages.sort(key=lambda m: m.ntime)

it seems to work, the .harry file also contains the UAS entries and the corresponding packets are mapped correctly. However, I don't known if this change breaks something, i.e. when using TCP. Is there some configuration option that needs to be set when using UDP?

.cluster file need to be modified to relate the message with the cluster number

I am trying to understand the implementation of PRISMA and PULSAR for my research. pulsar.core.data.DataHandler implementation about clusterAssignments supposes that .cluster file will relate all messages(line) belong to cluster number but as a matter of fact they are not yet related. As a result, itunes-xbmc does not seem to create model very well.

For instance, the data format of .cluster file which is expected by pulsar.core.data.DataHandler is :

Cluster number belonging to message 1(line 1)
Cluster number belonging to message 2(line 2)
Cluster number belonging to message 3(line 3)
…
Cluster number belonging to message n(line n) (=last message)

But currently, the .cluster file data format is:

Cluster number belonging to message ?(line ?) (=1st unique message's cluster number)
Cluster number belonging to message ?(line ?) (=2nd unique message's cluster number)
Cluster number belonging to message ?(line ?) (=3rd unique message's cluster number)
….
Cluster number belonging to message ?(line ?) (=last unique message's cluster number)

As a test, I tried to show contents of pulsar/core/cluster_generator.R’s variable, clusters (contents of clusters is written to .cluster file). The result is as follow:

> #capture_dir = “models/itunes-xbmc/itunes-xbmc”
…
> clusters = calcDatacluster(pmf)
> clusters
  line2  line98 line167 line563 line787 line273 line487 line451 line173 line181 
      1       6      10       5       4       3       1       4      10      10 
line569 line577 line793 line801 line493 line501 line457 line471 line465 line177 
      5       5       7       7       1       7       7       7       7      10 
line171 line169 line573 line567 line565 line797 line791 line497 line461 line277 
     10      10       5       5       5       7       2       1       5       3 
line491 line455 line789 line275 line185 line583 line805 line507 line469  line21 
      1       2       4       3      10       5       2       1       2       9 
line677 line291 line489 line453   line1 line165 line785 line271 line449 line485 
      2       2       1       4       4       4       4       4       4       4 
line561 line183 line289 line467 line503 line579  line97   line3 line671 line653 
      4       4       4       4       4       4       4       4       4       4 
  line5  line11  line19  line15   line9   line7  line17 line179 line575 line799 
      9       9       9       9       9       9       9      10       5       7 
line499 line463 line285 line667 line279 line287 line283 line655 line661 line669 
      1       5       3       8       3       3       3       8       8       8 
line777 line665 line659 line657  line10   line6   line8 
      8       8       8       8       6       6       6 

Under the influence of now .cluster file data format, many messages don’t relate to cluster number. To fix this issue, unique messages cluster number needs to be mapped to all messages cluster number. Unique messages can be made from prisma.R’s function duplicateRemover’s variable uniqueClasses. And, pulsar/core/cluster_generator.R’s variable names(data$remapper) relates uniqueClasses(data$remapper relates all messages).

My proposed correction procedure is as follow (Please take a look at PR sent later for more information → #21 ):

  1. Use names(data$remapper) and uniqueClasses to map unique messages to all messages(It's called lines).
  2. Use uniqueClasses and clusters to map unique messages cluster number to all messages cluster number(It's called lineClusters).
  3. Write lineClusters to .cluster file.

Thanks for taking your time reading this.

I am not a native speaker so some of my expression might not be accurate. Sorry for this inconvenience.

Problem when running pulsar in fuzzing mode

Pulsar looks like a really useful tool and I'm very keen to get it up and running as I haven't found anything else that is such a good match for what we require. I can run it in training mode and it runs as expected, but when I run it in fuzzing mode it terminates with an error.

I have verified the issue with two different pcap files: one downloaded from https://download.netresec.com/pcap/4sics-2015/4SICS-GeekLounge-151022.pcap and another that I captured from local traffic. The error is (with some added debug print statements, which offset some of the lines in the stack trace from the actual line numbers in the unaltered code):

            _
 _ __  _   _| |___  __ _ _ __
| '_ \| | | | / __|/ _` | '__|
| |_) | |_| | \__ \ (_| | |
| .__/ \__,_|_|___/\__,_|_|  v0.1-dev
|_|

RESETING MODEL...
host=192.168.56.101 port=9999

[*] Connected to server...

Selecting next state in OFS mode...
State selected: ('START|None.UAC', 8180)
Probability-based selection from templates: ['353', '354', '355', '356', '357', '358', '362', '364', '365', '366', '367', '368', '369', '370', '371', '372', '373', '374', '376', '378', '379', '381', '382', '384']

fields_len=3 # Documents/Fuzzers/pulsar/pulsar/core/fuzzer.py:338

fuzz_mask_int=7 # Documents/Fuzzers/pulsar/pulsar/core/fuzzer.py:339

num=[-8] # /usr/lib/python2.7/dist-packages/numpy/core/numeric.py:2259

Traceback (most recent call last):
File "./pulsar.py", line 115, in
f.run()
File "Documents/Fuzzers/pulsar/pulsar/core/fuzzer.py", line 113, in run
snd_message = self.lens.transitionSelf()
File "Documents/Fuzzers/pulsar/pulsar/core/lens.py", line 895, in transitionSelf
return self.transition(self.role, True)
File "Documents/Fuzzers/pulsar/pulsar/core/lens.py", line 991, in transition
(state, template, msg, fields, transition) = self._transition_to_state(next_state)
File "Documents/Fuzzers/pulsar/pulsar/core/lens.py", line 1037, in _transition_to_state
self.fuzzer)
File "Documents/Fuzzers/pulsar/pulsar/core/lens.py", line 656, in create_fuzzed_message
fuzz_fields = fuzzer.get_fuzz_fields(next_template.ID)
File "Documents/Fuzzers/pulsar/pulsar/core/fuzzer.py", line 341, in get_fuzz_fields
fields_len)
File "/usr/lib/python2.7/dist-packages/numpy/core/numeric.py", line 2260, in binary_repr
poswidth = len(bin(-num)[2:])
TypeError: only integer scalar arrays can be converted to a scalar index

I don't think it would be a fruitful exercise for me to try and debug it, so I was wondering if anyone else had come across this particular error.

I am running it on Debian 4.13.4-2kali1 (kali-rolling 2017.2)

Thanks,
Martin.

Err: No active connection to close

I tried to simulate a connection based on a given model I generated before and I got the following error which is shown in the image as well:
Err: set the Simulator role to 'client' or 'server' in /home/emi/pulsar/test
I went to fuzzer.py in conf directory and changed the role from client to server and the other way around but I am still getting the same error.
When I try to use the same model for starting a fuzzing session I get :
socket.error: [Errno 110] Connection timed out.
Can you please help me?
Screenshot from 2021-02-22 09-17-15
Screenshot from 2021-02-22 09-17-23

Unable to run learning mode

Dear Hugo

I am trying to run your app in learning mode for fuzzing. but facing issue. here is output of application run.

server:~/pulsar# ./pulsar.py -l -p tcap_fuzzing.pcap

             _
 _ __  _   _| |___  __ _ _ __
| '_ \| | | | / __|/ _` | '__|
| |_) | |_| | \__ \ (_| | |
| .__/ \__,_|_|___/\__,_|_|  v0.1-dev
|_|

Extracting DERRICK files from tcap_fuzzing
Generating PRISMA input files from tcap_fuzzing
Clustering data...

R version 3.0.2 (2013-09-25) -- "Frisbee Sailing"
Copyright (C) 2013 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.

Natural language support but running in an English locale

R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.

reading arguments

cmd_args<- commandArgs(TRUE)
prisma_dir<-cmd_args[1]
capture_dir<-cmd_args[2]
clusters_file<-cmd_args[3]
nmf_ncomp<-cmd_args[4]
print(cmd_args)
[1] "modules/PRISMA/R"
[2] "/root/pulsar/models/tcap_fuzzing/tcap_fuzzing"
[3] "/root/pulsar/models/tcap_fuzzing/tcap_fuzzing.cluster"
[4] "0"

store the current directory

initial_dir<-getwd()

load necessary libraries

library(PRISMA)

library(Matrix)

change to prisma src dir and load scripts

setwd(prisma_dir)
Error in setwd(prisma_dir) : cannot change working directory
Execution halted
Error during clustering (not enough data?)
Cluster file not generated: /root/pulsar/models/tcap_fuzzing/tcap_fuzzing.cluster
Exiting learning module...
server:~/pulsar#

TypeError: Expected str, got bytes

Traceback (most recent call last):
File "pulsar.py", line 98, in
mg.generate_model()
File "/home/x/pulsar/pulsar/core/model.py", line 59, in generate_model
self._generate_model_pcaps()
File "/home/x/pulsar/pulsar/core/model.py", line 76, in _generate_model_pcaps
self._generate_prisma_input(pcap_noext)
File "/home/x/pulsar/pulsar/core/model.py", line 159, in _generate_prisma_input
h.generate_prisma_input(drk_file)
File "/home/x/pulsar/pulsar/core/harry.py", line 88, in generate_prisma_input
doSingleWrite(filteredMessages, base)
File "/home/x/pulsar/pulsar/core/harry.py", line 67, in doSingleWrite
sallyInputFile = sally.rawWrite(fMessages, theBase, self.ngram)
File "/home/x/pulsar/pulsar/core/sally.py", line 36, in rawWrite
return rawWriteText(messages, path)
File "/home/x/pulsar/pulsar/core/sally.py", line 63, in rawWriteText
raw = urllib.parse.unquote(m.msg)
File "/usr/lib/python3.8/urllib/parse.py", line 643, in unquote
raise TypeError('Expected str, got bytes')
TypeError: Expected str, got bytes
ntp.zip

here is the pcap file

TimeoutError: [Errno 110] Connection timed out

May I ask how the case(itunes-xbmc) you gave passed the fuzz test?

When I type the command:pulsar.py -z -m /home/csuerlk/Pulsar/pulsar/models/itunes-xbmc, I encounter the following problem:

Traceback (most recent call last):
File "pulsar.py", line 115, in
f.run()
File "/home/csuerlk/Pulsar/pulsar/pulsar/core/fuzzer.py", line 111, in run
self.timeout, self.bsize)
File "/home/csuerlk/Pulsar/pulsar/pulsar/common/networking.py", line 11, in init
self.connection.connect((host, port))
TimeoutError: [Errno 110] Connection timed out

itunes-xbmc fuzzing example

I'm trying to use the itunes-xbmc model provided in the models/ folder as an example for fuzzing. I've created configuration files for the server and client and when I launch a fuzzing session, I see the following from the server side:

[ ] Waiting for client connection...
[*] Connected to client...
>>> RESETING MODEL...
>>> RECEIVING message... 
>>> RECEIVED message:
################################################################################
OPTIONS%20%2A%20RTSP/1.0%0D%0ACSeq%3A%201%0D%0AUser-Agent%3A%20iTunes/11.4%20%28Macintosh%3B%20OS%20X%2010.9.5%29%0D%0AClient-Instance%3A%209516327BC34A8004%0D%0ADACP-ID%3A%209516327BC34A8004%0D%0AActive-Remote%3A%202924970602%0D%0AApple-Challenge%3A%20Fp%2BwsheUfktDySHpateTWA%0D%0A%0D%0A
################################################################################
>>> Consuming RECEIVED msg of length 211
>>> EXACT MATCHED TEMPLATE: 25
>>> TRANSITION TO STATE: START|1.UAC
>>> STATUS: OK
[1502218306, 1, 'OK', 'START|1.UAC', '25', [], []]
>>> Selecting next MOST probable state from: [('1.UAC|2.UAC', 1)]
>>> STATUS: NO TRANSITION, TRANSITION: None
>>> RECEIVING message... 

And the following from the client side:

>>> RESETING MODEL...

[*] Connected to server...
Selecting next MOST probable state from: [('START|1.UAC', 1)]
Probability-based selection from templates: ['25']
>>> SELECTED TEMPLATE: 25
>>> TRANSITION TO STATE: START|1.UAC

>>> SENDING msg:
OPTIONS * RTSP/1.0
CSeq: 1
User-Agent: iTunes/11.4 (Macintosh; OS X 10.9.5)
Client-Instance: 9516327BC34A8004
DACP-ID: 9516327BC34A8004
Active-Remote: 2924970602
Apple-Challenge: Fp+wsheUfktDySHpateTWA

The server-side is started with the following command:
python pulsar.py -c pulsar/conf/server_fuzz -z -m ./models/itunes-xbmc

The client-side is started with the following command:
python pulsar.py -c pulsar/conf/client_fuzz -z -m ./models/itunes-xbmc

At which point both hang, because it seems the server-side is waiting for a message, but the client has not sent it. Is there something I'm doing wrong? Thanks!

test error?

I build in Windows Core FTP LE, then set up VSFTPD services under Linux, both can communicate, then I grabbed the FTP data packets, and stored in the pulsar directory and name for ftp. pcap, then I want to study to generate a model automatically, appear the following error。
image

Tips for installation

Since the readme doesn't state on how to install the dependencies and the requirements.txt isn't compatible with pip i wonder if there are any instructions on how to get this working?

Greetings
gt

Question; how to capture using mysqldump or derrick

I have tried various combinations of pcap dumps with mysqldump and derrick. None of the files make it through correctly (at best I am getting "Error during clustering (not enough data?)" and "to check: 2")

Setup;

1 server running at IP 127.0.0.1 port 30000
1 client connecting to IP 127.0.0.1 port 30000

Things tried;

sudo tcpdump -i any -s 0 -n -w out.pcap
sudo tcpdump -i lo -s 0 src host 127.0.0.1 -w out.pcap
sudo /usr/local/bin/derrick -i lo -l out.pcap -b 999999999 -m -t 999999999
sudo dumpcap -i lo -w /tmp/out.pcap

Would you be able to give some command line examples of how to correctly use tcpdump, derrick, or wireshark in this setup, and clarify how much data should be captured? Also, is capturing both server and client traffic in the same pcap file fine?

Thank you!

[Question] Hidden Markov Models

Hi everyone,

I have a quick question regarding the models used in the code.

The Readme states:

The tool allows to model a protocol through machine learning techniques, such as clustering and hidden Markov models.

In the code, I found Markov chains but no traces of Hidden Markov Models. Is the statement in the Readme to be understood more generally (as in "you could use the code to generate hidden markov models") or are hidden markov models actually generated by Pulsar?

Thanks in advance!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.