octopus-platform / joern-tools Goto Github PK

View Code? Open in Web Editor NEW

36.0 36.0 16.0 251 KB

Python utilities for joern

License: GNU General Public License v3.0

Python 95.91% Groovy 4.09%

joern-tools's People

Contributors

Stargazers

Watchers

Forkers

a0x77n chubbymaggie luiseduardohdbackup mvladk pombredanne vlad902 yangke qingkunm jjiad vaioco sigma-random tumblr1978 hylkefoeken bwry 5l1v3r1 zhaojianrui

joern-tools's Issues

Missing files

I would like to use joern-tools, so currently I am trying to learn how to use it but I found when the joern-tools/examples/subtreeEmbed.sh run , there are some missing python
files.

./subtreeEmbed.sh: line 1: ./lookup.py: No such file or directory
./subtreeEmbed.sh: line 1: ./getAst.py: No such file or directory
./subtreeEmbed.sh: line 1: ./astlabel.py: No such file or directory
./subtreeEmbed.sh: line 1: ./ast2features.py: No such file or directory
./subtreeEmbed.sh: line 1: ./demux.py: No such file or directory
./subtreeEmbed.sh: line 3: sally: command not found

I found similar names to the missing files at /joern-tools/build/scripts-2.7 folder. Is the files in this folder are the same missing files?
can you please let me know where I can find these files? especially getAst.py file.
Thank you!

EmbeddingLoader.py dependency

When finding the nearest neighbor in the VLC tutorial, I execute the following example:

joern-list-funcs -p VLCEyeTVPluginInitialize | awk -F "\t" '{print $2}' | joern-knn

An ImportError arises stating that there is no module named "sklearn.datasets":

    "from sklearn.datasets import load_svnmlight_file"

I thought the only dependency for joern-tools was pygraphviz.

After downloading scikit-learn v0.15.1 to provide the module "sklearn.datasets", I'm presented with the error in _svmlight_format.c line 2505, "ValueError: Feature indices in SVMlight/LibSVM data file should be sorted and unique.

Which version of scikit is being used for sklearn.datasets?

joern-apiembedder and joern-knn

When finding the nearest neighbor in the VLC tutorial, I execute the following example:

joern-list-funcs -p VLCEyeTVPluginInitialize | awk -F "\t" '{print $2}' | joern-knn

I'm presented with the error in _svmlight_format.c line 2505, "ValueError: Feature indices in SVMlight/LibSVM data file should be sorted and unique.

I've identified that when the SVM file is created by joern-apiembedder, the key-value pairs are in descending order by key; therefore, when the keys are parsed, the current index will be less than the previous index. Thus, the error is triggered.

For example, a line from the SVM file is:

3382 1:1.00000 0:1.00000 #3382

The current key, 0, is less than the previous key, 1, so the exceptions is thrown.

I get this error regardless of the source code repository I'm analyzing.

Thoughts on how to correct this, so that I can properly use joern-embedder and joern-knn together?

Support plotting of call graphs

It would be nice if joern-plot-proggraph could also plot call graphs to get a quick overview of how functions are linked to one another by calls.

No module named gremlin

when execute joern-lookup , it turns out :

Traceback (most recent call last):
File "/usr/local/bin/joern-lookup", line 4, in
import('pkg_resources').run_script('joerntools==0.1', 'joern-lookup')
File "/usr/lib/python2.7/dist-packages/pkg_resources/init.py", line 719, in run_script
self.require(requires)[0].run_script(script_name, ns)
File "/usr/lib/python2.7/dist-packages/pkg_resources/init.py", line 1504, in run_script
exec(code, namespace, namespace)
File "/usr/local/lib/python2.7/dist-packages/joerntools-0.1-py2.7.egg/EGG-INFO/scripts/joern-lookup", line 3, in
from joerntools.shelltool.LookupTool import LookupTool
File "/usr/local/lib/python2.7/dist-packages/joerntools-0.1-py2.7.egg/joerntools/shelltool/LookupTool.py", line 4, in
from TraversalTool import TraversalTool
File "/usr/local/lib/python2.7/dist-packages/joerntools-0.1-py2.7.egg/joerntools/shelltool/TraversalTool.py", line 3, in
from joerntools.shelltool.JoernTool import JoernTool
File "/usr/local/lib/python2.7/dist-packages/joerntools-0.1-py2.7.egg/joerntools/shelltool/JoernTool.py", line 3, in
from joerntools.DBInterface import DBInterface
File "/usr/local/lib/python2.7/dist-packages/joerntools-0.1-py2.7.egg/joerntools/DBInterface.py", line 2, in
from joern.all import JoernSteps
File "/usr/local/lib/python2.7/dist-packages/joern-0.1-py2.7.egg/joern/all.py", line 2, in
from py2neo.ext.gremlin import Gremlin
ImportError: No module named gremlin

how to solve this problems?

Multiline queries support

E.g works fine:

echo "fun1().fun2()" | joern-lookup -g

echo "fun1()
.fun2()" | joern-lookup -g

Fails on unexpected query

The scalability advantage of SallyBasedEmbedder

Here is the time cost of GlobalAPIEmbedding operation in chucky-ng.
By changing it's embedder, I got two different result about the time cost .

EmbedderType	Libpng-1.2.44	Libtiff-3.9.3	Pidgin-2.7.3	Firefox-4.0(/js)	Linux-2.6.34.13(/fs)
APIEmbedder using SallyBasedEmbedder	20s	39s	1m25s	11m50s	11m21s
SimplifiedAPIEmbedder using PythonAPIEmbedder	11s	32s	14m30s	93m48s	114m29s

This means the current PythonAPIEmbedder.py or SimplifiedAPIEmbedder.py still have the scalability problem when processing large code base. And SallyBasedEmbedder is the best choice by far when we encounter large code bases.

octopus-platform / joern-tools Goto Github PK

joern-tools's People

Contributors

Stargazers

Watchers

Forkers

joern-tools's Issues

Missing files

EmbeddingLoader.py dependency

joern-apiembedder and joern-knn

Support plotting of call graphs

No module named gremlin

Multiline queries support

The scalability advantage of SallyBasedEmbedder

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent