This repository contains the Explainability Tool developed by Pluribus One in the context of the AssureMOSS project.
Our tool perform the detection of security-relevant GitHub commits (on JAVA code only), e.g., code changes that are related to a vulnerability fixing, and shows how individual source code tokens have influenced the decision. A video demonstration of the tool can be found here.
We base our tool on JavaBERT-uncased (link to the related publication on arxiv) model, that we fine-tuned on the commit classification task. Explanations are obtained by applying the Layer Integrated Gradients method.
The Explainability Tool app is implemented with FastAPI and can be launched in several ways. For instance, you can use uvicorn:
uvicorn main:app --host 0.0.0.0 --port 8000
Before launching the tool, you can choose whether running the model on CPU or GPU
by changing the DEVICE
parameter from main.py
.
The first launch might be slower because the model will be downloaded.
To analyze a commit you must paste a commit url from GitHub. If the commit belong to a large repository, you might experiment a slight delay on the first analysis from it, as the entire repository need to be downloaded.
In the visualizations, a commit diff is shown as follows:
changed lines start either with a +
or a −
symbol, depending on if they belong
to the new version of the modified methods or the previous one.
If a line is present in both versions, an additional line starting with ?
helps to identify added or removed characters (again, with +
or −
symbols,
respectively).
On each diff line, tokens that influence the classifier’s decision towards the positive
class (i.e., security-relevant) are highlighted in green, whereas tokens that push toward the other class
are highlighted in red.
Neutral tokens are not highlighted.
The colour intensity is related to the weight that each token assumes in the decision.