Coder Social home page Coder Social logo

yunlongs / goshawk Goto Github PK

View Code? Open in Web Editor NEW
75.0 3.0 15.0 175.45 MB

Goshawk is a static analyze tool to detect memory corruption bugs in C source codes. It utilizes NLP to infer custom memory management functions and uses data flow analysis to abstract their behaviors and then adopts these summaries to enhace bug detection.

Home Page: https://goshawk.code-analysis.org/

Python 9.30% CMake 0.09% C++ 90.52% Makefile 0.01% C 0.03% Dockerfile 0.06%

goshawk's Introduction

News

  • Docker build support!
  • Add bug list
  • Goshawk now supports Clang-15.0.0

Found bug list

To see bugs found by Goshawk, visit bug_list page. You can also see some details about the bugs found by Goshawk.

Code Structure

Directories

  • data_process: The scripts for pre-processing, parsing and normalizing the function prototypes.
  • model: Pre-trained Siamese network, can be directly used to classify functions.
  • plugins: Clang and CSA plugins used by Goshawk.
  • plugins_src: The source codes of Clang plugins.
  • subword_dataset: The learned subword vocabulary and embedding for function prototype segmentation, and official MM function list.

Main Scripts

  • run.py: The entry point of Goshawk, performs each steps of Goshawk.
  • train.py: Train the Siamese network.
  • cal_metric.py: Evaluate the accuracy of the trained model.
  • similarity_inference.py: Utilize the trained Siamese network to generate similarity scores for each function prototype.
  • mysegment.py: The ULM based function prototype segmentation algorithm.
  • frontend_checker.py: Validate the MM functions according the function prototype and data flow.

Ⅰ. Environment Setup

Ⅰ.A Docker build (recommend)

Directly use our image released on DockerHub:

docker pull mmmiracle/goshawk

Or build docker image by yourself:

docker build -t goshawk .

Ⅰ.B Manually build

robin-map
python 3.7+
tensorflow = 2.1
CodeChecker
Clang v15.0.0

Download the subword embeddings to the directory subword_dataset/word_embedding.

You can install robin-map from https://github.com/Tessil/robin-map.

You can install CodeChecker from https://github.com/Ericsson/codechecker.

You can download the version of Clang-15.0.0 form this link, or compile a clang-15.0.0 by yourself.

Ⅱ. How to use

Ⅱ.A Record compilation commands of your target project.

Before using this tool, you need to record the compilation commands used by each file to compile the source code of the project, and then the further analysis will be based on these compilation commands.

We can use CodeChecker to record the required compilation commands. For the projects which use Makefile to build, we can use the log -b cmd to encapsulate the make related cmd to record the compiling process:

export CC=clang
export CXX=clang++
CodeChecker log -b "make CC=clang HOSTCC=clang -j$(nproc)" -o compilation.json

Remember that set your default compilers to clang and clang++.

The compilation commands will be recorded in the file of compilation.json.

Ⅱ.B Run the full phases of Goshawk to analyze a target project.

note: For large project, like linux kernel, you should guarantee that there is at least 300GB ROM on you hard disk.

Currently, you only need one command to analyze a project by Goshawk:

python3 run.py target_project_path

But you should make sure that there is a compilation.json file of your project under the target_project_path.

The MM functions and their corresponding MOSs will be generated at output/alloc and output/free.

Note: All the MM functions in AllocNormalFile.txt, AllocCustomFile.txt, FreeNormalFile.txt and FreeCustomFile.txt are considered as customized. They are separated as *NormalFile.txt and *CustomFile.txt because all MM functions in *NormalFile.txt behavior like malloc or free. It is an implementation decison.

The bug detection results will be generated at output/report_html/index.html.

Ⅲ Some beneficial components in Goshawk

Ⅲ.A Function Proatotype Segmentation

Function normalize_prototype_file(in_file, out_file) in normalize.py can be used to segment function prototypes.

It Segments and normalizes the function prototypes in the in_file, and the results are saved at out_file.

For example,
    before: void * kmalloc_array(size_t n, size_t size, gfp_t flags)
    after:  <cls> <ptr> kmalloc array ( <noptr> n <dot> <noptr> size <dot> <noptr> flags )

Ⅲ.B Re-train Simaese network for your customized target function identification task (e.g.,MM functions, crypto functions,...).

1. Prepare your training function prototype dataset.

Take crypto function as example, the dataset should be the prototypes of your collected crypto functions. Each line is a function prototype.

crypto.txt
-------------
int crypto_aead_encrypt(struct aead_request *req)
int crypto_aead_decrypt(struct aead_request *req)
static int crypto_aegis128_encrypt_generic(struct aead_request *req)
static int crypto_aegis128_decrypt_simd(struct aead_request *req)
void crypto_aegis128_encrypt_chunk_neon(void *state, void *dst, const void *src,unsigned int size)
void crypto_aegis128_decrypt_chunk_neon(void *state, void *dst, const void *src,unsigned int size)
static int crypto_authenc_esn_encrypt(struct aead_request *req)
static int crypto_authenc_esn_decrypt(struct aead_request *req)
...

2. Train the Siamese network.

We have implemented the re-train of Siamese network in the script Re-train.py. It takes two arguments:

  • training corpus
  • your trained model name

For example:

python Re-train.py crypto.txt crypto

After the training finished, your trained model which names "crypto" is saved at directory "model/crypto".

3. Infer similarities.

The already trained model were saved in the directory "model", you can use them to infer similarity for other functions directly.

We have implemented these functions in the script similarity_inference.py.

You can call the function similarity_inference(model_name, filename) to infer similarity for the functions whose prototypes saved in the argument filename.

Here, model_name should be the name of model that save in the directory model.

For example, there is a file names test.func which contains the follow functions:

test.func
---------
void * mem_malloc(unsigned long size)
void mem_free(void *ptr)
void CAST_set_key(CAST_KEY *key, int len, const unsigned char *data)

We can call the function similarity_inference to infer similarities for them.

from similarity_inference import working_on_raw_function_prototype
similarity_inference("alloc", "test.func") # Infer the similarity for allocation functions.

The result are saved at "temp/func_name_similarity"

temp/func_name_similarity
----
mem_malloc 0.938829920833657
mem_free -0.9019584597976495
cast_set_key -0.9085114460471964

Citation

We release Goshawk source code in the hope of benefiting others. If you find this project useful, please consider citing:

@INPROCEEDINGS {Goshawk,
    author = {Y. Lyu and Y. Fang and Y. Zhang and Q. Sun and S. Ma and E. Bertino and K. Lu and J. Li},
    booktitle = {2022 2022 IEEE Symposium on Security and Privacy (SP) (SP)},
    title = {Goshawk: Hunting Memory Corruptions via Structure-Aware and Object-Centric Memory Operation Synopsis},
    year = {2022},
    issn = {2375-1207},
    pages = {1566-1566},
    doi = {10.1109/SP46214.2022.00137},
    url = {https://doi.ieeecomputersociety.org/10.1109/SP46214.2022.00137},
    publisher = {IEEE Computer Society},
    address = {Los Alamitos, CA, USA},
    month = {may}
}

If your research work is inspired by or benefits from the NLP based function similarity inference module in Goshawk, please also consider citing:

@INPROCEEDINGS{SparrowHawk,
    author = {Lyu, Yunlong and Gao, Wang and Ma, Siqi and Sun, Qibin and Li, Juanru},
    title = {SparrowHawk: Memory Safety Flaw Detection via Data-Driven Source Code Annotation},
    year = {2021},
    isbn = {978-3-030-88322-5},
    publisher = {Springer-Verlag},
    address = {Berlin, Heidelberg},
    url = {https://doi.org/10.1007/978-3-030-88323-2_7},
    doi = {10.1007/978-3-030-88323-2_7},
    booktitle = {Information Security and Cryptology: 17th International Conference, Inscrypt 2021, Virtual Event, August 12–14, 2021, Revised Selected Papers},
    pages = {129–148},
    numpages = {20},
}

goshawk's People

Contributors

kylin-1 avatar shangzhixu avatar yunlongs avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

goshawk's Issues

json 解析错误

检查redis-stable版本,parse_call_graph.py第53行caller_funcname = json.loads(caller.strip())["funcname"]报错:

json.decoder.JSONDecodeError: Expecting ',' delimiter: line 1 column

caller.strip()输入内容为:

{"return_type": "class std::basic_string<char>", "funcname": "std::string_literals::operator""s", "params": "const char *@__str,unsigned long@__len,", "file" :"/usr/lib/gcc/aarch64-linux-gnu/7.3.0/../../../../include/c++/7.3.0/bits/basic_string.h", "begin": [6660, 5], "end": [6663, 48]}

其中funcname多余末尾“s”

The result is empty

I successfully configured the environment and ran Goshwak, and then I tried to use openssl as a test. The commit I used was b33c48b75aaf33c93aeda42d7138616b9e6a64cb, an earlier version. I successfully ran it to the end, but the running results were inconsistent with the official sample results, and the number of bugs detected is 0, I don't know if there is really no bug or I made a mistake, here are some results :

This is the content of the AllocCustomizedFile.txt file

{"funcname": "ossl_cmp_certrep_new", "returned_object": ["->extraCerts"], "param_object": []}
{"funcname": "dsa_new_intern", "returned_object": ["->lock"], "param_object": []}
{"funcname": "dsa_do_sign_int", "returned_object": ["->r", "->s"], "param_object": []}
{"funcname": "dh_new_intern", "returned_object": ["->lock"], "param_object": []}
{"funcname": "EC_GROUP_new_from_ecparameters", "returned_object": ["->seed"], "param_object": []}
{"funcname": "d2i_ECDSA_SIG", "returned_object": ["->r", "->s"], "param_object": [1, "psig"]}
{"funcname": "rsa_multip_info_new", "returned_object": ["->d", "->pp", "->r", "->t"], "param_object": []}
{"funcname": "policy_data_new", "returned_object": ["->expected_policy_set"], "param_object": []}
{"funcname": "make_IPAddressFamily", "returned_object": ["->ipAddressChoice"], "param_object": []}
{"funcname": "kdf_data_new", "returned_object": ["->lock"], "param_object": []}
{"funcname": "kmac_new", "returned_object": ["->ctx"], "param_object": []}
{"funcname": "ssl_cert_new", "returned_object": ["->lock"], "param_object": []}
{"funcname": "dsa_dupctx", "returned_object": ["->mdctx"], "param_object": []}
{"funcname": "cms_kari_create_ephemeral_key", "returned_object": [], "param_object": [1, "->pctx"]}
{"funcname": "OSSL_CRMF_MSG_create_popo", "returned_object": [], "param_object": [2, "->popo"]}
{"funcname": "dh_key2buf", "returned_object": [], "param_object": [2, "pbuf_out"]}
{"funcname": "hmac_ctx_alloc_mds", "returned_object": [], "param_object": [1, "->i_ctx", 1, "->md_ctx", 1, "->o_ctx"]}
{"funcname": "PKCS12_pbe_crypt", "returned_object": [], "param_object": [6, "data"]}
{"funcname": "pkey_get_rsa", "returned_object": [], "param_object": [2, "rsa"]}
{"funcname": "pkey_get_eckey", "returned_object": [], "param_object": [2, "eckey"]}
{"funcname": "SRP_create_verifier_BN_ex", "returned_object": [], "param_object": [4, "verifier"]}
{"funcname": "X509v3_add_ext", "returned_object": [], "param_object": [1, "x"]}
{"funcname": "X509_add_cert_new", "returned_object": [], "param_object": [1, "sk"]}
{"funcname": "ossl_prov_macctx_load_from_params", "returned_object": [], "param_object": [1, "macctx"]}
{"funcname": "ssl3_cbc_copy_mac", "returned_object": [], "param_object": [4, "mac"]}
{"funcname": "ssl_create_cipher_list", "returned_object": [], "param_object": [3, "cipher_list"]}

And this is the some examples of the FreeCustomizedFile.txt file

{"funcname": "sk_danetls_record_free", "param_names": [0, "sk"], "member_name": ["sk->data"]}
{"funcname": "sk_danetls_record_pop_free", "param_names": [0, "sk"], "member_name": ["sk->data"]}
{"funcname": "lh_SSL_SESSION_free", "param_names": [0, "lh"], "member_name": ["lh->b"]}
{"funcname": "lh_X509_NAME_free", "param_names": [0, "lh"], "member_name": ["lh->b"]}
{"funcname": "ssl_evp_cipher_free", "param_names": [0, "cipher"], "member_name": ["cipher->lock", "cipher->prov", "cipher->prov->error_strings", "cipher->prov->module", "cipher->prov->module->filename", "cipher->prov->module->loaded_filename", "cipher->prov->module->lock", "cipher->prov->name", "cipher->prov->operation_bits", "cipher->prov->parameters", "cipher->prov->parameters->data", "cipher->prov->path"]}
{"funcname": "sk_EX_CALLBACK_free", "param_names": [0, "sk"], "member_name": ["sk->data"]}

And this is the report html:

image

If you need to provide running logs, I have saved them here

使用Goshawk检查coreutils-6.11在部分文件上崩溃的bug反馈

我使用Goshawk检查了coreutils-6.11版本,具体的检查语句如下:
wget https://mirrors.tuna.tsinghua.edu.cn/gnu/coreutils/coreutils-6.11.tar.gz
tar -zxvf coreutils-6.11.tar.gz
cd coreutils-6.11
./configure
CodeChecker log -b "make -j20" -o compilation.json
cd /Goshawk
python3 run.py /coreutils-6.11
使用以上命令检查coreutils-6.11时,coreutils-6.11一共有259个文件,但是Goshawk在检查其中的十几个文件时会崩溃,我大致检查了一下原因,问题应该出在ModelReallocMem函数中的QualType sizeTy = TotalSize.getAsSymbol()->getType();这句话,我发现在崩溃的地方TotalSize.getAsSymbol()返回的是空指针,然后对空指针调用getType()就会导致崩溃,因此我尝试在这句话之前添加了对TotalSize.getAsSymbol()是否为空的判断,从而解决了这个问题,下面是我修改后的代码:
` // Get the value of the size argument.
SVal TotalSize = C.getSVal(Arg1);
if (!TotalSize.getAs())
return nullptr;

if (!TotalSize.getAsSymbol())
    return nullptr;
QualType sizeTy = TotalSize.getAsSymbol()->getType();
if (DebugMode){llvm::errs()<<"size type:\t";sizeTy.dump();llvm::errs()<<"\n";}

// Compare the size argument to 0.
DefinedOrUnknownSVal SizeZero =
svalBuilder.evalEQ(State, TotalSize.castAs<DefinedOrUnknownSVal>(),
                   svalBuilder.makeIntValWithWidth(sizeTy, 0));
if (DebugMode){llvm::errs()<<"Compare the size argument to 0.";}`

希望你们能够尽快修复这个bug

关于使用Goshawk检查coreutils-6.11部分文件崩溃的bug反馈及修复建议

我使用Goshawk检查了coreutils-6.11版本,具体的检查语句如下:
wget https://mirrors.tuna.tsinghua.edu.cn/gnu/coreutils/coreutils-6.11.tar.gz
tar -zxvf coreutils-6.11.tar.gz
cd coreutils-6.11
./configure
CodeChecker log -b "make -j20" -o compilation.json
cd /Goshawk
python3 run.py /coreutils-6.11
使用以上命令检查coreutils-6.11时,coreutils-6.11一共有259个文件,但是Goshawk在检查其中的十几个文件时会崩溃,我大致检查了一下原因,问题应该出在ModelReallocMem函数中的QualType ptrTy = Arg0Val.getAsSymbol()->getType();这句话,我发现在崩溃的地方Arg0Val.getAsSymbol()返回的是空指针,然后对空指针调用getType()就会导致崩溃,因此我尝试在这句话之前添加了对Arg0Val.getAsSymbol()是否为空的判断,从而解决了这个问题,下面是我修改后的代码:
*ProgramStateRef MemMisuseChecker::ModelReallocMem(CheckerContext &C, const CallEvent &Call, ProgramStateRef State)const{
if (!State)
return nullptr;
const Expr
expr = Call.getOriginExpr();
if(!expr)
return nullptr;
const CallExpr *CE = dyn_cast(expr);
if(!CE)
return nullptr;

if(DebugMode) {llvm::errs()<<"Enter a realloc function\n";}

// If the number of arguments less than 2, it could not be a realloc function.
if (CE->getNumArgs()<2)
    return nullptr;

const Expr *arg0Expr = CE->getArg(0);
if(DebugMode) {llvm::errs()<<"Arg0Expr dump:";arg0Expr->dump();llvm::errs()<<"\n";}
SVal Arg0Val = C.getSVal(arg0Expr);
if (!Arg0Val.getAs<DefinedOrUnknownSVal>())
    return nullptr;
if(!Arg0Val.getAsSymbol())
    return nullptr;
QualType ptrTy = Arg0Val.getAsSymbol()->getType();

DefinedOrUnknownSVal arg0Val = Arg0Val.castAs<DefinedOrUnknownSVal>();

SValBuilder &svalBuilder = C.getSValBuilder();

DefinedOrUnknownSVal PtrEQ =
    svalBuilder.evalEQ(State, arg0Val, svalBuilder.makeNullWithType(ptrTy));

const Expr *Arg1 = CE->getArg(1);
if(DebugMode) {llvm::errs()<<"Arg1Expr dump:";Arg1->dump();llvm::errs()<<"\n";}

// Get the value of the size argument.
SVal TotalSize = C.getSVal(Arg1);
if (!TotalSize.getAs<DefinedOrUnknownSVal>())
    return nullptr;

if (!TotalSize.getAsSymbol())
    return nullptr;
QualType sizeTy = TotalSize.getAsSymbol()->getType();
if (DebugMode){llvm::errs()<<"size type:\t";sizeTy.dump();llvm::errs()<<"\n";}

// Compare the size argument to 0.
DefinedOrUnknownSVal SizeZero =
svalBuilder.evalEQ(State, TotalSize.castAs<DefinedOrUnknownSVal>(),
                   svalBuilder.makeIntValWithWidth(sizeTy, 0));
if (DebugMode){llvm::errs()<<"Compare the size argument to 0.";}

ProgramStateRef StatePtrIsNull, StatePtrNotNull;
std::tie(StatePtrIsNull, StatePtrNotNull) = State->assume(PtrEQ);
ProgramStateRef StateSizeIsZero, StateSizeNotZero;
std::tie(StateSizeIsZero, StateSizeNotZero) = State->assume(SizeZero);
// We only assume exceptional states if they are definitely true; if the
// state is under-constrained, assume regular realloc behavior.
bool PrtIsNull = StatePtrIsNull && !StatePtrNotNull;
bool SizeIsZero = StateSizeIsZero && !StateSizeNotZero;

  // If the ptr is NULL and the size is not 0, the call is equivalent to
// malloc(size).
if (PrtIsNull && !SizeIsZero) {
    ProgramStateRef stateMalloc = ModelMallocNormal(C, Call, State);
    return stateMalloc;
}

// If the reallocated ptr is NULL and size is 0, this function do nothing.
if (PrtIsNull && SizeIsZero)
    return State;

// Get the from and to pointer symbols as in toPtr = realloc(fromPtr, size).
assert(!PrtIsNull);
SymbolRef FromPtr = arg0Val.getAsSymbol();
SVal RetVal = C.getSVal(CE);
SymbolRef ToPtr = RetVal.getAsSymbol();
if (!FromPtr || !ToPtr)
    return nullptr;

if (SizeIsZero)
// If size was equal to 0, either NULL or a pointer suitable to be passed
// to free() is returned. 
    if (ProgramStateRef stateFree =
            ModelFreeNormal(C, Call, StateSizeIsZero))
        return stateFree;

// Normal behavior of realloc
if (ProgramStateRef stateFree = ModelFreeNormal(C, Call, State)) {
    ProgramStateRef stateRealloc = ModelMallocNormal(C, Call,stateFree);
    if (!stateRealloc)
        return nullptr;
    return stateRealloc;
}
return nullptr;

}**
我修改的部分是:
if(!Arg0Val.getAsSymbol())
return nullptr;

希望你们能够尽快修复这个bug

自己写了一个含有释放后使用简单的cpp文件使用goshawk检查报错

你好,我自己写了两个c文件,这两个文件都包含一个释放后使用错误,我将文件放到goshawk上去检测,发现goshawk在第三步时提示FileNotFoundError: [Errno 2] No such file or directory: 'temp/free_check.txt',检测无法成功,下面是我的代码以及部分配置文件信息:
goshawk失败提示:`-----------------------------------------------

Step3: Identify Deallocation Functions from source code Start!

WARNING:tensorflow:No training configuration found in save file, so the model was not compiled. Compile it manually.
extracting time: 0.43344902992248535
Call Graph read finished!
total func number:9
clang -fsyntax-only -Xclang -load -Xclang /project/osteacher/Goshawk/plugins/FreeNullCheck.so -Xclang -plugin -Xclang free-check -Xclang -plugin-arg-free-check -Xclang /project/osteacher/Goshawk/temp/candidate_free.txt -Xclang -plugin-arg-free-check -Xclang /project/osteacher/Goshawk/temp/free_check.txt -Xclang -plugin-arg-free-check -Xclang /project/osteacher/Goshawk/temp/visited.txt -Wall -o figure figure.c
clang -fsyntax-only -Xclang -load -Xclang /project/osteacher/Goshawk/plugins/FreeNullCheck.so -Xclang -plugin -Xclang free-check -Xclang -plugin-arg-free-check -Xclang /project/osteacher/Goshawk/temp/candidate_free.txt -Xclang -plugin-arg-free-check -Xclang /project/osteacher/Goshawk/temp/free_check.txt -Xclang -plugin-arg-free-check -Xclang /project/osteacher/Goshawk/temp/visited.txt -Wall -o figure1 figure1.c
Traceback (most recent call last):
File "run.py", line 256, in
Step_3_Free()
File "run.py", line 132, in Step_3_Free
cleanup_free_null_check(config.free_check_file)
File "/project/osteacher/Goshawk/utils.py", line 26, in cleanup_free_null_check
with open(file) as f:
FileNotFoundError: [Errno 2] No such file or directory: 'temp/free_check.txt'我自己为两个c文件写的makefile:CC = gcc
CFLAGS = -Wall

all: figure figure1

figure: figure.c
$(CC) $(CFLAGS) -o figure figure.c

figure1: figure1.c
$(CC) $(CFLAGS) -o figure1 figure1.c

clean:
rm -f figure figure1一个含有释放后使用漏洞的代码:#include <stdlib.h>
#include <stdio.h>

typedef struct node {
int val;
struct node *next;
} Node;

Node* create() {
return (Node*)
malloc(sizeof(Node));
}

void myfree(Node **n) {
free(*n);
*n = NULL;
}

int main() {
Node *n1, *n2;
n1 = create();
n1->val = 10;
n2 = create();
n2->val = 5;
int y, i = 0;
scanf("%d", &y);

while (y < 10) {
	int x;
	scanf("%d", &x);

	while (x > 10) {
		i++;
		if (i > 4)
			x += i;
		else
			x -= i;

		if (x % 2 == 0)
			n2 = n1;
	}

	y += x;
}

myfree(&n1);
printf("%d", n2->val);
myfree(&n2);

}
生成的compilation.json文件:[
{
"directory": "/project/osteacher/test_case",
"command": "/usr/local/bin/clang -Wall -o figure figure.c",
"file": "figure.c"
}
,
{
"directory": "/project/osteacher/test_case",
"command": "/usr/local/bin/clang -Wall -o figure1 figure1.c",
"file": "figure1.c"
}
]`

Error When Run

Hi, I just tried to use Goshawk but met some problems. I set up the environment in Docker, I'm sure the json file exists and not empty. But when I run:
python3 ./run.py ./openssl
I got:
image

使用Goshawk检查coreutils-6.11部分文件崩溃的bug反馈

我使用Goshawk检查了coreutils-6.11版本,具体的检查语句如下:
wget https://mirrors.tuna.tsinghua.edu.cn/gnu/coreutils/coreutils-6.11.tar.gz
tar -zxvf coreutils-6.11.tar.gz
cd coreutils-6.11
./configure
CodeChecker log -b "make -j20" -o compilation.json
cd /Goshawk
python3 run.py /coreutils-6.11
使用以上命令检查coreutils-6.11时,coreutils-6.11一共有259个文件,但是Goshawk在检查其中的十几个文件时会崩溃,我大致检查了一下原因,问题应该出在ModelReallocMem函数中的QualType ptrTy = Arg0Val.getAsSymbol()->getType();这句话,我发现在崩溃的地方Arg0Val.getAsSymbol()返回的是空指针,然后对空指针调用getType()就会导致崩溃,因此我尝试在这句话之前添加了对Arg0Val.getAsSymbol()是否为空的判断,从而解决了这个问题,下面是我修改后的代码:
`const Expr *arg0Expr = CE->getArg(0);
if(DebugMode) {llvm::errs()<<"Arg0Expr dump:";arg0Expr->dump();llvm::errs()<<"\n";}
SVal Arg0Val = C.getSVal(arg0Expr);
if (!Arg0Val.getAs())
return nullptr;
if(!Arg0Val.getAsSymbol())
return nullptr;
QualType ptrTy = Arg0Val.getAsSymbol()->getType();

DefinedOrUnknownSVal arg0Val = Arg0Val.castAs<DefinedOrUnknownSVal>();

SValBuilder &svalBuilder = C.getSValBuilder();`

希望你们能够尽快修复这个bug

Step_3_Free did not generate the file temp/free_check.txt

My test code is as follow:
`#include <stdio.h>
#include <errno.h>
#include <stdlib.h>
#include <string.h>
struct auth{
char name[32];
int auth;
};

void free_s(void * ptr)
{
return free(ptr);
}

void uaf()
{
printf("enter uaf demo\n");
int size = 100;
struct auth *pauth = malloc(100);
if (!pauth)
{
fprintf(stderr, "%s\n", strerror(errno));
exit(EXIT_FAILURE);
}
pauth->auth = 1;

    free_s(pauth);


    if (pauth && pauth->auth){

            printf("uaf success\n");
    }

}

void double_free()
{
printf("enter double free demo\n");
int size = 100;
void *ptr = malloc(100);
if (!ptr)
{
fprintf(stderr, "%s\n", strerror(errno));
exit(EXIT_FAILURE);
}

    free(ptr);
    free(ptr);

}

int main()
{
uaf();
double_free();
}
`
An error occurred while running goshawk

extracting time: 0.44608020782470703
Call Graph read finished!
total func number:9
clang -fsyntax-only -Xclang -load -Xclang /home/Goshawk/plugins/FreeNullCheck.so -Xclang -plugin -Xclang free-check -Xclang -plugin-arg-free-check -Xclang /home/Goshawk/temp/candidate_free.txt -Xclang -plugin-arg-free-check -Xclang /home/Goshawk/temp/free_check.txt -Xclang -plugin-arg-free-check -Xclang /home/Goshawk/temp/visited.txt -c test.c -o test.o
Traceback (most recent call last):
File "/home/Goshawk/run.py", line 251, in
Step_3_Free()
File "/home/Goshawk/run.py", line 127, in Step_3_Free
cleanup_free_null_check(config.free_check_file)
File "/home/Goshawk/utils.py", line 26, in cleanup_free_null_check
with open(file) as f:
FileNotFoundError: [Errno 2] No such file or directory: 'temp/free_check.txt'

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.