Hello, forks, May I ask is it possible to make such a promising tool

You are welcome <a class="user-mention notranslate" data-hovercard-type="user" data-ho

Hi <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="

Thanks <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-u

Failure to raise binaries of Csmith-generated sources about llvm-mctoll HOT 4 OPEN

Hanseltu commented on June 28, 2024

Failure to raise binaries of Csmith-generated sources

from llvm-mctoll.

Comments (4)

Hanseltu commented on June 28, 2024 1

You are welcome @bharadwajy! I am happy I can help here!

By the way, the reduced versions (reduced by Creduce) of source code are as follows:

The source code of GCC

int a;
char(f)() {
  unsigned char c = 0;
  return a * c;
}
int main() {return 0;}

and the source code of clang

void a() {
  int b[5] = {0,0,0,0,0};
  b[4] ^= 253L;
}
int main() {return 0;}

For the correctness of the translation, may I ask a further question for you? To verify the translation process, I think two fundamental requirements are needed to check it. One comes from the lifting tool itself, i.e., the transferred IR should be able to be recompiled, and aother is that we need to define what should be the correct behavior after executing the compiled/re-compiled binary. Here, we can find a way to verify the correctness in llvm-mctoll as the IR lifted by llvm-mctoll can be recompiled which satisfies the first requirement and the programs generated by Csmith can meet the second requirement. However, for other lifting tools (e.g., retdec), the lifted LLVM IR is not recompiled based on some issues (e.g., avast/retdec#529). Is it possible to directly cross-check the lifted LLVM IR code without recompiling them? Do you have any suggestions?

For new bug reports, I will try to continuously file new issues for them in my spare time. Thanks!

Best,
Haoxin

from llvm-mctoll.

bharadwajy commented on June 28, 2024

Thanks for your interest in the project and for your question.

If the input to llvm-mctoll is a legitimate binary - whether built from a randomly generated source or otherwise - I'd like llvm-mctoll to be able to raise it correctly. So, it should not matter whether the source code was generated by CSmith or by another means, as long as it compiles to a well-behaved and correct binary.

In reality, CSmith would/could serve as a feeder of test cases for llvm-mctoll.

The examples you provided expose

an unhandled memory instruction while raising the gcc-generated binary
a bug in the pass Unify function exit nodes while raising clang-generated binary.

Thanks for the bug report. I'll plan to look at them. However, if you or anyone else can help out before I get to them, I'd very much appreciate the help.

from llvm-mctoll.

Hanseltu commented on June 28, 2024

Hi @bharadwajy. Thanks for your reply and bug confirmation.

Yeah, I thinkllvm-mctoll is a cool tool that has many benefits over other existing lifting tools (e.g., mcsema and retdec), and I am happy if it could become stronger and more scalable.

I am sorry I can not help in the implementation part, but if you require more test cases that trigger different assertion failures, I would like to help and find more useful test cases (with their reduced version) to assistant debugging. Do you need such cases? If so, is it better to file a new issue for each failure or just pack them all in a decompressed file then upload it here?

Best,
Haoxin

from llvm-mctoll.

bharadwajy commented on June 28, 2024

Thanks @Hanseltu The more test cases we have to make the tool robust and useful, the better.

I'd prefer if you can create issues one per kind of failure with sources (either C or assembly) that are as minimal as possible to help focus on the actual failure. It would also help if the sources can incorporate a way to verify the correctness of the translation. Currently the tests are set up to raise a given binary, recompile the raised IR back to x64 target. Then the output of original binary and the raised binary are compared to verify the correctness of raised IR. So, if your bug report sources can incorporate a way to output some verifiable set of results or provide some other way to verify the correctness of the raised IR, it would be very helpful.

Thanks again for your offer to help.

from llvm-mctoll.

Failure to raise binaries of Csmith-generated sources about llvm-mctoll HOT 4 OPEN

Comments (4)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent