Coder Social home page Coder Social logo

llvm-mctoll's Introduction

Introduction

This tool statically (AOT) translates (or raises) binaries to LLVM IR.

Current Status

Llvm-mctoll is capable of raising X86-64 and Arm32 Linux/ELF libraries and executables to LLVM IR. Raising Windows, OS X and C++ binaries needs to be added. At this time X86-64 support is more mature than Arm32.

Development and primary testing is being done on Ubuntu 22.04. Testing is also done on Ubuntu 20.04. The tool is expected to build and run on Ubuntu 18.04, 16.04, Ubuntu 17.04, Ubuntu 17.10, CentOS 7.5, Debian 10, Windows 10, and OS X to raise Linux/ELF binaries.

Triple VarArgs FuncProto StackFrame JumpTables SharedLibs C++
x86_64-linux X X X X X
arm-linux X X X X X
  • VarArgs: function calls with variable arguments (such as printf)
  • FuncProto: function prototype discovery
  • StackFrame: stack frame abstraction
  • JumpTables: switch statements with jump tables
  • SharedLibs: shared libraries
  • C++: vtables, name mangling and exception handling

Known Issues

SIMD instructions such as SSE, AVX, and Neon cannot be raised at this time. For X86-64 you can sometimes work around this issue by compiling the binary to raise with SSE disabled (clang -mno-sse).

Most testing is done using binaries compiled for Linux using LLVM. We have done only limited testing with GCC compiled code.

Getting Started

There are no dependencies outside of LLVM to build llvm-mctoll. The following instructions assume you will build LLVM with Ninja.

Support for raising X86-64 and Arm32 binaries is enabled by building LLVM's X86 and ARM targets. The tool is not built unless one of the X86 or ARM LLVM targets are built.

Building as part of the LLVM tree

  1. On Linux and OS X build from a command prompt such as a bash shell. On Windows build from an x64 Native Tools Command Prompt. See LLVM's Visual Studio guide.

  2. Clone the LLVM and mctoll git repositories

git clone https://github.com/llvm/llvm-project.git
cd llvm-project && git clone -b master https://github.com/microsoft/llvm-mctoll.git llvm/tools/llvm-mctoll
  1. The commit recorded in llvm-project-git-commit-to-use.txt is the tested version of LLVM to build against. If you use a different version LLVM you might encounter build errors.
git checkout <hash from llvm-project-git-commit-to-use.txt>
  1. Configure LLVM by enabling Clang and ld. See LLVM CMake Variables for more information on LLVM's cmake options.
cmake -S llvm -B <build-dir> -G "Ninja" \
  -DLLVM_TARGETS_TO_BUILD="X86;ARM"  \
  -DLLVM_ENABLE_PROJECTS="clang;lld" \
  -DLLVM_ENABLE_ASSERTIONS=true      \
  -DCLANG_DEFAULT_PIE_ON_LINUX=OFF   \
  -DCMAKE_BUILD_TYPE=<build-type>

clang-tidy checks can be enabled for the llvm-mctoll project sources by using the additional cmake option -DMCTOLL_CLANG_TIDY.

  1. Build llvm-mctoll
cmake --build  <build-dir> -- llvm-mctoll
  1. Run the unit tests (Linux only)
ninja check-mctoll
  1. Building Release without assertions
cmake -S llvm -B <build-dir> -G "Ninja"  \
      -DLLVM_TARGETS_TO_BUILD="X86;ARM"  \
      -DLLVM_ENABLE_PROJECTS="clang;lld" \
      -DCLANG_DEFAULT_PIE_ON_LINUX=OFF   \
      -DCMAKE_BUILD_TYPE=Release         \
      -DLLVM_ENABLE_DUMP=true

Usage

Command Description
-dh or --help Display available options
-d <binary> Generate LLVM IR for a binary and place the result in <binary>-dis.ll
--filter-functions-file=<file> Text file with C functions to exclude or include during raising
--include-files=[file1,file2,file3,...] or -I file1 -I file2 -I file3 Specify full path of one or more files with function prototypes to use
-debug Print all debug output
-debug-only=mctoll Print the LLVM IR after each pass of the raiser
-debug-only=prototypes Print ignored duplicate function prototypes in --include-files

Raising a binary to LLVM IR

This is what you came here for :-). Please file an issue if you find a problem.

llvm-mctoll -d a.out

See usage document for additional details of command-line options.

Checking correctness of translation

The easiest way to check the raised LLVM IR <binary>-dis.ll is correct is to compile the IR to an executable using clang and run the resulting executable. The tests in the repository follow this methodology.

Acknowledgements

Please use the following reference when citing this work Raising Binaries to LLVM IR with MCTOLL (WIP)

 @inproceedings{10.1145/3316482.3326354,
    author = {Yadavalli, S. Bharadwaj and Smith, Aaron},
    title = {Raising Binaries to LLVM IR with MCTOLL (WIP Paper)},
    year = {2019},
    isbn = {9781450367240},
    publisher = {Association for Computing Machinery},
    address = {New York, NY, USA},
    url = {https://doi.org/10.1145/3316482.3326354},
    doi = {10.1145/3316482.3326354},
    booktitle = {Proceedings of the 20th ACM SIGPLAN/SIGBED International Conference on Languages, Compilers, and Tools for Embedded Systems},
    pages = {213–218},
    numpages = {6},
    keywords = {Code Generation, LLVM IR, Binary Translation},
    location = {Phoenix, AZ, USA},
    series = {LCTES 2019}
 }

Contributing

This project welcomes contributions and suggestions. Most contributions require you to agree to a Contributor License Agreement (CLA) declaring that you have the right to, and actually do, grant us the rights to use your contribution. For details, visit https://cla.microsoft.com.

When you submit a pull request, a CLA-bot will automatically determine whether you need to provide a CLA and decorate the PR appropriately (e.g., label, comment). Simply follow the instructions provided by the bot. You will only need to do this once across all repos using our CLA.

This project has adopted the Microsoft Open Source Code of Conduct. For more information see the Code of Conduct FAQ or contact [email protected] with any additional questions or comments.

llvm-mctoll's People

Contributors

aaronsm avatar bharadwajy avatar li-xin-yi avatar martin-fink avatar microsoft-github-policy-service[bot] avatar rcorcs avatar sv99 avatar tathanhdinh avatar trass3r avatar y-nak avatar yang-xifeng avatar yaoxiaocc avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

llvm-mctoll's Issues

Raising strings

I generated a binary for a C code that involved char * and an assignment to it. When raised using llvm-mctoll, the string constant is not found as a global string and neither the IR has any variables of type i8*. All the variables present are either i64 or i32. Please let me know if this is a bug or the support for it is not yet added.

Incorrect function return types for code compiled with clang (coremark)

Grab coremark from github.

Modify /path/to/coremark/linux64/core_portme.mak file as below:
-CC = gcc
+CC = clang

Compile for X86:
/path/to/coremark/make PORT_DIR=linux64

Use llvm-mctoll to translate coremark and compare the function return type.

For get_seed_args:
Function prototype of source code:
ee_s32 get_seed_args(int i, int argc, char *argv[])
Function prototype parsed:
declare dso_local void @get_seed_args(i32 %arg1, i32 %arg2, i64 %arg3)

memcpy support

Memory copy operations exist in various flavors.
May be implemented as a loop, something like rep movsq (%rsi), %es:(%rdi) or vector instructions.

They occur quite often even without explicit memcpy, e.g. initializing stack arrays from constant data or object copies.

Unable to checkout using LLVMVersion.txt hash

When trying to checkout the hash in LLVMVesion.txt:

git checkout dca920a904c27f4c86e909ef2e4e343d48168cca

An error is received.

fatal: reference is not a tree: dca920a904c27f4c86e909ef2e4e343d48168cca

I think the issue is the commit was reverted.

x86 support

Just curious, any plans for 32bit support?
Seems like the calling convention is hard-coded in X86RegisterUtils.

Malformed LLVM module

Hello,

When raising an ELF generated with:

// foo.c
struct Foo {
    struct Foo *next;
};

struct Foo* __attribute__ ((noinline)) foo(struct Foo *f) {
    while (f->next != 0) {
        f = f->next;
    }
    return f;
}

int main() {
    return foo(10);
}

// compile
gcc -O3 foo.c -o foo

and the CFG is something like this:
foo

I've got the following result

define dso_local i64 @foo(i64 %arg1) {
entry:
  %0 = inttoptr i64 %arg1 to i64*
  %memload = load i64, i64* %0, align 8
  %1 = and i64 %memload, %memload
  %highbit = and i64 -9223372036854775808, %1
  %SF = icmp ne i64 %highbit, 0
  %ZF = icmp eq i64 %1, 0
  %CmpZF_JNE = icmp eq i1 %ZF, false
  br i1 %CmpZF_JNE, label %entry, label %bb.1     ; so it may branch to entry

bb.1:                                             ; preds = %entry
  ret i64 %arg1
}

define dso_local i64 @main() {
entry:
  %0 = zext i32 10 to i64
  %RAX = tail call i64 @foo(i64 %0)
  ret i64 %RAX
}

while I believe that the result of @foo is not correct (the loop cannot exit since %memload is always the same), the module is not even "valid" LLVM because there is a branch to the entry block in @foo.

A most naive fix that I can think of is to just create an empty entry block before the real one, like that:

define dso_local i64 @foo(i64 %arg1) {
entry:
  br label %real_entry

real_entry:
  %0 = inttoptr i64 %arg1 to i64*
  %memload = load i64, i64* %0, align 8
  %1 = and i64 %memload, %memload
  %highbit = and i64 -9223372036854775808, %1
  %SF = icmp ne i64 %highbit, 0
  %ZF = icmp eq i64 %1, 0
  %CmpZF_JNE = icmp eq i1 %ZF, false
  br i1 %CmpZF_JNE, label %real_entry, label %bb.1

bb.1:                                             ; preds = %entry
  ret i64 %arg1
}

I'm still very new to the project, I don't know where should I start, may be in a "validation pass" which check whether there is a branch to the entry block?

Thanks in advance.

undefined reference to llvm::MachineInstruction::dump

I tried building this project, exactly as described in README (i copy&pasted the 7 commands), but i got the following linker error:

[ 97%] Linking CXX executable ../../bin/llvm-mctoll
../../lib/libX86Raiser.a(X86MachineInstructionRaiser.cpp.o): In function `X86MachineInstructionRaiser::raiseDirectBranchMachineInstr(ControlTransferInfo*)':
X86MachineInstructionRaiser.cpp:(.text._ZN27X86MachineInstructionRaiser29raiseDirectBranchMachineInstrEP19ControlTransferInfo+0x724): undefined reference to `llvm::MachineInstr::dump() const'
X86MachineInstructionRaiser.cpp:(.text._ZN27X86MachineInstructionRaiser29raiseDirectBranchMachineInstrEP19ControlTransferInfo+0x909): undefined reference to `llvm::MachineInstr::dump() const'
../../lib/libX86Raiser.a(X86MachineInstructionRaiser.cpp.o): In function `X86MachineInstructionRaiser::raiseCompareMachineInstr(llvm::MachineInstr const&, llvm::BasicBlock*, bool, llvm::Value*)':
X86MachineInstructionRaiser.cpp:(.text._ZN27X86MachineInstructionRaiser24raiseCompareMachineInstrERKN4llvm12MachineInstrEPNS0_10BasicBlockEbPNS0_5ValueE+0xb8): undefined reference to `llvm::MachineInstr::dump() const'
X86MachineInstructionRaiser.cpp:(.text._ZN27X86MachineInstructionRaiser24raiseCompareMachineInstrERKN4llvm12MachineInstrEPNS0_10BasicBlockEbPNS0_5ValueE+0x185): undefined reference to `llvm::MachineInstr::dump() const'
../../lib/libX86Raiser.a(X86MachineInstructionRaiser.cpp.o): In function `X86MachineInstructionRaiser::raiseSetCCMachineInstr(llvm::MachineInstr const&, llvm::BasicBlock*)':
X86MachineInstructionRaiser.cpp:(.text._ZN27X86MachineInstructionRaiser22raiseSetCCMachineInstrERKN4llvm12MachineInstrEPNS0_10BasicBlockE+0x18a): undefined reference to `llvm::MachineInstr::dump() const'
../../lib/libX86Raiser.a(X86MachineInstructionRaiser.cpp.o):X86MachineInstructionRaiser.cpp:(.text._ZN27X86MachineInstructionRaiser33raiseBinaryOpImmToRegMachineInstrERKN4llvm12MachineInstrEPNS0_10BasicBlockE+0x1f4): more undefined references to `llvm::MachineInstr::dump() const' follow
CMakeFiles/llvm-mctoll.dir/MCInstRaiser.cpp.o: In function `MCInstRaiser::RaiseMCInst(llvm::MCInstrInfo const&, llvm::MachineFunction&, llvm::MCInst, unsigned long)':
MCInstRaiser.cpp:(.text._ZN12MCInstRaiser11RaiseMCInstERKN4llvm11MCInstrInfoERNS0_15MachineFunctionENS0_6MCInstEm+0x149): undefined reference to `llvm::MCOperand::dump() const'
CMakeFiles/llvm-mctoll.dir/MCInstRaiser.cpp.o: In function `MCInstRaiser::buildCFG(llvm::MachineFunction&, llvm::MCInstrAnalysis const*, llvm::MCInstrInfo const*)':
MCInstRaiser.cpp:(.text._ZN12MCInstRaiser8buildCFGERN4llvm15MachineFunctionEPKNS0_15MCInstrAnalysisEPKNS0_11MCInstrInfoE+0x5db): undefined reference to `llvm::MachineFunction::dump() const'
../../lib/libARMRaiser.a(ARMFunctionPrototype.cpp.o): In function `ARMFunctionPrototype::discover(llvm::MachineFunction&)':
ARMFunctionPrototype.cpp:(.text._ZN20ARMFunctionPrototype8discoverERN4llvm15MachineFunctionE+0x1bc): undefined reference to `llvm::MachineFunction::dump() const'
ARMFunctionPrototype.cpp:(.text._ZN20ARMFunctionPrototype8discoverERN4llvm15MachineFunctionE+0x1c4): undefined reference to `llvm::Value::dump() const'
../../lib/libARMRaiser.a(ARMEliminatePrologEpilog.cpp.o): In function `ARMEliminatePrologEpilog::eliminate()':
ARMEliminatePrologEpilog.cpp:(.text._ZN24ARMEliminatePrologEpilog9eliminateEv+0x95): undefined reference to `llvm::MachineFunction::dump() const'
ARMEliminatePrologEpilog.cpp:(.text._ZN24ARMEliminatePrologEpilog9eliminateEv+0x9e): undefined reference to `llvm::Value::dump() const'
collect2: error: ld returned 1 exit status
tools/llvm-mctoll/CMakeFiles/llvm-mctoll.dir/build.make:545: recipe for target 'bin/llvm-mctoll' failed
make[2]: *** [bin/llvm-mctoll] Error 1
CMakeFiles/Makefile2:62533: recipe for target 'tools/llvm-mctoll/CMakeFiles/llvm-mctoll.dir/all' failed
make[1]: *** [tools/llvm-mctoll/CMakeFiles/llvm-mctoll.dir/all] Error 2
Makefile:151: recipe for target 'all' failed
make: *** [all] Error 2

The important parts being:

undefined reference to `llvm::MachineInstr::dump() const'
undefined reference to `llvm::MachineFunction::dump() const'
undefined reference to `llvm::MCOperand::dump() const'

I believe the problem lies in the step 6 "Run cmake command that you usually use to build llvm", the LLVM have quite a lot of options and i may be missing some that are important.

For example here llvm::MachineInstr::dump and here LLVM_DUMP_METHOD it looks like it strips all dump methods in Release builds, so you probably need to build in Debug mode or add some other option to preserve them in output libs.

Can you please add more information in README about what build arguments are needed in step 6 for succesfull build?

Add support to raise relocatable binaries (.o)

// test.c
int test()
{
void* a = malloc(8);
}

$ clang -target x86_64 -c test.c
$ llvm-mctoll -d test.o

Assertion failed: (F && "Unexpected null function pointer encountered"), function getCalledFunctionUsingTextReloc, file /llvm-project/llvm/tools/llvm-mctoll/MachineFunctionRaiser.cpp, line 86.
0 llvm-mctoll 0x00000001025d6a8c llvm::sys::PrintStackTrace(llvm::raw_ostream&) + 60
1 llvm-mctoll 0x00000001025d7009 PrintStackTraceSignalHandler(void*) + 25
2 llvm-mctoll 0x00000001025d4c46 llvm::sys::RunSignalHandlers() + 118
3 llvm-mctoll 0x00000001025dab0c SignalHandler(int) + 252
4 libsystem_platform.dylib 0x00007fff6478d42d _sigtramp + 29
5 llvm-mctoll 0x00000001039a77e8 DisableLazyLoading + 1038232
6 libsystem_c.dylib 0x00007fff64662a1c abort + 120
7 libsystem_c.dylib 0x00007fff64661cd6 err + 0
8 llvm-mctoll 0x000000010106b3f7 ModuleRaiser::getCalledFunctionUsingTextReloc(unsigned long long, unsigned long long) const + 407
9 llvm-mctoll 0x0000000102658153 X86MachineInstructionRaiser::getCalledFunction(llvm::MachineInstr const&) + 643
10 llvm-mctoll 0x0000000102656f99 X86MachineInstructionRaiser::getRaisedFunctionPrototype() + 3081
11 llvm-mctoll 0x000000010106b5ee ModuleRaiser::runMachineFunctionPasses() + 222
12 llvm-mctoll 0x000000010107f5d5 DisassembleObject(llvm::object::ObjectFile const*, bool) + 15813
13 llvm-mctoll 0x000000010107b507 DumpObject(llvm::object::ObjectFile*, llvm::object::Archive const*) + 583
14 llvm-mctoll 0x0000000101073342 DumpInput(llvm::StringRef) + 306
15 llvm-mctoll 0x00000001010731e9 void (std::__1::for_each<std::__1::__wrap_iter<std::__1::basic_string<char, std::__1::char_traits, std::__1::allocator >>, void ()(llvm::StringRef)>(std::__1::__wrap_iter<std::__1::basic_string<char, std::__1::char_traits, std::__1::allocator >>, std::__1::__wrap_iter<std::__1::basic_string<char, std::__1::char_traits, std::__1::allocator >>, void ()(llvm::StringRef)))(llvm::StringRef) + 89
16 llvm-mctoll 0x0000000101072ec5 main + 997
17 libdyld.dylib 0x00007fff645947fd start + 1
Stack dump:
0. Program arguments: ./llvm-mctoll -d test.o
Abort trap: 6

Windows binaries support

Did some quick-and-dirty tests with Windows x64 executables: https://github.com/Trass3r/llvm-mctoll/commits/coff
It does reach the MachineInst level but after that all the assertions kick in.
Biggest changes are required in X86MachineInstructionRaiserUtils and the hard-coded calling convention registers in X86RegisterUtils.

Also requires SSE etc support for the startup code, standard C runtime etc.
Use cl -nologo -MT -GS- -GR- -EHs-c- test.cpp -link /opt:ref /ENTRY:main /NODEFAULTLIB with some simple code (no runtime) to generate a tiny executable for testing.

__declspec(noinline)
int sum(int *arr, int n)
{
  int sum = 0;
  for (int i = 0; i < n; ++i)
    sum += arr[i];
  return sum;
}

int main()
{
  int arr[] = {0, 1, 2, 3};
  return sum(arr, 4);
}

SSE2 support is also sort of required to use -O1 or -O2 as that's used by msvc even for trivial memcpy or init code.

IDA-Pro

I have no idea of where I can download this decompiler. it seems like it is not free

Test suite failures

llvm-mctoll/X86/X86ModuleRaiser.cpp:66: virtual bool X86ModuleRaiser::collectDynamicRelocations(): Assertion `(DotRelaDotPltShdr.get()->sh_info == DotGotDotPltSec.getIndex()) && ".rela.plt does not refer .got.plt section"' failed when trying to run even the test suite. Ubuntu 18.04 on the Windows Subsystem for Linux.

$ cat /etc/lsb-release
DISTRIB_ID=Ubuntu
DISTRIB_RELEASE=18.04
DISTRIB_CODENAME=bionic
DISTRIB_DESCRIPTION="Ubuntu 18.04.1 LTS"

Several llvm-mctoll tests fail with Release build

[Tracker Issue]

How to repro:

  1. cmake -G "Ninja" -DCMAKE_INSTALL_PREFIX=/full/path/to/install/llvm /full/path/to/src/llvm -DLLVM_TARGETS_TO_BUILD="X86;ARM" -DCMAKE_BUILD_TYPE=Release -DLLVM_ENABLE_DUMP=ON -DLLVM_ENABLE_ASSERTIONS=ON
  2. ninja check-mctoll

This results in multiple test failures.

Error raised when running "ninja check-mctoll"

When running the command “ninjia check-mctoll”, i meet a problem that different from the Issue #53, and i tried many methods but that didn't work, so i have to ask you for help. The problem i met is that i couldn't pass the check, 12 Unexpected Failures occur, and the the reasons for failures are all as follow:
FAIL: mctoll :: smoke_test/ARM/factorial-test.c (36 of 145)
******************** TEST 'mctoll :: smoke_test/ARM/factorial-test.c' FAILED ********************
Script:

: 'RUN: at line 1'; /home/user02/CGCL/llvm-mctoll/llvm-project/build/bin/clang /home/user02/CGCL/llvm-mctoll/llvm-project/llvm/tools/llvm-mctoll/test/smoke_test/ARM/../Inputs/factorial.c -o /home/user02/CGCL/llvm-mctoll/llvm-project/build/tools/llvm-mctoll/test/smoke_test/ARM/Output/factorial-test.c.tmp.so --target=arm-linux-gnueabi -fuse-ld=lld -shared
: 'RUN: at line 2'; /home/user02/CGCL/llvm-mctoll/llvm-project/build/bin/llvm-mctoll -d /home/user02/CGCL/llvm-mctoll/llvm-project/build/tools/llvm-mctoll/test/smoke_test/ARM/Output/factorial-test.c.tmp.so
: 'RUN: at line 3'; /home/user02/CGCL/llvm-mctoll/llvm-project/build/bin/clang -o /home/user02/CGCL/llvm-mctoll/llvm-project/build/tools/llvm-mctoll/test/smoke_test/ARM/Output/factorial-test.c.tmp1 /home/user02/CGCL/llvm-mctoll/llvm-project/llvm/tools/llvm-mctoll/test/smoke_test/ARM/factorial-test.c /home/user02/CGCL/llvm-mctoll/llvm-project/build/tools/llvm-mctoll/test/smoke_test/ARM/Output/factorial-test.c.tmp-dis.ll -mx32
: 'RUN: at line 4'; /home/user02/CGCL/llvm-mctoll/llvm-project/build/tools/llvm-mctoll/test/smoke_test/ARM/Output/factorial-test.c.tmp1 2>&1 | /home/user02/CGCL/llvm-mctoll/llvm-project/build/bin/FileCheck /home/user02/CGCL/llvm-mctoll/llvm-project/llvm/tools/llvm-mctoll/test/smoke_test/ARM/factorial-test.c

Exit Code: 1

Command Output (stderr):

ld.lld: error: cannot open crti.o: No such file or directory
ld.lld: error: cannot open crtbeginS.o: No such file or directory
ld.lld: error: unable to find library -lgcc
ld.lld: error: unable to find library -lgcc_s
ld.lld: error: unable to find library -lc
ld.lld: error: unable to find library -lgcc
ld.lld: error: unable to find library -lgcc_s
ld.lld: error: cannot open crtendS.o: No such file or directory
ld.lld: error: cannot open crtn.o: No such file or directory
clang-11: error: linker command failed with exit code 1 (use -v to see invocation)

I think is the ld.lld's problem, but I don't know how to slove it, and i don't know whether this will influence the result of the llvm-mctoll. Any help would be appreciated! Thanks a lot!

external functions

For C++ code the most important change is probably:

diff --git a/ExternalFunctions.cpp b/ExternalFunctions.cpp
index 9bdf6fb..8cfcd8c 100644
--- a/ExternalFunctions.cpp
+++ b/ExternalFunctions.cpp
@@ -29,7 +29,10 @@ const std::map<StringRef, ExternalFunctions::RetAndArgs>
         {"puts", {"i32", {"i8*"}, false}},
         {"free", {"void", {"i8*"}, false}},
         {"atoi", {"i32", {"i8*"}, false}},
-        {"exit", {"void", {"i32"}, false}}};
+        {"exit", {"void", {"i32"}, false}},
+        {"_Znwm", {"i8*", {"i64"}, false}}, // new
+        {"_ZdlPv", {"void", {"i8*"}, false}} // delete
+        };
 
 // Construct and return a Function* corresponding to a known external function
 Function *ExternalFunctions::Create(StringRef &CFuncName, ModuleRaiser &MR) {

I'd suggest to add that for the time being.

But in general that approach does not scale of course (different compilers, name mangling, calling conventions, architecture, missing llvm attributes, parameter names, ...).
I've seen it before in fcd. Then they switched to parsing a header file with libclang, which is better but pulls in the whole clang dependency.
I imagine relying on a separately generated bitcode file with all the declarations like declare i8* @_ZdlPv(i64 size) could be a clean solution.

try use llvm-mctoll but fail

$llvm-mctoll -d a.out 
llvm-mctoll: /XXXX/llvm_git/llvm/tools/llvm-mctoll/MachineFunctionRaiser.cpp:264: bool ModuleRaiser::collectDynamicRelocations(): Assertion `(DotRelaDotPltShdr.get()->sh_info == DotGotDotPltSec.getIndex()) && ".rela.plt does not refer .got.plt section"' failed.
LLVMSymbolizer: error reading file: No such file or directory
#0 0x00000000030a5181 (llvm-mctoll+0x30a5181)
#1 0x00000000030a5212 (llvm-mctoll+0x30a5212)
#2 0x00000000030a324d (llvm-mctoll+0x30a324d)
#3 0x00000000030a4c25 (llvm-mctoll+0x30a4c25)
#4 0x00007f04bf91e5e0 __restore_rt (/lib64/libpthread.so.0+0xf5e0)
#5 0x00007f04be5131f7 __GI_raise (/lib64/libc.so.6+0x351f7)
#6 0x00007f04be5148e8 __GI_abort (/lib64/libc.so.6+0x368e8)
#7 0x00007f04be50c266 __assert_fail_base (/lib64/libc.so.6+0x2e266)
#8 0x00007f04be50c312 (/lib64/libc.so.6+0x2e312)
#9 0x000000000049a285 (llvm-mctoll+0x49a285)
#10 0x000000000041c00f (llvm-mctoll+0x41c00f)
#11 0x000000000040c451 (llvm-mctoll+0x40c451)
#12 0x0000000000411938 (llvm-mctoll+0x411938)
#13 0x00000000004120e6 (llvm-mctoll+0x4120e6)
#14 0x000000000042653d (llvm-mctoll+0x42653d)
#15 0x000000000041292b (llvm-mctoll+0x41292b)
#16 0x00007f04be4ffc05 __libc_start_main (/lib64/libc.so.6+0x21c05)
#17 0x0000000000407a39 (llvm-mctoll+0x407a39)
Stack dump:
0.	Program arguments: llvm-mctoll -d a.out 
Aborted

I cloned all llvm,clang,mctoll source today, make sure they are newest and master, but still has aborted crash, it really hurts.

a.out is just a compiled helloworld.
gcc version 4.8.5, centos7

cat a.out | base64

you can decode and have a test.


f0VMRgIBAQAAAAAAAAAAAAIAPgABAAAATARAAAAAAABAAAAAAAAAAFgRAAAAAAAAAAAAAEAAOAAJ
AEAAHAAbAAYAAAAFAAAAQAAAAAAAAABAAEAAAAAAAEAAQAAAAAAA+AEAAAAAAAD4AQAAAAAAAAgA
AAAAAAAAAwAAAAQAAAA4AgAAAAAAADgCQAAAAAAAOAJAAAAAAAAcAAAAAAAAABwAAAAAAAAAAQAA
AAAAAAABAAAABQAAAAAAAAAAAAAAAABAAAAAAAAAAEAAAAAAAAQHAAAAAAAABAcAAAAAAAAAACAA
AAAAAAEAAAAGAAAAEA4AAAAAAAAQDmAAAAAAABAOYAAAAAAAJAIAAAAAAAAoAgAAAAAAAAAAIAAA
AAAAAgAAAAYAAAAoDgAAAAAAACgOYAAAAAAAKA5gAAAAAADQAQAAAAAAANABAAAAAAAACAAAAAAA
AAAEAAAABAAAAFQCAAAAAAAAVAJAAAAAAABUAkAAAAAAAEQAAAAAAAAARAAAAAAAAAAEAAAAAAAA
AFDldGQEAAAA4AUAAAAAAADgBUAAAAAAAOAFQAAAAAAANAAAAAAAAAA0AAAAAAAAAAQAAAAAAAAA
UeV0ZAYAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAEAAAAAAAAABS
5XRkBAAAABAOAAAAAAAAEA5gAAAAAAAQDmAAAAAAAPABAAAAAAAA8AEAAAAAAAABAAAAAAAAAC9s
aWI2NC9sZC1saW51eC14ODYtNjQuc28uMgAEAAAAEAAAAAEAAABHTlUAAAAAAAIAAAAGAAAAIAAA
AAQAAAAUAAAAAwAAAEdOVQDk1RMblav545i47YeKgWlKQpwKRgEAAAABAAAAAQAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAACwAAABIAAAAAAAAAAAAAAAAAAAAA
AAAAEAAAABIAAAAAAAAAAAAAAAAAAAAAAAAAIgAAACAAAAAAAAAAAAAAAAAAAAAAAAAAAGxpYmMu
c28uNgBwdXRzAF9fbGliY19zdGFydF9tYWluAF9fZ21vbl9zdGFydF9fAEdMSUJDXzIuMi41AAAA
AAIAAgAAAAAAAQABAAEAAAAQAAAAAAAAAHUaaQkAAAIAMQAAAAAAAAD4D2AAAAAAAAYAAAADAAAA
AAAAAAAAAAAYEGAAAAAAAAcAAAABAAAAAAAAAAAAAAAgEGAAAAAAAAcAAAACAAAAAAAAAAAAAAAo
EGAAAAAAAAcAAAADAAAAAAAAAAAAAABIg+wISIsFDQwgAEiFwHQF6DsAAABIg8QIwwAAAAAAAP81
AgwgAP8lBAwgAA8fQAD/JQIMIABoAAAAAOng/////yX6CyAAaAEAAADp0P////8l8gsgAGgCAAAA
6cD///+/0AVAAOnG////ZpAx7UmJ0V5IieJIg+TwUFRJx8CwBUAASMfBQAVAAEjHx0AEQADoq///
//RmkA8fhAAAAAAAuD8QYABVSC04EGAASIP4DkiJ5XcCXcO4AAAAAEiFwHT0Xb84EGAA/+APH4AA
AAAAuDgQYABVSC04EGAASMH4A0iJ5UiJwkjB6j9IAdBI0fh1Al3DugAAAABIhdJ09F1Iica/OBBg
AP/iDx+AAAAAAIA9PQsgAAB1EVVIieXofv///13GBSoLIAAB88MPH0AASIM9CAkgAAB0HrgAAAAA
SIXAdBRVvyAOYABIieX/0F3pe////w8fAOlz////Dx8AQVdBif9BVkmJ9kFVSYnVQVRMjSW4CCAA
VUiNLbgIIABTTCnlMdtIwf0DSIPsCOht/v//SIXtdB4PH4QAAAAAAEyJ6kyJ9kSJ/0H/FNxIg8MB
SDnrdepIg8QIW11BXEFdQV5BX8NmZi4PH4QAAAAAAPPDZpBIg+wISIPECMMAAAABAAIAAAAAAAAA
AAAAAAAAaGVsbG8gd29ybGQhAAAAAAEbAzs0AAAABQAAACD+//+AAAAAYP7//6gAAABs/v//UAAA
AGD////AAAAA0P///wgBAAAAAAAAFAAAAAAAAAABelIAAXgQARsMBwiQAQcQFAAAABwAAAAU/v//
KgAAAAAAAAAAAAAAFAAAAAAAAAABelIAAXgQARsMBwiQAQAAJAAAABwAAACY/f//QAAAAAAOEEYO
GEoPC3cIgAA/GjsqMyQiAAAAABQAAABEAAAAsP3//woAAAAAAAAAAAAAAEQAAABcAAAAmP7//2UA
AAAAQg4QjwJFDhiOA0UOII0ERQ4ojAVIDjCGBkgOOIMHTQ5AbA44QQ4wQQ4oQg4gQg4YQg4QQg4I
ABQAAACkAAAAwP7//wIAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAEAVAAAAAAADwBEAAAAAAAAAAAAAAAAAAAQAAAAAAAAABAAAAAAAAAAwAAAAAAAAA
4ANAAAAAAAANAAAAAAAAALQFQAAAAAAAGQAAAAAAAAAQDmAAAAAAABsAAAAAAAAACAAAAAAAAAAa
AAAAAAAAABgOYAAAAAAAHAAAAAAAAAAIAAAAAAAAAPX+/28AAAAAmAJAAAAAAAAFAAAAAAAAABgD
QAAAAAAABgAAAAAAAAC4AkAAAAAAAAoAAAAAAAAAPQAAAAAAAAALAAAAAAAAABgAAAAAAAAAFQAA
AAAAAAAAAAAAAAAAAAMAAAAAAAAAABBgAAAAAAACAAAAAAAAAEgAAAAAAAAAFAAAAAAAAAAHAAAA
AAAAABcAAAAAAAAAmANAAAAAAAAHAAAAAAAAAIADQAAAAAAACAAAAAAAAAAYAAAAAAAAAAkAAAAA
AAAAGAAAAAAAAAD+//9vAAAAAGADQAAAAAAA////bwAAAAABAAAAAAAAAPD//28AAAAAVgNAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAACgOYAAAAAAA
AAAAAAAAAAAAAAAAAAAAABYEQAAAAAAAJgRAAAAAAAA2BEAAAAAAAAAAAABHQ0M6IChHTlUpIDQu
OC41IDIwMTUwNjIzIChSZWQgSGF0IDQuOC41LTQpAAAuc2hzdHJ0YWIALmludGVycAAubm90ZS5B
QkktdGFnAC5ub3RlLmdudS5idWlsZC1pZAAuZ251Lmhhc2gALmR5bnN5bQAuZHluc3RyAC5nbnUu
dmVyc2lvbgAuZ251LnZlcnNpb25fcgAucmVsYS5keW4ALnJlbGEucGx0AC5pbml0AC50ZXh0AC5m
aW5pAC5yb2RhdGEALmVoX2ZyYW1lX2hkcgAuZWhfZnJhbWUALmluaXRfYXJyYXkALmZpbmlfYXJy
YXkALmpjcgAuZHluYW1pYwAuZ290AC5nb3QucGx0AC5kYXRhAC5ic3MALmNvbW1lbnQAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAsAAAABAAAAAgAAAAAAAAA4AkAAAAAAADgCAAAAAAAAHAAAAAAAAAAAAAAAAAAAAAEAAAAAAAAA
AAAAAAAAAAATAAAABwAAAAIAAAAAAAAAVAJAAAAAAABUAgAAAAAAACAAAAAAAAAAAAAAAAAAAAAE
AAAAAAAAAAAAAAAAAAAAIQAAAAcAAAACAAAAAAAAAHQCQAAAAAAAdAIAAAAAAAAkAAAAAAAAAAAA
AAAAAAAABAAAAAAAAAAAAAAAAAAAADQAAAD2//9vAgAAAAAAAACYAkAAAAAAAJgCAAAAAAAAHAAA
AAAAAAAFAAAAAAAAAAgAAAAAAAAAAAAAAAAAAAA+AAAACwAAAAIAAAAAAAAAuAJAAAAAAAC4AgAA
AAAAAGAAAAAAAAAABgAAAAEAAAAIAAAAAAAAABgAAAAAAAAARgAAAAMAAAACAAAAAAAAABgDQAAA
AAAAGAMAAAAAAAA9AAAAAAAAAAAAAAAAAAAAAQAAAAAAAAAAAAAAAAAAAE4AAAD///9vAgAAAAAA
AABWA0AAAAAAAFYDAAAAAAAACAAAAAAAAAAFAAAAAAAAAAIAAAAAAAAAAgAAAAAAAABbAAAA/v//
bwIAAAAAAAAAYANAAAAAAABgAwAAAAAAACAAAAAAAAAABgAAAAEAAAAIAAAAAAAAAAAAAAAAAAAA
agAAAAQAAAACAAAAAAAAAIADQAAAAAAAgAMAAAAAAAAYAAAAAAAAAAUAAAAAAAAACAAAAAAAAAAY
AAAAAAAAAHQAAAAEAAAAAgAAAAAAAACYA0AAAAAAAJgDAAAAAAAASAAAAAAAAAAFAAAADAAAAAgA
AAAAAAAAGAAAAAAAAAB+AAAAAQAAAAYAAAAAAAAA4ANAAAAAAADgAwAAAAAAABoAAAAAAAAAAAAA
AAAAAAAEAAAAAAAAAAAAAAAAAAAAeQAAAAEAAAAGAAAAAAAAAAAEQAAAAAAAAAQAAAAAAABAAAAA
AAAAAAAAAAAAAAAAEAAAAAAAAAAQAAAAAAAAAIQAAAABAAAABgAAAAAAAABABEAAAAAAAEAEAAAA
AAAAdAEAAAAAAAAAAAAAAAAAABAAAAAAAAAAAAAAAAAAAACKAAAAAQAAAAYAAAAAAAAAtAVAAAAA
AAC0BQAAAAAAAAkAAAAAAAAAAAAAAAAAAAAEAAAAAAAAAAAAAAAAAAAAkAAAAAEAAAACAAAAAAAA
AMAFQAAAAAAAwAUAAAAAAAAdAAAAAAAAAAAAAAAAAAAACAAAAAAAAAAAAAAAAAAAAJgAAAABAAAA
AgAAAAAAAADgBUAAAAAAAOAFAAAAAAAANAAAAAAAAAAAAAAAAAAAAAQAAAAAAAAAAAAAAAAAAACm
AAAAAQAAAAIAAAAAAAAAGAZAAAAAAAAYBgAAAAAAAOwAAAAAAAAAAAAAAAAAAAAIAAAAAAAAAAAA
AAAAAAAAsAAAAA4AAAADAAAAAAAAABAOYAAAAAAAEA4AAAAAAAAIAAAAAAAAAAAAAAAAAAAACAAA
AAAAAAAAAAAAAAAAALwAAAAPAAAAAwAAAAAAAAAYDmAAAAAAABgOAAAAAAAACAAAAAAAAAAAAAAA
AAAAAAgAAAAAAAAAAAAAAAAAAADIAAAAAQAAAAMAAAAAAAAAIA5gAAAAAAAgDgAAAAAAAAgAAAAA
AAAAAAAAAAAAAAAIAAAAAAAAAAAAAAAAAAAAzQAAAAYAAAADAAAAAAAAACgOYAAAAAAAKA4AAAAA
AADQAQAAAAAAAAYAAAAAAAAACAAAAAAAAAAQAAAAAAAAANYAAAABAAAAAwAAAAAAAAD4D2AAAAAA
APgPAAAAAAAACAAAAAAAAAAAAAAAAAAAAAgAAAAAAAAACAAAAAAAAADbAAAAAQAAAAMAAAAAAAAA
ABBgAAAAAAAAEAAAAAAAADAAAAAAAAAAAAAAAAAAAAAIAAAAAAAAAAgAAAAAAAAA5AAAAAEAAAAD
AAAAAAAAADAQYAAAAAAAMBAAAAAAAAAEAAAAAAAAAAAAAAAAAAAABAAAAAAAAAAAAAAAAAAAAOoA
AAAIAAAAAwAAAAAAAAA0EGAAAAAAADQQAAAAAAAABAAAAAAAAAAAAAAAAAAAAAQAAAAAAAAAAAAA
AAAAAADvAAAAAQAAADAAAAAAAAAAAAAAAAAAAAA0EAAAAAAAACwAAAAAAAAAAAAAAAAAAAABAAAA
AAAAAAEAAAAAAAAAAQAAAAMAAAAAAAAAAAAAAAAAAAAAAAAAYBAAAAAAAAD4AAAAAAAAAAAAAAAA
AAAAAQAAAAAAAAAAAAAAAAAAAA==


Failed to get effective address value (coremark)

This assertion comes up with the latest mctoll and coremark source from GitHub:

bool X86MachineInstructionRaiser::raiseLEAMachineInstr(const llvm::MachineInstr&): Assertion `(EffectiveAddrValue != nullptr) && "Failed to get effective address value"' failed.

Error raised when running "ninja check-mctoll"

It seems to be a clang related issue. I have looked it up, but couldn't find a solution that really works. (My clang version is 6.0.0 ) Any help would be appreciated!

[0/1] Running the llvm-mctoll tests
FAIL: mctoll :: smoke_test/test-empty-mbb.c (79 of 96)
******************** TEST 'mctoll :: smoke_test/test-empty-mbb.c' FAILED ********************
Script:
--
: 'RUN: at line 3';   /home/lol/research/json/llvm-project/build/bin/clang -o /home/lol/research/json/llvm-project/build/tools/llvm-mctoll/test/smoke_test/Output/test-empty-mbb.c.tmp-ld-opt /home/lol/research/json/llvm-project/llvm/tools/llvm-mctoll/test/smoke_test/test-empty-mbb.c -O2 -mno-sse -fuse-ld=ld
: 'RUN: at line 4';   /home/lol/research/json/llvm-project/build/bin/llvm-mctoll -d /home/lol/research/json/llvm-project/build/tools/llvm-mctoll/test/smoke_test/Output/test-empty-mbb.c.tmp-ld-opt
: 'RUN: at line 5';   /home/lol/research/json/llvm-project/build/bin/clang -o /home/lol/research/json/llvm-project/build/tools/llvm-mctoll/test/smoke_test/Output/test-empty-mbb.c.tmp-ld-opt-dis /home/lol/research/json/llvm-project/build/tools/llvm-mctoll/test/smoke_test/Output/test-empty-mbb.c.tmp-ld-opt-dis.ll
: 'RUN: at line 6';   /home/lol/research/json/llvm-project/build/tools/llvm-mctoll/test/smoke_test/Output/test-empty-mbb.c.tmp-ld-opt-dis 2>&1 | /home/lol/research/json/llvm-project/build/bin/FileCheck /home/lol/research/json/llvm-project/llvm/tools/llvm-mctoll/test/smoke_test/test-empty-mbb.c -check-prefix=CHECK-LD
: 'RUN: at line 14';   /home/lol/research/json/llvm-project/build/bin/clang -o /home/lol/research/json/llvm-project/build/tools/llvm-mctoll/test/smoke_test/Output/test-empty-mbb.c.tmp-lld-opt /home/lol/research/json/llvm-project/llvm/tools/llvm-mctoll/test/smoke_test/test-empty-mbb.c -O2 -mno-sse -fuse-ld=lld
: 'RUN: at line 15';   /home/lol/research/json/llvm-project/build/bin/llvm-mctoll -d /home/lol/research/json/llvm-project/build/tools/llvm-mctoll/test/smoke_test/Output/test-empty-mbb.c.tmp-lld-opt
: 'RUN: at line 16';   /home/lol/research/json/llvm-project/build/bin/clang -o /home/lol/research/json/llvm-project/build/tools/llvm-mctoll/test/smoke_test/Output/test-empty-mbb.c.tmp-lld-opt-dis /home/lol/research/json/llvm-project/build/tools/llvm-mctoll/test/smoke_test/Output/test-empty-mbb.c.tmp-lld-opt-dis.ll
: 'RUN: at line 17';   /home/lol/research/json/llvm-project/build/tools/llvm-mctoll/test/smoke_test/Output/test-empty-mbb.c.tmp-lld-opt-dis 2>&1 | /home/lol/research/json/llvm-project/build/bin/FileCheck /home/lol/research/json/llvm-project/llvm/tools/llvm-mctoll/test/smoke_test/test-empty-mbb.c -check-prefix=CHECK-LLD
--
Exit Code: 1

Command Output (stderr):
--
warning: overriding the module target triple with x86_64-unknown-linux-gnu [-Woverride-module]
1 warning generated.
clang-10: error: invalid linker name in argument '-fuse-ld=lld'

--

********************

Testing Time: 68.53s
********************
Failing Tests (1):
    mctoll :: smoke_test/test-empty-mbb.c

  Expected Passes    : 95
  Unexpected Failures: 1
FAILED: tools/llvm-mctoll/test/CMakeFiles/check-mctoll 
cd /home/lol/research/json/llvm-project/build/tools/llvm-mctoll/test && /usr/bin/python3.7 /home/lol/research/json/llvm-project/build/./bin/llvm-lit -sv --param llvm_site_config=/home/lol/research/json/llvm-project/build/tools/llvm-mctoll/test/lit.site.cfg --param llvm_unit_site_config=/home/lol/research/json/llvm-project/build/tools/llvm-mctoll/test/Unit/lit.site.cfg /home/lol/research/json/llvm-project/build/tools/llvm-mctoll/test
ninja: build stopped: subcommand failed.

ARM Target dependency

With -DLLVM_TARGETS_TO_BUILD=X86 I get the following error, might be good to make the build more modular:

MachineFunctionRaiser.cpp.obj : error LNK2019: unresolved external symbol InitializeARMMachineInstructionRaiser referenced in function "private: void __cdecl MachineFunctionRaiser::init(unsigned __int64,unsigned __int64)" (?init@MachineFunctionRaiser@@AEAAX_K0@Z)

Function pointer support

// test.c
void (*func_call)(int, int);
void __ attribute __ ((noinline)) call_test(int op1, int op2) {
}

void __ attribute __ ((noinline)) call_me(int a, int b) {
(*func_call)(a, b);
}

int main(int argc, char **argv) {
func_call = call_test;
call_me(4, 5);
return 0;
}

Assertion failed: (false && "Unhandled call or branch found during function " "prototype discovery"), function getRaisedFunctionPrototype, file llvm-mctoll/X86/X86FuncPrototypeDiscovery.cpp, line 455.

better loop support

Loops usually generate some inc/dec instructions instead of add when -Os is used:

int sum(int* arr, int n) {
	int sum = 0;
	for (int i = 0; i < n; ++i)
		sum += arr[i];
	return sum;
}

SSE2

Really necessary since all x64 code is using SSE2 by default.
At least the move operations should be implemented for #39 and #38.

Build error after API change

Hi guys,

Just wanted to notify you that a (minor) build error happens with latest versions of LLVM/Clang:
llvm-mctoll/src/llvm/tools/llvm-mctoll/MachODump.cpp:6832:65: error: no matching function... « llvm::DIContext::getLineInfoForAddress(uint64_t&) 

Kind regards,
plowsec

Building Integration Tests

The test_suite directory is automatically generated!

All modifications should happen to test_suite_generator

I don't know what does this mean and I failed to find this directory test_suit_generator.
How is it be generated automatically?

Building llvm-mctoll on Windows

Following the discussion in #30 I though it best to open a new issue to discuss support for compiling and running on Windows. As well as support for the Windows PE/COFF format.

Bug in raising division instructions.

I have been using llvm-mctoll for my project which involves raising of binaries to IR. I have been getting an error when i try to raise binaries which are generated for a C code involving division operation on a variable. It works fine if both the operands are constants.
As an example: The following code snippet works fine.

int main() {
   int a = 4/2;
   return 0;
}

But, there seems to be an issue with variable operands.
As an example:

int main() {
  int a = 10;
  a = a/2;
  return 0;
}

While raising the binary, following error occurs:

*** Generic instruction not raised :   IDIV32r $ecx, <0x557f28d346e8>, implicit-def $eax, implicit-def $edx, implicit-def $eflags, implicit $eax, implicit $edx

Please let me know if this is not implemented or there is an issue from my side.

when run occurs error

Hi,
when I try to generate LLVM IR for a binary, it cause errors as follows:
anna@ubuntu:/llvm/build/llvm/bin$ gcc hello.c
anna@ubuntu:
/llvm/build/llvm/bin$ ./llvm-mctoll -d a.out
llvm-mctoll: /home/anna/llvm/src/llvm/tools/llvm-mctoll/MachineFunctionRaiser.cpp:264: bool ModuleRaiser::collectDynamicRelocations(): Assertion `(DotRelaDotPltShdr.get()->sh_info == DotGotDotPltSec.getIndex()) && ".rela.plt does not refer .got.plt section"' failed.
./llvm-mctoll[0x3088eb3]
./llvm-mctoll[0x3088f44]
./llvm-mctoll[0x3086eb1]
./llvm-mctoll[0x3088958]
/lib/x86_64-linux-gnu/libpthread.so.0(+0x10340)[0x7f38a07f8340]
/lib/x86_64-linux-gnu/libc.so.6(gsignal+0x39)[0x7f389fc38f79]
/lib/x86_64-linux-gnu/libc.so.6(abort+0x148)[0x7f389fc3c388]
/lib/x86_64-linux-gnu/libc.so.6(+0x2fe36)[0x7f389fc31e36]
/lib/x86_64-linux-gnu/libc.so.6(+0x2fee2)[0x7f389fc31ee2]
./llvm-mctoll[0x499d52]
./llvm-mctoll[0x41c4bb]
./llvm-mctoll[0x40c5a2]
./llvm-mctoll[0x411d43]
./llvm-mctoll[0x412545]
./llvm-mctoll[0x426acb]
./llvm-mctoll[0x412daa]
/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf5)[0x7f389fc23ec5]
./llvm-mctoll[0x407a89]
Stack dump:
0. Program arguments: ./llvm-mctoll -d a.out
Aborted (core dumped)

besides, the OS is "Linux ubuntu 3.13.0-32-generic #57-Ubuntu SMP Tue Jul 15 03:51:08 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux"

I don't know where is the problem. Could you give me some suggestions?
best regards

Mctoll crashes when running raising ARM binary

Running mctoll on this program:

long g1;

attribute((noinline)) int func(int a, char b, short c, long d, long e, int f, int g) {
int temp = (a + f) * (int) c - (int) e;
int temp2 = temp + (int) (b * b);
int tf = (d + g) * 3 + temp + temp2;

return b + tf + temp2;
}

int a1, f1;
long e1;
char b1;
short c1;
long d1;

int main(int argc, char **argv) {
return (int) func (a1, b1, c1, d1, e1, f1, g1);
}

Compiled with arm-linux-gnueabi-gcc version 7.5 (-Os) gives this crash and backtrace:

llvm-mctoll: /home/collison/Raiser/llvm-project/llvm/include/llvm/ADT/ilist_iterator.h:138: llvm::ilist_iterator<OptionsT, IsReverse, IsConst>::reference llvm::ilist_iterator<OptionsT, IsReverse, IsConst>::operator*() const [with OptionsT = llvm::ilist_detail::node_options<llvm::MachineBasicBlock, true, false, void>; bool IsReverse = true; bool IsConst = false; llvm::ilist_iterator<OptionsT, IsReverse, IsConst>::reference = llvm::MachineBasicBlock&]: Assertion `!NodePtr->isKnownSentinel()' failed.
#0 0x0000563df73c5955 llvm::sys::PrintStackTrace(llvm::raw_ostream&) /home/collison/Raiser/llvm-project/llvm/lib/Support/Unix/Signals.inc:564:0
#1 0x0000563df73c59e8 PrintStackTraceSignalHandler(void*) /home/collison/Raiser/llvm-project/llvm/lib/Support/Unix/Signals.inc:625:0
#2 0x0000563df73c3779 llvm::sys::RunSignalHandlers() /home/collison/Raiser/llvm-project/llvm/lib/Support/Signals.cpp:68:0
#3 0x0000563df73c52d2 SignalHandler(int) /home/collison/Raiser/llvm-project/llvm/lib/Support/Unix/Signals.inc:406:0
#4 0x00007f4a3b385890 __restore_rt (/lib/x86_64-linux-gnu/libpthread.so.0+0x12890)
#5 0x00007f4a3a681e97 gsignal /build/glibc-OTsEL5/glibc-2.27/signal/../sysdeps/unix/sysv/linux/raise.c:51:0
#6 0x00007f4a3a683801 abort /build/glibc-OTsEL5/glibc-2.27/stdlib/abort.c:81:0
#7 0x00007f4a3a67339a __assert_fail_base /build/glibc-OTsEL5/glibc-2.27/assert/assert.c:89:0
#8 0x00007f4a3a673412 (/lib/x86_64-linux-gnu/libc.so.6+0x30412)
#9 0x0000563df60fec2d llvm::ilist_iterator<llvm::ilist_detail::node_options<llvm::MachineBasicBlock, true, false, void>, true, false>::operator*() const /home/collison/Raiser/llvm-project/llvm/include/llvm/ADT/ilist_iterator.h:139:0
#10 0x0000563df60fdd81 llvm::simple_ilistllvm::MachineBasicBlock::back() /home/collison/Raiser/llvm-project/llvm/include/llvm/ADT/simple_ilist.h:140:0
#11 0x0000563df60fd19a llvm::MachineFunction::back() /home/collison/Raiser/llvm-project/llvm/include/llvm/CodeGen/MachineFunction.h:726:0
#12 0x0000563df60fabdb MCInstRaiser::buildCFG(llvm::MachineFunction&, llvm::MCInstrAnalysis const*, llvm::MCInstrInfo const*) /home/collison/Raiser/llvm-project/llvm/tools/llvm-mctoll/MCInstRaiser.cpp:118:0
#13 0x0000563df60f62c5 ModuleRaiser::runMachineFunctionPasses() /home/collison/Raiser/llvm-project/llvm/tools/llvm-mctoll/MachineFunctionRaiser.cpp:109:0
#14 0x0000563df60657ad DisassembleObject(llvm::object::ObjectFile const*, bool) /home/collison/Raiser/llvm-project/llvm/tools/llvm-mctoll/llvm-mctoll.cpp:1443:0
#15 0x0000563df60675b4 DumpObject(llvm::object::ObjectFile*, llvm::object::Archive const*) /home/collison/Raiser/llvm-project/llvm/tools/llvm-mctoll/llvm-mctoll.cpp:1734:0
#16 0x0000563df6067bdf DumpInput(llvm::StringRef) /home/collison/Raiser/llvm-project/llvm/tools/llvm-mctoll/llvm-mctoll.cpp:1794:0
#17 0x0000563df607a214 void (std::for_each<__gnu_cxx::__normal_iterator<std::__cxx11::basic_string<char, std::char_traits, std::allocator >, std::vector<std::__cxx11::basic_string<char, std::char_traits, std::allocator >, std::allocator<std::__cxx11::basic_string<char, std::char_traits, std::allocator > > > >, void ()(llvm::StringRef)>(__gnu_cxx::__normal_iterator<std::__cxx11::basic_string<char, std::char_traits, std::allocator >, std::vector<std::__cxx11::basic_string<char, std::char_traits, std::allocator >, std::allocator<std::__cxx11::basic_string<char, std::char_traits, std::allocator > > > >, __gnu_cxx::__normal_iterator<std::__cxx11::basic_string<char, std::char_traits, std::allocator >, std::vector<std::__cxx11::basic_string<char, std::char_traits, std::allocator >, std::allocator<std::__cxx11::basic_string<char, std::char_traits, std::allocator > > > >, void ()(llvm::StringRef)))(llvm::StringRef) /usr/include/c++/7/bits/stl_algo.h:3883:0
#18 0x0000563df6067fb3 main /home/collison/Raiser/llvm-project/llvm/tools/llvm-mctoll/llvm-mctoll.cpp:1843:0
#19 0x00007f4a3a664b97 __libc_start_main /build/glibc-OTsEL5/glibc-2.27/csu/../csu/libc-start.c:344:0
#20 0x0000563df605faea _start (../../../../../build/bin/llvm-mctoll+0x30daea)

NOTE: X86 compiles but raises the binary incorrectly due too the number of arguments. It appears the X86 does not handle more than four parameters.

Raising simple arm binary err

I thought have successfully build llvm-mctoll, trying to raising a simple arm32 binary

I have meet some problem during the job, desired to get some help

$ cat test.c   # the binary source file
#include <stdio.h>
char g_buffer[100];
int main(int argc, char **argv)
{
    int val_0;
    int val_1 = 1;
    int val_2 = 2;
    int val_sum  = val_0 + val_1 + val_2;

    g_buffer[0] = 10;
    printf("run ret %d \n", val_sum);
    return val_sum;
}

$ arm-linux-gnueabihf-gcc -marm test.c  # it seems not support thumbs instructions, causing cannot decode instruction error when raising
$ llvm-mctoll -d --print-after-all a.out

; a.out:	file format ELF32-arm-little

Disassembling section 

Function call_weak_fn:

Function main:
Parsed MCInst List
0x10b64
0x0030
0x10b90
0x10b8c
0x10b40
0x0028
0x10b64
0x10b60
0x10b14
0x003c
0x10b2e
0x10ae4
0x0024
0x10b1a
0x10b0e
Generated CFG
ARMFunctionPrototype start.
ARMFunctionPrototype end.
Parsed MCInst List
0x10a8c
0x002c
0x0070
0x1091a
0x10910
Generated CFG
ARMFunctionPrototype start.
ARMFunctionPrototype end.
ARMEliminatePrologEpilog start.
ARMEliminatePrologEpilog end.
ARMEliminatePrologEpilog start.
ARMEliminatePrologEpilog end.
$ cat a.out-dis.ll 
; ModuleID = 'a.out'
source_filename = "a.out"

declare void @call_weak_fn()

declare i32 @main(i32, i32)

the a.out-dis.ll apparently missing define of main and declare of printf

and I try to raising the x86 version with the same test.c
it success like

$ llvm-mctoll -d x86_a.out && cat x86_a.out-dis.ll 
; ModuleID = 'x86_a.out'
source_filename = "x86_a.out"

@g_buffer = common dso_local global [100 x i8] zeroinitializer, align 32
@RO-String = private constant [13 x i8] c"run ret %d \0A\00", align 1

declare dso_local i32 @printf(i8*, ...)

define dso_local i32 @main(i32 %arg1, i64 %arg2) {
entry:
  %0 = alloca i64, align 8
  %StackAdj = alloca i64, align 8
  %1 = alloca i32, align 4
  %2 = alloca i32, align 4
  %3 = alloca i32, align 4
  %4 = alloca i32, align 4
  %5 = alloca i32, align 4
  %6 = ptrtoint i64* %0 to i64
  store i32 %arg1, i32* %1, align 4
  store i64 %arg2, i64* %StackAdj, align 8
  store i32 1, i32* %2, align 4
  store i32 2, i32* %3, align 4
  %7 = load i32, i32* %4, align 4
  %8 = load i32, i32* %2, align 4
  %9 = add nsw i32 %7, %8
  %10 = shl i32 1, 31
  %11 = and i32 %10, %9
  %SF = icmp eq i32 %11, %10
  %ZF = icmp eq i32 %9, 0
  %12 = load i32, i32* %3, align 4
  %13 = add nsw i32 %12, %9
  %14 = shl i32 1, 31
  %15 = and i32 %14, %13
  %SF1 = icmp eq i32 %15, %14
  %ZF2 = icmp eq i32 %13, 0
  store i32 %13, i32* %5, align 4
  %16 = getelementptr inbounds [100 x i8], [100 x i8]* @g_buffer, i32 0, i32 0
  store i8 10, i8* %16, align 1
  %17 = load i32, i32* %5, align 4
  %18 = ptrtoint [13 x i8]* @RO-String to i64
  %19 = inttoptr i64 %18 to i8*
  %20 = call i32 (i8*, ...) @printf(i8* %19, i32 %17, i32 %9)
  %21 = load i32, i32* %5, align 4
  ret i32 %21
}

questions for help:

  1. How to fix the arm32 binary raising problem?
  2. Is llvm-mctoll not support thumb instructions yet?

Incorrect LLVM IR generated from 32-bit ARM binary in thumb mode

As I was trying to lift a 32-bit ARM binary (in thumb mode) with mctoll, the generated LLVM IR only contained function declarations as shown below:

; ModuleID = 'raceflight_REVO.elf'
source_filename = "raceflight_REVO.elf"
declare i32 @taskMainPidLoopCheck()
declare i32 @taskUpdateRxMain()
..........
declare i32 @__errno()

My Guess:
My guess is that mctoll might not take the modes of ARM binary into consideration. i.e. mctoll assumes any input ARM binary is in ARM mode. In ARM binary in thumb mode, the last digit of the the destination address of jump instruction must be 1 (0 in ARM mode) as a convention .

For example:

Info about the testing binary:

ELF Header:
  Magic:   7f 45 4c 46 01 01 01 00 00 00 00 00 00 00 00 00 
  Class:                             ELF32
  Data:                              2's complement, little endian
  Version:                           1 (current)
  OS/ABI:                            UNIX - System V
  ABI Version:                       0
  Type:                              EXEC (Executable file)
  Machine:                           ARM
  Version:                           0x1
  Entry point address:               0x8018a3d
  Start of program headers:          52 (bytes into file)
  Start of section headers:          898248 (bytes into file)
  Flags:                             0x5000402, Version5 EABI, hard-float ABI, <unknown>
  Size of this header:               52 (bytes)
  Size of program headers:           32 (bytes)
  Number of program headers:         5
  Size of section headers:           40 (bytes)
  Number of section headers:         20
  Section header string table index: 17

As we see, the entry point address is 0x8018a3d (the last digit is 1) even if it appears as 0x8018a3c (the last digit is 0) in IDA Pro as shown below:

image

If mctoll disassembles the binary w/o considering thumb mode feature, it may consider 0x8018a3d (and all other function start addresses whose last digit is 1) as a dissembling failure and thus discards it. At the end, only function declarations are left in the LLVM IR.

For testing, here is the binary I used as input:
raceflight_REVO.zip

aborted for linux x64 binary (hello, world)

firstly thanks for your efforts !
but aborted when translating linux x64 program (hello)

hello.c
#include <stdio.h>
int main(int argc, char** argv)
{
printf("is me !\n");
}

build:
gcc -o main main.c

aborted log:
./llvm-mctoll -d -print-after-all ~/Downloads/main

; /home/steward/Downloads/main: file format ELF64-x86-64

Disassembling section .text

Function main:
Running buildCFG

Machine code for function main: TracksLiveness

bb.0:
PUSH64r $rbp, <0x56348d44d4f8>, implicit-def $rsp, implicit $rsp
$rbp = MOV64rr $rsp, <0x56348d45c428>
$rsp = SUB64ri8 $rsp(tied-def 0), 16, <0x56348d45c528>, implicit-def $eflags
MOV32mr $rbp, 1, $noreg, -4, $noreg, $edi, <0x56348d45c648>
MOV64mr $rbp, 1, $noreg, -16, $noreg, $rsi, <0x56348d45c768>
$rdi = LEA64r $rip, 1, $noreg, 158, $noreg, <0x56348d45c888>
CALL64pcrel32 -363, <0x56348d45c9a8>, implicit $rsp, implicit $ssp
$eax = MOV32ri 0, <0x56348d45cac8>
LEAVE64 <0x56348d45cbe8>, implicit-def $rbp, implicit-def $rsp, implicit $rbp, implicit $rsp
RETQ <0x56348d45cd08>
NOOPW $rax, 1, $rax, 0, $cs, <0x56348d45de38>
NOOPL $rax, 1, $noreg, 0, $noreg, <0x56348d45df58>

End machine code for function main.

llvm-mctoll: /home/steward/Downloads/src/llvm/tools/llvm-mctoll/X86/X86MachineInstructionRaiser.cpp:1291: llvm::Value* X86MachineInstructionRaiser::getMemoryAddressExprValue(const llvm::MachineInstr&, llvm::BasicBlock*): Assertion `((BaseReg == X86::NoRegister) && (IndexReg == X86::NoRegister) && (ScaleAmt == 1)) && "Unhandled addressing mode in memory addr expression calculation"' failed.
./llvm-mctoll(+0x2f952ca)[0x5634896562ca]
./llvm-mctoll(+0x2f9535d)[0x56348965635d]
./llvm-mctoll(+0x2f93349)[0x563489654349]
./llvm-mctoll(+0x2f94d68)[0x563489655d68]
/lib/x86_64-linux-gnu/libpthread.so.0(+0x110c0)[0x7f6202e2f0c0]
/lib/x86_64-linux-gnu/libc.so.6(gsignal+0xcf)[0x7f62019c4fff]
/lib/x86_64-linux-gnu/libc.so.6(abort+0x16a)[0x7f62019c642a]
/lib/x86_64-linux-gnu/libc.so.6(+0x2be67)[0x7f62019bde67]
/lib/x86_64-linux-gnu/libc.so.6(+0x2bf12)[0x7f62019bdf12]
./llvm-mctoll(+0x15d81db)[0x563487c991db]
./llvm-mctoll(+0x15daba5)[0x563487c9bba5]
./llvm-mctoll(+0x15e0e9f)[0x563487ca1e9f]
./llvm-mctoll(+0x15e22b7)[0x563487ca32b7]
./llvm-mctoll(+0x15e2779)[0x563487ca3779]
./llvm-mctoll(+0x15e27ee)[0x563487ca37ee]
./llvm-mctoll(+0x46b010)[0x563486b2c010]
./llvm-mctoll(+0x46b759)[0x563486b2c759]
./llvm-mctoll(+0x3e15b5)[0x563486aa25b5]
./llvm-mctoll(+0x3e38a6)[0x563486aa48a6]
./llvm-mctoll(+0x3e4053)[0x563486aa5053]
./llvm-mctoll(+0x3f844e)[0x563486ab944e]
./llvm-mctoll(+0x3e48a4)[0x563486aa58a4]
/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf1)[0x7f62019b22e1]
./llvm-mctoll(+0x3d999a)[0x563486a9a99a]
Stack dump:
0. Program arguments: ./llvm-mctoll -d -print-after-all /home/steward/Downloads/main
Aborted

Question 1: how to fix this issue ?
Question 2: is it possible to support Win32 x86 ?

Thanks

Raising Go binary - Assertion failed: No called function prototype found while determining return type

$ cat > hello.go <<EOF
package main
import "fmt"
func main() {
        fmt.Println("Hello, world!")
}
$ go version && go build
go version go1.13 linux/amd64
$ ~/src/llvm-project/build/bin/llvm-mctoll -d helloworld 
llvm-mctoll: /home/xaionaro/src/llvm-project/llvm/tools/llvm-mctoll/X86/X86FuncPrototypeDiscovery.cpp:612: llvm::Type* X86MachineInstructionRaiser::getReturnTypeFromMBB(const llvm::MachineBasicBlock&, bool&): Assertion `(CalledFunc != nullptr) && "No called function prototype found while determining return type"' failed.
 #0 0x000000000102061a llvm::sys::PrintStackTrace(llvm::raw_ostream&) (/home/xaionaro/src/llvm-project/build/bin/llvm-mctoll+0x102061a)
 #1 0x000000000101e604 llvm::sys::RunSignalHandlers() (/home/xaionaro/src/llvm-project/build/bin/llvm-mctoll+0x101e604)
 #2 0x000000000101eb0d SignalHandler(int) (/home/xaionaro/src/llvm-project/build/bin/llvm-mctoll+0x101eb0d)
 #3 0x00007fdcbaf00b20 __restore_rt (/lib64/libpthread.so.0+0x14b20)
 #4 0x00007fdcba9a7625 raise (/lib64/libc.so.6+0x3c625)
 #5 0x00007fdcba9908d9 abort (/lib64/libc.so.6+0x258d9)
 #6 0x00007fdcba9907a9 _nl_load_domain.cold (/lib64/libc.so.6+0x257a9)
 #7 0x00007fdcba99fa66 (/lib64/libc.so.6+0x34a66)
 #8 0x0000000001046b3c X86MachineInstructionRaiser::getReturnTypeFromMBB(llvm::MachineBasicBlock const&, bool&) (/home/xaionaro/src/llvm-project/build/bin/llvm-mctoll+0x1046b3c)
 #9 0x0000000001046b8e X86MachineInstructionRaiser::getReachingReturnType(llvm::MachineBasicBlock const&) (/home/xaionaro/src/llvm-project/build/bin/llvm-mctoll+0x1046b8e)
#10 0x0000000001047d25 X86MachineInstructionRaiser::getFunctionReturnType() (/home/xaionaro/src/llvm-project/build/bin/llvm-mctoll+0x1047d25)
#11 0x0000000001048d78 X86MachineInstructionRaiser::getRaisedFunctionPrototype() (/home/xaionaro/src/llvm-project/build/bin/llvm-mctoll+0x1048d78)
#12 0x00000000004896ba ModuleRaiser::runMachineFunctionPasses() (/home/xaionaro/src/llvm-project/build/bin/llvm-mctoll+0x4896ba)
#13 0x000000000044fbc6 DisassembleObject(llvm::object::ObjectFile const*, bool) (.constprop.0) (/home/xaionaro/src/llvm-project/build/bin/llvm-mctoll+0x44fbc6)
#14 0x000000000040c5a4 main (/home/xaionaro/src/llvm-project/build/bin/llvm-mctoll+0x40c5a4)
#15 0x00007fdcba9921a3 __libc_start_main (/lib64/libc.so.6+0x271a3)
#16 0x000000000043cede _start (/home/xaionaro/src/llvm-project/build/bin/llvm-mctoll+0x43cede)
Stack dump:
0.      Program arguments: /home/xaionaro/src/llvm-project/build/bin/llvm-mctoll -d helloworld 
Aborted (core dumped)
$ ~/src/llvm-project/build/bin/llvm-mctoll --version
LLVM (http://llvm.org/):
  LLVM version 11.0.0git
  Optimized build with assertions.
  Default target: x86_64-unknown-linux-gnu
  Host CPU: skylake

  Registered Targets:
    arm     - ARM
    armeb   - ARM (big endian)
    thumb   - Thumb
    thumbeb - Thumb (big endian)
    x86     - 32-bit X86: Pentium-Pro and above
    x86-64  - 64-bit X86: EM64T and AMD64

cmov from memory

int foo(bool b, int* arr)
{
	int res = 0;
	__asm__(R"(
		test  %1, %1
		cmove %2, %0
	)"
	: "+r"(res)
	: "r"(b), "m"(arr[1])
	: "cc"
	);
	return res;
}

Came up in #38.

Run llvm-mctoll occurs error

Hi, when I am trying to generate LLVM IR for ls, it cause errors as follows:
(ubuntu 16.04 x64 LTS)

build/llvm/bin 
➜ ./llvm-mctoll -d ~/Desktop/ls 
*** WARNING Out of range target not added.
*** WARNING Out of range target not added.
*** WARNING Out of range target not added.
*** WARNING Out of range target not added.
*** WARNING Out of range target not added.
*** WARNING Out of range target not added.
*** WARNING Out of range target not added.
*** WARNING Out of range target not added.
*** WARNING Out of range target not added.
*** WARNING Out of range target not added.
*** WARNING Out of range target not added.
*** WARNING Out of range target not added.
*** WARNING Out of range target not added.
*** WARNING Out of range target not added.
*** WARNING Out of range target not added.
*** WARNING Out of range target not added.
*** WARNING Out of range target not added.
*** WARNING Out of range target not added.
./llvm-mctoll[0x30b2ae1]
./llvm-mctoll[0x30b2b72]
./llvm-mctoll[0x30b0adf]
./llvm-mctoll[0x30b2586]
/lib/x86_64-linux-gnu/libpthread.so.0(+0x11390)[0x7fa669234390]
./llvm-mctoll[0x4ad5aa]
./llvm-mctoll[0x15e41ed]
./llvm-mctoll[0x15e45d3]
./llvm-mctoll[0x4a67fe]
./llvm-mctoll[0x41cdb8]
./llvm-mctoll[0x41f243]
./llvm-mctoll[0x41fa45]
./llvm-mctoll[0x433fcb]
./llvm-mctoll[0x4202aa]
/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf0)[0x7fa6685d8830]
./llvm-mctoll[0x414f89]
Stack dump:
0.	Program arguments: ./llvm-mctoll -d /home/sk/Desktop/ls 
[1]    27055 segmentation fault (core dumped)  ./llvm-mctoll -d ~/Desktop/ls

And here is the binary's info:

build/llvm/bin 
➜ file ~/Desktop/ls
/home/sk/Desktop/ls: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2, for GNU/Linux 2.6.32, BuildID[sha1]=d0bc0fb9b3f60f72bbad3c5a1d24c9e2a1fde775, stripped

when make occurs errors

Hi.
I tried to build the tool to translate arm to llvm-ir. But when I run the sixth step "make llvm-mctoll" it cause errors as following:
"collect:2 errror: ld returned 1 exit status
make[2]: *** [bin/llvm-mctoll] error 1
make[1]: ***[tools/llvm-mctoll/CMakeFiles/llvm-mctoll.dir/all] error 2
make : ***[all] error 2
"
I don't know where is the problem. Could you give me some suggestions?
best regards

vtable assignment

With the #33 patch it's possible to lift a simple C++ program involving vtables.

// clang++ -fno-exceptions -fno-rtti -Os -mno-sse vtable.cpp -o vtable
#include <stdint.h>
struct Base
{
	virtual ~Base() = default;
	__attribute__((noinline))
	int publicfoo(uint64_t a, uint64_t b, uint64_t c)
	{
		int d = 5; //foo(a, b);
		return d + 5;
	}
private:
	virtual int foo(uint64_t a, uint64_t b) { return 0; }
	int a;
};

int main()
{
	Base b;
	return 3 + b.publicfoo(1,2,3);
}
lea    rdi,[rsp+0x8]
mov    QWORD PTR [rdi],0x402018
mov    esi,0x1
mov    edx,0x2
mov    ecx,0x3
call   Base::publicfoo
  %0 = alloca i64, i32 2, align 8
  %1 = ptrtoint i64* %0 to i64
  %2 = inttoptr i64 %1 to i32*
  store i32 4202520, i32* %2, align 8
  1. It gets the assignment wrong, storing 32bit instead of 64 according to QWORD PTR.
  2. It does not recognize the link to the .rodata section.
  3. (minor) It does not recognize the publicfoo parameters correctly (would require inspecting the callsite according to the calling convention).

But other than that it works fine and produces nice output, kudos!

Support for __stack_chk_fail

Hello all!

On certain Linux distributions, gcc enables stack protection by default. This makes the compiler introduce canaries and add __stack_chk_fail function to detect overflows. llvm-mctoll fails on raising basic binaries compiled with -fstack-protector. I guess one could add handling in ExternalFunctions.cpp or add a note to the README to inform the user. I would be interested in hearing your thoughts on fixing this.

llvm 9.0 doesn't have setCategory function?

I build this project with latest llvm/clang(9.0) and it has an error:

class llvm::cl::Option has no member named setCategory

in file link

And I checked the source code of the llvm 9.0, and find there is a function named addCategory instead(link). So I replace the setCategory with addCategory, and it worked.

Failed to decode instruction on armeabi-v7a shared library

I'm hoping to use llvm-mctoll to rebuild a legacy Android armeabi-v7a binary library for 64-bit support. Not sure if this is really an expected goal of the project of not?

I've built the current master of with cmake -G "Ninja" -DCMAKE_INSTALL_PREFIX=/mnt/c/Users/anl/llvm-mctoll/install/llvm /mnt/c/Users/anl/llvm-mctoll/src/llvm/ -DLLVM_TARGETS_TO_BUILD="ARM" which appears to have build correctly.

Trying to run it against the native library however results in a lot of warnings

./llvm-mctoll/build/llvm/bin/llvm-mctoll -d libproxy.so
**** Warning: Failed to decode instruction
    c1c4:       d4 ff ff ff <unknown>
**** Warning: Failed to decode instruction
    c920:       d4 fa ff ff <unknown>
**** Warning: Failed to decode instruction
    c924:       e4 fb ff ff <unknown>
**** Warning: Failed to decode instruction
    c928:       50 fb ff ff <unknown>
**** Warning: Failed to decode instruction
    c92c:       58 fa ff ff <unknown>
**** Warning: Failed to decode instruction
    d87c:       14 f9 ff ff <unknown>
**** Warning: Failed to decode instruction
    d884:       4c f8 ff ff <unknown>

< A couple of pages of this >

**** Warning: Failed to decode instruction
   385a4:       78 ec ff ff <unknown>
**** Warning : Index 1dd8 not found
**** Warning : Index 319d8 not found

FWIW this is the library I'm hoping to convert/rebuild: https://gitlab.com/alelec/navdy/alelec_navdy_client/blob/master/src/main/jniLibs/armeabi-v7a/libproxy.so

Thanks.

dead code elimination

Looks like a final DCE pass should be added?
Hello world becomes:

define dso_local i32 @main(i32 %arg1) {
entry:
  %0 = alloca i64, align 8
  %1 = bitcast [10 x i8]* @RO-String to i8*
  %2 = call i32 (i8*, ...) @printf(i8* %1, i32 %arg1)
  ret i32 0
}

Even the paper shows quite a few dead llvm instructions.

Run llvm-mctoll occurs error

Hi, I test llvm-mctoll on Juliet database (Juliet Test Suite for C/C++)
https://samate.nist.gov/SARD/testsuite.php
but it has error message when i run llvm-mctoll on binary

./llvm-mctoll ~/test/mctoll/double_pointer_18_v2/CWE457
llvm-mctoll: /home/vmware/src/llvm-project/llvm/tools/llvm-mctoll/X86/X86MachineInstructionRaiser.cpp:645: unsigned int X86MachineInstructionRaiser::find64BitSuperReg(unsigned int): Assertion SuperRegFound && "Super register not found"' failed.

do you know what this means "Assertion SuperRegFound && "Super register not found"' failed.`" and how to fix it?
I tested it on Ubuntu 16 and 18, I had the same result.

I also test on dynist and ls binary.
it provided the different error message and it failed to convert to IR

./llvm-mctoll ~/dyninst/bin/cfg_to_dot llvm-mctoll: /home/vmware/src/llvm-project/llvm/tools/llvm-mctoll/X86/X86MachineInstructionRaiser.cpp:1425: llvm::Value* X86MachineInstructionRaiser::getMemoryAddressExprValue(const llvm::MachineInstr&): Assertion MI.getOperand(MemoryRefOpIndex + X86::AddrSegmentReg).getReg() == X86::NoRegister && "Expect no segment register"' failed.

./llvm-mctoll -d ~/Desktop/ls llvm-mctoll: /home/vmware/src/llvm-project/llvm/tools/llvm-mctoll/X86/X86JumpTables.cpp:221: bool X86MachineInstructionRaiser::raiseMachineJumpTable(): Assertion (JmpTblBaseCalcMBB.pred_size() == 1) && "Expect a single predecessor during jump table discovery"' failed.

Mode that outputs to stdout

The suggestion from another issue is to be able to output to stdout in order to pipe to other tools.

$ llvm-mctoll -d a.out > opt ...

Build error after recent commits

I tried to build llvm/mctoll, but it failed with the following error:

[100%] Building CXX object tools/llvm-mctoll/CMakeFiles/llvm-mctoll.dir/llvm-mctoll.cpp.o
/home/harshit/Documents/BinaryObfuscation/new_llvm-mctoll/src/llvm/tools/llvm-mctoll/llvm-mctoll.cpp: In function ‘void DisassembleObject(const llvm::object::ObjectFile*, bool)’:
/home/harshit/Documents/BinaryObfuscation/new_llvm-mctoll/src/llvm/tools/llvm-mctoll/llvm-mctoll.cpp:1105:74: error: invalid conversion from ‘std::unique_ptr<llvm::TargetMachine>::pointer {aka llvm::TargetMachine*}’ to ‘const llvm::LLVMTargetMachine*’ [-fpermissive]
   MachineModuleInfo *machineModuleInfo = new MachineModuleInfo(Target.get());
                                                                ~~~~~~~~~~^~
In file included from /home/harshit/Documents/BinaryObfuscation/new_llvm-mctoll/src/llvm/tools/llvm-mctoll/ModuleRaiser.h:18:0,
                 from /home/harshit/Documents/BinaryObfuscation/new_llvm-mctoll/src/llvm/tools/llvm-mctoll/MachineInstructionRaiser.h:20,
                 from /home/harshit/Documents/BinaryObfuscation/new_llvm-mctoll/src/llvm/tools/llvm-mctoll/MachineFunctionRaiser.h:19,
                 from /home/harshit/Documents/BinaryObfuscation/new_llvm-mctoll/src/llvm/tools/llvm-mctoll/llvm-mctoll.cpp:14:
/home/harshit/Documents/BinaryObfuscation/new_llvm-mctoll/src/llvm/include/llvm/CodeGen/MachineModuleInfo.h:148:12: note:   initializing argument 1 of ‘llvm::MachineModuleInfo::MachineModuleInfo(const llvm::LLVMTargetMachine*)’
   explicit MachineModuleInfo(const LLVMTargetMachine *TM = nullptr);
            ^~~~~~~~~~~~~~~~~
tools/llvm-mctoll/CMakeFiles/llvm-mctoll.dir/build.make:62: recipe for target 'tools/llvm-mctoll/CMakeFiles/llvm-mctoll.dir/llvm-mctoll.cpp.o' failed
make[3]: *** [tools/llvm-mctoll/CMakeFiles/llvm-mctoll.dir/llvm-mctoll.cpp.o] Error 1
CMakeFiles/Makefile2:62164: recipe for target 'tools/llvm-mctoll/CMakeFiles/llvm-mctoll.dir/all' failed
make[2]: *** [tools/llvm-mctoll/CMakeFiles/llvm-mctoll.dir/all] Error 2
CMakeFiles/Makefile2:62176: recipe for target 'tools/llvm-mctoll/CMakeFiles/llvm-mctoll.dir/rule' failed
make[1]: *** [tools/llvm-mctoll/CMakeFiles/llvm-mctoll.dir/rule] Error 2
Makefile:14681: recipe for target 'llvm-mctoll' failed
make: *** [llvm-mctoll] Error 2

Earlier it used to build fine.
I think there might be an issue while adding support to raise idiv with register source operand in X86MachineInstructionRaiser.cpp.
Thanks,
Harshit

Conditional branch to target MachineBasicBlock failed

Assembly from X86 coremark:

  400d31: 49 8b 5d 00                   movq    (%r13), %rbx
  400d35: 48 85 db                      testq   %rbx, %rbx
  400d38: 74 21                         je      33 <core_bench_list+0x59b>
  400d3a: 66 0f 1f 44 00 00             nopw    (%rax,%rax)
  400d40: 49 8b 45 08                   movq    8(%r13), %rax
  400d44: 0f bf 38                      movswl  (%rax), %edi
  400d47: 41 0f b7 f7                   movzwl  %r15w, %esi
  400d4b: e8 c0 29 00 00                callq   10688 <crc16>
  400d50: 41 89 c7                      movl    %eax, %r15d
  400d53: 48 8b 1b                      movq    (%rbx), %rbx
  400d56: 48 85 db                      testq   %rbx, %rbx
  400d59: 75 e5                         jne     -27 <core_bench_list+0x580>
  400d5b: 44 89 f8                      movl    %r15d, %eax
  400d5e: 48 83 c4 38                   addq    $56, %rsp
  400d62: 5b                            popq    %rbx
  400d63: 41 5c                         popq    %r12
  400d65: 41 5d                         popq    %r13
  400d67: 41 5e                         popq    %r14
  400d69: 41 5f                         popq    %r15
  400d6b: 5d                            popq    %rbp
  400d6c: c3                            retq

Dump from llvm-mctoll:

bb.113:
; predecessors: %bb.80
  successors: %bb.116, %bb.114, %bb.115

  $rbx = MOV64rm $r13, 1, $noreg, 0, $noreg, <0x55e9aeab55c8>
  TEST64rr $rbx, $rbx, <0x55e9aeab56e8>, implicit-def $eflags
  JCC_1 33, 4, <0x55e9aeab5808>, implicit $eflags

bb.114:
; predecessors: %bb.113


bb.115:
; predecessors: %bb.115, %bb.113
  successors: %bb.115, %bb.116

  $rax = MOV64rm $r13, 1, $noreg, 8, $noreg, <0x55e9aeab5ba8>
  $edi = MOVSX32rm16 $rax, 1, $noreg, 0, $noreg, <0x55e9aeab5cc8>
  $esi = MOVZX32rr16 $r15w, <0x55e9aeab5de8>
  CALL64pcrel32 10688, <0x55e9aeab5f08>, implicit $rsp, implicit $ssp
  $r15d = MOV32rr $eax, <0x55e9aeab6028>
  $rbx = MOV64rm $rbx, 1, $noreg, 0, $noreg, <0x55e9aeab6148>
  TEST64rr $rbx, $rbx, <0x55e9aeab6268>, implicit-def $eflags
  JCC_1 -27, 5, <0x55e9aeab6388>, implicit $eflags

The else branch target MBB of the conditional branch of JE is bb.114, but bb.114 is empty. This should jump to bb.115.

Push Imm

sorry but Push Imm are not translates (or raises)?

PUSH64i32 8049862, <0x199bb0eb708>, implicit-def $ rsp, implicit $ rsp

thanks

MOVZX32rm16 loads 4 bytes

I noticed in one of the existing tests that MOVZX32rm16 for example loads 4 bytes, then truncates to 16 bits and extends to 32 cause it uses the destination register width:

PointerType::get(getPhysRegOperandType(MI, LoadOpIndex), 0);

This does work usually cause x86 is little endian but can still go south if the intended 2 bytes are at the end of a page boundary.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.