nyx-fuzz / libxdc Goto Github PK
View Code? Open in Web Editor NEWThe fastest Intel-PT decoder for fuzzing
License: MIT License
The fastest Intel-PT decoder for fuzzing
License: MIT License
Just running make on a fresh checkout results in Fatal error: can't create build/cfg.o: No such file or directory
error, update README to add mkdir -p build
before running make or include mkdir -p build
in the Makefile itself
I want to try this as a static library on Windows.My project is a VS project with libipt static library.I want to try libxdc to get better performance.How can i do?Thks.
I'm trying to build my own fuzzer using the perf_event_open API for capturing the bitstream as a fun side project.
I've got a lot of questions regarding library usage, partially due to lacking background information. (I've sparsely read the Intel manual on the subject)
Let's start by library basic usage as far as I understand it.
Create decoder -> Run Decoder -> decoder fills bitmap according to the trace data. Where 1 means branch taken and 0 means not taken.
That much is clear to me. What is considered a branch is (what I presume) left to the capture configuration.
Though the arguments could be clarified a bit more as I needed to check the source what for what they meant.
libxdc_init:
Filter: Are these absolute and/or physical addresses or virtual?
page_cache_fetch: The purpose of page_cache_fetch is to fetch data from the fuzzed target. However argument names, purpose of the arguments are left out, so I'm guessing what going on based on the tests.
libxdc_decode: Pretty clear.
libxdc_register_bb_callback: called for each NEW basic block e.g. Function calls?
libxdc_register_edge_callback: Called on each branch e.g. If statements, switch case,?
I made a simple test case but I haven't found any success so far
Which might be due to two things, in-proper capture set-up, in-proper library usage, both?
It's relatively small (200 ~LOC), I can post it if you want to take a look. ( Need to clean it up first though )
Is libxdc thread safe? It's not documented either way and I noticed that page_cache_lock
and page_cache_unlock
are empty functions -- which might or not be a complete red herring!
I'm seeing a strange issue with libxdc reporting truncated addresses:
Starting trace from 0xffffffffc05f803c.
Writing 8 bytes of input to 0xffffffffc05fa010
Starting fuzz loop
[TRACER cpuid] RIP: 0xffffffffc05f808f
CPUID leaf 13371337
Harness signal on finish
Stopping fuzz loop.
PSB
MODE
MODE
FUP c05f803c (TNT: 0)
VMCS
PSBEND
PGE c05f803c (TNT: 0)
[IPT] 0xffffffffffffffff -> 0xc05f803c
FUP c05f803c (TNT: 0)
MODE
disasm(c05f803c,c05f803c) TNT: 0
DIS mode 64
PTCOV fetch page of 0xc05f803c
I saved the buffer and ptdump works on it no problem
0000000000000000 psb
0000000000000010 pad
0000000000000011 pad
0000000000000012 pad
0000000000000013 mode.tsx
0000000000000015 mode.exec cs.l
0000000000000017 fup 6: ffffffffc05f803c
0000000000000020 pad
0000000000000021 pad
0000000000000022 pad
0000000000000023 pad
0000000000000024 pad
0000000000000025 pad
0000000000000026 pip 1a2606000, nr cr3 00000001a2606000
000000000000002e pad
000000000000002f pad
0000000000000030 pad
0000000000000031 pad
0000000000000032 pad
0000000000000033 pad
0000000000000034 pad
0000000000000035 pad
0000000000000036 vmcs 3db4bc000 vmcs 00000003db4bc000
000000000000003d pad
000000000000003e pad
000000000000003f pad
0000000000000040 cbr 29
0000000000000044 psbend
0000000000000046 pad
0000000000000047 tip.pge 6: ffffffffc05f803c
0000000000000050 pad
0000000000000051 pad
0000000000000052 pad
0000000000000053 pad
0000000000000054 pad
0000000000000055 pad
0000000000000056 pad
0000000000000057 fup 6: ffffffffc05f803c
0000000000000060 tip.pgd 0: ????????????????
0000000000000061 pad
0000000000000062 pad
0000000000000063 pad
0000000000000064 pad
0000000000000065 pad
0000000000000066 pad
0000000000000067 pad
0000000000000068 pad
0000000000000069 pad
000000000000006a pad
000000000000006b pad
000000000000006c pad
000000000000006d pad
000000000000006e pad
000000000000006f pad
0000000000000070 cbr 29
0000000000000074 pad
0000000000000075 mode.exec cs.l
0000000000000077 tip.pge 6: ffffffffc05f803c
0000000000000080 pad
0000000000000081 pad
0000000000000082 pad
0000000000000083 pad
0000000000000084 pad
0000000000000085 pad
0000000000000086 pad
0000000000000087 fup 6: ffffffffc05f803c
0000000000000090 tip.pgd 0: ????????????????
0000000000000091 pad
0000000000000092 pad
0000000000000093 pad
0000000000000094 pad
0000000000000095 pad
0000000000000096 pad
0000000000000097 pad
0000000000000098 pad
0000000000000099 pad
000000000000009a pad
000000000000009b pad
000000000000009c pad
000000000000009d pad
000000000000009e pad
000000000000009f pad
00000000000000a0 cbr 29
00000000000000a4 pad
00000000000000a5 mode.exec cs.l
00000000000000a7 tip.pge 6: ffffffffc05f803c
00000000000000b0 tnt.8 .!!
00000000000000b1 pad
00000000000000b2 pad
00000000000000b3 pad
00000000000000b4 pad
00000000000000b5 pad
00000000000000b6 pad
00000000000000b7 fup 6: ffffffffc05f808f
00000000000000c0 tip.pgd 0: ????????????????
00000000000000c1 pad
00000000000000c2 pad
00000000000000c3 pad
00000000000000c4 pad
00000000000000c5 pad
00000000000000c6 pad
00000000000000c7 pad
00000000000000c8 pad
00000000000000c9 pad
00000000000000ca pad
00000000000000cb pad
00000000000000cc pad
00000000000000cd pad
00000000000000ce pad
00000000000000cf pad
Strangely enough the same code works no problem on another machine. Only difference is this problem is on Ubuntu 20.04, vs Debian Buster where it works just fine. I tried switching to the same compiler on Ubuntu but no difference. Do you guys have any clue what might be the issue?
I'm getting an error as follows:
ERR: TNT 1052361 at position <0xffffffffc0739083,0xffffffffc0739083>
Any hints on how to debug this ?
While using libxdc to gather the basic-block information AFL reports low stability score. Currently I register a bb callback and feed the dst
address received in that function to AFL as a location that was instrumented. Stability seems to be howering around ~18%. In my setup interrupts are blocked and the code being fuzzed is tiny with no external calls. The low stability score only pops up while using PT+libxdc, breakpoint based tracing yields stability in the ~95% range.
Hey guys, so I'm running into an issue and I'm a bit stuck. I have a 64k PT buffer recorded by Xen and ptdump
seems to be able to parse the buffer no problem.
00000000000010a0 psb
00000000000010b0 pad
00000000000010b1 pad
00000000000010b2 pad
00000000000010b3 mode.tsx
00000000000010b5 mode.exec cs.l
00000000000010b7 fup 3: 00007f918d853264
00000000000010be pad
00000000000010bf pad
00000000000010c0 pad
00000000000010c1 pad
00000000000010c2 pad
00000000000010c3 pad
00000000000010c4 pad
00000000000010c5 pad
00000000000010c6 pip b2619800, nr cr3 00000000b2619800
00000000000010ce pad
00000000000010cf pad
00000000000010d0 pad
00000000000010d1 pad
00000000000010d2 pad
00000000000010d3 pad
00000000000010d4 pad
00000000000010d5 pad
00000000000010d6 vmcs 2b5b75000 vmcs 00000002b5b75000
00000000000010dd pad
00000000000010de pad
00000000000010df pad
00000000000010e0 pad
00000000000010e1 pad
00000000000010e2 pad
00000000000010e3 pad
00000000000010e4 pad
00000000000010e5 pad
00000000000010e6 tsc 412444672534
00000000000010ee pad
00000000000010ef pad
00000000000010f0 cbr 8
00000000000010f4 psbend
00000000000010f6 tnt.8 ..!.!.
:
I've enabled DEBUG_TRACES
in libxdc
but this is as far is gets:
PSB
MODE
MODE
FUP 7f918d853264 (TNT: 0)
VMCS
Afterwards libxdc_decode
just returns with the value 4. I have the buffer saved to a file if that helps with debugging this further.
Currently the page_cache_fetch
function only receives a virtual address to be retrieved. While this is sufficient for small traces where the target process is known, if we are tracing across processes or between kernel and userspace, we need to know what table to use to translate the virtual address with to grab the underlying page. As this information is carried in the PT buffer, having an "active pt" variable should be very low overhead on the libxdc
side.
libxdc is a great work in the decoders of Intel PT, particularly in the hardware-assisted fuzzing. I notice that the evaluation in this repo shows that libxdc is faster than Ptrix, which rebuilds the coverage without disassemble. Though libxdc utilizes many micro-optimization to accelerate the process of rebuilding coverage, I'm interesting in whether libxdc will be faster in rebuilding the Ptrix-type path coverage without disassembly.
Hi,
You explain in the readme file that libxdc must receive a callback which helps to "request memory":
"To disassemble the target, a callback page_cache_fetch_fptrhas to be provided that allows libxdc to request memory"
This is not at all understandable at least for me. Do you mean the default page cache of the operation system? and if so, why do you need it? I would very appreciate if you insert detailed documentation about why you need it. The Intel PT decoder by intel (libipt) does not require that, so what's the bonus in here?
Hey guys,
so I'm trying to verify that the edge information I get from libxdc
matches what I expected and so far it doesn't. The target code being traced flows like this when executed through MTF and disassembling each instruction with Capstone:
0: 0xffffffffc03af03c movsx [7, next: 0xffffffffc03af043] 0f be 0d cd 1f 00 00 bf 04 00 00 00 89 c8 99 ...............
1: 0xffffffffc03af043 mov [5, next: 0xffffffffc03af048] bf 04 00 00 00 89 c8 99 f7 ff 83 fa 03 74 1f .............t.
2: 0xffffffffc03af048 mov [2, next: 0xffffffffc03af04a] 89 c8 99 f7 ff 83 fa 03 74 1f 83 fa 02 74 15 ........t....t.
3: 0xffffffffc03af04a cdq [1, next: 0xffffffffc03af04b] 99 f7 ff 83 fa 03 74 1f 83 fa 02 74 15 89 ce ......t....t...
4: 0xffffffffc03af04b idiv [2, next: 0xffffffffc03af04d] f7 ff 83 fa 03 74 1f 83 fa 02 74 15 89 ce 40 .....t....t...@
5: 0xffffffffc03af04d cmp [3, next: 0xffffffffc03af050] 83 fa 03 74 1f 83 fa 02 74 15 89 ce 40 80 e6 ...t....t...@..
6: 0xffffffffc03af050 je [2, next: 0xffffffffc03af052] 74 1f 83 fa 02 74 15 89 ce 40 80 e6 03 74 09 [email protected].
7: 0xffffffffc03af052 cmp [3, next: 0xffffffffc03af055] 83 fa 02 74 15 89 ce 40 80 e6 03 74 09 ff ca [email protected]...
8: 0xffffffffc03af055 je [2, next: 0xffffffffc03af057] 74 15 89 ce 40 80 e6 03 74 09 ff ca 75 10 83 [email protected]..
9: 0xffffffffc03af057 mov [2, next: 0xffffffffc03af059] 89 ce 40 80 e6 03 74 09 ff ca 75 10 83 c1 0c [email protected]....
10: 0xffffffffc03af059 and [4, next: 0xffffffffc03af05d] 40 80 e6 03 74 09 ff ca 75 10 83 c1 0c eb 0b @...t...u......
11: 0xffffffffc03af05d je [2, next: 0xffffffffc03af05f] 74 09 ff ca 75 10 83 c1 0c eb 0b ff c1 eb 07 t...u..........
12: 0xffffffffc03af068 inc [2, next: 0xffffffffc03af06a] ff c1 eb 07 6b c9 0c eb 02 ff c9 48 8b 05 96 ....k......H...
13: 0xffffffffc03af06a jmp [2, next: 0xffffffffc03af06c] eb 07 6b c9 0c eb 02 ff c9 48 8b 05 96 1f 00 ..k......H.....
14: 0xffffffffc03af073 mov [7, next: 0xffffffffc03af07a] 48 8b 05 96 1f 00 00 48 39 05 7f 1f 00 00 75 H......H9.....u
15: 0xffffffffc03af07a cmp [7, next: 0xffffffffc03af081] 48 39 05 7f 1f 00 00 75 07 89 0c 25 00 00 00 H9.....u...%...
16: 0xffffffffc03af081 jne [2, next: 0xffffffffc03af083] 75 07 89 0c 25 00 00 00 00 b8 37 13 37 13 0f u...%.....7.7..
17: 0xffffffffc03af08a mov [5, next: 0xffffffffc03af08f] b8 37 13 37 13 0f a2 31 f6 48 c7 c7 77 00 3b .7.7...1.H..w.;
18: 0xffffffffc03af08f cpuid [2, next: 0xffffffffc03af091] 0f a2 31 f6 48 c7 c7 77 00 3b c0 e8 f4 56 d6 ..1.H..w.;...V.
The full decode log with the diassembly is:
PSB
MODE
MODE
FUP ffffffffc03af03c (TNT: 0)
VMCS
PSBEND
PGE ffffffffc03af03c (TNT: 0)
[IPT] 0xffffffffffffffff -> 0xffffffffc03af03c
FUP ffffffffc03af03c (TNT: 0)
disasm(ffffffffc03af03c,ffffffffc03af03c) TNT: 0
[IPT] Caching page 0xffffffffc03af
DISASM @ 0xffffffffc03af03e add byte ptr [rax], al
[IPT] Cached page found 0xffffffffc03af
DISASM @ 0xffffffffc03af045 cmp qword ptr [rip + 0x1f7f], rax
[IPT] Cached page found 0xffffffffc03af
DISASM @ 0xffffffffc03af047 jne 0xffffffffc03af04e
DISASM FOUND COFI
PGD ffffffffc03af03c (TNT: 0)
disasm(ffffffffc03af03c,ffffffffc03af03c) TNT: 0
[IPT] 0xffffffffc03af03c -> 0xffffffffffffffff
MODE
PGE ffffffffc03af03c (TNT: 0)
[IPT] 0xffffffffffffffff -> 0xffffffffc03af03c
FUP ffffffffc03af03c (TNT: 0)
disasm(ffffffffc03af03c,ffffffffc03af03c) TNT: 0
PGD ffffffffc03af03c (TNT: 0)
disasm(ffffffffc03af03c,ffffffffc03af03c) TNT: 0
[IPT] 0xffffffffc03af03c -> 0xffffffffffffffff
MODE
PGE ffffffffc03af03c (TNT: 0)
[IPT] 0xffffffffffffffff -> 0xffffffffc03af03c
TNT 16
FUP ffffffffc03af08f (TNT: 3)
disasm(ffffffffc03af03c,ffffffffc03af08f) TNT: 3
[IPT] 0xffffffffc03af045 -> 0xffffffffc03af047
[IPT] Cached page found 0xffffffffc03af
DISASM @ 0xffffffffc03af04e mov dword ptr [0], ecx
[IPT] Cached page found 0xffffffffc03af
DISASM @ 0xffffffffc03af053 mov eax, 0x13371337
[IPT] Cached page found 0xffffffffc03af
DISASM @ 0xffffffffc03af055 cpuid
[IPT] Cached page found 0xffffffffc03af
DISASM @ 0xffffffffc03af057 xor esi, esi
[IPT] Cached page found 0xffffffffc03af
DISASM @ 0xffffffffc03af05e mov rdi, -0x3fc4ff89
[IPT] Cached page found 0xffffffffc03af
DISASM @ 0xffffffffc03af063 call 0xffffffff81114757
DISASM FOUND COFI
[IPT] 0xffffffffc03af05e -> 0xffffffff81114757
[IPT] Caching page 0xffffffff81114
The disassembly looks off compared to what it should be. Just at the start it starts to disassemble:
disasm(ffffffffc03af03c,ffffffffc03af03c) TNT: 0
[IPT] Caching page 0xffffffffc03af
DISASM @ 0xffffffffc03af03e
when there is no instruction at that location. Why does it start to disassemble from 0xffffffffc03af03e
instead of ffffffffc03af03c
?
A new issue I've encountered. The PT buffer is getting processed AFAICT but still no calls to the bb or page-cache callback functions.
This is my init code:
uint64_t filter[4][2] = {0};
void* bitmap = malloc(0x10000);
libxdc_t* decoder = libxdc_init(filter, &page_cache_fetch, NULL, bitmap, 0x10000);
libxdc_register_bb_callback(decoder, &trace_log, NULL);
ret = libxdc_decode(decoder, buf, pt_buf_size);
libxdc_free(decoder);
free(bitmap);
The processing stops with an error message after a bit. With DEBUG_TRACES
enabled I see this at the end.
disasm(ffffffff816f3b07,0) TNT: 30270
TNT 5a
TIP ffffffff8114160c (TNT: 30275)
disasm(ffffffff810e4403,0) TNT: 30275
TNT 4
TIP ffffffff811415be (TNT: 30276)
disasm(ffffffff8114160c,0) TNT: 30276
TNT e
TIP ffffffff811415e5 (TNT: 30278)
disasm(ffffffff811415be,0) TNT: 30278
ERR: TNT 30278
It seems to have gotten quite far into the trace. Any recommendation on how to further debug this?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.