intel / libipt Goto Github PK
View Code? Open in Web Editor NEWlibipt - an Intel(R) Processor Trace decoder library
License: Other
libipt - an Intel(R) Processor Trace decoder library
License: Other
Intel(R) Processor Trace Decoder Library ======================================== The Intel Processor Trace (Intel PT) Decoder Library is Intel's reference implementation for decoding Intel PT. It can be used as a standalone library or it can be partially or fully integrated into your tool. The library comes with a set of sample tools built on top of it and a test system built on top of the sample tools. The samples demonstrate how to use the library and may serve as a starting point for integrating the library into your tool. Contents -------- README this file libipt A packet encoder/decoder library Optional Contents and Samples ----------------------------- ptdump Example implementation of a packet dumper ptxed Example implementation of a trace disassembler ptseg A simple tool to find surrounding PSB packets pttc A trace test generator ptunit A simple unit test system sideband A sideband correlation library pevent A library for reading/writing Linux perf event records script A collection of scripts test A collection of tests include A collection of substitute headers doc A document describing the build A document describing how to get started A document describing the usage of the decoder library A document describing how to capture trace A document describing pttc doc/man Man pages for the encoder/decoder library Dependencies ------------ We use cmake for building. cmake The cross-platform open-source build system. http://www.cmake.org Other packages you need for some of the above optional components. xed The Intel x86 instruction encoder and decoder. https://github.com/intelxed/xed This is needed to build and run ptxed. yasm The Yasm Modular Assembler http://github.com/yasm This is needed to run pttc. pandoc A universal document converter http://pandoc.org This is needed for man pages.
I am just wondering if it is possible to decode the packets stream while recording the traced program and how difficult is it to do this?
Hi Markus,
Is there a way to load code into the image from a memory address of the current process?
It looks like pt_image_add
might be what I want, but this is not exposed in libipt.
I'm trying to avoid having to dump the VDSO to file, just to immediately load it back in.
Come to think of it, if I can load code in from memory, since my process is tracing itself, I can load all of the code sections in from memory, thus avoid filesystem accesses entirely.
Thanks
Hello, I want to use the decoding library to decode the packets. ptxed uses the information from perf (perf script -D) to reconstruct the control flow. What if I use perf_event_open() to get the perf events to decode so that I do not need to dump out the data? I have no idea how to set the traced image for the decoder in this case. By the way, if someone could explain what traced image is, I would appreciate it. Thanks a lot.
Hello,
I'm wanting to experiment with ptdump and ptxed, however I'm lost on the build instructions.
My understanding is the syntax is:
cmake [options] /path/to/source
but I'm stuck trying to figure out how to invoke any of the options. Can someone please elaborate, or better, add some greater detail to the build doc? :)
Thanks,
-Steve
Hello!
I've noticed that when I trace user code (using attr.exclude_kernel = 1
) and there's a system call in the trace, then ptxed will generate en error message upon returning to user-space:
$ ptxed --cpu auto, --block-decoder --block:end-on-call --block:end-on-jump \
--block:show-blocks --pt ...
...
0000557a22886c5c mov edi, dword ptr [rdi+0x18]
0000557a22886c5f call 0x557a22878f58
[block]
0000557a22878f58 jmp qword ptr [rip+0x2f1c7a]
[block]
00007fde76ddfe00 mov eax, 0x10
00007fde76ddfe05 syscall
[disabled]
[3ec80, 7fde76ddfe07: error: expected tracing enabled event] <--------- this
[block]
00007ffe85dc5a20 jle 0x7ffe85dc5b1d
[block]
Is that supposed to happen? If not, I'm wondering if that's a bug in ptxed
, the Linux kernel I'm using (4.9.30-2+deb9u5
), or the chip itself?
Any ideas?
Hi,
It seems there is a potential resource leak at https://github.com/01org/processor-trace/blob/master/libipt/src/posix/pt_section_posix.c#L204
The code is as following:
file = fdopen(fd, "rb"); // <-- **open a file descriptor**
if (!file)
goto out_fd;
/* We need to keep the file open on success. It will be closed when
* the section is unmapped.
*/
errcode = pt_sec_file_map(section, file);
if (!errcode) {
section->mcount = 1;
return pt_section_unlock(section); // <-- **returns, but not close the file descriptor**
}
Hope to have a look.
Best wishes
I would like to point out that an identifier like "__INTEL_PT_H__
" does eventually not fit to the expected naming convention of the C language standard.
Would you like to adjust your selection for unique names?
Cmake configure fails because it cannot find the pt_blk_get_image.3.md file under doc/man.
Hi,
I've written small tracer with Intel PT. It works well besides one feature. I'd like to annotate the resulting trace with TSC values. I use 'pt_blk_time' function for this functionality. And sometimes I get TSC value smaller than the previous TSC value. I tried to minimize the threshold for CYC packets but it didn't work. Moreover, as I see PT dump with ptdump utility I get a lot of error messages. What are these messages mean? What I'm doing wrong?
0000000000000e14 1b cyc 3 tsc 0053f7523d1a81fc
0000000000000e15 d4 tnt.8 !.!.!.
0000000000000e16 1b cyc 3 tsc 0053f7523d1a81fe
0000000000000e17 d4 tnt.8 !.!.!.
0000000000000e18 1b cyc 3 tsc 0053f7523d1a8200
0000000000000e19 0c tnt.8 !.
0000000000000e1a 0b cyc 1 tsc 0053f7523d1a8200 <<<<<
[e1b: error calibrating time: bad configuration] <<<<<<
[e1b: error updating time: bad configuration] <<<<<
0000000000000e1b 5985 mtc 85 tsc 0053f7523d1a8200 <<<<<
0000000000000e1d d4 tnt.8 !.!.!.
0000000000000e1e 1b cyc 3 tsc 0053f7523d1a8190 <<<<<
0000000000000e1f d4 tnt.8 !.!.!.
0000000000000e20 1b cyc 3 tsc 0053f7523d1a8192
--Sergey
Hi, in the case of virtualization, the intel_pt function cannot be found, whether the virtual machine cannot directly access the PT hardware, how should I do
perf list | grep intel_pt
perf version 4.18.20
Ubuntu version: 4.18.0-16-generic
Hardware: Skylake
Trying to capture PT packets for a specific function (drive_machine) in the executable: memcached
Command I used to capture the filtered trace:
perf record -e intel_pt/tsc=0,cyc=0,mtc=0/u --filter="filter drive_machine @ ./memcached" ./memcached -p 12345 -l 127.0.0.1 -t 1
Test Setup:
I am using a client workload that sends some requests to the memcached server. Note: I am using the server in single threaded mode.
Result: With filtering, the PT-trace reports that the function drive_machine was called only once during the execution.
perf script --itrace=e shows no errors.
I captured all the IPs using perf script --itrace=i0 and confirmed that the starting address of the function drive_machine appeared only once indicating this function was called once during this execution.
When I ran without filtering, the PT-trace reports that the function drive_machine was called 103 times using the same client workload (No change in the setup).
perf script --itrace=e shows no errors.
When I captured the IPs using perf script --itrace=i0, it shows me the 103 executions of the of the function drive_machine.
The issue is: Address Filtering is reporting only a subset (only 1) of the executions of the function that I am tracing.
Am I doing something wrong with address filtering? I would imagine address filtering to not work at all if I were doing something wrong. But since I do get an execution trace (only 1 though), I suspect something else is going on here. Any help would be appreciated. Thanks!
I have also tried the above on earlier versions of perf
perf version: 4.13.16
Ubuntu version: 4.13.0-46-generic
Hardware: Skylake
And see the same result.
I tried decoding traces with the perf recording flag exclude_kernel = 1
set, however this computes invalid instruction pointer (Pointers to not mapped memory). Is mode supposed to work in the current state with the sideband library?
Hello,
This has been a recurring problem with me when I am trying to use ptxed to generate decoded instructions out of perf.data file.
When I use the command :- script/perf-read-aux.bash
-- the intel-pt traces are probably extracted correctly and a file perf.data-aux-idx4.bin
is generated.(4 being the CPU number)
Now when I try to generate the traced memory image and simultaneously send this output to ptxed using the below command --
script/perf-read-image.bash | xargs ptxed --cpu auto --pt perf.data-aux-idx4.bin
I get an error message :- ptxed: warning: failed to open 0]:: No such file or directory.
I cannot understand why this is happening. I already have the file in the correct directory. I have also added all the necessary environment variables. Can you guide me on this problem ?
`
Hello..
I dont know if this is the right place to report this since it is not an issue with the decoding really.. if you may, please refer me to the right place to post this issue if this isn't.
I dont know if this is specific to thread start_routine, or is it for all callback functions that a binary provide to a library, but I will present my case:
Assume this test code (from https://www.geeksforgeeks.org/multithreading-c-2/):
#include <stdio.h>
#include <stdlib.h>
#include <pthread.h>
// Let us create a global variable to change it in threads
int g = 0;
// The function to be executed by all threads
void *myThreadFun(void *vargp)
{
// Store the value argument passed to this thread
int *myid = (int *)vargp;
// Let us create a static variable to observe its changes
static int s = 0;
// Change static and global variables
// Print the argument, static and global variables
printf("Thread ID: %d, Static: %d, Global: %d\n", *myid, ++s, ++g);
if(g > 2)
printf("Test\n");
}
int main()
{
int i;
pthread_t tid;
// Let us create three threads
for (i = 0; i < 3; i++)
pthread_create(&tid, NULL, myThreadFun, (void *)&i);
pthread_exit(NULL);
return 0;
}
I want to trace the function myThreadFun, so I do this:
perf record -m 512,10000 -e intel_pt//u -T --switch-events --filter 'filter myThreadFun @ ./test' -- ./test
Then I decode with perf script
.. but the trace is empty?!
Now when I trace, without the filters:
perf record -m 512,10000 -e intel_pt//u -T --switch-events -- ./test
I can see myThreadFun in perf script
..
Is this a bug? or am I doing something wrong?
Thanks!
Hi all,
I am unable to find Chapter 11 in the Intel Architecture Instruction Set Extensions Programming Reference as stated in doc/getting_started.md
.
For detailed information about Intel PT, please refer to chapter 11 of the Intel Architecture Instruction Set Extensions Programming Reference at http://www.intel.com/products/processor/manuals/.
This is the link to the pdf I refer to - https://software.intel.com/sites/default/files/managed/c5/15/architecture-instruction-set-extensions-programming-reference.pdf . There is only up to Chapter 6. Could someone kindly point me in the right direction, please? Thank you.
Hi,
This isn't really a bug per-se, but it's something I don't think is clear. I'm hoping we can improve the docs.
I'm collecting traces which are very short (~96 bytes). I'm also re-using the same perf file descriptor with multiple ioctl
calls to turn on and off tracing, which seems to have something to do with this.
In most cases, the second trace I collect fails to pt_blk_sync_forward()
, returning -pte_eos
. Then subsequent operations return -pte_bad_query
.
If I use ptdump
on one of these problem traces dumped to disk, i get no output. With the --no-sync
option, I see packets, but there is no psb
packet present. Presumably this is because these packets are emitted at a time interval, and my tracing session is too short to see one.
What should one do in such a scenario? Is it OK to have no psb
packet? Should I seek back to offset 0 in the trace if I can't sync forward? If I do need to have a psb
packet, is there a way to force one into the packet stream somehow?
Thanks.
Hi,
I was reading:
https://github.com/01org/processor-trace/blob/master/doc/howto_libipt.md
And was wondering how to fill in the cpu
field of the config
struct.
Looking at ptdump, it uses a private function: pt_cpu_read()
:
https://github.com/01org/processor-trace/blob/92124616abb6ceea8b726a661c9fb38c0d3c10e6/ptdump/src/ptdump.c#L1822
Presumably this fills in the field based on the current CPU.
This seems useful. Should it be exposed in libipt?
Thanks
On OpenBSD there is a conflict with the truncate(2)
system call:
http://man.openbsd.org/OpenBSD-current/man2/truncate.2
[ 3%] Building C object libipt/CMakeFiles/libipt.dir/src/pt_sync.c.o
cd /home/edd/research/hwt_experiment/processor-trace/build/libipt && /usr/bin/cc -DFEATURE_THREADS -DPT_VERSION_BUILD=0 -DPT_VERSION_EXT=\"\" -DPT_VERSION_MAJOR=1 -DPT_VERSION_MINOR=6 -Dlibipt_EXPORTS -I/home/edd/research/hwt_experiment/processor-trace/include -I/home/edd/research/hwt_experiment/processor-trace/build/libipt/include -I/home/edd/research/hwt_experiment/processor-trace/include/posix -I/home/edd/research/hwt_experiment/processor-trace/libipt/internal/include -I/home/edd/research/hwt_experiment/processor-trace/libipt/internal/include/posix -std=c99 -fvisibility=hidden -fPIC -o CMakeFiles/libipt.dir/src/pt_sync.c.o -c /home/edd/research/hwt_experiment/processor-trace/libipt/src/pt_sync.c
/home/edd/research/hwt_experiment/processor-trace/libipt/src/pt_sync.c:47: error: conflicting types for 'truncate'
/usr/include/sys/types.h:217: error: previous declaration of 'truncate' was here
Thanks
When I tried to run the following command from the tutorial of the doc
$ perf record -e intel_pt//u -T --switch-events ...
$ script/perf-read-aux.bash
$ script/perf-read-sideband.bash
$ ptdump $(script/perf-get-opts.bash) perf.data-aux-idx0.bin
[...]
$ ptxed $(script/perf-get-opts.bash -m perf.data-sideband-cpu0.pevent)
--pevent:vdso... --event:tick --pt perf.data-aux-idx0.bin
it reports that:
ptdump: unknown option: --pevent:time-shift
ptxed: unknown option: --pevent:time-shift
After working around #10, I get the following compilation error on OpenBSD:
[ 3%] Building C object libipt/CMakeFiles/libipt.dir/src/posix/pt_section_posix.c.o
cd /home/edd/research/hwt_experiment/processor-trace/build/libipt && /usr/bin/cc -DFEATURE_THREADS -DPT_VERSION_BUILD=0 -DPT_VERSION_EXT=\"\" -DPT_VERSION_MAJOR=1 -DPT_VERSION_MINOR=6 -Dlibipt_EXPORTS -I/home/edd/research/hwt_experiment/processor-trace/include -I/home/edd/research/hwt_experiment/processor-trace/build/libipt/include -I/home/edd/research/hwt_experiment/processor-trace/include/posix -I/home/edd/research/hwt_experiment/processor-trace/libipt/internal/include -I/home/edd/research/hwt_experiment/processor-trace/libipt/internal/include/posix -std=c99 -fvisibility=hidden -fPIC -o CMakeFiles/libipt.dir/src/posix/pt_section_posix.c.o -c /home/edd/research/hwt_experiment/processor-trace/libipt/src/posix/pt_section_posix.c
/home/edd/research/hwt_experiment/processor-trace/libipt/src/posix/pt_section_posix.c: In function 'pt_sec_posix_map':
/home/edd/research/hwt_experiment/processor-trace/libipt/src/posix/pt_section_posix.c:114: error: 'PAGE_SIZE' undeclared (first use in this function)
/home/edd/research/hwt_experiment/processor-trace/libipt/src/posix/pt_section_posix.c:114: error: (Each undeclared identifier is reported only once
/home/edd/research/hwt_experiment/processor-trace/libipt/src/posix/pt_section_posix.c:114: error: for each function it appears in.)
Arguably the page size should not be a compile time constant, as this means traces from different machines with different page sizes can not be decoded.
Hi,
This is not an issue. I am reading the documentation of Intel PT. I have a question about the deferred TIPs.
May I know if it is possible to disable the deferred TIPs optimization? Thanks.
Hello,
I am trying use libipt in combination with address range filtering. I only add the address ranges to be traced as sections to libipt. When there is a call withing the traced range that redirects the control flow to a non-traced region, libipt returns a nomap error, which in turn requires a call to sync_forward to recover the decoder state. This results in skipping a considerable amount of the trace while decoding (because of sync_forward calls upon nomap errors) when the calls to non-traced regions are frequent.
I was wondering if there is any way of jumping to PGE packets instead of PSB packets to avoid such cases.
Building ptdump on debian-testing, I get:
[ 16%] Building C object CMakeFiles/ptdump.dir/src/ptdump.o
/home/vext01/research/hwt_experiment/deps/src/processor-trace/ptdump/src/ptdump.c: In function ‘version’:
/home/vext01/research/hwt_experiment/deps/src/processor-trace/ptdump/src/ptdump.c:211:15: error: ‘PT_VERSION_MAJOR’ undeclared (first use in this function)
name, PT_VERSION_MAJOR, PT_VERSION_MINOR, PT_VERSION_BUILD,
^~~~~~~~~~~~~~~~
/home/vext01/research/hwt_experiment/deps/src/processor-trace/ptdump/src/ptdump.c:211:15: note: each undeclared identifier is reported only once for each function it appears in
/home/vext01/research/hwt_experiment/deps/src/processor-trace/ptdump/src/ptdump.c:211:33: error: ‘PT_VERSION_MINOR’ undeclared (first use in this function)
name, PT_VERSION_MAJOR, PT_VERSION_MINOR, PT_VERSION_BUILD,
^~~~~~~~~~~~~~~~
/home/vext01/research/hwt_experiment/deps/src/processor-trace/ptdump/src/ptdump.c:211:51: error: ‘PT_VERSION_BUILD’ undeclared (first use in this function)
name, PT_VERSION_MAJOR, PT_VERSION_MINOR, PT_VERSION_BUILD,
^~~~~~~~~~~~~~~~
/home/vext01/research/hwt_experiment/deps/src/processor-trace/ptdump/src/ptdump.c:212:9: error: ‘PT_VERSION_EXT’ undeclared (first use in this function)
PT_VERSION_EXT, v.major, v.minor, v.build, v.ext);
For now I have just commented the printf using these macros.
Howdy, this is not an issue but a question about the usage of this decoding library. Recently, I want to use the decoding library to recover blocks have been executed during the tracing. I tried the block_decoder, but it was a little slow. Since I don't care what exact instructions have been executed, is there a good way to quickly map the raw trace (e.g., TNT, TIP) to the branch decisions in the binary? Thanks a lot!
The `pt_pkt_sync_backward()' does not sync backward from the current decoder position. Instead, it attempts to the sync backward from the last sync position. However, as sync position is not updated due to calls to pt_pkt_next() it will always attempt to sync backward from the first sync point.
This seems to be due to the following assignment before the call to pt_sync_backward():
pos = decoder->sync;
Looks like it's a bug and should have been:
pos = decoder->pos;
Am I understanding this correctly, that this allows to also log hypervisor instructions? If so, is root access to the host or guest required?
Hello,
When I decode an PT trace from raw packets, how can I determine whether a conditional move instruction (e.g., cmove) is executed? For example, an instruction is "int a = b==0? c : d", which is compiled to "cmp XXX XXX; cmovez XXX XXX".
I also tried ptxed. It prints the conditional move instruction that is supposed not to be executed.
Thank you so much!
The documentation in doc/howto_libipt.md mistakenly refer to the non-existing pts_event
status flag. All occurrences should be replaced with pts_event_pending
.
When decoding such addresses, ptdump prints message like
*** ERROR: having problems with printing the payload.
0020 fup
and stops processing. It looks like the problem is caused byt the fact that raw decoded IP addresses encoded with pt_ipc_sext_48 are not sign extended (that's mentioned in the doc, and sign extension is done in pt_last_ip_update_ip() function), while pt_print_strprint_ip_packet() expects sign extended addresses (bits 47-63 are either all 1s or all 0s).
Some code from C source files should be wrapped by the setting 'extern "C"' for C++ tools, shouldn't it?
when I try to dump a file using ptdump on windows:
ptdump c:\path\to\file
I get:
ptdump.exe: failed to open c: 2.
When I try to open with a relative path it works.
Does the lib along with the ptxed support decoding perf.data generated from perf version 4.17? I can see the script/perf-read-*.bash scripts generate incorrect information. I tried fixing it; I adjusted the perf-read-aux.bash to read the correct fields in the new perf script -D, but I am not sure if the library will be able to decode?
Thanks
Mansour.
Would it be possible to add the LEAVE instruction as a possible class for pt_insn? Specifically, add the handler for 0xC9 on the instruction decoder. It would be useful to know when the stack frame was released to more easily detect embedded calls. Is there another way to know a LEAVE instruction has been executed that I'm not aware of?
Hi,
I've been trying to interface directly with the kernel to get traces. The documentation seems to be pointing me in the right direction, but I have some questions. Perhaps we can feed back the outcome of this discussion into improving the docs.
When using the AUX area, its size and offset have to be filled into the perf_event_mmap_page, which is mapped together with the DATA area. This requires the DATA area to be mapped read-write and hence configured as linear buffer. In our example, we configure the AUX area as circular buffer.
Did you mean "In our example, we configure the AUX area as linear buffer"? If I understand correctly, you are saying if you want to use AUX, you need to use a linear buffer.
base = mmap(NULL, (1+2**n) * PAGE_SIZE, PROT_WRITE, MAP_SHARED, fd, 0);
...
header->aux_size = (2**m) * PAGE_SIZE;
n
here? Does it have to match some in-kernel allocation size, or am I free to "request" however many pages I like?m
distinct from n
, or is that a typo?2**n
is "two to the power n"? If so, why not 1<<n
?I wonder if a minimal standalone program would help? If we can work out the answers to the above questions, I don't mind writing such a program and raising a pull request.
Thanks
Hello,
We are trying to generate the instruction trace for a dotnet core application using Intel PT. Unfortunately, we are having trouble decoding intel PT trace to produce the assembly instructions that were executed during the intel PT trace collection. Our main problem is related to just-in-time compiled code. I have enabled my perf-map-agent (export COMPlus_PerfMapEnabled=1
for dotnet core) and corresponding /tmp/pef-*.map files are also created. Still, when I decode the intel_pt trace using libipt following the guidelines from here the trace could not decode the assembly instructions from the just-in-time compiled code. For example, I get following errors,
[a84ef, 7fa7bf0c3b00: error: no memory mapped at this address]
I was wondering how to decode instruction traces from such jitted codes.
Our procedure to decode the instruction trace is as follows,
perf record -e intel_pt//u -T --switch-events -- dotnet run -c Release
~/git-repos/libipt/script/perf-read-aux.bash
~/git-repos/libipt/script/perf-read-sideband.bash
ptxed $(~/git-repos/libipt/script/perf-get-opts.bash -m perf.data-sideband-cpu32.pevent) --event:tick --pt perf.data-aux-idx32.bin > intel_pt.txt
Thank you in advance.
Processor trace is now in Chapter 35 instead of Chapter 36 of sdm, as mentioned in doc/howto_libipt.md
There is a variable named 'private' in the structure "pt_packet_unknown" which is also a reserved word in C++. This causes a compilation error when I try to compile with G++. I am not sure if the library is meant to be used in C++ though.
In certain cases it is required to know the target address of the final instruction in a block. For instance, if the last instruction is a CALL instruction one would want to know what is its target IP. This information is currently not available in struct pt_block
or via the block decoder's API.
One may try to figure the target IP by fetching the next block from the decoder but that would be wrong if an asynchronous event occurred (e.g. HW interrupt) and execution continues elsewhere. Inspecting the trace dump revealed that the target IP information is available in the trace (a TIP followed by a FUP both with the content matching the target IP). Furthermore, it seems the decoder->ip
is indeed equal to the target_ip at the point where the block is ready to be returned and right before processing of trailing events for the block takes place (pt_blk_collect()).
We suggest the following patch to the block decoder interface and implementation for providing the required information:
diff --git a/libipt/include/intel-pt.h.in b/libipt/include/intel-pt.h.in
index 2a8f3df..fbe85db 100644
--- a/libipt/include/intel-pt.h.in
+++ b/libipt/include/intel-pt.h.in
@@ -2122,6 +2122,10 @@ struct pt_block {
*/
uint64_t end_ip;
+ /* The IP of the decoder after processing the block. This address is
+ * the target IP of the last instruction in the block. */
+ uint64_t post_ip;
+
/** The image section that contains the instructions in this block.
*
* A value of zero means that the section did not have an identifier.
diff --git a/libipt/src/pt_block_decoder.c b/libipt/src/pt_block_decoder.c
index 8dffbc2..3c9ca40 100644
--- a/libipt/src/pt_block_decoder.c
+++ b/libipt/src/pt_block_decoder.c
@@ -2932,6 +2932,8 @@ static int pt_blk_collect(struct pt_block_decoder *decoder,
if (errcode < 0)
return errcode;
+ block->post_ip = decoder->ip;
+
/* We may still have events left that trigger on the current IP.
*
* This IP lies outside of @block but events typically bind to the IP of
I'm currently working on an instruction flow "watchdog" system where I need to use libipt to detect at runtime potential execution flow mismatches (Control-Flow Integrity, in essence).
I need to use perf snapshot mode to periodically take snapshots (sending SIGUSR2
every once in a fixed interval) of a program's instruction flow, extract the basic blocks (using --block:show-blocks
in ptxed
), and compare them against a set of legal block sequences.
My current workflow is the following:
sudo perf record -e intel_pt//u --snapshot --switch-events -T ./binary-to-trace
sudo libipt/script/perf-read-aux.bash -S
(-S for snapshot mode)sudo libipt/script/perf-read-sideband.bash
sudo libipt/build/bin/ptxed $(sudo libipt/script/perf-get-opts.bash) --block:show-blocks --pt perf.data-aux-idx0.bin --elf binary-to-trace > flow.txt
No matter what I do, all I get as output is a sequence of error: no memory mapped at this address
messages with the corresponding addresses. So my questions are:
Hi,
In the README, there is no information about which exact type of processor can support PT. I'm wondering if Intel Broadwell processors can support it?
Is it possible to update the README about the supported processors?
Thanks
Hi,
I've written a Rust library to collect an Intel PT trace. If I dump the trace I've collected to file and use ptxed on it:
$ ./c_deps/inst/bin/ptxed --pt out.pt --elf target/debug/examples/simple_example:0x55e4f3a15000 --elf target/debug/examples/simple_example:0x55e4f3c93e40 --elf /home/vext01/research/hwtracer/c_deps/inst/lib/libipt.so.1:0x7f96c970f000 --elf /home/vext01/research/hwtracer/c_deps/inst/lib/libipt.so.1:0x7f96c9931a40 --elf /lib/x86_64-linux-gnu/libdl.so.2:0x7f96c950b000 --elf /lib/x86_64-linux-gnu/libdl.so.2:0x7f96c970dd60 --elf /lib/x86_64-linux-gnu/librt.so.1:0x7f96c9303000 --elf /lib/x86_64-linux-gnu/librt.so.1:0x7f96c9509d58 --elf /lib/x86_64-linux-gnu/libpthread.so.0:0x7f96c90e6000 --elf /lib/x86_64-linux-gnu/libpthread.so.0:0x7f96c92fdb78 --elf /lib/x86_64-linux-gnu/libgcc_s.so.1:0x7f96c8ecf000 --elf /lib/x86_64-linux-gnu/libgcc_s.so.1:0x7f96c90e4db8 --elf /lib/x86_64-linux-gnu/libc.so.6:0x7f96c8b30000 --elf /lib/x86_64-linux-gnu/libc.so.6:0x7f96c8ec57c8 --elf /lib64/ld-linux-x86-64.so.2:0x7f96c9934000 --elf /lib64/ld-linux-x86-64.so.2:0x7f96c9b57bc0 | less
[enabled]
[exec mode: 64-bit]
00007f96c8c10e07 cmp rax, 0xfffffffffffff001
00007f96c8c10e0d jnb 0x7f96c8c10e10
00007f96c8c10e0f ret
000055e4f3a247d9 test eax, eax
000055e4f3a247db jns 0x55e4f3a247e5
000055e4f3a247e5 cmp dword ptr [rbp-0x4], 0x0
000055e4f3a247e9 jz 0x55e4f3a24803
... lots of code
000055e4f3a1f0de mov rdx, rax
000055e4f3a1f0e1 call 0x55e4f3a1bb68
000055e4f3a1bb68 jmp qword ptr [rip+0x27c302]
00007f96c8c58e10 [fetch error: bad image]
[75, 7f96c8c58e10: reconstruct error: decoder out of sync]
00007f96c8c58e62 cmp dl, 0x10
00007f96c8c58e65 jnb 0x7f96c8c58e7e
... loads more code
I'm wondering why I'm seeing [fetch error: bad image]
.
I'm tracing a loop. There are other areas of my trace where the problem location (00007f96c8c58e10
) does get decoded ok, e.g.:
...
000055e4f3a1bb68 jmp qword ptr [rip+0x27c302]
00007f96c8c58e10 mov rax, rdi
00007f96c8c58e13 cmp rdx, 0x20
...
versus.
000055e4f3a1bb68 jmp qword ptr [rip+0x27c302]
00007f96c8c58e10 [fetch error: bad image]
[75, 7f96c8c58e10: reconstruct error: decoder out of sync]
The ptxed
invocation you see above is automatically generated. It loads PT_LOAD
sections as reported by dl_iterate_phdr(3)
but skipping linux-vdso.so.1
(which is a fake shared object no existing on disk AFAICS). Does that sound right?
I've also tried with --cpu auto
as I'm both collecting and decoding on the same system. No joy.
Any ideas why I see that error?
Thanks
In the documentation, there is a sentence:
This generates a file called perf.data that contains the Intel PT trace, the sideband information, and some metadata. To process the trace with ptxed, we extract the Intel PT trace into one file per thread or cpu.
I get confused here. I tried the script, and it generated files like "perf.data-aux-idx0.bin", which should be per-CPU. But as present in the decoded trace file using "perf script" command, there is clear thread information. So how can I extract the trace per-thread? Or, alternatively, how can I add the tid of each instruction to the trace generated by ptxed?
I also tried '--per-thread' option (see command below) when doing perf recording, but the result trace only contains one thread.
perf record -e intel_pt//u --per-thread --filter='filter main @ ./test1, filter func @ ./test1' ./test1
Thanks!!
Hello Markus. I think I've found a libipt bug here.
Here's a ptxed snippet using --block-decoder --block:end-on-call --block:end-on-jump --block:show-blocks
...
[block]
00007ffd317b1a20 jle 0x7ffd317b1b1d
[block]
00007ffd317b1b1d test edi, edi
00007ffd317b1b1f lea r13, ptr [rbp-0x1c]
00007ffd317b1b23 jnz 0x7ffd317b1b08
[block]
00007ffd317b1b25 mov r12d, dword ptr [rbx]
[block]
00007ffd317b1b28 test r12b, 0x1
00007ffd317b1b2c jnz 0x7ffd317b1c2c
[block]
00007ffd317b1b32 mov eax, dword ptr [rip-0x2ab4]
00007ffd317b1b38 mov dword ptr [rbp-0x1c], eax
00007ffd317b1b3b mov rax, qword ptr [rip-0x2a9a]
...
Notice how the block starting at 00007ffd317b1b25
has been split in two after a (seemingly innocuous) mov
instruction.
It just so happens that this is the second time we've been at address 00007ffd317b1b25
. Let's look at the first time:
[block]
00007ffd317b1b25 mov r12d, dword ptr [rbx]
[disabled]
[resumed]
[block]
00007ffd317b1b28 test r12b, 0x1
00007ffd317b1b2c jnz 0x7ffd317b1c2c
[block]
00007ffd317b1b32 mov eax, dword ptr [rip-0x2ab4]
00007ffd317b1b38 mov dword ptr [rbp-0x1c], eax
...
So the first time we were here, there was an asynchronous interrupt after the mov
causing an otherwise contiguous block to be split. It appears that this split gets cached and every time we see this code, it will appear split regardless of whether an interrupt actually occurred.
We can prove this theory by diffing the ptxed output before and after adding --iscache-limit 0
to the ptxed invocation:
--- before 2018-04-24 15:24:41.774672502 +0100
+++ after 2018-04-24 15:24:42.106672747 +0100
@@ -775,7 +775,6 @@
00007ffd317b1b23 jnz 0x7ffd317b1b08
[block]
00007ffd317b1b25 mov r12d, dword ptr [rbx]
-[block]
00007ffd317b1b28 test r12b, 0x1
00007ffd317b1b2c jnz 0x7ffd317b1c2c
[block]
I found this because it was causing issues in my hwtracer test suite. hwtracer doesn't yet use a cache, so it disagreed with ptxed (it's taken me all day to figure out what's going on!).
I think it's fine to split a block if it was interrupted, but not if it wasn't. My use case entails enumerating all of the start addresses of the blocks a trace passes through. I can easily ignore a spurious block if I see the async_disable
event in the packet stream. However I cannot know if a block is spurious if there is no event, as with the case above.
I think the fix is to only cache a block if it wasn't interrupted. If a block was interrupted, then you can't look in the cache either.
Thanks
Hi,
I am having a recurring problem when using perf with Intel-PT event. I am currently performing profiling on a Intel(R) Xeon(R) CPU E5-2620 v4 @ 2.10GHz machine, with x86_64 architecture and 32 hardware threads with virtualization enabled. I specifically use programs/source codes from SpecCPU2006 for profiling.
I am specifically observing that the first time I perform profiling on one of the compiled binaries from SpecCPU2006, everything works fine and the perf.data file gets generated, which is as expected with Intel-PT. As SpecCPU2006 programs are computationally-intensive(use 100% of CPU at any time), clearly perf.data files would be large for most of the programs. I obtain roughly 7-10 GB perf.data files for most of the profiled programs.
However, when I try to perform profiling the second time on the same compiled binary, after the first one is successfully done -- my server machine freezes up. Sometimes, this happens when I try profiling the third time/the fourth time (after the second or third profiling completed successfully). This behavior is highly unpredictable. Now I cannot profile any more binaries unless I have restarted the machine again.
I have also posted the server error logs which I get once I see that the computer has stopped responding.
Clearly there is an error message saying Fixing recursive fault but reboot is needed!.
This happens for particularly large enough SpecCPU2006 binaries which take more than 1 minute to run without perf.
Is there any particular reason why this might happen ? This should not occur due to high CPU usage, as running the programs without perf or with perf but any other hardware event(that can be seen by perf list) completed successfully. This only seems to happen with Intel-PT.
How can I address this problem ? I have been stuck with this for a long time now.. some directions to solving it would be helpful.
I've noticed that a way to distinguish threads when decoding is correlating the data with sideband information produce by the perf tool. Would it be possible to achieve that purely from PT packets?
In case it's not possible, do you know what would be the fastest way to produce a context switch trace that can be used for correlation with the PT data?
Hello,
I am currently having an issue with sideband correlation when I am using PTXED. I do not have sideband losses when I am collecting Intel-PT packets using perf events
( I do not see any PERF_RECORD_LOST
events ).
Initially I have error messages like -
[perf.data-sideband-cpu0.pevent:00000000000053e8 sideband error: bad configuration]
The above error message repeats for 6-7 lines after which I see the below stream of error messages which occur periodically in my decoded instruction trace. Sometimes the xed decode error
is 2 as well.
[xed decode error: (9) BAD_EVEX_UBIT]
[43f7, 401860: reconstruct error: decoder out of sync]
[43f7, 401b18: error: trace stream does not match query]
Let me explain what I am trying to do.
I am running an x86_64 VM on KVM-QEMU while using the same x86_64 system as my host. I start collecting Intel-PT traces using the below command on the host -
perf kvm --guest --guestkallsyms=~/guest-kallsyms --guestmodules=~/guest-modules record -e intel_pt//
I am actually collecting packets on the host during the time when I am running an application on the VM.
I then follow the steps as suggested in the docs to generate the aux and sideband images. I finally run the below command to generate the set of instruction traces -
bin/ptxed $(script/perf-get-opts.bash -m perf.data-sideband-cpu0.pevent) --event:tick --pt perf.data-aux-idx0.bin
Please note that the perf-read-sideband.bash
script generates two sideband.pevent
files for me (one is for reading the global records and the other is CPU specific which I supply to my earlier command).
The final instruction trace data is filled up with the error messages I mentioned at the start.
I have read about your troubleshooting options and was wondering what could be a good choice -- whether increasing MTC or adding a TSC-offset value to the --pevent
switch.
Let me know what could be a reason for such errors appearing all over my trace.
Note: If it is of any interest, I am attaching my PERF_RECORD_AUXTRACE_INFO
for your information:
0x198 [0x98]: PERF_RECORD_AUXTRACE_INFO type: 1
PMU Type 7
Time Shift 31
Time Muliplier 596683345
Time Zero 18446744061403608139
Cap Time Zero 1
TSC bit 0x400
NoRETComp bit 0x800
Have sched_switch 3
Snapshot mode 0
Per-cpu maps 1
MTC bit 0x200
TSC:CTC numerator 300
TSC:CTC denominator 2
CYC bit 0x2
Thanks,
Arnab
Hi, I am using the ptdump to dump the packets. However, I found when I set the --lastip option,
it always prints the last ip which is the same as the current ip. I checked the code in ptdump.c. It looks like there is a logic error between ptdump.c:630-647,
I think it should query the last ip first and then update it.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.