rust-bpf / rust-bcc Goto Github PK
View Code? Open in Web Editor NEWuser-friendly rust bindings for the bpf compiler collection
License: MIT License
user-friendly rust bindings for the bpf compiler collection
License: MIT License
One line summary of the issue here
BPF_RINGBUF_OUTPUT support
briefly answer these questions:
what did you expect?
Does rust-bcc support BPF_RINGBUF_OUTPUT?
what actually happened?
I have a BCC code using BPF_RINGBUF_OUTPUT, and I'd like to rewrite it using rust-bcc, but I didn't find BPF_RINGBUF_OUTPUT in rust-bcc.
what steps can we take to reproduce the behavior you saw?
NONE
Checking-in on any discussion, thoughts to implementing the ability to attach via raw socket or attach xdp programs? I didn't see any related past issues, but was just wondering if there's been some attempts at implementing these previously?
If anything, it may be something we'd like to contribute upstream eventually as well.
Thanks.
init_perf_map
returns a PerfMap
that can be poll
ed for data. But, what if I have two of them? E.g. in tcprestrans there is one for IPv4 and one for IPv6.
In #48 we received a report that the opensnoop example wasn't working. The exact changes in that PR led me to close without merge. At the time, we didn't have tests in TravisCI for this example. It tested working on my machine. #50 attempts to add that example to CI and we see the issue there.
This issue is for trying to track down why this seems to work for some systems but not others and lead to a resolution.
Systems known to be failing:
Fails with "oh no":
TravisCI VM infrastructure
OS Ubuntu Xenial
Kernel 4.15.0-1028-gcp x86_64
Rust 1.37.0
BCC 0.6.0->0.10.0
Fails with Segfault:
VMWare Fusion VM
OS Ubuntu Bionic
Kernel 4.15.0-58-generic x86_64
Rust 1.27.0
BCC 0.10.0
Systems shown to be working:
VMWare Fusion VM
OS Arch
Kernel 5.2.3-arch1-1-ARCH x86_64
Rust 1.36.0 / 1.37.0
BCC 0.10.0
VMWare Fusion VM
OS Debian Buster
Kernel 4.14.128 x86_64
Rust 1.37.0
BCC 0.10.0
Add support for bcc 0.16.0 to this crate.
Blocked by rust-bpf/bcc-sys#51
I encountered an issue related to the iteration of BPF table entries, which resulted in an infinite loop. The loop continually processes the table going over it and restarting from the start without ever really stopping.
The issue arises during the iteration of BPF entries in a table using table.iter()
. In the loop, the same entries seem to be processed repeatedly, leading to an infinite loop and prohibiting the iteration from reaching its expected end.
Faulty code snippet:
let mut bpf_entries: Vec<FilterRules> = Vec::new();
eprintln!("Starting to iterate BPF entries");
for entry in table.iter() {
eprintln!("Processing a BPF entry");
unsafe {
let rule: FilterRules = ptr::read_unaligned(entry.key.as_ptr() as *const _);
bpf_entries.push(rule);
}
}
eprintln!("Finished processing BPF entries");
I anticipated that the iterator would process each entry in the BPF table exactly once, eventually reaching the end of the table and exiting the loop. If this is not the expected behavior (as the examples of the library would imply) this issue can simply be closed.
Workaround:
let mut bpf_entries: Vec<FilterRules> = Vec::new();
let mut seen_entries: std::collections::HashSet<Vec<u8>> = std::collections::HashSet::new();
eprintln!("Starting to iterate BPF entries");
for entry in table.iter() {
eprintln!("Processing a BPF entry");
unsafe {
if seen_entries.contains(&entry.key) {
eprintln!("Duplicate BPF entry found. Exiting loop.");
break;
}
seen_entries.insert(entry.key.clone());
let rule: FilterRules = ptr::read_unaligned(entry.key.as_ptr() as *const _);
bpf_entries.push(rule);
}
}
eprintln!("Finished processing BPF entries");
I would greatly appreciate clarification on whether the described behavior is expected or if it might be a potential bug. Specifically, is the iterator designed to restart from the beginning of the entries after each loop, or should it progress through each entry once and subsequently exit the loop?
With the changes in #43 we now have a minimum repro for a case I've noticed in practical use of this library. In short, it seems there's some issue in the Drop
implementations that results in kprobes remaining registered in debugfs after program exit.
This issue is specific to kernels prior to 4.17 where ebpf programs are opened using perf event ABI instead of through debugfs. It is unclear if a similar issue exists in newer kernels.
However, for kernels like 4.9 and 4.14 we can easily reproduce this issue using the opensnoop example.
Steps to reproduce:
# the following command should show no registered kprobes
sudo cat /sys/kernel/debug/tracing/kprobe_events
# run the example and ctrl-c after some time, will see EBUSY on `write()`
sudo ./target/release/examples/opensnoop
# show the registered kprobes again, note they persist after program exit
sudo cat /sys/kernel/debug/tracing/kprobe_events
Error messages look like this:
write(-:kprobes/r_do_sys_open_bcc_16070): Device or resource busy
write(-:kprobes/p_do_sys_open_bcc_16070): Device or resource busy
Looking at the kernel source for 4.14.128, I see two cases where EBUSY could be returned:
https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/tree/kernel/trace/trace_kprobe.c?h=v4.14.128#n524
We need to see if there's more we need to do as part of the Drop
implementations to ensure the probes are cleaned up properly.
Add support for bcc 0.12.0
Blocked by: rust-bpf/bcc-sys#23
rust-bcc could support for User Statically-Defined Tracing (USDT) probes as the Python bcc wrapper does.
More on USDT probes:
First, bindings in bcc-sys are needed (see rust-bpf/bcc-sys#62).
Segfault for examples when running in docker apline:3.12 image.
The host OS is CentOS 7.8(3.10.0-1127.19.1.el7.x86_64), and bcc-tools in python runs normally both in the host and container. And rust-bcc's examples run normally too in host.
running successfully
segfault happened on bpf_module_create_c_from_string
from dmesg
docker run --rm -it --privileged -v /lib/modules:/lib/modules:ro -v /sys:/sys:ro -v /usr/src:/usr/src:ro alpine:3.12
apk add bcc-tools bcc-dev
git clone https://github.com/rust-bpf/rust-bcc
cd rust-bcc
cargo build --example opensnoop
cargo run --example opensnoop
Table::new()
can cause a segfault in safe code from dereferencing an arbitrary pointer. To illustrate:
/*
[dependencies]
bcc = "=0.0.32"
*/
use bcc::table::Table;
use std::os::raw::c_void;
fn main() {
let p = &mut 0usize as *mut _ as *mut c_void;
let mut table = Table::new(0, p);
println!("{}", table.key_size());
}
The function should probably be marked unsafe
, since the other methods in Table
depend on p
being a valid BPF module pointer. Alternatively, it should be private (or pub(crate)
), since it's really only useful in BPF::table()
.
64-bit word size is hardcoded (let mut cpu_bytes: [u8; 8] = cpu.to_ne_bytes(); in src/core/perf_event_array/v0_4_0.rs:70:38).
Is there interest in supporting also 32 bit architecture, like the raspberry pi 32 bit (armhf) mode?
A proposal by which we can support using rust-bcc
with various versions of the bcc
libraries.
The bcc
library versions may vary based on the environment in which an application is running. Currently rust-bcc
makes use of bcc-sys 0.4.0
which corresponds to bcc 0.4.0
. The newest published version bcc-sys 0.6.1
corresponds to bcc 0.5.0
. Operating systems such as Arch and Debian Sid have bcc 0.6.x
which does not seem to be supported in the current bcc-sys
crate.
We wish for rust-bcc
to be usable across a number of different OSes and therefore across different versions of the bcc
libraries. This RFC proposes a method by-which we can achieve this goal.
feature
as part of their cargo manifest if they wish to use a bcc
version other than the currently supported 0.4.0
rust-bcc
API shall remain consistent with prior behavior, as such, users who are running rust-bcc
in combination with bcc 0.4.0
shall see no difference in usageFor this RFC it is proposed that:
bcc-sys
dependency is dropped in favor of build-time generation of bindings to the bcc
libraries. This decouples our support of newer bcc
versions from the release train of the bcc-sys
library and ensures we are using bindings that match the bcc
libraries present in the build environmentfeature
s are defined in the cargo manifest to match bcc
versions - eg: feature 0.4.0
will conditionally compile the rust-bcc
library in a manner which supports bcc 0.4.0
. This enables us to transparently handle breaking changes in the bcc
library's API.This will increase the complexity of this library by requiring us to handle version specific details as we take-on support of additional bcc
library versions. There will likely be some code duplication as well, as there have been changes between 0.4.0 -> 0.6.1
in terms of the function signatures and return types - eg: a file descriptor is returned from attach_uprobe
instead of returning a pointer, therefore our Uprobe
struct would store the filehandle instead of the pointer.
bcc
versions which may have incompatible APIs, we need a way to match rust-bcc
's behavior to the system bcc
version. Conditional compilation via feature flags would enable the users of rust-bcc
to match our library's behavior to their system bcc
version.bcc
libraries will not be able to use rust-bcc
. This could result in fragmentation and competing implementations of our library's functionality.None known
None known
Hi,
I'm working on an attempt to recreate tcpconnect.py. In part this is to help me teach myself rust and eBPF at the same time.
The "count" mode is something I'm having a bit of difficulty porting at the moment but I can get the other functionality of tcpconnect working.
Would rust-bcc accept partial implementations?
One line summary of the issue here
unresolved imports bcc::trace_parse
, bcc::trace_read
no trace_parse
in the root
briefly answer these questions:
bcc::trace_parse
can be accessed as examples/hello_bpf.rs
didbcc::trace_parse
cannot be accessedbcc = "0.0.32"
, copy code from examples/hello_bpf.rs
to main.rs
, and run cargo run
Here https://github.com/rust-bpf/rust-bcc/blob/master/examples/tcpretrans.rs#L107
i noticed that it parses 11.249.50.92 instead of 92.50.249.11.
Ipv4Addr::from(event.daddr.swap_bytes()) works.
thanks.
Thought on an implementation of get_syscall_fnname to go along with the use of ksysname? Or is there a way that is currently working around this? Thank you and sorry for my lack of knowledge.
https://github.com/iovisor/bcc/blob/master/tools/statsnoop.py#L116
Has a release been made for bcc 0.17.0 support yet? I saw #161 was merged recently, so it should be picked up in the next release.
One line summary of the issue here
briefly answer these questions:
fn main()-> Result<()> {
let code = include_str!("netif_receive_skb.c").to_string();
//let code = include_str!("dev_queue_xmit.c").to_string();
let mut bpf = BPF::new(&code)?;
Kprobe::new().handler("kprobe____netif_receive_skb").function("netif_receive_skb").attach(&mut bpf)?;
//Kprobe::new().handler("kprobe____dev_queue_xmit").function("dev_queue_xmit").attach(&mut bpf)?;
let table = bpf.table("ipv4_send_bytes")?;
println!("{}","GOT TABLE".to_string());
loop{
let sleeptime = time::Duration::from_millis(1000);
thread::sleep(sleeptime);
for entry in table.iter(){
//parse key struct (pair source addr -> dest addr)
let key = parse_ipv4_struct(&entry.key);
println!("{}",format!("Received from {} to {} that converts to {} ",Ipv4Addr::from(key.saddr),Ipv4Addr::from(key.daddr),get_uint64(entry.value.clone())));
}
}
This is the python code I am running that shows actual values, I don't understand why this can be happening.
b = BPF(src_file = "netif_receive_skb.c")
def get_ipv4_session_key(k):
return TCPSessionKey(laddr=inet_ntop(AF_INET, pack("I", k.saddr)),
daddr=inet_ntop(AF_INET, pack("I", k.daddr)))
# header
print("Tracing... Hit Ctrl-C to end.")
# output
do_exit = 0
ipv4_send_bytes = b["ipv4_send_bytes"]
while (1):
ipv4_throughput = {}
for k, v in ipv4_send_bytes.items():
key = get_ipv4_session_key(k)
ipv4_throughput[key] = v.value
#ipv4_send_bytes.clear()
for k, send_bytes in ipv4_throughput.items():
#print(f"k {k} | snd {send_bytes}")
print("%-21s %-21s %-6s" % (k.laddr, k.daddr, int(send_bytes)))
try:
sleep(0.05)
except KeyboardInterrupt:
pass; do_exit = 1
if do_exit:
exit()
Best Regards.
raw string raplace costs a lot, so i wonder if there is any better ways to minimize this kind of overhead
running bpf_table.iter()
on an empty table returns an iterator with 1 entry.
briefly answer these questions:
table.iter()
on an empty table, it should return an empty iteratortable.iter()
my loop ran once with an entry which was empty.BPF_HASH(table, u32)
We should release 0.0.12 for the bugfix in #59
One line summary of the issue here
Looking for functionality to attach probes with pre-compiled .elf files.
I've been looking through the repo, and I'm not sure if this feature is supported (being a newcomer to Rust). But I'm looking to take pre-compiled BPF binaries in the form of ".o" or ".elf" or whatever and attach the probes without compiling. I'm looking at steps like these:
let mut module = BPF::new(code)?;
let uprobe_code = module.load_uprobe("count")?;
module.attach_uprobe(
"/lib/x86_64-linux-gnu/libc.so.6",
"strlen",
uprobe_code,
-1, /* all PIDs */
)?;
And am wondering if said .elf file could be substituted for uprobe_code
.
If it doesn't have this functionality, I'd have to poke around the library a bit more first (and brush up a bit more on my Rust), but I might be willing to develop that if it doesn't seem like it would conflict with the overall project. Otherwise I may make a fork just for that specific use-case.
Thanks for the work you've done!
The key and value are from on-stack Vec<u8>
. Maybe the Vec
isn't aligned with the type of key and value.
Line 160 in c4e5ef8
Using std::ptr::read
on these misaligned ptrs may cause trouble. Should it use std::ptr::read_unalign
instead?
rust-bcc/examples/contextswitch.rs
Line 160 in f28fa17
It'll be good, if you could add example using attach_tracepoint
Summary:
I propose we deprecate init_perf_map()
now that we have a PerfMapBuilder
that gives us more flexibility over PerfMap
configuration.
Justification:
Marking the old function as deprecated will guide others to the new builder pattern for configuration without breaking things by providing a compiler warning. After some time, we can then remove the function outright.
Open Questions:
I'm getting an immediate segfault when I try to run any of the three examples, and I haven't made any modifications to the codebase. I'm using the latest version.
I can run the BCC python examples without any issues, so I think my BCC install, kernel version, and kernel configuration should be okay. I'm trying this on a pretty normal Thinkpad laptop running Arch Linux. I'm running these with something like sudo -E cargo run --release --example opensnoop
. I tried both debug and release modes.
When I attach a debugger I see that the segfault appears to be at bcc::core::BPF::load mod.rs:174
. I've attached a dump of the debugger state after catching the segfault.
Let me know what other information would be helpful to debug!
__strlen_avx2 0x00007f50da8f8715
<unknown> 0x00007f50dad3b38a
bpf_prog_load_xattr 0x00007f50dad3d12d
bpf_prog_load 0x00007f50dad3d341
bcc::core::BPF::load mod.rs:174
bcc::core::BPF::load_kprobe mod.rs:65
opensnoop::do_main opensnoop.rs:39
opensnoop::main opensnoop.rs:80
std::rt::lang_start::{{closure}} rt.rs:74
std::rt::lang_start_internal::{{closure}} rt.rs:59
std::panicking::try::do_call panicking.rs:310
__rust_maybe_catch_panic lib.rs:102
std::panicking::try panicking.rs:289
std::panic::catch_unwind panic.rs:398
std::rt::lang_start_internal rt.rs:58
std::rt::lang_start rt.rs:74
main 0x000055f93e7f554a
__libc_start_main 0x00007f50da7bd223
_start 0x000055f93e7ee16e
Signal = SIGSEGV (Segmentation fault)
log_buf = {alloc::vec::Vec<u8>}
version = {u32} 327687
license = {i8 * | 0x7f50dcd95b90} "GPL"
*license = {i8} 71
size = {i32} 664
start = {bcc_sys::bccapi::v0_8_0::bpf_insn *}
code = {u8} 191
_bitfield_1 = {bcc_sys::bccapi::v0_8_0::__BindgenBitfieldUnit<[u8; 1], u8>}
storage = {[u8; 1]}
[0] = {u8} 22
align = {[u8; 0]}
off = {i16} 0
imm = {i32} 0
cname = {std::ffi::c_str::CString}
inner = {alloc::boxed::Box<[u8]>}
data_ptr = {u8 * | 0x55f940b90a70} "trace_return\000"
*data_ptr = {u8} 116
length = {usize} 13
self = {bcc::core::BPF *}
p = {core::ffi::c_void * | 0x55f93f8770c0} 0x55f93f8770c0
*p = {core::ffi::c_void} (unknown: 2)
kprobes = {std::collections::hash::set::HashSet<bcc::core::kprobe::v0_6_0::Kprobe, std::collections::hash::map::RandomState>}
map = {std::collections::hash::map::HashMap<bcc::core::kprobe::v0_6_0::Kprobe, (), std::collections::hash::map::RandomState>}
hash_builder = {std::collections::hash::map::RandomState}
k0 = {u64} 18373245324916279028
k1 = {u64} 6511614963522513460
table = {std::collections::hash::table::RawTable<bcc::core::kprobe::v0_6_0::Kprobe, ()>}
capacity_mask = {usize} 18446744073709551615
size = {usize} 0
hashes = {std::collections::hash::table::TaggedHashUintPtr}
0 = {core::ptr::Unique<usize>}
pointer = {core::nonzero::NonZero<*const usize>}
0 = {usize * | 0x1} 0x1
*0 = {usize}
_marker = {core::marker::PhantomData<usize>}
marker = {core::marker::PhantomData<(bcc::core::kprobe::v0_6_0::Kprobe, ())>}
resize_policy = {std::collections::hash::map::DefaultResizePolicy}
uprobes = {std::collections::hash::set::HashSet<bcc::core::uprobe::v0_6_0::Uprobe, std::collections::hash::map::RandomState>}
map = {std::collections::hash::map::HashMap<bcc::core::uprobe::v0_6_0::Uprobe, (), std::collections::hash::map::RandomState>}
hash_builder = {std::collections::hash::map::RandomState}
k0 = {u64} 18373245324916279027
k1 = {u64} 6511614963522513460
table = {std::collections::hash::table::RawTable<bcc::core::uprobe::v0_6_0::Uprobe, ()>}
capacity_mask = {usize} 18446744073709551615
size = {usize} 0
hashes = {std::collections::hash::table::TaggedHashUintPtr}
0 = {core::ptr::Unique<usize>}
pointer = {core::nonzero::NonZero<*const usize>}
0 = {usize * | 0x1} 0x1
*0 = {usize}
_marker = {core::marker::PhantomData<usize>}
marker = {core::marker::PhantomData<(bcc::core::uprobe::v0_6_0::Uprobe, ())>}
resize_policy = {std::collections::hash::map::DefaultResizePolicy}
tracepoints = {std::collections::hash::set::HashSet<bcc::core::tracepoint::v0_6_0::Tracepoint, std::collections::hash::map::RandomState>}
map = {std::collections::hash::map::HashMap<bcc::core::tracepoint::v0_6_0::Tracepoint, (), std::collections::hash::map::RandomState>}
hash_builder = {std::collections::hash::map::RandomState}
k0 = {u64} 18373245324916279029
k1 = {u64} 6511614963522513460
table = {std::collections::hash::table::RawTable<bcc::core::tracepoint::v0_6_0::Tracepoint, ()>}
capacity_mask = {usize} 18446744073709551615
size = {usize} 0
hashes = {std::collections::hash::table::TaggedHashUintPtr}
0 = {core::ptr::Unique<usize>}
pointer = {core::nonzero::NonZero<*const usize>}
0 = {usize * | 0x1} 0x1
*0 = {usize}
_marker = {core::marker::PhantomData<usize>}
marker = {core::marker::PhantomData<(bcc::core::tracepoint::v0_6_0::Tracepoint, ())>}
resize_policy = {std::collections::hash::map::DefaultResizePolicy}
name = {&str} "trace_return"
prog_type = {u32} 2
log_level = {i32} 0
log_size = {u32} 0
We have some recent changes that have been merged. Opening this to track releasing 0.0.27
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.