Comments (9)
I have been playing around with this a bit. The flaky behavior seems to originate in the kernels WakeupEvents logic. I have not looked into the kernel code yet, but the current test fails from time to time until I always add the WakeupEvents + 1 amount of events, then it consistently passes.
// send followup events
for i := 1; i < numEvents+1; i++ {
_, _, err = prog.Test(internal.EmptyBPFContext)
if err != nil {
t.Fatal(err)
}
}
So perhaps this has to do with memory alignment of the map or something like that. I have tried varying the numEvents
and sampleSize
but changes there don't seem to change anything.
from ebpf.
I think I found the cause. The WakeupEvents limit is per ring, one per CPU. And when we execute BPF_PROG_RUN multiple times, we sometimes write 2 messages to different rings. If I log the CPU ID of the first and the followup events I see:
=== RUN TestPerfReaderWakeupEvents
ret 7
ret 7
--- PASS: TestPerfReaderWakeupEvents (0.01s)
=== RUN TestPerfReaderWakeupEvents
ret 7
ret 7
--- PASS: TestPerfReaderWakeupEvents (0.01s)
=== RUN TestPerfReaderWakeupEvents
ret 7
ret 7
--- PASS: TestPerfReaderWakeupEvents (0.01s)
=== RUN TestPerfReaderWakeupEvents
ret 7
ret 7
--- PASS: TestPerfReaderWakeupEvents (0.01s)
=== RUN TestPerfReaderWakeupEvents
ret 7
ret 0
panic: test timed out after 1s
The numbers changes from run to run, and its seems pure luck that the +1 I mentioned earlier happens to land on the same CPU as one of the once before.
A potential fix would be to add the following to the start of the test:
import extUnix "golang.org/x/sys/unix"
...
func TestPerfReaderWakeupEvents(t *testing.T) {
// Lock goroutine to thread
runtime.LockOSThread()
defer runtime.UnlockOSThread()
// Save CPU affinity
var set extUnix.CPUSet
err := extUnix.SchedGetaffinity(0, &set)
qt.Assert(t, qt.IsNil(err))
// Schedule test to run on only CPU 0
err = extUnix.SchedSetaffinity(0, &extUnix.CPUSet{1})
qt.Assert(t, qt.IsNil(err))
// Restore CPU affinity
defer extUnix.SchedSetaffinity(0, &set)
Perhaps there are other alternatives (this doesn't win any beauty awards)
from ebpf.
Could we send numCPUs * WakeupEvents
events to ensure that at least one CPU gets woken up?
from ebpf.
Yea, that should also work, but I don't know if that defeats the purpose of the test, in my case you would be enqueue'ing 16 events to test a 2 event limit.
from ebpf.
The test was more for making sure it didn't wakeup after 1 event.
from ebpf.
I'm not sure we can control the CPU the eBPF program actually runs on by controlling the affinity of the userspace program.
from ebpf.
I'm not sure we can control the CPU the eBPF program actually runs on by controlling the affinity of the userspace program.
I tested the code I showed seems to work, at least locally. By default the BPF program executes on the CPU making the syscall. Although that isn't official so not guaranteed.
The Program.Run
also has a parameter to pick a CPU to run on, but looking at the kernel, it only works for raw tracepoint programs, so if we can change the program type for our sample prog, then that might be an option. (torvalds/linux@1b4d60e)
from ebpf.
it only works for raw tracepoint programs
That would constrain what kernel versions we can test on though.
from ebpf.
I'd be fine with both solutions. I remember that we have the same problem (samples submitted on the "wrong" CPU) in other places as well. Maybe we could reuse the user space code.
I think it's also fine to constrain this to a smaller number of kernel versions: we're testing that the plumbing we have ~ works. We don't need to / want to assert that the kernel isn't doing dodgy things (as we'd never see the end of it 😆 ).
from ebpf.
Related Issues (20)
- dae can not recognize pppoe dial-up interface and route out correctly. HOT 1
- Kernel version detection does not work with vDSO disabled HOT 6
- Allow changing line info data in btf.Line HOT 11
- load program: invalid argument: unknown func bpf_redirect_peer#155 (51 line(s) omitted)
- With the program type raw_tracepoint, no data is generated.error: loading objects: field TraceSchedWakeup: program trace_sched_wakeup: load program: permission denied: 5: (61) r1 = *(u32 *)(r7 +2784): R7 invalid mem access 'inv' (5 line(s) omitted HOT 1
- With Linux 4.9, loadBpfObjects() failed, error=argument list too long HOT 1
- program: relocation of program targeting a module fails if CONFIG_DEBUG_INFO_BTF_MODULES is disabled HOT 3
- Unusual `go` directive in `go.mod`
- flake: TestMapIteratorAllocations HOT 1
- TestHaveProgramType/Extension fails on kernels >6.7 HOT 2
- ci: arm64 tests fail HOT 4
- deprecation of directive `//go:linkname` HOT 3
- AttachXDP failed on Ubuntu 20.04, kernel version is 5.4 HOT 3
- program: make it harder to attach Kretprobe via link.Kprobe and vice versa
- btf: CO-RE: types from (other) kmod BTF are not available HOT 3
- map Lookup allocates when key does not exist HOT 3
- DATA RACE: github.com/cilium/ebpf.(*MapSpec).createMap HOT 1
- has to have BTF in order to use bpf_spin_lock HOT 5
- New API in Collection{Spec} for modifying global BPF variables HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from ebpf.