Comments (8)
I tried the Fedora Kconfig and it appears to use a dummy console. If I add CONFIG_NO_HZ=y
to my regular config then I can boot, but if I select CONFIG_NO_HZ_IDLE=y
then I get a hang. Do you have a Kconfig that manifests your error in QEMU simulation (ie, without a PCIe graphics card)? Walking all the way across the lab to check if the machine boots is a headache :).
from riscv-linux.
For me, it booted but hangs after
"Starting logging: OK
Starting mdev...
sort: /sys/devices/platform/Fixed: No such file or directory
modprobe: can't change directory to '/lib/modules': No such file or directory
Initializing random number generator... done.
Starting network...
udhcpc (v1.24.2) started
Sending discover...
Sending select for 10.0.2.15...
Lease of 10.0.2.15 obtained, lease time 86400
deleting routers
adding dns 10.0.2.3
Starting dropbear sshd: OK"
Are you seeing the same hang ? I am seeing same hang for both
CONFIG_NO_HZ=y
or
CONFIG_NO_HZ=y
CONFIG_NO_HZ_IDLE=y
CONFIG_NO_HZ_COMMON=y
However, it hangs during the boot for spike/smp combination (for both configuration).
[ 0.020000] clocksource: Switched to clocksource riscv_clocksource
[ 0.020000] NET: Registered protocol family 2
[ 0.020000] TCP established hash table entries: 16384 (order: 5, 131072 bytes)
[ 0.020000] TCP bind hash table entries: 16384 (order: 6, 262144 bytes)
[ 0.020000] TCP: Hash tables configured (established 16384 bind 16384)
[ 0.020000] UDP hash table entries: 1024 (order: 3, 32768 bytes)
[ 0.020000] UDP-Lite hash table entries: 1024 (order: 3, 32768 bytes)
[ 0.030000] NET: Registered protocol family 1
[ 0.050000] Unpacking initramfs...
[ 0.080000] Initialise system trusted keyrings
[ 0.080000] workingset: timestamp_bits=62 max_order=19 bucket_order=0
[ 21.020000] INFO: rcu_sched self-detected stall on CPU
[ 21.020000] INFO: rcu_sched self-detected stall on CPU
[ 21.020000] 3-...!: (698636 ticks this GP) idle=2b6/2/0 softirq=1/1 fqs=0
[ 21.020000] (t=2100 jiffies g=-299 c=-300 q=1376)
[ 21.020000] rcu_sched kthread starved for 2100 jiffies! g18446744073709551317 c18446744073709551316 f0x0 RCU_GP_WAIT_FQS(3) ->state=0x402 ->cpu=1
[ 21.020000] rcu_sched I 0 8 2 0x00000000
[ 21.020000] Call Trace:
[ 21.020000] [< (ptrval)>] __schedule+0x1c6/0x4ea
[ 21.020000] [< (ptrval)>] schedule+0x16/0x58
[ 21.020000] [< (ptrval)>] schedule_timeout+0x12a/0x248
[ 21.020000] [< (ptrval)>] rcu_gp_kthread+0x31e/0x6cc
[ 21.020000] [< (ptrval)>] kthread+0xcc/0xde
[ 21.020000] [< (ptrval)>] ret_from_syscall+0xa/0xe
[ 21.020000] Task dump for CPU 2:
[ 21.020000] swapper/2 R running task 0 0 1 0x00000008
[ 21.020000] Call Trace:
[ 21.020000] [< (ptrval)>] __schedule+0x1c6/0x4ea
[ 21.020000] Task dump for CPU 3:
[ 21.020000] swapper/3 R running task 0 0 1 0x00000008
[ 21.020000] Call Trace:
[ 21.020000] [< (ptrval)>] walk_stackframe+0x0/0xa2
[ 21.020000] [< (ptrval)>] show_stack+0x26/0x34
[ 21.020000] [< (ptrval)>] sched_show_task+0xa6/0xfc
[ 21.020000] [< (ptrval)>] dump_cpu_task+0x3a/0x48
[ 21.020000] [< (ptrval)>] rcu_dump_cpu_stacks+0x7a/0xb8
[ 21.020000] [< (ptrval)>] rcu_check_callbacks+0x4a0/0x660
[ 21.020000] [< (ptrval)>] update_process_times+0x1e/0x48
[ 21.020000] [< (ptrval)>] tick_nohz_handler+0x7c/0xd2
[ 21.020000] [< (ptrval)>] riscv_timer_interrupt+0x26/0x32
[ 21.020000] [< (ptrval)>] riscv_intc_irq+0xaa/0xd2
[ 21.020000] [< (ptrval)>] ret_from_syscall+0xa/0xe
[ 21.030000] 2-...!: (698635 ticks this GP) idle=31a/2/0 softirq=1/1 fqs=0
[ 21.040000] (t=2102 jiffies g=-299 c=-300 q=1376)
[ 21.040000] rcu_sched kthread starved for 2102 jiffies! g18446744073709551317 c18446744073709551316 f0x0 RCU_GP_WAIT_FQS(3) ->state=0x0 ->cpu=3
[ 21.040000] rcu_sched R running task 0 8 2 0x00000000
I am running with fairly latest linux
commit 8fa96b8
Author: Palmer Dabbelt [email protected]
Date: Wed Feb 7 18:23:09 2018 -0800
irq-riscv-intc post-merge cleanups
Should I update anything else to verify?
from riscv-linux.
The difference is probably just my working directory being a mess, but I can reproduce some failures with
$ git diff
diff --git a/Makefile b/Makefile
index 408a4247fe64..6375cab35974 100644
--- a/Makefile
+++ b/Makefile
@@ -130,6 +130,7 @@ endif
$(vmlinux): $(linux_srcdir) $(linux_wrkdir)/.config $(buildroot_initramfs_sysroot_stamp)
$(MAKE) -C $< O=$(linux_wrkdir) \
+ CROSS_COMPILE=riscv64-unknown-linux-gnu- \
CONFIG_INITRAMFS_SOURCE="$(confdir)/initramfs.txt $(buildroot_initramfs_sysroot)" \
CONFIG_INITRAMFS_ROOT_UID=$(shell id -u) \
CONFIG_INITRAMFS_ROOT_GID=$(shell id -g) \
diff --git a/conf/linux_defconfig b/conf/linux_defconfig
index eff1c1731aa8..e0aac738869a 100644
--- a/conf/linux_defconfig
+++ b/conf/linux_defconfig
@@ -83,3 +83,6 @@ CONFIG_MTD=y
CONFIG_MTD_SPI_NOR=y
CONFIG_MTD_M25P80=y
CONFIG_MTD_BLOCK_RO=y
+CONFIG_NO_HZ=y
+CONFIG_NO_HZ_IDLE=y
+CONFIG_NO_HZ_COMMON=y
diff --git a/linux b/linux
index 8fa96b887786..632fc702e0ef 160000
--- a/linux
+++ b/linux
@@ -1 +1 @@
-Subproject commit 8fa96b8877861fbcc7340f8f0a975569da254d93
+Subproject commit 632fc702e0efd519bf75f55c7f85ac7300e08dab
diff --git a/riscv-gnu-toolchain b/riscv-gnu-toolchain
--- a/riscv-gnu-toolchain
+++ b/riscv-gnu-toolchain
@@ -1 +1 @@
-Subproject commit b4dae89f85bf882852c6186b1284df11065bfcd9
+Subproject commit b4dae89f85bf882852c6186b1284df11065bfcd9-dirty
diff --git a/riscv-qemu b/riscv-qemu
index cd2c21d70361..6256f8ac6004 160000
--- a/riscv-qemu
+++ b/riscv-qemu
@@ -1 +1 @@
-Subproject commit cd2c21d70361f6887e265860c4f0dbb49ad323ca
+Subproject commit 6256f8ac6004677cc4652b1fcd77c839dfc05015-dirty
so I'm going to go figure out what's going on.
from riscv-linux.
@atishp04 is going to take a look, thanks!
from riscv-linux.
Just a quick update. I am able to get repeated stall warnings in qemu setup as well (with smp).
[ 21.020000] INFO: rcu_sched self-detected stall on CPU
[ 21.020000] INFO: rcu_sched self-detected stall on CPU
[ 21.020000] 3-...!: (698636 ticks this GP) idle=2b6/2/0 softirq=1/1 fqs=0
[ 21.020000] (t=2100 jiffies g=-299 c=-300 q=1376)
[ 21.020000] rcu_sched kthread starved for 2100 jiffies! g18446744073709551317 c18446744073709551316 f0x0 RCU_GP_WAIT_FQS(3) ->state=0x402 ->cpu=1
[ 21.020000] rcu_sched I 0 8 2 0x00000000
[ 21.020000] Call Trace:
[ 21.020000] [< (ptrval)>] __schedule+0x1c6/0x4ea
[ 21.020000] [< (ptrval)>] schedule+0x16/0x58
[ 21.020000] [< (ptrval)>] schedule_timeout+0x12a/0x248
[ 21.020000] [< (ptrval)>] rcu_gp_kthread+0x31e/0x6cc
[ 21.020000] [< (ptrval)>] kthread+0xcc/0xde
[ 21.020000] [< (ptrval)>] ret_from_syscall+0xa/0xe
[ 21.020000] Task dump for CPU 2:
[ 21.020000] swapper/2 R running task 0 0 1 0x00000008
[ 21.020000] Call Trace:
[ 21.020000] [< (ptrval)>] __schedule+0x1c6/0x4ea
[ 21.020000] Task dump for CPU 3:
[ 21.020000] swapper/3 R running task 0 0 1 0x00000008
[ 21.020000] Call Trace:
[ 21.020000] [< (ptrval)>] walk_stackframe+0x0/0xa2
[ 21.020000] [< (ptrval)>] show_stack+0x26/0x34
[ 21.020000] [< (ptrval)>] sched_show_task+0xa6/0xfc
[ 21.020000] [< (ptrval)>] dump_cpu_task+0x3a/0x48
[ 21.020000] [< (ptrval)>] rcu_dump_cpu_stacks+0x7a/0xb8
[ 21.020000] [< (ptrval)>] rcu_check_callbacks+0x4a0/0x660
[ 21.020000] [< (ptrval)>] update_process_times+0x1e/0x48
[ 21.020000] [< (ptrval)>] tick_nohz_handler+0x7c/0xd2
[ 21.020000] [< (ptrval)>] riscv_timer_interrupt+0x26/0x32
[ 21.020000] [< (ptrval)>] riscv_intc_irq+0xaa/0xd2
[ 21.020000] [< (ptrval)>] ret_from_syscall+0xa/0xe
[ 21.030000] 2-...!: (698635 ticks this GP) idle=31a/2/0 softirq=1/1 fqs=0
[ 21.040000] (t=2102 jiffies g=-299 c=-300 q=1376)
[ 21.040000] rcu_sched kthread starved for 2102 jiffies! g18446744073709551317 c18446744073709551316 f0x0 RCU_GP_WAIT_FQS(3) ->state=0x0 ->cpu=3
[ 21.040000] rcu_sched R running task 0 8 2 0x00000000
[ 21.040000] Call Trace:
[ 21.040000] [< (ptrval)>] __schedule+0x1c6/0x4ea
[ 21.040000] [< (ptrval)>] schedule+0x16/0x58
[ 21.040000] [< (ptrval)>] schedule_timeout+0x12a/0x248
[ 21.040000] [< (ptrval)>] rcu_gp_kthread+0x31e/0x6cc
[ 21.040000] [< (ptrval)>] kthread+0xcc/0xde
[ 21.040000] [< (ptrval)>] ret_from_syscall+0xa/0xe
[ 21.040000] Task dump for CPU 2:
[ 21.040000] swapper/2 R running task 0 0 1 0x00000008
[ 21.040000] Call Trace:
[ 21.040000] [< (ptrval)>] walk_stackframe+0x0/0xa2
[ 21.040000] [< (ptrval)>] show_stack+0x26/0x34
[ 21.040000] [< (ptrval)>] sched_show_task+0xa6/0xfc
[ 21.050000] [< (ptrval)>] dump_cpu_task+0x3a/0x48
[ 21.050000] [< (ptrval)>] rcu_dump_cpu_stacks+0x7a/0xb8
[ 21.050000] [< (ptrval)>] rcu_check_callbacks+0x4a0/0x660
[ 21.050000] [< (ptrval)>] update_process_times+0x1e/0x48
[ 21.050000] [< (ptrval)>] tick_nohz_handler+0x7c/0xd2
[ 21.050000] [< (ptrval)>] riscv_timer_interrupt+0x26/0x32
[ 21.050000] [< (ptrval)>] riscv_intc_irq+0xaa/0xd2
[ 21.050000] [< (ptrval)>] ret_from_syscall+0xa/0xe
[ 21.050000] Task dump for CPU 3:
[ 21.050000] swapper/3 R running task 0 0 1 0x00000008
[ 21.050000] Call Trace:
[ 21.050000] [< (ptrval)>] __schedule+0x1c6/0x4ea
The first one is after 21sec and rest of them keeps appearing after each ~63 secs. Still investigating.
from riscv-linux.
Got some time to look at the issue.
It looks like it's a self detected stall on one or more cpus where rcu_sched kthread couldn't run for more than 21 seconds. All other cpus seems to run the idle thread (swapper task).
arch_idle() function invokes wfi which puts the cpu to until "idle" mode until the next interrupt happens.
In periodic tick, the cpus are woken up by periodic timer tick. However, the no_hz idle case it should be only woken up when there is some task queued i.e. scheduler IPIs.
Here are possible causes:
1.All of the cpus are in idle. Thus nobody is forwarding the jiffies - This is not the case. We can see repeated stalls have correct jiffies.
2. The concerned cpu never returned from WFI after IPI. I am still investigating this cause. Doesn't seem likely because I have qemu logs which suggest the cpus do return from WFI after IPI.
-
The concerned cpu immediately returns from WFI thus looping continuously without giving any other thread chance to run.
-
My initial assumption was something wrong with timer_probe and clock source being initialized for each cpu. But that has been fixed with following patches.
http://lists.infradead.org/pipermail/linux-riscv/2018-April/000510.html
It is difficult to debug as I can't put printk in the arch_idle function as it is called so many times that serial console will hang in the first place. trace_printk is of no help as it is stuck during the boot itself. I am not aware of any method to access the trace buffer unless it's booted completely.
Looking for any ideas or suggestions to debug the problem further.
from riscv-linux.
Add an instruction or CSR to qemu which does printf on a register, then debug by doing csr writes in arch_idle? If you're feeling ambitious implement Liviu's semihosting proposal (already in openocd and should be findable on sw-dev, it's a small change to EBREAK translation)
from riscv-linux.
Fixed in
atishp04@fd1e893
We can close this issue now.
from riscv-linux.
Related Issues (20)
- Build fails with lack of auto-generated headers HOT 6
- what branch or tag is suitable for riscv-qemu HOT 2
- Documentation/riscv HOT 1
- WARNING: CPU: 0 PID: 0 at drivers/irqchip/irq-riscv-intc.c:158 irq_enable+0x32/0x58 HOT 4
- I can't do "make ARCH=riscv". HOT 2
- Asking for help! Error in execution of compiled : $Top/riscv-tools/lib/gcc/riscv64-unknown-elf/7.2.0/../../../../riscv64-unknown-elf/bin/ld:unsupported -shared HOT 3
- linker issue undefined reference to __lshrti3 HOT 9
- patch for PTRACE_SETREGSET bug HOT 4
- Linux Ash Shell issue
- icache not flushed correctly HOT 8
- riscv-linux compile issue in branch riscv-all HOT 1
- Stopping lo interface Oops HOT 2
- unknown architecture riscv. HOT 1
- GPIO & PWM spec for Unleashed HOT 2
- perf_tools support
- What instruction sets do I need to run Linux on my own risc-v CPU? HOT 7
- Any plan for upstreaming DMA32? HOT 2
- Kernel errors on loading modules
- Are there any substantial differences between this repo and upstream? HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from riscv-linux.