Comments (26)
@mib1185 Docker Compose... Nothing too strange. My config, as well as system dbus and localtime are bound in volumes.
version: '3'
services:
homeassistant:
image: "ghcr.io/home-assistant/home-assistant"
volumes:
- /home/pi/ha-config:/config
- /etc/localtime:/etc/localtime:ro
- /run/dbus:/run/dbus:ro
- /srv/homeassistant/quirks/:/srv/homeassistant/quirks/
restart: unless-stopped
privileged: true
network_mode: host
devices:
- "/dev/ttyUSB0:/dev/ttyUSB0"
from core.
I attached gdb from my host OS. I do not have debug symbols but I was able to get this:
Thread 1 "hass" received signal SIGBUS, Bus error.
0xf5f6f9f6 in <orjson::serialize::per_type::unicode::StrSerializer as serde::ser::Serialize>::serialize ()
from target:/usr/local/lib/python3.12/site-packages/orjson/orjson.cpython-312-arm-linux-musleabihf.so
from core.
The just released version 2024.6.1 works and fixed the issue for me. Tag stable
can be used again :)
from core.
do have any more details, like log messages, command outputs or anything else helpful?
from core.
Painfully little I can see. It starts, stops and repeats. No obvious error messaging.
homeassistant_1 | s6-rc: info: service s6rc-oneshot-runner: starting
homeassistant_1 | s6-rc: info: service s6rc-oneshot-runner successfully started
homeassistant_1 | s6-rc: info: service fix-attrs: starting
homeassistant_1 | s6-rc: info: service fix-attrs successfully started
homeassistant_1 | s6-rc: info: service legacy-cont-init: starting
homeassistant_1 | s6-rc: info: service legacy-cont-init successfully started
homeassistant_1 | s6-rc: info: service legacy-services: starting
homeassistant_1 | services-up: info: copying legacy longrun home-assistant (no readiness notification)
homeassistant_1 | s6-rc: info: service legacy-services successfully started
homeassistant_1 | [18:58:25] INFO: Home Assistant Core finish process exit code 256
homeassistant_1 | [18:58:25] INFO: Home Assistant Core finish process received signal 7
homeassistant_1 | s6-rc: info: service legacy-services: stopping
homeassistant_1 | s6-rc: info: service legacy-services successfully stopped
homeassistant_1 | s6-rc: info: service legacy-cont-init: stopping
homeassistant_1 | s6-rc: info: service legacy-cont-init successfully stopped
homeassistant_1 | s6-rc: info: service fix-attrs: stopping
homeassistant_1 | s6-rc: info: service fix-attrs successfully stopped
homeassistant_1 | s6-rc: info: service s6rc-oneshot-runner: stopping
homeassistant_1 | s6-rc: info: service s6rc-oneshot-runner successfully stopped
Oh I've just found home-assistant.log.fault. Looks like a python "Bus error"
from core.
@oliwarner from your initila post i get you're using the homeassistant docker image directly? But your log from #118507 (comment) shows some service handling around - so how exactly do you run HA?
from core.
could you please provide the log of the container itself
from core.
This sounds like an OS level error SIGBUS
which killed the Python process. From what I read this can happen in various circumstances, e.g. accessing /dev/mem
(do you use RPi GPIOs?) or memory related issues (unaligned access) to potential hardware problems.
What OS are you using? Are there errors showing up in the kernel log (dmesg
)?
from core.
Thanks for follow-up @agners
- Raspbian 11 (bullseye)
- No GPIOs but I do passthrough a USB device for zigbee (works fine in stable)
- No errors in
dmesg
This is testing with the latest RC (which is somewhat further along than the initial reported one). Reverting back to stable still works.
Am I at the point where I need to start culling my existing config until I find what's wrong? Has there been a major Python environment upgrade in this HA release? Could it be an import error of a plugin that's not 3.12-compatible? I thought we were over that hill already but happy to be corrected.
from core.
Raspbian 11 (bullseye)
Is this 32-bit or 64-bit?
FWIW, in my test setup 2024.6.0b5 runs fine on a Raspberry Pi 3 Model B with HAOS 12.3 (32-bit) and Raspberry Pi 3 Model B+ with HAOS 12.3 (64-bit).
What I would try is cleaning the image completely. Sometimes the layers corrupt in mysterious way, especially on Rasspbery Pis. So make sure to stop and remove the container and cleanup/prune all the image layers, and download it again.
from core.
aarch64 — I'm so sorry. I've accidentally mislead you, this is a Raspi 4b. I completely forgot I upgraded it. I'll update the title.
I've deleted and re-downloaded the entire image stack. Same behaviour.
from core.
What container image/tag do you use exactly?
from core.
ghcr.io/home-assistant/home-assistant:rc
for testing this, ghcr.io/home-assistant/home-assistant
normally. (I've just noticed I've posted the non-rc version above - that's what I fall back to when the WAF dips too low and the family want their automatic lights back). My compose file is currently:
services:
homeassistant:
image: "ghcr.io/home-assistant/home-assistant:rc"
volumes:
- /home/pi/ha-config:/config
- /etc/localtime:/etc/localtime:ro
- /run/dbus:/run/dbus:ro
restart: unless-stopped
privileged: true
network_mode: host
devices:
- "/dev/ttyUSB0:/dev/ttyUSB0"
-
If I run it directly in debug mode with no config supplied (
docker compose run homeassistant hass --debug
) it starts up. That's obviously not storing anything anywhere and it has no plugins or existing configuration. I'm falling back to the working idea that the problem is a compatibility issue with a configured integration. -
If I run it in
--recovery-mode
with the right config, it's crashing still. -
I've turned on verbose logging,
-v
and I now see INFO output I didn't see before, and the last thing to show (after the last python WARNING) is:INFO (MainThread) [homeassistant.helpers.storage] Migrating core.config_entries storage from 1.1 to 1.2
It's still crashing with a bus error. The last trace is full of things happening around the storage helpers, which I didn't appreciate before. Can't be a co-incidence, right? The JSON file .storage/core.config_entries
parses in Python okay but it's way too big (36k) to manually spot what the problem might be, and it's full of secrets so I can't upload it.
Thread 0xf7e89f24 (most recent call first):
File "/usr/local/lib/python3.12/linecache.py", line 72 in checkcache
File "/usr/local/lib/python3.12/traceback.py", line 434 in _extract_from_extended_frame_gen
File "/usr/local/lib/python3.12/traceback.py", line 395 in extract
File "/usr/local/lib/python3.12/traceback.py", line 232 in extract_stack
File "/usr/local/lib/python3.12/asyncio/base_events.py", line 448 in create_future
File "/usr/local/lib/python3.12/asyncio/futures.py", line 417 in wrap_future
File "/usr/local/lib/python3.12/asyncio/base_events.py", line 860 in run_in_executor
File "/usr/src/homeassistant/homeassistant/core.py", line 876 in async_add_executor_job
File "/usr/src/homeassistant/homeassistant/helpers/storage.py", line 545 in _async_write_data
File "/usr/src/homeassistant/homeassistant/helpers/storage.py", line 540 in _async_handle_write_data
File "/usr/src/homeassistant/homeassistant/helpers/storage.py", line 436 in async_save
File "/usr/src/homeassistant/homeassistant/helpers/storage.py", line 419 in _async_load_data
File "/usr/src/homeassistant/homeassistant/helpers/storage.py", line 309 in _async_load
File "/usr/src/homeassistant/homeassistant/helpers/storage.py", line 289 in async_load
File "/usr/src/homeassistant/homeassistant/config_entries.py", line 1770 in async_initialize
File "/usr/local/lib/python3.12/asyncio/events.py", line 88 in _run
File "/usr/local/lib/python3.12/asyncio/base_events.py", line 1980 in _run_once
File "/usr/local/lib/python3.12/asyncio/base_events.py", line 639 in run_forever
File "/usr/local/lib/python3.12/asyncio/base_events.py", line 672 in run_until_complete
File "/usr/src/homeassistant/homeassistant/runner.py", line 188 in run
File "/usr/src/homeassistant/homeassistant/__main__.py", line 209 in main
File "/usr/local/bin/hass", line 8 in <module>
from core.
ghcr.io/home-assistant/home-assistant:rc
for testing this,ghcr.io/home-assistant/home-assistant
normally. (I've just noticed I've posted the non-rc version above - that's what I fall back to when the WAF dips too low and the family want their automatic lights back). My compose file is currently:
This is using the multi-platform image. I guess when you use docker inspect
on the image it says indeed aarch64, correct?
It's still crashing with a bus error. The last trace is full of things happening around the storage helpers, which I didn't appreciate before. Can't be a co-incidence, right? The JSON file
.storage/core.config_entries
parses in Python okay but it's way too big (36k) to manually spot what the problem might be, and it's full of secrets so I can't upload it.
Hm, yeah this makes it sound like this is an issue with orjson
, the json parser used in HA. It has native parts which can cause crashes like this. Can you try to parse the file using orjson
3.10.3?
from core.
FWIW, Core 2024.6.0b5 runs fine here on Raspberry Pi 4/aarch64
with HAOS, but I guess this is related to the exact data at play 🤔
from core.
Nice idea but orjson 3.10.3 can parse /config/.storage/core.config_entries
# running in $ docker compose run homeassistant python
import orjson
from pathlib import Path
orjson.loads(Path('/config/.storage/core.config_entries').read_text())" # outputs a parsed copy
Do you know where this 1.1→1.2 migration code is? The storage classes are a bit overwhelming for somebody looking at them for the first time but if you can point me at the code that's actually running on here, perhaps I can slip a few debug statements into my copy and see what it outputs.
from core.
Nice idea but orjson 3.10.3 can parse
/config/.storage/core.config_entries
You tried that on the target platform correct?
docker compose run homeassistant python
doesn't use this the latest tag (instead of rc
? 🤔 )
Do you know where this 1.1→1.2 migration code is? The storage classes are a bit overwhelming for somebody looking at them for the first time but if you can point me at the code that's actually running on here, perhaps I can slip a few debug statements into my copy and see what it outputs.
Not sure, maybe @bdraco can help out here?
from core.
I have the same issue on a raspberry pi with 8GB of ram. I am running rocky 8 and use podman (as root) to run the container.
When using verbose logging the last line is indeed:
2024-06-05 21:53:33.051 INFO (MainThread) [homeassistant.helpers.storage] Migrating core.config_entries storage from 1.1 to 1.2
from core.
Can you try downgrading orjson to 3.9.15 in the container?
from core.
Do you know where this 1.1→1.2 migration code is?
called in
core/homeassistant/helpers/storage.py
Line 412 in 8099ea8
the migration method itself
core/homeassistant/config_entries.py
Lines 1599 to 1629 in 8099ea8
saving migrated data to disk
core/homeassistant/helpers/storage.py
Line 419 in 8099ea8
from core.
@bartv Since you seem to be able to reproduce it on demand, can you get a backtrace with gdb?
https://wiki.python.org/moin/DebuggingWithGdb
from core.
I was able to resolve the issue by running the python.org python:3.12 container, creating a venv and installing the latest hass. When starting it did the migration. When done, I started the official 2024.6.0 container and now it goes a lot further in the startup process before I again get the "Buss Error"
2024-06-05 23:40:58.337 INFO (MainThread) [homeassistant.setup] Setup of domain rest_command took 0.00 seconds
2024-06-05 23:40:58.337 DEBUG (MainThread) [homeassistant.core] Bus:Handling <Event component_loaded[L]: component=rest_command>
2024-06-05 23:40:58.345 INFO (MainThread) [homeassistant.setup] Setting up application_credentials
2024-06-05 23:40:58.347 INFO (MainThread) [homeassistant.setup] Setup of domain application_credentials took 0.00 seconds
2024-06-05 23:40:58.347 DEBUG (MainThread) [homeassistant.core] Bus:Handling <Event component_loaded[L]: component=application_credentials>
Bus error (core dumped)
GDB is not working at the moment because it crashes much earlier:
(gdb) run /usr/local/bin/hass -c /config -v
Starting program: /usr/local/bin/python3 /usr/local/bin/hass -c /config -v
Program received signal SIGILL, Illegal instruction.
0xf73bb41e in ?? () from /lib/libcrypto.so.3
from core.
Downgrading to orjson 3.10.1 fixes the issue. It exists with both 3.10.2 and 3.10.3 (dep with which 2024.6.0) was shipped.
I tried running it from a python:3.12 container (debian based with full libc) installed from pypi.org and that seems to start fine, with the side note that this install is not complete and it does not start all custom components.
from core.
https://github.com/ijl/orjson/blob/master/src/serialize/per_type/unicode.rs
from core.
It looks like the original issue that caused us to revert was fixed but a new issue was introduced in 3.10.2 so I opened a PR to revert to the last known good version
from core.
just letting you know I had the same issue, luckily I made an complete backup a couple of months ago on a new drive , I will not update until this is fixed thanks guys
from core.
Related Issues (20)
- MariaDB Recorder: Out of range value for column 'state_id' at row 1 HOT 4
- yalexs_ble: inconsistant battery updates when "always connected" is on HOT 1
- Minecraft Addon won't load. HOT 1
- BTHome Multibutton device action as Trigger got‘s malformed
- Hunter Hydrawise broken with core 2024.6.4 HOT 10
- Redirect URI Issue Withings and Fitbit HOT 6
- Tailwind creates multiple (disabled) doors HOT 1
- FRITZ!DECT 210 disable Switch-Button if in the Fritz Box the option is disabled to set switch Manuel with the app HOT 1
- Deconz Sunndly not working and not possible to fix it
- Failed import in sensor.py for thermoworks HOT 2
- Unable to dim color leds on ZDB5100 (zwavejs)
- Support different types of Climate accessories HOT 4
- Multiple "SyntaxWarning: invalid escape sequence" on start-up HOT 3
- Something it stopping homeassistant from wrapping up the startup phase HOT 2
- Workday sensor problem since two weeks HOT 11
- Roborock Qrevo S causes warnings and log spam HOT 3
- GitHub hangs on home assistant green install HOT 3
- Workday platform is not updating HOT 2
- Tuya Petfeeder, missing entities HOT 1
- Since update 2024.6.3 to 2024.6.4 soco.events_asyncio Could not bind to 192.166.66.6:1400: [Errno 98] Address in use HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from core.