Coder Social home page Coder Social logo

Comments (20)

DG12 avatar DG12 commented on June 9, 2024

what is the output from:

systemctl status influxdb

/bin/df

grep -v \# /etc/influxdb/influxdb.conf |uniq

sudo /usr/bin/du -d 2 /var/lib/influxdb |sort -n

influx -precision rfc3339 -database ruuvi -execute "select last(temperature),mac from ruuvi_measurements

curl --get 'http://localhost:8086/query?pretty=true' --data-urlencode "db=ruuvi" --data-urlencode "q=SELECT last(rssi),mac FROM ruuvi_measurements "

from ruuvicollector.

0KiB avatar 0KiB commented on June 9, 2024

Hey, theres log.

influx -precision rfc3339 -database ruuvi -execute "select last(temperature),mac from ruuvi_measurements

^ Gave nothing i had to control a+d

https://pastebin.com/WJMidgFM

from ruuvicollector.

DG12 avatar DG12 commented on June 9, 2024

Sorry, I seem to have left out the trailing quote ( " ) after ruuvi_measurements. control C would be better when a bash command like this seems to be hung. No worries.

Your responses:

systemctl status influxdb
 influxdb.service - InfluxDB is an open-source, distributed, time series database
   Loaded: loaded (/lib/systemd/system/influxdb.service; enabled; vendor preset: enabled)
   Active: active (running) since Wed 2019-06-19 16:58:07 EEST; 50min ago
     Docs: https://docs.influxdata.com/influxdb/
 Main PID: 473 (influxd)
   CGroup: /system.slice/influxdb.service
           └─473 /usr/bin/influxd -config /etc/influxdb/influxdb.conf
 
Jun 19 17:46:58 raspberrypi influxd[473]: [httpd] 192.168.1.212, 192.168.1.212,::1 - - [19/Jun/2019:17:46:58 +0300] "GET /query?db=r

+++ Unfortunately above is an incomplete (even after scrolling right here) due to the fact that systemctl cuts log to fit size of the terminal window!+++

/bin/df
Filesystem     1K-blocks    Used Available Use% Mounted on
/dev/root       60910488 5650140  52714540  10% /

grep -v \# /etc/influxdb/influxdb.conf |uniq
[meta]
  dir = "/var/lib/influxdb/meta"
[data]
  dir = "/var/lib/influxdb/data"
  wal-dir = "/var/lib/influxdb/wal"
  series-id-set-cache-size = 100

sudo /usr/bin/du -d 2 /var/lib/influxdb |sort -n
31044   /var/lib/influxdb/wal/_internal
31060   /var/lib/influxdb/wal
36616   /var/lib/influxdb

curl --get 'http://localhost:8086/query?pretty=true' --data-urlencode "db=ruuvi" --data-urlencode "q=SELECT last(rssi),mac FROM ruuvi_measurements "
{    "results": [        {    "statement_id": 0,
            "series": [     {     "name": "ruuvi_measurements",
             "columns": [     "time", "last",  "mac" ],
             "values": [ [  "2019-06-19T14:26:37.437536735Z",  -91, "DFE5676531E9" 

This all shows that the influx data base is running and responding to updates from ruuvi collector..

from ruuvicollector.

DG12 avatar DG12 commented on June 9, 2024

How often does this happen? predictably?
I have had problems with the entire system stalling for a couple of minutes on a raspberry Pi zero w.

To determine if this is memory/swapping/paging problem we need to start recording statics. Add this SINGLE line to
/etc/cron.d/atsar


*/10 * * * * root test -x /usr/lib/atsar/atsa1 && (LOGDIR=/var/log/atsar; CURDAY=`date +\%d`;/usr/lib/atsar/atsadc 15 39 $LOGDIR/atsa$CURDAY )

Be sure to scroll to the right to get all the command ending with CURDAY)

Be sure the directory /var/log/atsar exits, if it doesn't then
sudo mkdir /var/log/atsar

Next time the error occurs we will use a command like
atsar -s 12:25 -e 12:40 -p
to show statistics.

from ruuvicollector.

DG12 avatar DG12 commented on June 9, 2024

Feels like linux or InfluxDB issue.

from ruuvicollector.

0KiB avatar 0KiB commented on June 9, 2024

Heres logs from atsar / others. So grafana will stop showing temps after about 2hours, last day i was able to run it for a good 8h seems like it will randomly crash or something. I also re-flashed rasbian and installed everything again but not helping at all.

While keeping on eye grafana
Grafana popups red notices on right corner saying
unauthorized
but works still after that, if not totally messed-up by something. (not sure why)

https://pastebin.com/tFAgWQbt

from ruuvicollector.

DG12 avatar DG12 commented on June 9, 2024

systemctl status shows complete information this time but not from a time that there was a problem!

The atsar command -s (start ) and -e (end) needs to include a period of a few minutes before the event and a few minutes after the problem event.

What is output from
grep --invert-match '^#' /home/pi/RuuviCollector/ruuvi-collector.properties |uniq

influx -precision rfc3339 -database ruuvi -execute "select mac,last(temperature) from ruuvi_measurements group by mac;"|grep 2019

from ruuvicollector.

0KiB avatar 0KiB commented on June 9, 2024

atsar included start / end just like you mentioned.

pi@raspberrypi:~ $ influx -precision rfc3339 -database ruuvi -execute "select mac,last(temperature) from ruuvi_measurements group by mac;"|grep 2019
2019-06-23T13:18:17.620637744Z 0052010C4003 24.3
2019-06-23T17:43:54.173533189Z DFE5676531E9 19.03
2019-06-23T17:43:51.950908515Z F161992C25B6 24.16
2019-06-23T17:43:49.227313382Z FD0531F180CB 23.83

grep --invert-match '^#' /home/pi/ruuvi-collector.properties |uniq
^
nothing...

from ruuvicollector.

DG12 avatar DG12 commented on June 9, 2024

I do not see atsar data covering a period of an event?

OK so you've got 4 active tags no biggie.
You haven't specified any custom ruuvi-collector.properties

influx
SHOW RETENTION POLICIES on ruuvi
quit

from ruuvicollector.

thestraycat avatar thestraycat commented on June 9, 2024

I also have the same issue! I have 3 ruuvi tags and running ruuvi collector from a RPI Zero W...I'm getting a lot of [no data] alerts in Grafana and the system crashes within a few days. Anything i can generate to help?

from ruuvicollector.

DG12 avatar DG12 commented on June 9, 2024

@thestraycat How often does this happen? predictably?
Please be more specific regarding "crashes".
Can you provide syslog messages (/var/log )? systemctl status ruuvicollector
How does it (you? ) recover?
+++++++++++++++
To determine if this is memory/swapping/paging problem we need to start recording statics. To set up to collect 15 second samples 39 times every 10 minutes add this SINGLE line to
/etc/cron.d/atsar

*/10 * * * * root (LOGDIR=/var/log/atsar; CURDAY=date +%d;/usr/lib/atsar/atsadc 15 39 $LOGDIR/atsa$CURDAY )
Be sure to scroll to the right to get all the command ending with CURDAY)

Be sure the directory /var/log/atsar exits, if it doesn't then
sudo mkdir /var/log/atsar

Next time the error occurs we will use a command SOMETHING LIKE
atsar -s 12:25 -e 12:40 -p
(where -s (start and -e (end) cover the failing period) to show statistics.
+++++++++++++++
What is output from
grep --invert-match '^#' /home/pi/RuuviCollector/ruuvi-collector.properties |uniq
+++++++++++++
top -d 10 |head

from ruuvicollector.

anakuron avatar anakuron commented on June 9, 2024

I have the same issue, fresh install on rpi4, ruuvi collector was fine for a couple of days but now it crashes after ~5h of use, there are no errors, but if I restart the collector it continues to work. I'm wondering does it have something to do with the bluez utils, mine's version 5.5

from ruuvicollector.

Scrin avatar Scrin commented on June 9, 2024

Are you running the latest version of the collector, and does it print anything in the log when it crashes?

I have now set up a 2nd Raspberry pi with the latest Raspbian (10, Buster) to see if I can reproduce this somehow, but without any additional details it's really difficult to pinpoint whether the issue is with the collector or something else

from ruuvicollector.

anakuron avatar anakuron commented on June 9, 2024

Are you running the latest version of the collector, and does it print anything in the log when it crashes?

Yes, latest version by cloning. To be more specific I don't think it technically crashes, it just stops writing the data. At some point, I see that there's no new data coming into Grafana so if I check the status of the RuuviCollector it seems to be running fine, no errors or anything. When I restart the service the data starts to flow in.

I'll keep it running and see if it produces any errors
is there some other logs I could check except the service logs from
journalctl -u RuuviCollector and systemctl status ruuvi-collector.service

from ruuvicollector.

Scrin avatar Scrin commented on June 9, 2024

RuuviCollector logs everything to the standard output, which can be seen with journalctl if running the collector as a systemd service. The collector should log all anomalies that prevent data from being saved as well as any exits (clean or not), so if nothing appears in the logs, it sounds like it's being stuck somewhere, could be that hcidump stops outputting data (without exiting) or something else.

if/when this happens again and the collector seems "stuck", could you try taking a thread dump and posting it here?
You can take a thread dump by first finding the PID of the process (for example, ps -ef | grep java the pid is in the 2nd column, after the username), then issue kill -3 PID (replace the PID with the PID of the java process) and then Java should print the thread dump to the standard output, which should appear in journalctl -u RuuviCollector in your case.

from ruuvicollector.

anakuron avatar anakuron commented on June 9, 2024

After the three "crashes" in a row during one day, the collector has been running happily for over a month without any problems. Last Friday it didn't get stuck but actually crashed and did close the service so I can't take the thread dump, However, There's a clear log this time.

The log states that there were some issues with the Bluetooth device, just restarting the service gets it running again.

Given the timeframe this can happen It seems to me that this could be very hard to debug, but I'm pretty sure both problems are related to the Bluetooth device in some way

What would be the recommended way to just restart the service when something like this occurs?

collector.service - RuuviCollector
   Loaded: loaded (/etc/systemd/system/ruuvi-collector.service; enabled; vendor preset: enabled)
   Active: failed (Result: exit-code) since Thu 2019-10-10 17:31:26 +08; 3 days ago
  Process: 307 ExecStart=/usr/bin/java -jar /home/pi/git/ruuvi-collector/target/ruuvi-collector-0.2.jar (code=exited, status=1/FAILURE)
 Main PID: 307 (code=exited, status=1/FAILURE)

Oct 10 17:31:15 kosong systemd[1]: Started RuuviCollector.
Oct 10 17:31:21 kosong RuuviCollector[307]: 2019-10-10 17:31:21.058 DEBUG [Main] Starting scan with: [hcitool, lescan, --duplicates, --passive]
Oct 10 17:31:21 kosong RuuviCollector[307]: 2019-10-10 17:31:21.089 DEBUG [Main] Starting dump with: [hcidump, --raw]
Oct 10 17:31:21 kosong RuuviCollector[307]: 2019-10-10 17:31:21.111 INFO  [Main] BLE listener started successfully, waiting for data...
Oct 10 17:31:21 kosong RuuviCollector[307]:  If you don't get any data, check that you are able to run 'hcitool lescan' and 'hcidump --raw' without issues
Oct 10 17:31:26 kosong RuuviCollector[307]: 2019-10-10 17:31:26.415 ERROR [Main] Can't attach to device hci0. No such device(19): Check that your bluetooth adapter is enabled and working properly
Oct 10 17:31:26 kosong RuuviCollector[307]: 2019-10-10 17:31:26.495 INFO  [Main] Unclean exit
Oct 10 17:31:26 kosong systemd[1]: ruuvi-collector.service: Main process exited, code=exited, status=1/FAILURE
Oct 10 17:31:26 kosong systemd[1]: ruuvi-collector.service: Failed with result 'exit-code'.

from ruuvicollector.

DG12 avatar DG12 commented on June 9, 2024

To the file:
/lib/systemd/system/ruuvicollector.service
please add

After=network.target
Restart=always
RestartSec=300

The RestartSec delays the restart after failure to prevent rapid crash, restart, crash...cycling if something (like full file system) goes really wrong.

from ruuvicollector.

DG12 avatar DG12 commented on June 9, 2024

Are there any related messages in other /var/log files?
What is in your /etc/rsyslog.conf and /etc/rsyslog.d/* ?

The entries you've shown only show startup related to RuuviCollector, none related to failure.

from ruuvicollector.

anakuron avatar anakuron commented on June 9, 2024

Thanks for the restart config, I found the problem for the crash from the syslog but after this, it has gotten stuck again, so I also managed to get the thread dump.

rsyslog.conf was the default, looks like this

rsyslog.d/ was empty by default

so on 10th when going through the syslog I found that around of the time of the drop there's something like this

Oct 10 17:17:22 kosong dhcpcd[523]: eth0: carrier lost

full log here

and it goes up and down for a few times and this seems to somehow cause the whole device to restart, after this restart it somehow manages to get back up and running, but I guess it won't bring the Ruuvicollector/Bluetooth device in working state.

it seems like other people have experienced similar problems caused by the router on RPI3 and RPI4, so maybe this is the case as well but it seems that this is not a problem with the collector but the raspberry itself.

so I have restarted the service on 14th around 12:00 and it has been running until 18:27
and then the dataflow has stopped to grafana. so I checked the logs and status

-- Logs begin at Thu 2019-02-14 18:11:59 +08, end at Wed 2019-10-16 15:47:18 +08. --
Oct 14 11:49:58 kosong systemd[1]: Started RuuviCollector.
Oct 14 11:49:59 kosong RuuviCollector[6884]: 2019-10-14 11:49:59.635 DEBUG [Main] Starting scan with: [hcitool, lescan, --duplicates, --passive]
Oct 14 11:49:59 kosong RuuviCollector[6884]: 2019-10-14 11:49:59.641 DEBUG [Main] Starting dump with: [hcidump, --raw]
Oct 14 11:49:59 kosong RuuviCollector[6884]: 2019-10-14 11:49:59.643 INFO  [Main] BLE listener started successfully, waiting for data...
Oct 14 11:49:59 kosong RuuviCollector[6884]:  If you don't get any data, check that you are able to run 'hcitool lescan' and 'hcidump --raw' without issues
Oct 14 11:50:00 kosong RuuviCollector[6884]: 2019-10-14 11:50:00.716 INFO  [Main] Successfully reading data from hcidump

root@kosong:~# systemctl status ruuvi-collector.service

● ruuvi-collector.service - RuuviCollector
   Loaded: loaded (/etc/systemd/system/ruuvi-collector.service; enabled; vendor preset: enabled)
   Active: active (running) since Mon 2019-10-14 11:49:58 +08; 2 days ago
 Main PID: 6884 (java)
    Tasks: 25 (limit: 4915)
   Memory: 65.5M
   CGroup: /system.slice/ruuvi-collector.service
           ├─6884 /usr/bin/java -jar /home/pi/git/ruuvi-collector/target/ruuvi-collector-0.2.jar
           ├─6900 hcitool lescan --duplicates --passive
           └─6902 hcidump --raw

Oct 14 11:49:58 kosong systemd[1]: Started RuuviCollector.
Oct 14 11:49:59 kosong RuuviCollector[6884]: 2019-10-14 11:49:59.635 DEBUG [Main] Starting scan with: [hcitool, lescan, --duplicates, --passive]
Oct 14 11:49:59 kosong RuuviCollector[6884]: 2019-10-14 11:49:59.641 DEBUG [Main] Starting dump with: [hcidump, --raw]
Oct 14 11:49:59 kosong RuuviCollector[6884]: 2019-10-14 11:49:59.643 INFO  [Main] BLE listener started successfully, waiting for data...
Oct 14 11:49:59 kosong RuuviCollector[6884]:  If you don't get any data, check that you are able to run 'hcitool lescan' and 'hcidump --raw' without issues
Oct 14 11:50:00 kosong RuuviCollector[6884]: 2019-10-14 11:50:00.716 INFO  [Main] Successfully reading data from hcidump

so basically it's saying it's running fine
I killed the java process and got this

after that status looks like this

● ruuvi-collector.service - RuuviCollector
   Loaded: loaded (/etc/systemd/system/ruuvi-collector.service; enabled; vendor preset: enabled)
   Active: active (running) since Mon 2019-10-14 11:49:58 +08; 2 days ago
 Main PID: 6884 (java)
    Tasks: 25 (limit: 4915)
   Memory: 65.7M
   CGroup: /system.slice/ruuvi-collector.service
           ├─6884 /usr/bin/java -jar /home/pi/git/ruuvi-collector/target/ruuvi-collector-0.2.jar
           ├─6900 hcitool lescan --duplicates --passive
           └─6902 hcidump --raw

Oct 16 15:49:47 kosong RuuviCollector[6884]: "G1 Conc#0" os_prio=0 cpu=0.05ms elapsed=187188.73s tid=0xb5f71c00 nid=0x1ae8 runnable
Oct 16 15:49:47 kosong RuuviCollector[6884]: "G1 Refine#0" os_prio=0 cpu=4.13ms elapsed=187188.73s tid=0xb5fd8400 nid=0x1ae9 runnable
Oct 16 15:49:47 kosong RuuviCollector[6884]: "G1 Refine#1" os_prio=0 cpu=0.08ms elapsed=187187.20s tid=0x71e9a800 nid=0x1afa runnable
Oct 16 15:49:47 kosong RuuviCollector[6884]: "G1 Young RemSet Sampling" os_prio=0 cpu=41612.63ms elapsed=187188.73s tid=0xb5fd9800 nid=0x1aea runnable
Oct 16 15:49:47 kosong RuuviCollector[6884]: "VM Periodic Task Thread" os_prio=0 cpu=107240.78ms elapsed=187188.56s tid=0x72b8d800 nid=0x1af2 waiting on condition
Oct 16 15:49:47 kosong RuuviCollector[6884]: JNI global refs: 14, weak refs: 0
Oct 16 15:49:47 kosong RuuviCollector[6884]: Heap
Oct 16 15:49:47 kosong RuuviCollector[6884]:  garbage-first heap   total 63488K, used 37583K [0x75c00000, 0xb3a00000)
Oct 16 15:49:47 kosong RuuviCollector[6884]:   region size 1024K, 34 young (34816K), 1 survivors (1024K)
Oct 16 15:49:47 kosong RuuviCollector[6884]:  Metaspace       used 10502K, capacity 10626K, committed 10648K, reserved 11568K

so now I just restarted the service and it's running for now.

from ruuvicollector.

Scrin avatar Scrin commented on June 9, 2024

Thanks for the thread dump. The relevant part:

Oct 16 15:49:47 kosong RuuviCollector[6884]: "main" #1 prio=5 os_prio=0 cpu=123603.16ms elapsed=187188.73s tid=0xb5f0a400 nid=0x1ae5 runnable  [0xb60f3000]
Oct 16 15:49:47 kosong RuuviCollector[6884]:    java.lang.Thread.State: RUNNABLE
Oct 16 15:49:47 kosong RuuviCollector[6884]:         at java.io.FileInputStream.readBytes([email protected]/Native Method)
Oct 16 15:49:47 kosong RuuviCollector[6884]:         at java.io.FileInputStream.read([email protected]/FileInputStream.java:279)
Oct 16 15:49:47 kosong RuuviCollector[6884]:         at java.io.BufferedInputStream.read1([email protected]/BufferedInputStream.java:290)
Oct 16 15:49:47 kosong RuuviCollector[6884]:         at java.io.BufferedInputStream.read([email protected]/BufferedInputStream.java:351)
Oct 16 15:49:47 kosong RuuviCollector[6884]:         - locked <0x75debea8> (a java.lang.ProcessImpl$ProcessPipeInputStream)
Oct 16 15:49:47 kosong RuuviCollector[6884]:         at sun.nio.cs.StreamDecoder.readBytes([email protected]/StreamDecoder.java:284)
Oct 16 15:49:47 kosong RuuviCollector[6884]:         at sun.nio.cs.StreamDecoder.implRead([email protected]/StreamDecoder.java:326)
Oct 16 15:49:47 kosong RuuviCollector[6884]:         at sun.nio.cs.StreamDecoder.read([email protected]/StreamDecoder.java:178)
Oct 16 15:49:47 kosong RuuviCollector[6884]:         - locked <0x75deff08> (a java.io.InputStreamReader)
Oct 16 15:49:47 kosong RuuviCollector[6884]:         at java.io.InputStreamReader.read([email protected]/InputStreamReader.java:185)
Oct 16 15:49:47 kosong RuuviCollector[6884]:         at java.io.BufferedReader.fill([email protected]/BufferedReader.java:161)
Oct 16 15:49:47 kosong RuuviCollector[6884]:         at java.io.BufferedReader.readLine([email protected]/BufferedReader.java:326)
Oct 16 15:49:47 kosong RuuviCollector[6884]:         - locked <0x75deff08> (a java.io.InputStreamReader)
Oct 16 15:49:47 kosong RuuviCollector[6884]:         at java.io.BufferedReader.readLine([email protected]/BufferedReader.java:392)
Oct 16 15:49:47 kosong RuuviCollector[6884]:         at fi.tkgwf.ruuvi.Main.run(Main.java:79)
Oct 16 15:49:47 kosong RuuviCollector[6884]:         at fi.tkgwf.ruuvi.Main.run(Main.java:70)
Oct 16 15:49:47 kosong RuuviCollector[6884]:         at fi.tkgwf.ruuvi.Main.main(Main.java:30)

indicates that the main thread is still alive and in an expected state, more specifically it's waiting for more output from the hcidump process, which also means the the hcidump process is still running (as it should be). However, for some reason hcidump doesn't seem to be outputting any more data for some reason that lies somewhere outside RuuviCollector itself, ie. the bluetooth device itself may have ended up in some weird state, another process may have taken control over bluetooth, or something like that.

When this happens, you can try to run hcidump yourself with hcidump --raw and see if you get any output and whether there's any change in RuuviCollector while doing it, or manually performing a scan with hcitool lescan --duplicates --passive and seeing if anything happens (this with RuuviCollector still running), there's a small chance that RuuviCollector may start outputting data again when you do this.

from ruuvicollector.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.