Coder Social home page Coder Social logo

ESP32 uart handling about ems-esp32 HOT 61 CLOSED

emsesp avatar emsesp commented on May 23, 2024
ESP32 uart handling

from ems-esp32.

Comments (61)

proddy avatar proddy commented on May 23, 2024

Good question. I had a prototype on ESP32 with the UART code written in ESP-IDF and core C last year but stopped to focus solely on ESP8266 as this was more cost effective for the gateway boards. Honestly i'd prefer the ESP32 (or ESP32-S) as it has so much more flash memory and bandwidth. My version 2.0 of EMS-ESP has a core that is fully compatible with the ESP32 apart from the UART library which needs to be re-coded and tested. When I release it for ESP8266 I'll see if I can get the ESP32 up and running too. If you're handy with programming let me know as I could do with some help in testing.

from ems-esp32.

ArwedL avatar ArwedL commented on May 23, 2024

With current master I already have it compiling for ESP32 (and Win32). As I told this is the platform with that I have to go (other tasks are also running and ESP8266 isn't an option). The only part missing is UART. Unfortunately my EPS32 (ESP32-Gateway from Olimex) doesn't allow for JTAG debugging. So I am not sure if I really want to develop UART stuff with printf style debugging...
Because of JTAG I decided to do all debugging on WIN32 platform. This approach works quite well (e.g. using FTDI 3V3 TTL UART - USB converter). But of course it fails for low-level components like UART handling (which of course is ESP32 specific and can't be emulated on Win32)...

from ems-esp32.

ArwedL avatar ArwedL commented on May 23, 2024

I will take your offer and prepare something on my side and would be happy if you could test it (assuming you have a device with JTAG debug possiblity)?
In general on ESP32 platform at least sending should be quite comfortable as Espressif SDK offers uart_write_bytes_with_break() function...

from ems-esp32.

proddy avatar proddy commented on May 23, 2024

It'll be great if you can get the ESP32 uart driver working. The tricky part is the timing and detecting the BRK signals.

Also it may be easier to work on my EMS-ESP version 2.0 branch which builds both ESP8266 and ESP32 and the UART code is isolated. It also builds and runs standalone without an ESP microcontroller which I use for most of the coding and testing, saving me a ton of time. It's still in 'alpha' stage at the moment and doesn't have all the devices like 1.9.x does. The web UI is still in development too which I'm adding very soon. Up to you, what ever is easiest.

I do have a ESP32-prog board that has the JTAG interface so can debug in Visual Studio Code on my Win10 box if needed.

from ems-esp32.

MichaelDvP avatar MichaelDvP commented on May 23, 2024

There is a nice example for ESP32 using events. If this works as descriped, it should be very easy
sending with uart_write_bytes_with_break() function and receiving by event UART_BREAK and read the complete string to the pEMSRxBuf.
Sadly this does not work on ESP8266 as the event and the tx-function is missing. But as i had some problems sending much (i simulate a RC20 with id0x19 and answer every poll with ack) I've tested a new tx_mode without waiting for the whole telegram sent out.
@proddy take a look at tx_mode 4 in this emsuart. Sending is mainly done by the rx_intr. This works very good and shows, that timing isn't that critical as supposed.
I think in most dokumentation there is a misunderstanding about the bus, because people think the telegram is echoed by the master, but we have a differential 2-wire-bus like RS485, all we send is also on the rx line, so we always receive our own sendings simultaniously, the master does not echo. I was confused by oszillograms showing a delay between tx and echo, especially if rx and tx overlap, which is impossible on a halfduplex-bus.

from ems-esp32.

proddy avatar proddy commented on May 23, 2024

Thanks @MichaelDvP for the ESP32 code. I remember looking at that a while ago and my first version is based loosely off the same design and in version 2.0

I'll take a look at your tx_mode 4. You're right in that the tx_mode 1 logic is quite complicated (with checking for timeouts and breaks) while the code for EMS+ and HT3 is very basic but works equally well. And yes, the master doesn't not echo, its just the x reading back the same data we sent along the Tx.

from ems-esp32.

MichaelDvP avatar MichaelDvP commented on May 23, 2024

Yes all three modes working well, rx and tx was never lost.
My problem was only when sending much more:
[APP] Uptime: 0 days 0 hours 1 minute 58 seconds [SYSTEM] Last reset reason: Software Watchdog [SYSTEM] Last reset info: Fatal exception:4 flag:3 (SOFT_WDT) epc1:0x4000dd38 epc2:0x00000000 epc3:0x00000000 excvaddr:0x00000000 depc:0x00000000
and these resets were related to how much i'm sending, so it has something to do with tx. I found nothing in the code which can cause the exception (only EMS_TX_TO_COUNT was initialized factor 10 to high, but it does not matter) and it happens in all 3 modes.
All the modes waiting until tx is finished and that seems sometimes to long. I think there are also other things taken long and sometimes it happens that all comes together and the watchdog resets. Mode 4 sends the first byte out and returns to calling function, that saves time and i don't have the watchdog-resets again (running 2 days now with ack every poll on 0x0B and 0x19).

from ems-esp32.

ArwedL avatar ArwedL commented on May 23, 2024

I did some analysis of emsuart.cpp to understand how it works.
`How does it work currently:

  • emsuart_init installs emsuart_recvTask with
    • system_os_task(emsuart_recvTask, EMSUART_recvTaskPrio, recvTaskQueue, EMSUART_recvTaskQueueLen)
  • Receiption in interrupt handler = emsuart_rx_intr_handler
    1. After detection of break pEMSRxBuf->buffer is filled
    2. system_os_post(EMSUART_recvTaskPrio, 0, 0); // call emsuart_recvTask()
      --> this calls finally parseTelegram in context of receive task (noisy data with length between 3 and 4 bytes (including break byte) is ignored)
  • Sending when?
    • sending by emsuart_tx_buffer function (called by ems.cpp and ems_utils.cpp at several places)
    • can send directly
  • Other functions
    • emsuart_stop & emsuart_start (public but not called)
    • emsuart_tx_brk (internal only - not in header - no external call --> good)
  • All transmit is triggered because telegrams are received
    • send functions like ems_setWarmWaterOnetime only push to EMS_TxQueue.push(EMS_TxTelegram)
      --> Single threaded but functions like ems_setWarmWaterOnetime are called from another thread
      --> Potential problem that CircularBuffer isn't multi-thread protected?`

By doing that I detected potential problem and I am wondering if my conclusion is true. In principal everything seems to happen inside context of receive task (even sending) --> Singlethreading --> Good. But functions like ems_setWarmWaterOnetime are called from another task context --> Multithreading with interface CircularBuffer which isn't multi-threading safe...
Is this a real problem or is my analysis wrong?
Just wondering...

from ems-esp32.

MichaelDvP avatar MichaelDvP commented on May 23, 2024

Sending when?

Sending is ony allowed in reply to a poll or request (within 20ms). emsuart_tx_buffer is (at least) only called from parseTelegram. All functions push the messages to the queue and parseTelegram picks it from the queue and sends it to emsuart_tx_buffer.

emsuart_stop & emsuart_start (public but not called)

Called from ems-esp.cpp only for OTA-Updates.

from ems-esp32.

ArwedL avatar ArwedL commented on May 23, 2024

Thanks for clarification regarding sending...
How do you see my supsicion of having a potential multi-threading problem in CircularBuffer?

from ems-esp32.

MichaelDvP avatar MichaelDvP commented on May 23, 2024

Afair CircularBuffer isn't used in v2. But that can only answer @proddy correctly, for now i'm not so familiar with v2.

from ems-esp32.

proddy avatar proddy commented on May 23, 2024

emsuart has it's own simple buffer queue storing the complete telegrams. When one if filled it is sent to the core emsesp. CircularBuffer is not used in Tx.

But now in v2 there are two queues (std::deque), one for Rx and one for Tx, both asynchronous using std::atomic to prevent data race conditions. A telegram is sent to Tx after a poll is received on the Rx line, the Rx disabled and the whole data sent as one block.

from ems-esp32.

proddy avatar proddy commented on May 23, 2024

also @nomis pointed me to his safe buffer implementation (https://github.com/nomis/EvohomeWirelessFW/blob/master/lib/InterruptSafeBuffer/InterruptSafeBuffer.h) which I'd like to try at some point too.

from ems-esp32.

nomis avatar nomis commented on May 23, 2024

Something that was frustrating about the Rx process is that it has to wait until the break finishes before it receives the message (which could delay Tx). Ideally it should use a timer to identify when there is nothing else being received and process the message sooner. It may be possible to use the UART timer to do this but it's character sized not bit sized.

I never got Tx working on my boiler so I don't know if any of my changes to Rx would break Tx.

from ems-esp32.

ArwedL avatar ArwedL commented on May 23, 2024

emsbus.zip
I added a first version of emsuart.cpp and .h with ESP32 support. It compiles for me - I didn't yet test it.
It is ESP32 only (dropped all ESP8266 stuff)!

from ems-esp32.

proddy avatar proddy commented on May 23, 2024

@ArwedL I've added the code to the v2 repo. Builds fine but causes a reset when the uart port is opened so need to debug further. It's probably easier if you work off my latest code base.

from ems-esp32.

ArwedL avatar ArwedL commented on May 23, 2024

The latest commit which I can see on v2 branch (https://github.com/proddy/EMS-ESP/tree/v2) is from 16.1.2020 - is this the latest codebase?

from ems-esp32.

proddy avatar proddy commented on May 23, 2024

I thought that branch was empty. I deleted it. No my EMS-ESP2 is in a private repo and not quite ready for the real world. I'll grant you access so can familiarize yourself with how the modules work.

from ems-esp32.

proddy avatar proddy commented on May 23, 2024

I made a snapshot of v2 in https://github.com/proddy/EMS-ESP/tree/v2

Note it's not backward compatible for v1. So first wipe the flash on the ESP then upload the new firmware, connect to USB/Serial with 115200 speed/baud, type system to go into the system menu and use the set commands to change the wifi settings. Use help if you get lost and remember to read the README file.

from ems-esp32.

proddy avatar proddy commented on May 23, 2024

Hi @ArwedL did you get any further debugging the ESP32 UART code?

from ems-esp32.

ArwedL avatar ArwedL commented on May 23, 2024

To be honest I hoped for your side to make debugging progress... Any info so far which you can share? I had to shift focus to other open points in my ESP32 project. Will see if I can spend some time.

from ems-esp32.

proddy avatar proddy commented on May 23, 2024

no worries, I'll have a go and fixing it up after I finished merging in the web code

from ems-esp32.

ArwedL avatar ArwedL commented on May 23, 2024

I spent some time and found the issue. For me it works now (only tested receiving) - see attached update emsuart.cpp
emsbus.zip
The problem was in line "buf_handle = xRingbufferCreate(512, RINGBUF_TYPE_NOSPLIT);" which has to be called before the recvTask and parseTask are created. I didn't had any other problem. uart_driver_install worked for me.

from ems-esp32.

proddy avatar proddy commented on May 23, 2024

nice! trying it out now...

from ems-esp32.

proddy avatar proddy commented on May 23, 2024

yes, it kinda works. Data comes in (Rx) but also a lot of miss-fires. I'll take a deeper look later this week.

Annotation 2020-05-18 222237

from ems-esp32.

MichaelDvP avatar MichaelDvP commented on May 23, 2024

I could get a wemos-d1-mini32 for testing. It fits in the BBQKees board (but not in the housing).
Settings for this module with this pinout are:
sensors.h:
SENSOR_GPIO = 18; // Wemos D1-32 for compatibility D5
system.h:
LED_GPIO = 2; // on Wemos D1-32
LED_ON = HIGH;
emsuart_esp32.h:
EMSUART_RXPIN 23 // Wemos D1-32 RX pin for compatibility D7
EMSUART_TXPIN 5 // Wemos D1-32 TX pin for compatibility D8

The communication-problem is a bug in the driver. The default interrupt gives on break-intr. only the queue-flag, but does not read the fifo to the buffer (see here). The buffer is only filled in line 808. On break-intr. the telegram is mainly in the fifo and stays there until fifo full intr. Solution is to set fifo-full to 1 with a lot of irq-calls, or use our own irq-routine.
Here is a small intr routine, rx is working but i can not see tx in the log.
uart.zip
terminal

from ems-esp32.

proddy avatar proddy commented on May 23, 2024

nice Michael! I had a quick try but couldn't see any Rx come in. I could be a fault on my side (broken wires). I'll have another go later.

from ems-esp32.

MichaelDvP avatar MichaelDvP commented on May 23, 2024

I have changed the rx/tx pins for my module, you have to change back.
But i think i know what's wrong with tx, i have to set conf0.txd_brk after the tx-buffer is filled and clear it in the irq. It also seems to work with esp8266. I'll test and let you know, stay tuned.

from ems-esp32.

proddy avatar proddy commented on May 23, 2024

it was a fault cable. I'm getting Rx in but Tx causes a crash. So I think you're on the right track! nice work btw.

from ems-esp32.

ArwedL avatar ArwedL commented on May 23, 2024

Also thanks from my side for improving the code. Obviously I don't find the time for making the solution bullet-proof...
And of course I assumed correct implementation from Espressif SDK (which seems not to be the case).

from ems-esp32.

MichaelDvP avatar MichaelDvP commented on May 23, 2024

Now with working tx. I have also used the same logic for 8266, saving a lot of interrupts, no waiting for tx done and works fine with my Buderus ems. Maybe some people can test with EMS+ and HT3, i don't believe that they realy need other timings. The file is without the other modes for better reading.
It seems the mixing module is asked many times for ems+ messages, but it's only ems. Is there a way to log the raw telegram, the message id 0xDB00 looks odd.

uart.zip
esp8266-EMS.log
esp32-ser.log

from ems-esp32.

proddy avatar proddy commented on May 23, 2024

nice work Michael! you've certainly been busy. I've incorporated your changes in alpha7 on the v2 branch. I did a quick test with both ESP8266 and ESP32 and Rx works but still I have issues with Tx which are on my side. I think its because I keep switching circuits and confusing the EMS bus :)

I did notice on the ESP32 when saving to SPIFFS (like making a setting change) you will get a corrupt telegram. So either in the EMSuart::stop() or EMSuart::restart() maybe flush the FIFO?

Those log messages from your mixer unit with 0xDB00 don't look correct indeed. It could very well be something in the code somewhere. To home in on the root cause try:

log trace 21 raw
which will show you the raw telegrams from the mixer (device id 0x21), or
log trace db00 if you just want to watch that particular telegram ID.

I'll continue testing here too

from ems-esp32.

MichaelDvP avatar MichaelDvP commented on May 23, 2024

I've only done a short test with esp32, but had no issues with the uart. Over the day i'm running 1.9.6b with modified uart in this new 8266 version. Seems stable.
The mixer is in my opinion a confusion with ems+. If i send a question like 0b A1 FF 00 01 D8 20 the mixer anwers with a empty ems 21 0b FF 00, but ems-esp awaits a ems+ and takes the CRC as telegram highbyte.

from ems-esp32.

proddy avatar proddy commented on May 23, 2024

I was able to simulate that and will adjust the logic. Question: should we just ignore these short messages? They can't be parsed because there is no type ID and no data.

from ems-esp32.

proddy avatar proddy commented on May 23, 2024

I couldn't get Tx working with the latest changes, so merged the old logic back and made it configurable. set tx_mode 4 is the newer tx code (in 2.0a8). If you have time could you compare with your working version and see what could be causing this?

from ems-esp32.

MichaelDvP avatar MichaelDvP commented on May 23, 2024

What issues do you have with tx? Only on 8266 or both? I see only
[telegram] Tx buffer full. Looks like Tx is not working?
but this is caused by filling the queue with more than 20 messages (including poll-acks) before the master polls. But the messages are send (log from 8266).
tx_log.txt
The raw log helps a lot to analyse. As you see the MM10-mixer is requested with ems+ telegrams, but can only understand ems (type in telegram[2]), so he answers with type-id FF and empty ems-telegram (i have modified telegram.ccp, line 272 to:
if (data[2] < 0xF0 || length < 6) { to avoid reading outside the telegram).
(btw, mixing.cpp, line 156 should be hc_ = device_id() - 0x20 + 1; )
I think it is better to request only the telegrams implemented in the device. So for MM10 request only 0xAB, for MM50,100... request 0x01D7 only for device 0x20, 0x01D8 only for device 0x21, and so on.
Regarding the corrupt telegrams when stopping the uart: The first telegram after reenable is always corrupt (to long or first bytes missing if buffers cleared), we can set a flag to drop this first telegram.
But the interrupt routine send in the zip drops telegrams longer than 33 bytes after enable the intr. the fifo is normaly filled by the bus.

from ems-esp32.

MichaelDvP avatar MichaelDvP commented on May 23, 2024

For me all modes working fine, test with change mode, wait a minute, show emsbus, change mode, etc., starting with mode 4:
EMS Bus protocol: Buderus, #telegrams received: 252, #Read requests sent: 87, #Write requests sent: 0, #CRC errors: 0 (0%)
Tx mode = 2
EMS Bus protocol: Buderus, #telegrams received: 348, #Read requests sent: 118, #Write requests sent: 0, #CRC errors: 0 (0%)
Tx mode = 1
EMS Bus protocol: Buderus, #telegrams received: 411, #Read requests sent: 129, #Write requests sent: 0, #CRC errors: 0 (0%)
Tx mode = 3
EMS Bus protocol: Buderus, #telegrams received: 448, #Read requests sent: 140, #Write requests sent: 0, #CRC errors: 0 (0%)
Tx mode = 4
EMS Bus protocol: Buderus, #telegrams received: 536, #Read requests sent: 172, #Write requests sent: 0, #CRC errors: 0 (0%)
I've used this routine:
emsuart_esp8266.cpp.txt

With the mixer i meant something like that:
mixing.cpp.txt

Is it possible to set some device-flags to the boilers. The 0xE3 .. 0xE9 are not supportet by my boiler. I think they are used by newer boilers or only heatpumps/condensors/compressors?

Also for the thermostat it should be fine if only the active hcs would be requested.

from ems-esp32.

proddy avatar proddy commented on May 23, 2024

What issues do you have with tx? Only on 8266 or both?

On both ESP8266 and ESP32 when the Tx is sent, there is no acknowledgment, just more Rx. If it works on your setup it must work on mine too so let me experiment a little more using your latest code. Worst case I'll bring out the scope and see what is being transmitted over the EMS line.

I see only "[telegram] Tx buffer full. Looks like Tx is not working?"

Here I wanted a way to detect if the Tx was not working and thought if the Tx queue is full (max 20) this must be a good sign. With a poll happening every 1-2 seconds and doing the queue check every minute it should be pretty fail-safe. Note the poll-acks are not stored as Tx messages so these are only the real read/write commands. But you're right, there will be times when there are a lot of messages in the queue (like after a 'refresh' command) so I'll need to find a better way. Any ideas?

i have modified telegram.ccp, line 272 to:
if (data[2] < 0xF0 || length < 6) { to avoid reading outside the telegram)

correct. I had fixed that in an earlier build and modified the whole logic.

btw, mixing.cpp, line 156 should be hc_ = device_id() - 0x20 + 1;

thanks, corrected it!

I think it is better to request only the telegrams implemented in the device. So for MM10 request only 0xAB, for MM50,100... request 0x01D7 only for device 0x20, 0x01D8 only for device 0x21, and so on.

This is a very good point and I had noticed it too. Same with the boiler (your 0xE3-E9 example). Maybe by using device flags like we did with the thermostat is the right approach. And this will also save on flooding the Tx queue with bogus messages that won't be answered anyway.

Also for the thermostat it should be fine if only the active hcs would be requested.

good idea. I'll work on this too.

Regarding the corrupt telegrams when stopping the uart: The first telegram after reenable is always corrupt (to long or first bytes missing if buffers cleared), we can set a flag to drop this first telegram.

Ok. I will test again. I just noticed after each send I would get a CRC error.

it's really nice all the help you're providing, much appreciated Michael.

from ems-esp32.

MichaelDvP avatar MichaelDvP commented on May 23, 2024

What issues do you have with tx? Only on 8266 or both?

On both ESP8266 and ESP32 when the Tx is sent, there is no acknowledgment, just more Rx.

Yes, i know what you mean. I've added some more log-messages to monitor tx and polls and some answers and acks are missing. missing_answer_and_ack.log (some messages are logged from telegram and emsesp, they are received/send single but appear double in the log).
To monitor the ack better i added a log for acks and test another version with the same tx, now working. working.log.
But i have not figured out yet what caused the problem.

Here I wanted a way to detect if the Tx was not working and thought if the Tx queue is full (max 20)
I've changed to 30 and with the mentioned changes in Mixing::Mixing the message does not show up again.

I'l send you a working uart as soon as i understand what's going wrong.

edit: BTW: the test with different modes was false, i doesn't realize that i have to reboot before tx_mode is active.

from ems-esp32.

proddy avatar proddy commented on May 23, 2024

ok. I'm working on improving the wifi and mqtt calls (I still get some dropouts). I've add the changes to the mixer and also adjusted the thermostat to only fetch the active heating circuits. And I made the check for 'Tx line' more robust. Just need to add device_types to the Boilers and I think I've covered all your comments. It'll be version a10.

from ems-esp32.

proddy avatar proddy commented on May 23, 2024

Also for the thermostat it should be fine if only the active hcs would be requested.

@MichaelDvP when you query a heating circuit on your thermostat that's not active, what does it come back with? Eg. read 2a8 for HC4 ?

from ems-esp32.

MichaelDvP avatar MichaelDvP commented on May 23, 2024

Here is the uart, working on 8266 in mode 4, on esp32 i'm getting a few CRC error at start and very rarely a "TX read failed", but the device answers, so it's in the echo readback.
uart.zip

@proddy i can't read 2a8 since i have no ems+. But these are the RC35 circuits, only hc2 is active:
Thermostat_hc1-4.log

I've also found some smaler bugs and typos, should i describe them in a issue or make a pr? (I've put a fork on github with the fixes (and few changes for me like sensor-mqtt)).

from ems-esp32.

MichaelDvP avatar MichaelDvP commented on May 23, 2024

For the esp32 add in the start-routine the line EMS_UART.idle_conf.rx_idle_thrhd = 12;
sometimes the break is recognized before the line goes up, The receiver takes the end of the break as startbit and reads a 0xFF at the start of the next telegram.

from ems-esp32.

proddy avatar proddy commented on May 23, 2024

@MichaelDvP thanks for the updates. I've merged them into a11 in the v2 branch. I've also made you a contributor so feel free to hack away directly in the project branch or push PRs.

A few comments on the changes

  • I still can't get Tx working on the ESP8266. I get Tx errors so will need to look into what is causing that on my system
  • I added an MQTT format called 'CUSTOM' to replace your 'MY'
  • the only change I didn't make is the MQTT changes in mixing.cpp. I don't like the static heap allocation. Was there a reason for this? If you're getting mqtt errors try qos 1

from ems-esp32.

MichaelDvP avatar MichaelDvP commented on May 23, 2024

@proddy Thank you for the invitation.

I get Tx errors

Strange, i've tested with the esp32 and a noname wemos-8266-clone (to have the original wemos with 1.9.6 as backup reference), both without tx issues now. Btw: I found, that esp32 resets the tx-break-bit by hardware after sending, no need to do it in the code, but in 8266 we have to clear it in the code.

MQTT format called 'CUSTOM'

the intention was mainly the sensor formatting, I have 7 sensors and sometimes change them (add one on a new place, or disconnect on that is useless), Then the numbering changes and my iobroker-scripts catch the wrong sensor. With json {id:temp,...} the sensors are fixed.

MQTT changes in mixing.cpp. I don't like the static heap

No mqtt-errors, the intention was a 1.9.5 compatible json, were every mixing device can add a own nest and keep the other nests. But now i think it's better to have a json for every device. But we have to uses the device-id (also represents the setting on the switch on MM100) for numbering, otherwise mixing_data1 with hc1 will be overwitten by wwc1. With device-id wwc1 (switch position 8) will become mixing_data8.

from ems-esp32.

proddy avatar proddy commented on May 23, 2024

TxErrors

using the latest v2 I still don't get Tx working. After each send I get a timeout error. I think I need to hook up the scope.

Annotation 2020-06-01 141944

sensors & MQTT

that's a valid use case for people with more than 1 sensor so happy with any changes you commit that make it better.

MQTT & mixing

yes, the ArdunioJson library uses only the copy constructor when they are char * or consts so there may be conflicts when building the json object. I'll improve this code.

telnet performance

I've been battling to find why the telnet is always not very responsive and I think its due to the Dallas one-wire library interfering with the wifi. Have you ever experienced this? I may re-write that piece of code too.

from ems-esp32.

MichaelDvP avatar MichaelDvP commented on May 23, 2024

Oh, i thought the issues are with tx_mode 4. Now I tested mode 1 a bit longer and can reproduce this errors, but only one error in a few minutes. There are timeouts and break-interruptions and collisions with the next telegram, resultion in bad CRC. Something seems to interrupt and delay the tx routine very long, is there somthing time consuming (like dallas) in another thread?
Do you have the errors also with mode 4?
this disablees all interrupt while sending, no error in the first 8 minutes now.
emsuart_esp8266 .cpp.txt

from ems-esp32.

proddy avatar proddy commented on May 23, 2024

I'll do some checking this afternoon. Also re-writing the sensor code as it seems to also block the wifi/lwip. There are a couple of delay() calls in the onewire library

from ems-esp32.

MichaelDvP avatar MichaelDvP commented on May 23, 2024

Yes, the onewire is burning a lot of time. Another thing: the RxService::loop() should process a telegram as fast as possible, but is delayed (RX_LOOP_WAIT), skip this.

from ems-esp32.

proddy avatar proddy commented on May 23, 2024

@MichaelDvP i added the RX_LOOP_WAIT because I thought it was slowing down the telnet, but in the end the culprit was the 1-wire library. So happy to remove it. Except, why should we process an Rx telegram as fast as possible? They're queued up and can be processed every 300ms without effecting any Tx

from ems-esp32.

MichaelDvP avatar MichaelDvP commented on May 23, 2024

Ups, you're right, i confused process_telegram an incoming_telegram. The short reaction time is on master poll to tx. I added code to check the time from receiving interupt to tx transmit call, and sometimes this takes more than 20 ms. I wonder what happens in this time?

from ems-esp32.

proddy avatar proddy commented on May 23, 2024

that is odd. Tx should happen immediately after a poll. What you could try is to comment out the sensor_.start() and sensors_.loop() in emsesp.cpp. This made my setup run a zillion times faster.

from ems-esp32.

MichaelDvP avatar MichaelDvP commented on May 23, 2024

@proddy

i added the RX_LOOP_WAIT because I thought it was slowing down the telnet

i thought about that, but we have only a few messages per second and if there is no message the function also returns. I see no benefit. To give telnet/wifi more time i think it's better to add a delay in the main loop. Since the tx-reaction is complete in the emsuart_recvTask, this tasks should have enough time to complete. I'm trying now a delay(MYESP_DELAY) as in 1.9 in the EMSESP::loop() and it seems to help. Also terminal seems more responsive.

from ems-esp32.

proddy avatar proddy commented on May 23, 2024

@MichaelDvP I found and fixed the issue that was causing Tx to fail with the old logic (tx_mode 1). The value of the timeout was too short. Should be 1760. I suspect a typo in the macro when it was copied over from 1.9.5.

I still can't get the newer Tx code to work (tx_mode 4). Giving me the same errors. Perhaps also a timing error?

Capture

from ems-esp32.

MichaelDvP avatar MichaelDvP commented on May 23, 2024

I don't believe it is the timeout value. I've changed it as i was working on uart, the 10 seems to me as a typo.
The timeout is counting loops, each loop is EMSUART_BUSY_WAIT long, which is 1/8 bittime. With EMS_TX_TO_COUNT set to 22 * 8 * 10 you get a timeout of 22 * 8 * 10*bittime/8, e.g. 220 bittimes or 22000 µs. If you wait so long for one byte, you'll get a collision with the next master-poll in the second byte. With 22 * 8 the timeout is 22 bittimes, 2200µs, a bit more than the EMS+ fixed wait.

The error messages indicates that there is no response from the destination, right?
So first question, did we send? Can we receive the echo or is it missing?
If there is a echo, what do we receive next, What message is recieved, that triggers the rx, but does not match the tx_waiting? We should log the raw telegrams including break direct in recvTask to see what's come in. For timing it can be usefull to increase the priority of the recvTask.

from ems-esp32.

MichaelDvP avatar MichaelDvP commented on May 23, 2024

I logged with syslog to see the tx-errors and there is a another strange thing. I get reboots every 2 hours (mark is set to 2h, can it be that?) and always after 4 complete tx-errors and the first error of 5th. But the time between retries is very long, seems the counter is'nt cleared in between.

Another thing: [system] and [network] logs with local time (mest), [emseesp] (and also [boiler], [thermostat]) logs with utc.
syslog.txt

from ems-esp32.

proddy avatar proddy commented on May 23, 2024

I don't believe it is the timeout value. I've changed it as i was working on uart, the 10 seems to me as a typo.
The timeout is counting loops, each loop is EMSUART_BUSY_WAIT long, which is 1/8 bittime. With EMS_TX_TO_COUNT set to 22 * 8 * 10 you get a timeout of 22 * 8 * 10*bittime/8, e.g. 220 bittimes or 22000 µs. If you wait so long for one byte, you'll get a collision with the next master-poll in the second byte. With 22 * 8 the timeout is 22 bittimes, 2200µs, a bit more than the EMS+ fixed wait.

The error messages indicates that there is no response from the destination, right?
So first question, did we send? Can we receive the echo or is it missing?
If there is a echo, what do we receive next, What message is recieved, that triggers the rx, but does not match the tx_waiting? We should log the raw telegrams including break direct in recvTask to see what's come in. For timing it can be usefull to increase the priority of the recvTask.

You're right, 1760ms is a long time within the loop. I'd rather just forget the "tx_mode 1-3" and work on your new and improved Tx logic and figure out why it doesn't work on my setup. I'll also ask BBQKees is he's willing to try out a few things on his boiler. Is there anything specific with your environment? There is a difference with timings between EMS+ and EMS1.0 and I'm on EMS1.0.

from ems-esp32.

proddy avatar proddy commented on May 23, 2024

I logged with syslog to see the tx-errors and there is a another strange thing. I get reboots every 2 hours (mark is set to 2h, can it be that?) and always after 4 complete tx-errors and the first error of 5th. But the time between retries is very long, seems the counter is'nt cleared in between.

Another thing: [system] and [network] logs with local time (mest), [emseesp] (and also [boiler], [thermostat]) logs with utc.
syslog.txt

I'll create a separate issue for this and track it there.

from ems-esp32.

MichaelDvP avatar MichaelDvP commented on May 23, 2024

I've logged the time from rx-intr to send and found the it's always this check:
if (millis() > (emsRxTime + EMS_RX_TO_TX_TIMEOUT)) { // send allowed within 20 ms
return EMS_TX_WTD_TIMEOUT;
that cause the error. I replaced this code by
LOG_DEBUG(F("Responsetime: %d"), uuid::get_uptime() - emsRxTime);
and it's mainly 0 or 1 but sometimes 29 ms (no other values), but does not give a collision with next telegram. We should skip this check completly.

from ems-esp32.

proddy avatar proddy commented on May 23, 2024

I see a responsetime of 0 and sometimes 1 with "tx_mode 1". I just can't get tx_mode 4 working, even by adding some delay after each bit write. I'm so busy with the web interface I don't really have time to get the scope out to see how the timings are off on my EMS 1.0 system.

from ems-esp32.

proddy avatar proddy commented on May 23, 2024

Closing this. Covered in emsesp/EMS-ESP#398

from ems-esp32.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.