-
-
Notifications
You must be signed in to change notification settings - Fork 116
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Mqtt client does not reconnect if mqtt server is temporary down #42
Comments
I've reproduced this but can't fix it. If the MQTT is down, EMS-ESP will attempt to reconnect. Which works, except after a long wait (>10mins on my system) it fails. The bug is in the underlying asyncmqttclient code where it trips-up after multiple calls and doesn't close the pipe. So the onDisconnect callback is never called and EMS-ESP assumes it's connected. I spent some time trying to fix it but it's hard. If this is a major problem then one solution is to re-write the EMS-ESP logic to use what we had back in v1.9 and not use the mqtt library calls. |
@proddy |
On the other side MQTT was designed to support unstable network connections with delays. My other clients reconnect without a problem to the broker. Shortterm I recommend to focus on the new API development but midterm I would recommend to find a solution. |
I'll post a bug and some reproducible code to the owners of the library |
and think I'll re-write the logic and copy what esphome do : https://github.com/esphome/esphome/blob/07b3327102f7457f960940a4f5ceae8abb4686cd/esphome/components/mqtt/mqtt_client.cpp#L289 if it still hasn't reconnected it reboots the ESP :-) BTW did you try out the new API from the other branch? |
Not yet since I struggle with the actual API with my ioBroker adapter. Which branch do you mean: ft_https? And is the actual API command structure working? |
I merged the new API format into dev. It's probably not perfect but the only way to see is if people use it. It's worth trying to reproduce the ERRCONRESET. You could create a small bash script using curl and a sleep. The commands are in the |
I use 100 msecs delay during the initialization phase which runs on adapter start when I read per field the detailled info. I will test the new API version now. |
I also ran into this issue. Rebooting the ESP if all else fails seems like a good to me. Actually, would it be possible to include the option of having the ESP rebooting weekly or even nightly? It is not ideal but it seems to me like it would solve a lot of reliability issues in the short term. |
I don't think that is a real solution. MQTT and REST API V3 are working now in ESP32 very reliable. Just on MQTT Broker / Server Maintenance (>15-20 minutes) I do have this reconnection issue. Anyhow webui and/or telnet is still working. A manual MQTT restart is always possible. Midterm this should be solved to secure stable long-term operation. |
I spent some time to try and reproduce this, first in EMS-ESP by leaving the MQTT Server off for >1hr and then with a small standalone ESP32 application to help troubleshoot. I wasn't able to get the same error - it reconnected successfully each time. I'm closing this issue and we can re-open if there are more cases which can be reproduced |
This has happened to me a couple of times in the last few days, are there any logs that could be helpful? I have a hunch it's when the EMS-ESP device starts up before the DHCP server - once the IP address is finally available, MQTT never tries to connect. Edit: EMS-ESP Version v3.4.1 |
The problem is I've never been able to reproduce this. If you have an almost 100% use case where it always fails we can provide a debug version with extra trace information. The MQTT service in EMS-ESP will always try and reconnect. Let me know if you want a special build |
I'll give it a bash over the next few days and let you know if I can reliably break it. |
I can reproduce it with these steps:
Disconnect Reason Happy to try a debug build if that would be helpful. |
ok, I'll use those steps too. I take it if you hit Save on the MQTT Settings page it does reconnect? |
Yes it does. |
@daviessm Do you have ipv6 enabled? |
The network has IPv6 enabled, broadcasts SLAAC and has a DHCPv6 server, but EMS-ESP seems to only give itself a link-local address:
|
Could you reproduce the issue with ipv6 disabled? I think it's this line. Mqtt do not reconfigure if ipv6 is first and ipv4 comes later. |
With IPv6 disabled I can't reproduce the issue. Good spot. |
I suppose it's because the A secondary problem is that no global IPv6 address is being requested (through DHCP) or allocated (through SLAAC) - would you like me to raise another issue for that? Edit: SLAAC is covered in #283. |
global address: Let's wait for the next arduino core, there is a PR open. Normally |
Can you check the dev build from here, if it works for you i'll make a PR. |
Yes that seems to solve the IPv6 problem for me. |
Sorry to be a pain but I noticed this morning that my MQTT connection was disconnected again. The DHCP/DNS/MQTT server was rebooted and since then there's been no MQTT connection. This time, there's only an IPv4 address shown in the UI - no IPv6, even though IPv6 is enabled. |
Looks like a network ipv6 issue. If ems-esp got no ipv6, but mqtt host is ipv6 or dns giives back the ipv6, the connection can't work. Let's wait for arduino ipv6 update. Please keep us informed about circumstances of every connection issue. |
Sounds good. My MQTT server is set to an IPv4 IP address (no DNS) if that makes any difference. |
I can reproduce this issue with ems-esp 3.6.2 on a esp32. My MQTT broker is connected remotely via wireguard VPN of the AVM Fritzbox. If the internet connection fails (unreliable cable-internet connection) and gets connected again after some minutes - these error seems to be related to a failing ems-esp MQTT connection. |
3.6.2 is old (about 4 months and we move quick!). A lot of has been done to address the MQTT reconnect issues. If you're experiencing the same with 3.6.5 then do let us know. |
Looks like 3.6.5 isn't released yet and there was no 3.6.3 so 3.6.2 is only one version behind the latest! |
With this version I do not have any stability problems anymore. Neither MQTT nor WiFi. |
I have still MQTT reconnection issues with 3.6.5-dev.11 :-( |
did you set a BSSID? https://emsesp.github.io/docs/Troubleshooting/#ems-esp-freezes |
Yes I set a BSSID to connect to a designated wifi AP. |
can you trace with logs and show what is happening? what does the MQTT broker tell you in it's logs? Also make sure you are using unique client IDs as they may be conflicting. |
I did some software updates on my home automization system. Therefore the mqtt server was temporary down (15 minutes).
After restarting my other mqtt clients reconnected automatically but ems-esp32 does not. I had to reconnect manually by webui.
I tested it twice, but automatic reconnection does not work if mqtt server was down.
I did some more testing. If I stop the mqtt broker for a couple of minutes then reconnection works.
When I disconnect the LAN-cable from the server where the mqtt broker is running on, then ems-esp does not recognize a "disconnected" status - It stays "connected" (I tested for 3-4 minutes - IP-Address not reachable). When I just stop the service but IP stays reachable then the ems-esp recognize the disconnect status immediatly.
The text was updated successfully, but these errors were encountered: