Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

esp32 switch_example issues #155

Open
rgb-ryan opened this issue Aug 29, 2024 · 29 comments
Open

esp32 switch_example issues #155

rgb-ryan opened this issue Aug 29, 2024 · 29 comments

Comments

@rgb-ryan
Copy link

rgb-ryan commented Aug 29, 2024

Hello. Happy day

The esp32 device is located in the st-device-sdk-c-ref/apps/esp32/switch_example path
Testing switch_example code.

There were 2 problems during the test.
I would like to ask for your help with the reason and how to solve it.

  1. The first issue

  2. Test Environment

  • phone : samsung galaxy s23
  • android ver : 14
  • smartthings ver : 1.8.17.22
  • app automation conditions

[app image]
1

  1. Issue
  1. case 1
  • Auto iteration condition 7 hours or so in the app "Routine run but some actions don't run" message and wait for the app in a stopped state.

[app image]
image

[esp log]
image

  1. case 2
  • "mqtt idn't get PINGRESP" ------> app standby after "MQTT connect success" during 21 hours of automation condition repetition. (No message in app)

  • "mqtt idn't get PINGRESP" ----> MQTT connection is lost and reconnection is repeated multiple times.
    "MQTT connect success" message and app standby.

[esp log]
image

[esp log]
image

@rgb-ryan
Copy link
Author

Please save the log file image and look at it.

@rgb-ryan rgb-ryan reopened this Aug 29, 2024
@junyoun-kim
Copy link
Collaborator

Hello @rgb-ryan Sorry for late noticing. Is there still issue ocurring?

@junyoun-kim
Copy link
Collaborator

No response from reporter

@rgb-ryan
Copy link
Author

rgb-ryan commented Dec 2, 2024

@junyoun-kim
Thank you for your response.
The issue is still ongoing. Can you please open an issue and help me?

@junyoun-kim junyoun-kim reopened this Dec 3, 2024
@junyoun-kim
Copy link
Collaborator

Hello @rgb-ryan , it looks like there was network disconnection in esp32. Network disconnection can happen depends on network environment or other factors. So it can't be always guaranteed all command(turn off/trun on) operates successfully.

@rgb-ryan
Copy link
Author

rgb-ryan commented Dec 3, 2024

@junyoun-kim
hello.

Below is a list of common errors that occur during operation. Are they all due to instability of the wifi network?

After the error message, manually turning it on/off in the 'smartthing app' works fine, but I understand that it is because the network connection is stable when manually operated.

Let's assume that the network is frequently disconnected.

The following problem does not occur in Samsung Electronics home appliances that are currently connected to the same wifi network.
Is it common to supplement the sequence with the procedure below? If there is a better way, please help.

  1. Detect network instability or disconnection.
  2. Reconnect the network.
  3. Retry the failed operation.

------------- most frequent error list ----------------

E (100613346) [IoT]: _iot_net_tls_read(600) > mbedtls_ssl_read = -0x0000

E (100613346) [IoT]: _iot_net_tls_read(600) > mbedtls_ssl_read = -0x004C

E (109648276) [IoT]: _iot_net_tls_read(600) > mbedtls_ssl_read = -0x7880

E (114469726) [IoT]: _iot_net_tls_read(600) > mbedtls_ssl_read = -0x7880

W (117058126) [IoT]: _iot_mqtt_check_alive(877) > mqtt didn't get PINGRESP

E (123974306) [IoT]: _iot_net_tls_read(600) > mbedtls_ssl_read = -0x7880

E (125370226) [IoT]: _iot_net_tls_read(600) > mbedtls_ssl_read = -0x0000

E (125370226) [IoT]: _iot_net_tls_read(600) > mbedtls_ssl_read = -0x004C
W (535496) [IoT]: _iot_mqtt_process_received_publish(457) > duplicated PUB packet 26015

E (77158806) [IoT]: _iot_net_tls_connect(488) > mbedtls_ssl_handshake = -0x6800

E (77158806) [IoT]: _iot_mqtt_connect_net(798) > MQTT net connection failed

E (77158806) [IoT]: _iot_es_mqtt_connect(938) > _iot_es_mqtt_connect error(-1)

E (77158816) [IoT]: iot_es_connect(1102) > failed to connect

E (77158826) [IoT]: _do_iot_main_command(722) > failed to iot_es_connect for communication try count 1 next after 2829 ms

E (77183286) [IoT]: _iot_net_tls_read(600) > mbedtls_ssl_read = -0x7880

I (77183286) [IoT]: _iot_net_show_status(89) > [1073427092] network socket status: sockfd 1726124974 readable 0 writable 54 sock_err 0 errno 0

E (77183296) [IoT]: _iot_es_mqtt_connect(938) > _iot_es_mqtt_connect error(-9)

E (77183306) [IoT]: iot_es_connect(1102) > failed to connect

E (77183316) [IoT]: _do_iot_main_command(722) > failed to iot_es_connect for communication try count 2 next after 4451 ms

E (77203146) [IoT]: _iot_net_tls_connect(488) > mbedtls_ssl_handshake = -0x7280

E (77203146) [IoT]: _iot_mqtt_connect_net(798) > MQTT net connection failed

E (77203146) [IoT]: _iot_es_mqtt_connect(938) > _iot_es_mqtt_connect error(-1)

E (77203156) [IoT]: iot_es_connect(1102) > failed to connect

E (77203166) [IoT]: _do_iot_main_command(722) > failed to iot_es_connect for communication try count 3 next after 8899 ms

@junyoun-kim
Copy link
Collaborator

Hello @rgb-ryan , the device SDK try to reconnect when it detects disconnection like PING fail etc. So in most cases, it will reconnect successfully within 1 ~ 2 mins.

@rgb-ryan
Copy link
Author

rgb-ryan commented Dec 3, 2024

hello @junyoun-kim ,

I understand that stdk automatically reconnects.

The problem I need to solve now is that if the device encounters a network problem while in auto-routine, it stays in standby after reconnecting (auto-routine stops).

How can I solve this issue?

@junyoun-kim
Copy link
Collaborator

junyoun-kim commented Dec 3, 2024

@rgb-ryan , On your auto-routine example, it looks like routine condition trigger is trun-off. And due to previous routine fail(fail to trun on and trun off after 1 min), it didn't trigger routine condition next time. Auto routine service don't retry failed operation.

@rgb-ryan
Copy link
Author

rgb-ryan commented Dec 3, 2024

@junyoun-kim ,

I understand that stdk does not retry if the automatic routine fails due to network problems.

I think I need to add code to save the failed routine using stdk and retry after reconnection. Can you help me?

When the automatic reconnection is done, it seems like the state is the same as calling the connection_start(); function.

I need to develop a function that can recover the failed routine after network reconnection due to network connection failure or failed command.

I need help using the stdk library to implement the above function.

@junyoun-kim
Copy link
Collaborator

@rgb-ryan There is some misunderstanding. SmartThings App don't retry failed operation not stdk. There is no way from stdk to know if there is failed auto routine operation.

@rgb-ryan
Copy link
Author

rgb-ryan commented Dec 3, 2024

@junyoun-kim ,

  1. Auto-routine commands are given by the smartthing app.
  2. If the device has a problem with its network connection and the command cannot be executed, the routine will not be recovered or retried.
  3. The device cannot know which commands were missed due to network problems.

I understand the above.

Does the network have to be perfect to solve this problem?

@junyoun-kim
Copy link
Collaborator

The command isn't delivered to the device if there is network issue. And we don't queue the failed operation and re-deliver again sometime later. It's current policy. So best we are doing is to notify there is failed routine to user.

@rgb-ryan
Copy link
Author

rgb-ryan commented Dec 3, 2024

@junyoun-kim ,

"I (46270496) [IoT]: iot_cap_commands_cb(1106) > command :

---------- iot_mqtt error occurrence time : 156sec------------

"W (46426526) [IoT]: _iot_mqtt_check_alive(877) > mqtt didn't get PINGRESP"
"I (46426536) [IoT]: st_mqtt_disconnect(1434) > mqtt disconnect 0"
I (46426576) [IoT]: iot_es_connect(1086) > connect_type: log-in
I (46427016) [IoT]: _check_connection_response(79) > Connection response payload

--------- Network reconnection time : 0.5sec

The above content is about the log that occurred when there was a network problem.

It took 156 seconds to check iot_mqtt problem. Does this mean
PINGRESP timeout is 156 seconds?

If stdk's fixed PINGRESP timeout is 156 seconds, any auto routines that are less than 156 seconds long will be lost.

@junyoun-kim
Copy link
Collaborator

Yes. it can have high possibility to miss the commands if auto routine period is shorter.

@rgb-ryan
Copy link
Author

rgb-ryan commented Dec 3, 2024

@junyoun-kim ,

MQTT Keep-Alive Time Adjustment
PINGREQ/PINGRESP Timeout Adjustment

above two things adjustable using API on the device?
If possible, it seems like it would recover the state quickly even in the event of a network problem, so that auto-routines would not be missed.

The same symptom occurred even when the turn-off auto-routine was set to 5 minutes. What is the actual set PINGRESP Timeout?

@junyoun-kim
Copy link
Collaborator

@rgb-ryan You can change st_mqtt_default_alive_interval value in src/include/iot_mqtt.h file. But minimal interval value is 40 seconds.

@rgb-ryan
Copy link
Author

rgb-ryan commented Dec 3, 2024

@junyoun-kim ,

Is it not possible to modify the switch_example project in st-device-sdk-c-ref?

I would appreciate it if you could guide me through the process of changing the alive_interval at the beginner level using example code.

@junyoun-kim
Copy link
Collaborator

The file is located in iot-core/src/include path from st-device-sdk-c-ref perspective.

@rgb-ryan
Copy link
Author

rgb-ryan commented Dec 3, 2024

@junyoun-kim

Thank you for your reply.
Change to st_mqtt_default_alive_interval 40 and try again.

You said to notify the user of a failed routine.
What APIs can receive the failed event?

If you know the failed routine, what do you think about the idea of ​​forcing the device to run the failed state?

@rgb-ryan
Copy link
Author

rgb-ryan commented Dec 4, 2024

@junyoun-kim ,

Good morning.

I changed the original value of st_mqtt_default_alive_interval from '120' to '40' and tested it.
The existing connection delay time was reduced by the value set from 156sec to 76sec.

What is the minimum configurable value of st_mqtt_default_alive_interval?
Is the value below '40' set to '0'?

---------------------------- log ----------------------------------

Sequence number return : 19

I (1097236) [IoT]: iot_cap_commands_cb(1106) > command : {"commands":[{"id":"824e32ef-88f7-4b90-91e2-fe58852c065d","component":"main","capability":"switch","command":"on","arguments":[]}]}

Notification message received

W (1173266) [IoT]: _iot_mqtt_check_alive(877) > mqtt didn't get PINGRESP

I (1173276) [IoT]: st_mqtt_disconnect(1434) > mqtt disconnect 0

I (1173276) [IoT]: _iot_net_show_status(89) > [1073427092] network socket status: sockfd 1733278563 readable 0 writable 54 sock_err 0 errno 0

I (1173316) [IoT]: iot_es_connect(1086) > connect_type: log-in

I (1173326) [IoT]: _iot_es_mqtt_connect(925) > url: mqtt-regional-apnortheast2.api.smartthings.com, port: 8883

I (1173326) [IoT]: _iot_es_mqtt_connect(931) > mqtt connect,

id : 32cd4980-f7ca-4568-8ec4-caf2d746d1d8

username : 41548fc8-a51a-46ff-83b7-4f14b7cdbec2

password : eyJhbGciOiJFZERTQSIsImt0eSI6Ik9LUCIsImNydiI6IkVkMjU1MTkiLCJ0eXAiOiJKV1QiLCJ2ZXIiOiIwLjAuMSIsImtpZCI6IlNUREtTNEUxZTlUbFFpQngifQ==.eyJpYXQiOiIxNzMzMjc4NTYzIiwianRpIjoiOWE1YzIwYzQtMmQyOC00ODEwLWFiYjUtN2JhNTgyMjNjMDcyIiwibW5JZCI6ImZ3SzMifQ==.RTSUMoogS8yaKsKjjptNdXn9d9LLGoq4+qoyZlPVtSfvquKH2FEn3qdNX6xLfdejn1TEOscWbiP0Xu8/DImLDA==

I (1173376) [IoT]: _iot_net_tls_connect(416) > Loading the CA root certificate 1317@0x3ffcb3e4

I (1174036) [IoT]: iot_es_connect(1106) > MQTT connect success sucess/try : 2/3

I (1174086) [IoT]: _check_connection_response(79) > Connection response payload {"target":"41548fc8-a51a-46ff-83b7-4f14b7cdbec2","currentTime":1733278564,"localMigrated":false,"event":"connect.success"}

@junyoun-kim
Copy link
Collaborator

Hello @rgb-ryan 40 is minimal value for the interval. And what I mean about notifying user of failed routine is notification message showing on SmartThings app. As I said, there is no failed routine event for device to know.

@rgb-ryan
Copy link
Author

rgb-ryan commented Dec 4, 2024

@junyoun-kim ,

Below, there is a warning or error message with "Notification message received" in the log of esp32.

If I can receive the warning and error message, I think I can sort it and use it to recover the device operation.

Is there an API that can receive the warning and error message from the device?
(e.g. callback function)

---------------------- esp32 log -----------------------------

W (1173266) [IoT]: _iot_mqtt_check_alive(877) > mqtt didn't get PINGRESP

W (11516926) [IoT]: _iot_mqtt_process_received_publish(457) > duplicated PUB packet 5232

E (39696) [IoT]: _iot_net_tls_connect(488) > mbedtls_ssl_handshake = -0x6800

E (39696) [IoT]: _iot_mqtt_connect_net(798) > MQTT net connection failed

E (39696) [IoT]: _iot_es_mqtt_connect(938) > _iot_es_mqtt_connect error(-1)

E (39706) [IoT]: iot_es_connect(1102) > failed to connect

E (39706) [IoT]: _do_iot_main_command(722) > failed to iot_es_connect for communication try count 1 next after 2978 ms

@junyoun-kim
Copy link
Collaborator

@rgb-ryan You mean disconnection event? Do you want disconnection event from stdk?

@rgb-ryan
Copy link
Author

rgb-ryan commented Dec 4, 2024

@junyoun-kim ,

Yes, I want to receive disconnection events. And I also want to receive error and warning events.

@rgb-ryan
Copy link
Author

rgb-ryan commented Dec 5, 2024

@junyoun-kim ,

Good morning

I had a problem with my test today.
The auto-routine sequence is turn-on <-> turn-off every 2 minutes.

  1. Turn-on was successful at the 685th sequence.
  2. An error occurred and reconnection was successful.
  3. Turn-off was successful at the 686th sequence.

After the 686th sequence, I should receive a turn-on command from the server after 2 minutes.
But I did not receive any response from the server.

There is no notification in the smartthings app after the 686th sequence either.

Please help me troubleshoot.

----------------- log ---------------------
Sequence number return : 684

I (81376776) [IoT]: iot_cap_commands_cb(1106) > command : {"commands":[{"id":"62f9234c-090d-4ab5-ae08-11508b77ca02","component":"main","capability":"switch","command":"off","arguments":[]}]}

Notification message received

I (81497906) [IoT]: iot_cap_sub_cb(908) > command : {"commands":[{"id":"947e9c42-441d-482b-b78e-7b9681f227bb","component":"main","capability":"switch","command":"on","arguments":[]}]}

called [caps_switch_cmd_on_cb] func with num_args:0

I (81497916) [IoT]: st_cap_send_attr(613) > publish event, topic : /v1/deviceEvents/41548fc8-a51a-46ff-83b7-4f14b7cdbec2, payload :

{"deviceEvents":[{"component":"main","capability":"switch","attribute":"switch","value":"on","providerData":{"sequenceNumber":685,"timestamp":"1733358888083"}}]}

Sequence number return : 685

I (81497946) [IoT]: iot_cap_commands_cb(1106) > command : {"commands":[{"id":"947e9c42-441d-482b-b78e-7b9681f227bb","component":"main","capability":"switch","command":"on","arguments":[]}]}

Notification message received

W (81500666) [IoT]: _iot_mqtt_process_received_publish(457) > duplicated PUB packet 13752

W (81503666) [IoT]: _iot_mqtt_process_received_publish(457) > duplicated PUB packet 13752

E (81503666) [IoT]: _iot_net_tls_read(600) > mbedtls_ssl_read = -0x7880

I (81503676) [IoT]: st_mqtt_disconnect(1434) > mqtt disconnect 0

I (81503676) [IoT]: _iot_net_show_status(89) > [1073427092] network socket status: sockfd 1733358893 readable 0 writable 54 sock_err 0 errno 0

I (81503726) [IoT]: iot_es_connect(1086) > connect_type: log-in

I (81503726) [IoT]: _iot_es_mqtt_connect(925) > url: mqtt-regional-apnortheast2.api.smartthings.com, port: 8883

I (81503736) [IoT]: _iot_es_mqtt_connect(931) > mqtt connect,

I (81503776) [IoT]: _iot_net_tls_connect(416) > Loading the CA root certificate 1317@0x3ffb8228

I (81506546) [IoT]: iot_es_connect(1106) > MQTT connect success sucess/try : 3/4

I (81511096) [IoT]: _check_connection_response(79) > Connection response payload {"target":"41548fc8-a51a-46ff-83b7-4f14b7cdbec2","currentTime":1733358902,"localMigrated":false,"event":"connect.success"}

I (81617606) [IoT]: iot_cap_sub_cb(908) > command : {"commands":[{"id":"849fb8ca-afb4-4c0d-86ec-00f3a7086bb3","component":"main","capability":"switch","command":"off","arguments":[]}]}

called [caps_switch_cmd_off_cb] func with num_args:0

I (81617616) [IoT]: st_cap_send_attr(613) > publish event, topic : /v1/deviceEvents/41548fc8-a51a-46ff-83b7-4f14b7cdbec2, payload :

{"deviceEvents":[{"component":"main","capability":"switch","attribute":"switch","value":"off","providerData":{"sequenceNumber":686,"timestamp":"1733359007783"}}]}

Sequence number return : 686

I (81617646) [IoT]: iot_cap_commands_cb(1106) > command : {"commands":[{"id":"849fb8ca-afb4-4c0d-86ec-00f3a7086bb3","component":"main","capability":"switch","command":"off","arguments":[]}]}

Notification message received

image

@junyoun-kim
Copy link
Collaborator

Hello @rgb-ryan , you can get disconnection event from st_status_cb callback function you put when call st_conn_start.

typedef void (*st_status_cb)(iot_status_t iot_status, iot_stat_lv_t stat_lv, void *usr_data);

/**
* @brief    st-iot-core server connection function
* @details  This function tries to connect server
* This function can't be used in callback functions such as init_cb, noti_cb, cmd_cb, status_cb
* @param[in]    iot_ctx     iot_context handle generated by st_conn_init()
* @param[in]    status_cb   user callback function to receive status of st-iot-core
* @param[in]    maps        status of st-iot-core interested to receive through status_cb
* @param[in]    usr_data    user data(a pointer) to use in status_cb
* @param[in]    pin_num     if PIN ownership validation type used, valid 8 digit pin should be set. otherwise set null.
* @return       return `(0)` if it works successfully, non-zero for error case.
*/
int st_conn_start(IOT_CTX *iot_ctx, st_status_cb status_cb,
        iot_status_t maps, void *usr_data, iot_pin_t *pin_num);

When disconnected, status_cb is called with parameters(iot_status = IOT_STATUS_IDLE, stat_lv = IOT_STAT_LV_STAY)

And you don't notify any error and warning events.

@junyoun-kim
Copy link
Collaborator

@junyoun-kim ,

Good morning

I had a problem with my test today. The auto-routine sequence is turn-on <-> turn-off every 2 minutes.

  1. Turn-on was successful at the 685th sequence.
  2. An error occurred and reconnection was successful.
  3. Turn-off was successful at the 686th sequence.

After the 686th sequence, I should receive a turn-on command from the server after 2 minutes. But I did not receive any response from the server.

There is no notification in the smartthings app after the 686th sequence either.

Please help me troubleshoot.

----------------- log --------------------- Sequence number return : 684

I (81376776) [IoT]: iot_cap_commands_cb(1106) > command : {"commands":[{"id":"62f9234c-090d-4ab5-ae08-11508b77ca02","component":"main","capability":"switch","command":"off","arguments":[]}]}

Notification message received

I (81497906) [IoT]: iot_cap_sub_cb(908) > command : {"commands":[{"id":"947e9c42-441d-482b-b78e-7b9681f227bb","component":"main","capability":"switch","command":"on","arguments":[]}]}

called [caps_switch_cmd_on_cb] func with num_args:0

I (81497916) [IoT]: st_cap_send_attr(613) > publish event, topic : /v1/deviceEvents/41548fc8-a51a-46ff-83b7-4f14b7cdbec2, payload :

{"deviceEvents":[{"component":"main","capability":"switch","attribute":"switch","value":"on","providerData":{"sequenceNumber":685,"timestamp":"1733358888083"}}]}

Sequence number return : 685

I (81497946) [IoT]: iot_cap_commands_cb(1106) > command : {"commands":[{"id":"947e9c42-441d-482b-b78e-7b9681f227bb","component":"main","capability":"switch","command":"on","arguments":[]}]}

Notification message received

W (81500666) [IoT]: _iot_mqtt_process_received_publish(457) > duplicated PUB packet 13752

W (81503666) [IoT]: _iot_mqtt_process_received_publish(457) > duplicated PUB packet 13752

E (81503666) [IoT]: _iot_net_tls_read(600) > mbedtls_ssl_read = -0x7880

I (81503676) [IoT]: st_mqtt_disconnect(1434) > mqtt disconnect 0

I (81503676) [IoT]: _iot_net_show_status(89) > [1073427092] network socket status: sockfd 1733358893 readable 0 writable 54 sock_err 0 errno 0

I (81503726) [IoT]: iot_es_connect(1086) > connect_type: log-in

I (81503726) [IoT]: _iot_es_mqtt_connect(925) > url: mqtt-regional-apnortheast2.api.smartthings.com, port: 8883

I (81503736) [IoT]: _iot_es_mqtt_connect(931) > mqtt connect,

I (81503776) [IoT]: _iot_net_tls_connect(416) > Loading the CA root certificate 1317@0x3ffb8228

I (81506546) [IoT]: iot_es_connect(1106) > MQTT connect success sucess/try : 3/4

I (81511096) [IoT]: _check_connection_response(79) > Connection response payload {"target":"41548fc8-a51a-46ff-83b7-4f14b7cdbec2","currentTime":1733358902,"localMigrated":false,"event":"connect.success"}

I (81617606) [IoT]: iot_cap_sub_cb(908) > command : {"commands":[{"id":"849fb8ca-afb4-4c0d-86ec-00f3a7086bb3","component":"main","capability":"switch","command":"off","arguments":[]}]}

called [caps_switch_cmd_off_cb] func with num_args:0

I (81617616) [IoT]: st_cap_send_attr(613) > publish event, topic : /v1/deviceEvents/41548fc8-a51a-46ff-83b7-4f14b7cdbec2, payload :

{"deviceEvents":[{"component":"main","capability":"switch","attribute":"switch","value":"off","providerData":{"sequenceNumber":686,"timestamp":"1733359007783"}}]}

Sequence number return : 686

I (81617646) [IoT]: iot_cap_commands_cb(1106) > command : {"commands":[{"id":"849fb8ca-afb4-4c0d-86ec-00f3a7086bb3","component":"main","capability":"switch","command":"off","arguments":[]}]}

Notification message received

image

Could you share your auto-routine config?

@rgb-ryan
Copy link
Author

hello @junyoun-kim ,
It repeats turn-on and turn-off every 2 minutes.

image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants