Announcement

Collapse
No announcement yet.

ESP Broker CRASHES Always with the SECOND Connection Attempt

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

    ESP Broker CRASHES Always with the SECOND Connection Attempt

    Michael... you helped me way back try to resolve this issue. Now the covid19 is allowing me so much time, I want to get rid of CRASHES forever. Have been conversing with Martin Ger, the creator of uMQTTBroker which operates the mqtt in the ESP. Between both of you I am hoping to get enough information to find a solution.

    Broker connects successfully after a “fresh” ESP boot, but ALWAYS crashes on the SECOND connection attempt from HS3... but will connect multiple times to other clients.

    Observation:
    • Start ESP Broker
    • Broker properly connects to its unique CLIENTS (SONOFF/Tasmota and Custom ESP)
    • HS3 properly Connects to Broker
    • All Is well...
    • Disable HS3 connection to Broker
    • Enable HS3 connection
    • Broker receives HS3 MQTT Client ID
    • CRASH... sometimes ESP auto-starts, sometimes it is dead
    I have this reduced to a very minimal test platform with NO external clients and ONLY HS3 communication... same result.

    So, although the crash dump shows the failure location in uMQTTBroker library, it only fails with connections to mcsMQTT. I will be trying to learn some more complex debugging skills today.

    Confusing to me. Your thoughts can only help... LZH

    Crash dump shows:
    Exception 29: StoreProhibited: A store referenced a page mapped with an attribute that does not permit stores PC: 0x4021ebee:
    append_string at C:\Users\r\Documents\Arduino\libraries\uMQTTBroker-master\src\mqtt_msg.c line 45 EXCVADDR: 0x00000003 which is in uMQTT code...
    staticint ICACHE_FLASH_ATTR append_string(mqtt_connection_t * connection, constchar *string, int len) {

    if (connection->message.length + len + 2 > connection->buffer_length)
    return -1;
    connection->buffer[connection->message.length++] = len >> 8; THIS IS CRASH LINE 45
    connection->buffer[connection->message.length++] = len & 0xff;
    os_memcpy(connection->buffer + connection->message.length, string, len);
    connection->message.length += len;
    return len + 2;
    }















    #2
    It seems that the error message is telling you that code that was calling append_sting procedure is passing a pointer in parameter connection to an area of memory that is out of valid range. Perhaps something like indexing into an array with a negative array subscript due to a data type being signed when the code assumed it was unsigned. Insert diagnostic code in the calling procedure(s).

    mcsMQTT attempts a reconnect in the same manner as the initial attempt so I would expect the support library M2MQTT would be doing the same socket level sequence. I have not looked at the library to try to understand this sequence. The reconnect is done on 10 second intervals.

    Comment


      #3
      Michael,

      Have another unusual set of data and could use your opinions.

      I am also sending this off to Martin Ger, because it is his code that can not recover; your skilled mqtt knowledge could help.

      New Information

      My ESP crash is NOT triggered by the Second mqtt Connect, but is triggered by an “offline” data communication.

      Process

      The ESP loaded program is simple; increment a counter, publish value, connect to HS3 when asked

      Boot ESP;

      Get connection onData topic/data is xxxPCFEB2019/mcsMQTT/LWT Online

      HS3 is receiving the count

      mcsMQTT Broker Connection... Check the Disconnect from MQTT Brokers” box AND KEEP IT CHECKED

      Should I be seeing the mqtt offline coming through???

      wait wait wait...lots of wait...

      ?? NO “offline” data communication from HS3... I am assuming this is OK but I ask the question?

      Now, tired of waiting...

      Unchecked the box allowing new connection attempt

      HS3 sends mqtt: onData topic/data is xxxPCFEB2019/mcsMQTT/LWT Offline

      CRASH






      Comment


        #4
        The broker sends the offline status based upon the prior connection where mcsMQTT disclosed the LWT topic and payload. If you are not getting the offline status then you need to look at your broker.

        mcsMQTT will send the LWT Online status when a connection is successful.

        It tells me your broker does not recognize the mcsMQTT has disconnected (intentionally). It should know it when the connection was dropped or when the next ping failed to return a response.

        Comment


          #5
          Michael... I have been working with Martin Ger, the creator of the ESP library, uMQTTBroker. I provided him my WireShark analysis. He asked for your mcsMQTT sequence to connect:
          You can ask, why it tries to establish a second parallel session with the same ClientId (see the analysis from my last post). To my understanding this should not happen and I don't know what to do in this case.
          Let me know what you think, I will pass it on.

          I am attaching my WireShark analysis should you be interested. LZH

          Attached Files

          Comment


            #6
            The second connection is requested when the connection status of the first indicates "not connected". Once the connection has been dropped then there is no way for mcsMQTT to communicate with the MQTT Broker. A new connection is needed. The client ID does not change because it is the same client connecting as before. Just as if the client had performed a power cycle and looking for its initial connection.

            The question to be be answered is why the connection was dropped or why it appeared to have dropped.

            Looking at the M2MQTT library there are two code paths to "not connected". M2MQTT is available on GitHub. One when there is a socket level exception as shown below. This is the most likely path where the connection reset by the broker.

            Click image for larger version

Name:	Capture.PNG
Views:	26
Size:	38.5 KB
ID:	1429466

            The second is on an intentional closure of the socket. This could be initiated by mcsMQTT client, but only after connection status fails previously or during a normal shutdown.

            Click image for larger version

Name:	Capture1.PNG
Views:	12
Size:	21.1 KB
ID:	1429467

            There is no indication in the Wireshark capture of a connection being intentionally disconnected, yet it looks like the .NET socket library has raised an exception about a port 1883 no longer being available.

            There should never be a case where the same MQTT client ID has multiple connections. When this is done with Mosquitto broker then the results seen by the client are not as expected. What seems like the appropriate broker action when a new connection is requested by a client ID that already has a connection is to close the prior connection. Alternately it could reject the connection because of the duplicate client ID.

            Comment

            Working...
            X