Announcement

Collapse
No announcement yet.

HS3 crashing when entering mcsMQTT plugin config and flood of messages

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

    HS3 crashing when entering mcsMQTT plugin config and flood of messages

    I'm wondering if anybody else is having some issues with HS3 crashing, after running mcsMQTT and a large number of messages are sent in.

    I have an energy monitor that sends ~50 messages at a time every 6 seconds, plus a few other updates from other sensors.

    I had narrowed it down the crashing issue to mcsMQTT and the inflow of messages because when I disable my MQTT server; everything works fine.
    The crashing involves HS3 crashing completely with some other plugins in Windows. It needs a complete restart of the HS3 application (but not the computer/OS). The crashing is slightly random, but happens most of the time (more than 3/4 of the time). The crashing happens with much higher certainty (90%+) when discover new topics (subscription to #) is selected. In either case, usually if it had happened once, then it is quite likely to happen again if I repeat the action of entering the mcsMQTT config after a restart of HS3; the best way to avoid this is just to leave it alone for a while, and give it a shot or disable the MQTT server first.

    HS3 is running on 64bit Windows 8.1 Pro, with 8GB RAM on a Intel NUC with Core i5 processor. CPU utilization is usually quite low (Avg 10-%, 3-4% typical with 30-40% bursts every few seconds). Disk is relatively good too, SSD with no significant transaction queue. One thing to note is that the MQTT server runs on the same computer in a virtual machine within VMWare Workstation, along with the client software that polls the energy monitor. So each poll of the energy monitor, sends a bunch of updates to a database, plus all the updates to MQTT which is then pulled into HS3 vs mcsMQTT. There is another database that polls MQTT for the same data and stores everything; in other words a small chain reaction occurs from each update. There are about 530 devices within the HS3 list, includes virtual, some are status, MQTT, plugins, parent devices, etc.

    In mcsMQTT my queue is set to 30, and I have disabled discover new topics. If I don't touch the settings page, it will run fine for a long time. However, HS3 still feels a bit more twitchy than usual. If I do something like an Zwave Optimize or Full Optimize for a device, it may crash (maybe about 20% of the time).

    It feels like there is a timing/delay issue at hand here; that something is not happening when it's expected to (because of a delay from a flood of messages), and the computer just gives up.

    Any help or experience to share would be appreciated.

    Some stats attached below.

    Message Receive Queue
    Current Queue Depth
    0
    Max Queue Depth
    34
    Average Receive Processing Milliseconds
    39
    Average Receive Milliseconds
    221

    #2
    Your description seems to indicate a startup initialization problem when everything is happening. There is not as much debug now as when timing was the focus, but there is still some. Can you debug enabled so the startup information is captured and and I can see if anything unusual is observed. If not then some isolation builds will be the likely next step.
    The debug file is in \Data\mcsMQTT\mcsDebug.txt. It is enabled from the General tab.

    Comment


      #3
      In the attached I changed the initialization sequence so that MQTT broker connection is not made until all internal initialization is completed. See if it helps. Unzip into the HS folder.
      Attached Files

      Comment


        #4
        Hi Michael

        I've decided to upgrade to the latest version of your plugin and recreate all the required devices. I'm left with this, is this something I need to resolve in a database editor. This entry is always there.....

        Cheers..Pete
        Attached Files
        HS 2.2.0.11

        Comment


          #5
          I'm guessing that the database had a device that no longer exists. The database editor is the easiest way to get rid of it.

          Comment


            #6
            Hi Michael,

            I will be sending you a link to my log via PM. I've collected some data and it's a little too long to include here. My novice eyes can't see anything outstanding that might help explain the crash though.

            Will try the file you just posted and let you know. Thanks again.

            Originally posted by Michael McSharry View Post
            Your description seems to indicate a startup initialization problem when everything is happening. There is not as much debug now as when timing was the focus, but there is still some. Can you debug enabled so the startup information is captured and and I can see if anything unusual is observed. If not then some isolation builds will be the likely next step.
            The debug file is in \Data\mcsMQTT\mcsDebug.txt. It is enabled from the General tab.

            Comment


              #7
              I looked at you file and see an issue with the ability of mcsMQTT and the broker to establish a connection. mcsMQTT tries at 10 second intervals. In the sequence below the broker did not service the connection request in the time allowed. 10 seconds later it tried again, but now the broker refused to connect. Likely had connected sometime in the past 9 seconds and now the port is connected to a mcsMQTT attempt that had timed out and is no longer being serviced. On the 4th attempt a connection was established.

              15 minutes later mcsMQTT saw the connection status as being no longer connected and it tries to reconnect.

              My recommendation is to put the MQTT broker and mcsMQTT on different computers. The IP layer protocol that you have right now between these two seems fragile with excessive delays in responding and status not stable. I don't have much experience with VM, but putting broker on a real machine is what I suggest for as a minimum a failure isolation step.

              What is not obvious from the debug is why the inability to connect and keep a connection has a negative impact on HS. I see several breaks in the debug file with nothing pointing a finger. The debug file contains the current plugin execution sequence and it also has remnants of other debug collected that had not yet been overwritten. Startups will be at the start of the file so there will not be any prior startups with the preceding lines contains what happen just before the new startup. What I can see is that HS was calling mcsMQTT Shutdown IO at the end so at least for the ones in the debug HS had sufficient control to execute the normal shutdown protocol of a plugin.

              Code:
              2018-06-24 4:29:30 PM	3304131	| Calling MQTTclient  
              2018-06-24 4:29:30 PM	3304131	| Calling MQTT Connect  
              2018-06-24 4:29:51 PM	3325131	| StartMQTT Connection attempt1 to Broker. Exception: uPLibrary.Networking.M2Mqtt.Exceptions.MqttConnectionException: Exception connecting to the broker ---> System.Net.Sockets.SocketException: A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond 10.19.0.81:1883
              
              2018-06-24 4:30:01 PM	3335133	| Calling MQTTclient  
              2018-06-24 4:30:01 PM	3335133	| Calling MQTT Connect  
              2018-06-24 4:30:02 PM	3336126	| StartMQTT Connection attempt1 to Broker. Exception: uPLibrary.Networking.M2Mqtt.Exceptions.MqttConnectionException: Exception connecting to the broker ---> System.Net.Sockets.SocketException: No connection could be made because the target machine actively refused it 10.19.0.81:1883
              
              2018-06-24 4:30:12 PM	3346128	| Calling MQTTclient  
              2018-06-24 4:30:12 PM	3346128	| Calling MQTT Connect  
              2018-06-24 4:30:13 PM	3347131	| StartMQTT Connection attempt1 to Broker. Exception: uPLibrary.Networking.M2Mqtt.Exceptions.MqttConnectionException: Exception connecting to the broker ---> System.Net.Sockets.SocketException: No connection could be made because the target machine actively refused it 10.19.0.81:1883
              
              2018-06-24 4:30:34 PM	3368139	| MQTT Broker Connection Accepted, Connected=True  
              
              2018-06-24 4:44:44 PM	4218275	| Calling MQTTclient  
              2018-06-24 4:44:44 PM	4218275	| Calling MQTT Connect  
              2018-06-24 4:44:45 PM	4219282	| StartMQTT Connection attempt1 to Broker. Exception: uPLibrary.Networking.M2Mqtt.Exceptions.MqttConnectionException: Exception connecting to the broker ---> System.Net.Sockets.SocketException: No connection could be made because the target machine actively refused it 10.19.0.81:1883

              Comment


                #8
                I had updated version of the plugin you supplied earlier in the thread. In general use, it does feel more stable than the previous version. Before I would get random crashes going into the plugin setup, and it would be stuck in a crashing streak for a while even after restarts.

                Now, I had only managed to get it to crash only after enabling subscribe to all topics, leaving the plugin setup page for another page in HS3, then going back into the mcsMQTT setup page again. It had crashed the two times I tried, I had enabled the debug on the second time, but the debug only picked up the reboot after the crash. The crash always happened when the plugin setup page was loading. However, when I was in the plugin, it was stable and did not crash even when I navigated between pages of the plugin.

                After the crash, restarting HS3 on both cases allowed me to get back into the plugin setup on the first shot. So it's definitely more useable than before.

                Comment


                  #9
                  This is good. Did you try anything to address the broker connection/timing?

                  Comment

                  Working...
                  X