Announcement

Collapse
No announcement yet.

Future Fixes for HS4 Proposed

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

    Future Fixes for HS4 Proposed

    I'm not sure if there is a better place to raise this, but I assume HomeSeer is working on coding for a HS4 version of the Z-Wave plugin and wanted to raise a few issues with the current version that I'm hoping HomeSeer might be able to address in the HS4 version.

    The purpose here is not to call for new features, but rather to address a few issues I've run into with current operation where there are a number of non-obvious "errata" that seem like they would be hard for new users to identify / diagnose. I'm not looking for scripts or other ways to address this in the the current plugin - I already have some solution, the goal here is call out problems where there are reasonable fixes that could be implemented so the end-user does not have to detect that a problem exist and then figure out a way to solve the problem - i.e., these are problems where the fixes should be "invisible" to the user.

    They relate to the following subjects (which are detailed below):

    I. Startup Device Synchronization
    II. Confirmation of Commands / "Lost" Commands


    I'm assuming that a number of the issues may exist due to limitations in the HS3 plugin API, but as HS4 is still unreleased, maybe there's an opportunity to influence it and the plugin API to improve the operations.

    Here are my thoughts:

    I. Startup Device Synchronization

    If HomeSeer is shut down or the plugin is turned off, and the user makes changes to devices during that time (e.g., turns them on or off) HomeSeer's internal device status can get out of sync with the actual device state. As a result, until at least a first activation of the device, they can remain out of sync which results in errors in event conditions due to mistaken states. This is a particularly bad problem if the devices aren't set to automatic polling, in which case, this problem can persist.

    What I would like to see is that, each time the plugin is restarted, it should do a full poll of all Z-wave devices and re-establish proper state.

    I currently address this issue using a script that is run at startup to poll all Z-wave devices. However, I find relying on end-users to implement scripting is a poor solution for this issue as "beginners" may not know the problem exists and should not have to resort to scripting -- the HomeSeer application / plugin should be responsible for ensuring it maintains proper synchronization of device state.

    II. Confirmation of Commands / "Lost" Commands
    I've found that Z-Wave has a problem in that it can fail to command devices, particularly when a large number of devices is controlled in a burst. For example, if I try to turn on/off 50 dimmers, some may not execute the command. I'm assuming packets are getting lost on the network and aren't being re-transmitted. Most network protocols have a "Send", "Receive Acknowledgement", "Resend If Not Acknowledged" protocol that I assume this may be lacking in the Z-Wave plugin. It seems that the Z-Wave plugin should implement its own re-transmission protocol so it has better protection against failure of commands to execute. I propose something along the following lines be included int he protocol:
    1. The plugin should maintain a table of Device, Command Sent, Send Time, Number of Tries
    2. When a command is sent, a row is added to the table identifying the device, command, when it was sent, and the number of retries
    3. I understand that most Z-Wave devices (or most Z-wave plus?) provide a response when a change of state occurs, so if a command is acknowledged - i.e., you receive back a conformation of a new value, the row described in #2 is removed from the table. If the plugin receives back a different value than was sent, it should still remove the row as some devices may respond to the command with a different result value - e.g., if you send the "last level" command 255 to a Z-Wave dimmer, you'll receive back the new level, e.g., "55".
    4. As I said in #3, I understand that "most" devices provide back a response. I think some older Z-wave devices may not. But the plugin should "know" whether a device does or not and this might be something the plugin could establish when the device is first added to the system - i.e., during the add/include procedure for a device, the plugin could send a command and check if it gets acknowledged and store this in its device database. If devices don't acknowledge their commands, then shortly after a command is sent the plugin should poll the device to determine if the value changed. Of course, for devices that already exist in a person's setup, this "determination of response" should be done for all devices during a first-time startup of the plugin having this feature.
    5. If an acknowledgement is not received after a period of time (2 seconds), a re-transmission is attempted and the plugin will increase the retries count for the device. If the number of retries has exceeded some amount, an error is generated.
    6. Maybe have a check-mark on a device's Z-Wave page allowing this to be turned on/off for a device and a box to set the transmission attempts allowed (Default would be to provide re-transmission service with 3 tries)
    7. A further enhancement would be to have a new Event trigger "Transmission failure" and set global variables identifying the device (e.g., Reference, Name, Location1, Location2, Failed Command / Value, Current Value, Retries, Time) which a user could then use in an Event to send a warning email or otherwise act on a device failure (e.g., send an email if a critical device has failed to respond)

    I would appreciate someone from HS respond with thoughts on these proposals. Maybe HS is already hard at work on these issues and has better solutions.

    #2
    Originally posted by jvm View Post
    I'm not sure if there is a better place to raise this, but I assume HomeSeer is working on coding for a HS4 version of the Z-Wave plugin and wanted to raise a few issues with the current version that I'm hoping HomeSeer might be able to address in the HS4 version.

    The purpose here is not to call for new features, but rather to address a few issues I've run into with current operation where there are a number of non-obvious "errata" that seem like they would be hard for new users to identify / diagnose. I'm not looking for scripts or other ways to address this in the the current plugin - I already have some solution, the goal here is call out problems where there are reasonable fixes that could be implemented so the end-user does not have to detect that a problem exist and then figure out a way to solve the problem - i.e., these are problems where the fixes should be "invisible" to the user.

    They relate to the following subjects (which are detailed below):

    I. Startup Device Synchronization
    II. Confirmation of Commands / "Lost" Commands


    I'm assuming that a number of the issues may exist due to limitations in the HS3 plugin API, but as HS4 is still unreleased, maybe there's an opportunity to influence it and the plugin API to improve the operations.

    Here are my thoughts:

    I. Startup Device Synchronization

    If HomeSeer is shut down or the plugin is turned off, and the user makes changes to devices during that time (e.g., turns them on or off) HomeSeer's internal device status can get out of sync with the actual device state. As a result, until at least a first activation of the device, they can remain out of sync which results in errors in event conditions due to mistaken states. This is a particularly bad problem if the devices aren't set to automatic polling, in which case, this problem can persist.

    What I would like to see is that, each time the plugin is restarted, it should do a full poll of all Z-wave devices and re-establish proper state.

    I currently address this issue using a script that is run at startup to poll all Z-wave devices. However, I find relying on end-users to implement scripting is a poor solution for this issue as "beginners" may not know the problem exists and should not have to resort to scripting -- the HomeSeer application / plugin should be responsible for ensuring it maintains proper synchronization of device state.

    II. Confirmation of Commands / "Lost" Commands
    I've found that Z-Wave has a problem in that it can fail to command devices, particularly when a large number of devices is controlled in a burst. For example, if I try to turn on/off 50 dimmers, some may not execute the command. I'm assuming packets are getting lost on the network and aren't being re-transmitted. Most network protocols have a "Send", "Receive Acknowledgement", "Resend If Not Acknowledged" protocol that I assume this may be lacking in the Z-Wave plugin. It seems that the Z-Wave plugin should implement its own re-transmission protocol so it has better protection against failure of commands to execute. I propose something along the following lines be included int he protocol:
    1. The plugin should maintain a table of Device, Command Sent, Send Time, Number of Tries
    2. When a command is sent, a row is added to the table identifying the device, command, when it was sent, and the number of retries
    3. I understand that most Z-Wave devices (or most Z-wave plus?) provide a response when a change of state occurs, so if a command is acknowledged - i.e., you receive back a conformation of a new value, the row described in #2 is removed from the table. If the plugin receives back a different value than was sent, it should still remove the row as some devices may respond to the command with a different result value - e.g., if you send the "last level" command 255 to a Z-Wave dimmer, you'll receive back the new level, e.g., "55".
    4. As I said in #3, I understand that "most" devices provide back a response. I think some older Z-wave devices may not. But the plugin should "know" whether a device does or not and this might be something the plugin could establish when the device is first added to the system - i.e., during the add/include procedure for a device, the plugin could send a command and check if it gets acknowledged and store this in its device database. If devices don't acknowledge their commands, then shortly after a command is sent the plugin should poll the device to determine if the value changed. Of course, for devices that already exist in a person's setup, this "determination of response" should be done for all devices during a first-time startup of the plugin having this feature.
    5. If an acknowledgement is not received after a period of time (2 seconds), a re-transmission is attempted and the plugin will increase the retries count for the device. If the number of retries has exceeded some amount, an error is generated.
    6. Maybe have a check-mark on a device's Z-Wave page allowing this to be turned on/off for a device and a box to set the transmission attempts allowed (Default would be to provide re-transmission service with 3 tries)
    7. A further enhancement would be to have a new Event trigger "Transmission failure" and set global variables identifying the device (e.g., Reference, Name, Location1, Location2, Failed Command / Value, Current Value, Retries, Time) which a user could then use in an Event to send a warning email or otherwise act on a device failure (e.g., send an email if a critical device has failed to respond)

    I would appreciate someone from HS respond with thoughts on these proposals. Maybe HS is already hard at work on these issues and has better solutions.
    I think it is also a good idea to share this with HS support as the HS team is not reading all the posts on the forum.

    ---
    John

    Comment


      #3
      Yes, If you want the dev team to see this you should send this to support@homeseer.com
      💁‍♂️ Support & Customer Service 🙋‍♂️ Sales Questions 🛒 Shop HomeSeer Products

      Comment


        #4
        FWIW, the Z-wave protocol does have retries and acknowledgements are part of the protocol. The timeout waiting for acknowledgments is configurable in the z-wave plugin. Not sure if the number of retries is configurable.
        HS 4.2.8.0: 2134 Devices 1252 Events
        Z-Wave 3.0.10.0: 133 Nodes on one Z-Net

        Comment


          #5
          Originally posted by sparkman View Post
          FWIW, the Z-wave protocol does have retries and acknowledgements are part of the protocol. The timeout waiting for acknowledgments is configurable in the z-wave plugin. Not sure if the number of retries is configurable.
          Thanks. I just went back and looked. The plugin does have a configuration for "poll retries limit" which seems to be a narrower concept, but one that's in the right arena. What I'm looking for is, essentially, a "command retries". Basically, extending the idea of retrying Polls, to also encompass other commands.

          Comment


            #6
            Originally posted by jvm View Post

            Thanks. I just went back and looked. The plugin does have a configuration for "poll retries limit" which seems to be a narrower concept, but one that's in the right arena. What I'm looking for is, essentially, a "command retries". Basically, extending the idea of retrying Polls, to also encompass other commands.
            I just tried sending a command to a switch on which I had pulled the air-gap while I had debug logging turned on, and the logs only show one attempt “attempt 0 sending to Node 55”. Not sure whether the retries could be set in the ini file for the zwave plugin. I don’t see an entry for it.
            HS 4.2.8.0: 2134 Devices 1252 Events
            Z-Wave 3.0.10.0: 133 Nodes on one Z-Net

            Comment


              #7
              A command retry capability is something I've always wanted as we don't live in a perfect world - the PI is able to queue commands, so why not only remove them once acknowledged? Allow a retry backoff period (maybe even per device), maybe issue a status poll if the device doesn't acknowledge the command after x times and allow an action such as "if failed, then do x"

              Even if that's not possible, at least give some kind of way to handle a node going unknown. The fact that its not an explicit thing you can trap for and can only be checked for by some property scripting is not exactly conducive to preventing or monitoring for problems.

              I'd also appreciate something to be able to do restoration of a network in event of server explosion or zwave controller death. Allow us to change the network ID and import the devices and map the associations to what it used to be.
              If you've ever had to redo your controller, you still have to redo the associations. Why not allow us to change the ID of the controller?

              One more thing - if its possible, allow configuration commands to be sent to battery powered nodes. The PI really doesn't handle them well (especially parameters), and if would be great to have some interface that allows me to say "set parameter x to y at next wake up" or "run an optimise 3 times"

              Comment


                #8
                Originally posted by Furious View Post

                Even if that's not possible, at least give some kind of way to handle a node going unknown. The fact that its not an explicit thing you can trap for and can only be checked for by some property scripting is not exactly conducive to preventing or monitoring for problems.
                You can check for it with an event trigger. No scripting is required...
                HS 4.2.8.0: 2134 Devices 1252 Events
                Z-Wave 3.0.10.0: 133 Nodes on one Z-Net

                Comment


                  #9
                  Originally posted by sparkman View Post

                  You can check for it with an event trigger. No scripting is required...
                  I wasn't aware of that - which event trigger allows you to check for an event going unknown?

                  The advise that you can use an event trigger is helpful, but it does bring this full circle back to the underlying original point - i.e., there's too much non-obvious knowledge that the user has to have to handle simple things which would be better handled in a standardized way. The user shouldn't have to know that they have to set up events so that they can know that their system is faulting. As a top-of-head example, one easier solution would be a simple tab on the homeseer web interface that shows up when there is a critical problem (device not responding, battery below threshold, etc. etc.).

                  I know HS has a lot on their plate as they try and get out a stable HS4 release, but one point of the new release should be not merely replicating what we already have, I'd like to see it become "make this easier for new users." To me, that means more dummy-proofing; less looking at forums to answer common questions (i.e., more information in the interface about how things work and common problems to avoid), not looking at logs for problems (instead, having critical information shown in so-easy-my-parent-would-understand form without a lot of user customization). I'm not really expecting this in the 4.0 release as they have to get that working enough to support prior use cases, but working toward this in a 4.1 release (or so) is something I hope they target.

                  Comment


                    #10
                    Side note....can you share the script your are using to poll at startup?

                    Seems useful

                    Comment


                      #11
                      Originally posted by mbg0333 View Post
                      Side note....can you share the script your are using to poll at startup?

                      Seems useful
                      Script is attached. It was originally written by others.

                      1. Place script in attached Zip vile in the HomeSeer Scripts directory (on Windows, that's C:/Program Files (x86)/HomeSer HS3/scripts

                      2. Then create an event that is triggered Manually, and which, when triggered, runs this script. I have this organized in an "Event" group that I call "Startup and Shutdown Events" and I call the event "Startup Event". I also include "other" things in this event - e.g., I check that locks are locked and that lights are in proper state for night or day (basically, this startup event acts as a recovery event in case HomeSeer crashed and had to be rebooted).

                      3. Then modify the HomeSeer startup script by adding: hs.TriggerEvent("Startup Event") as the right before the last line in the startup script "Startup.vb" ( this goes before the "End Sub"). Startup.vb is found in the scripts directory I mentioned in #1.


                      I have this polling script set to do a "slow" poll of Z-Wave devices. It sends out one poll every 500 milliseconds. This is to make sure I don't "flood" the network with too many polls as other startup activities are taking place.

                      Attached Files

                      Comment


                        #12
                        Originally posted by sparkman View Post

                        You can check for it with an event trigger. No scripting is required...
                        Wow. it really does work - invalid or error state triggers work......

                        Thanks for that - now, if the event engine could handle something like "if error state, then resend the command x times at y intervals and give up", that would mean a lot less worry about the odd misfire of comms

                        Comment


                          #13
                          Originally posted by Furious View Post

                          Wow. it really does work - invalid or error state triggers work......

                          Thanks for that - now, if the event engine could handle something like "if error state, then resend the command x times at y intervals and give up", that would mean a lot less worry about the odd misfire of comms
                          I see now see that I can set the trigger using INVALID/ERROR states, but writing events for every device would be unwieldy. There's also the option to set:
                          "If Any Device is set to an INVALID/ERROR State"

                          And I can then use this to do something like send an email, but I don't see a way to identify or act on the specific one of the "Any Device"-es that is in error. Is there some way to do that?

                          Comment


                            #14
                            jvm

                            Before that option existed (or before I knew it existed), I had set up an alert using UltraLog. It happens so rarely for me, and mainly due to unplugged modules, that I've never set anything up to automate re-polling, re-sending a command, etc. I just send a Pushover message to let me know something is amiss.
                            Attached Files
                            HS 4.2.8.0: 2134 Devices 1252 Events
                            Z-Wave 3.0.10.0: 133 Nodes on one Z-Net

                            Comment

                            Working...
                            X