Announcement

Collapse
No announcement yet.

More Lost Event Actions

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

    More Lost Event Actions

    I recently posted
    Originally posted by ericg View Post
    Has anybody been losing delayed action event triggers?
    an issue where delayed event actions were occasionally not being processed. I.e., an event action (THEN) requests a device action to occur a little later. A little later comes around, and it's as if the delayed action had never been requested at all.

    So far, I have had no responses, so I decided to dig a little deeper myself. To start, I created a counter for test purposes only. Then I created a recurring event that does nothing but increment the counter 5 times:



    I expected the counter value to increase by 5 every 5 minutes. I was surprised to discover that the counter incremented by 1 every 5 minutes. Suspecting that there might be timing issues (i.e. a race condition) involved, I created another counter to be driven by a new event definition:

    Click image for larger version

Name:	2019-05-11_19-07-57.jpg
Views:	398
Size:	131.8 KB
ID:	1304237
    Then I set both of these recurring events running. I found that the second event with the interspersed waits was consistently incrementing by 5 every 5 minutes, while the original event was incrementing its counter by only 1 when it ran every 5 minutes. The obvious conclusion: if you try to increment a counter too often (under at least some conditions), not every event action is going to get honored.

    So, what are the implications? Call me naive, but I think that if HS3 lets me build an event definition, then it ought to execute that event as written when triggered, subject to external hardware constraints. Alternatively, if that's asking too much, then I think it's critical for HS3 documentation to identify precisely what event action sequences are beyond its ability to perform. (Perhaps that has been done, and I have just not read the appropriate material. If so, I would appreciate a pointer to the information I need.) And if I can't have even that, then will somebody please tell me how I can create a secure system, knowing that HS3 sometimes won't perform event actions, though nobody can say what, or why?

    #2
    This is an interesting one. Perhaps rjh has some info.

    Comment


      #3
      Can't see your first example, so I'm assuming it's the action "Increment counter by 5". When I make a test event to run every 5 mins, it increments correctly for me..

      Z

      Comment


        #4
        I would need to see the first Event. Since you cannot use a delayed device action to increment a counter, I would need to see what the event is doing.
        HS4 Pro, 4.2.19.0 Windows 10 pro, Supermicro LP Xeon

        Comment


          #5
          vasrc and , this is actually getting funny. Now I've lost an event from a posting! I live in an area where Internet service is often problematic, and last night was especially bad. I had to try many times to upload the pictures. I even had a pop-up box tell me I had "lost a security token", and that I should report it to the board administrator. But that didn't work, either.

          Anyway, this is the event that apparently didn't make it to the posting:

          Click image for larger version

Name:	2019-05-11_18-53-14.jpg
Views:	284
Size:	106.4 KB
ID:	1304323

          Comment


            #6
            Originally posted by ericg View Post
            vasrc and , this is actually getting funny. Now I've lost an event from a posting! I live in an area where Internet service is often problematic, and last night was especially bad. I had to try many times to upload the pictures. I even had a pop-up box tell me I had "lost a security token", and that I should report it to the board administrator. But that didn't work, either.

            Anyway, this is the event that apparently didn't make it to the posting:

            Click image for larger version

Name:	2019-05-11_18-53-14.jpg
Views:	284
Size:	106.4 KB
ID:	1304323
            Any reason you're not using the Increment count BY instead or is it programmatic? You're probably right there's a timing issue, but using Increment by 5 should work as well?

            Z

            Comment


              #7
              Originally posted by ericg View Post
              vasrc and , this is actually getting funny. Now I've lost an event from a posting! I live in an area where Internet service is often problematic, and last night was especially bad. I had to try many times to upload the pictures. I even had a pop-up box tell me I had "lost a security token", and that I should report it to the board administrator. But that didn't work, either.

              Anyway, this is the event that apparently didn't make it to the posting:

              Click image for larger version  Name:	2019-05-11_18-53-14.jpg Views:	0 Size:	106.4 KB ID:	1304323
              The problem with this event is that all actions are essentially launched simultaneously, not sequentially, so it will probably only increment the counter once.

              Adding a Wait as your did in your other example will cause the actions to be performed sequentially.
              HS4 Pro, 4.2.19.0 Windows 10 pro, Supermicro LP Xeon

              Comment


                #8
                Originally posted by ericg View Post
                vasrc and , this is actually getting funny. Now I've lost an event from a posting! I live in an area where Internet service is often problematic, and last night was especially bad. I had to try many times to upload the pictures. I even had a pop-up box tell me I had "lost a security token", and that I should report it to the board administrator. But that didn't work, either.

                Anyway, this is the event that apparently didn't make it to the posting:

                Click image for larger version

Name:	2019-05-11_18-53-14.jpg
Views:	284
Size:	106.4 KB
ID:	1304323
                The problem is that the "Then" statements are not necessarily ran when enocuntered, instead they are "Scheduled" to be run, and actually run when Homeseer is paused from doing other things, including the processing of this event. I have seen this when changing values on a virtual switch, the Then statements that follow will still be using the previous switch value.

                The wait statement allows Homeseer to catch up (a single second is a long time to a computer). I always put a wait action following a "Then" action if the "Then" effects actions that directly follow.

                With a counter, the multiple rapid "Scheduling" are stepping on each other. Not a big problem since counters can have their values explicitly set in an action. However, no such value setting is available in script, which greatly reduces the usability of counters within scripts.

                Comment


                  #9
                  Originally posted by vasrc View Post

                  Any reason you're not using the Increment count BY instead or is it programmatic? You're probably right there's a timing issue, but using Increment by 5 should work as well?

                  Z
                  I didn't "increment count BY" because I wanted to expose a possible timing issue that might arise when one request to increment a counter quickly follows another request to increment the same counter. It was just a test. In a real application, I would probably have implemented as you suggested.

                  Good examples of why one might need to bump a counter in quick succession are a little hard to come by, but try this: Suppose I have a water well which can supply only 3 gallons per minute. My home's intermittent water demand is often much higher, so I purchase a storage tank to buffer the water supply. Now, somehow I have to shut off the well pump when the tank gets full, and I have to start it again when the water level drops to, say, half full. There are any number of pump control solutions that measure the water level directly (ultrasonic sensors, for example), but direct measurement is not feasible for reasons of cost, reliability, maintainability, etc. So, I buy a couple of flow meters that issue a pulse every time a gallon of water passes through them. One meter is installed at the well head, and the other at the house water service entrance. I program an HS3 event to increment a counter every time the well produces a gallon, and another HS3 event to decrement the same counter every time the house uses a gallon. You can see that the water level in the storage tank will be proportional to the value of the HS3 counter. BUT, the whole scheme falls apart if flow pulses are getting lost because one flow meter reports a gallon added at almost the same time as the other meter reports a gallon dispensed.

                  It occurs to me that I could likely work around this problem by using two different counters, one for each flow meter. Pump control events would use the difference of the two counters to turn power on/off. But knowledge of a good model of the event engine would have helped avoid the problem in the first place.

                  Comment


                    #10
                    Originally posted by rprade View Post

                    The problem with this event is that all actions are essentially launched simultaneously, not sequentially ...
                    That is a very interesting statement. I'm trying to get a decent model of the execution engine in my head to help me design events properly.

                    I already understood that, say, a 30 second WAIT in one active event will not prevent many other events from running from start to completion during those 30 seconds (though it does tie up some system resources).

                    What I think you have added is that the execution engine will initiate all of an event's specified actions, up to the next WAIT action (if there is one) for concurrent processing. Then, when the WAIT has expired, the next batch of actions is given to the execution engine, again for concurrent processing. Do I have that right?

                    Even if that is true, however, I am still left with the question: what happened to the other four actions in the test case that failed? My head model pictures 5 different THEN actions being tossed to the event execution engine at approximately the same time. The engine queues, then processes them, one at a time, as it finds the opportunity. After the processing overhead delay, my counter value should still be bigger by 5. So what happened to those actions?

                    I hope you don't think I am pushing on this merely for argument's sake. As indicated in other posts, in my real world environment, I am occasionally losing event triggers and event actions. And, so far, I have been unable to determine why. I think others have been documenting similar problems, though that may be my faulty recollection.

                    It would be a big help if someone published a good model of the event execution engine. The more I think about it, the more confused I become. (Solution: don't think about it!) Suppose the clock just now ticked to 10 seconds past noon, and two independent events become thereby triggered. My office light is presently Off. One of the two events asks the HS-WD200+ to turn on at 50% brightness, and the other asks the same dimmer to turn on at full brightness. So what does the dimmer actually receive, one request, or two?

                    My Schlage door locks take an estimated 2 seconds to lock or unlock. An event tells them to lock at a certain time every night. I want detection of a fire or smoke to unlock all locks automatically. Since this is an asynchronous event, the triggers might occur even during the same second. It seems right to trigger the lock event with: IF the time is <xx> AND IF <no fires detected> THEN <lock the doors>. But if fire detection triggers its event while the doors are locking for the evening, then what happens? The question has two parts: (1) if fire detection event triggers while the doors are locking, but not in the same second as the scheduled lock event, and (2) scheduled door lock and fire detection unlock events both trigger in the same second. Of course, I would like these events to be queued. If the fire detection event occurs when scheduled locking has already been committed, then I want the fire detection unlock actions to be queued for execution as soon as the locks finish locking. I'm not at all sure this is how the system works.

                    I suppose I could do a workaround: IF <fire detected> THEN <unlock all doors> THEN WAIT 5 seconds THEN <unlock all doors>. I don't like to do workarounds for problems that may not even exist.

                    Comment


                      #11
                      Interesting thread. I'm genuinely interested in a better understanding of this as well.

                      Originally posted by ericg View Post
                      I suppose I could do a workaround: IF <fire detected> THEN <unlock all doors> THEN WAIT 5 seconds THEN <unlock all doors>. I don't like to do workarounds for problems that may not even exist.
                      I use this technique quite a bit, although I agree it's frustrating to do so when there may not be (or should not be?) a reason to do it. In my case it has involved z-wave actions that occasionally don't successfully complete--typically when, for example, many Off commands go out to lights at once at bedtime--so I've generally assumed it was a failing of the z-wave side and not within the event engine. I still think this is likely most of the problem, but this discussion would seem to bring it into question. (My "workaround" is typically EasyTrigger's feature allowing a fixed pause between commands to a device group, although I also have certain actions re-fire "just in case" exactly like your hypothetical workaround.)
                      -Wade

                      Comment


                        #12
                        Originally posted by ericg View Post

                        I didn't "increment count BY" because I wanted to expose a possible timing issue that might arise when one request to increment a counter quickly follows another request to increment the same counter. It was just a test. In a real application, I would probably have implemented as you suggested.

                        Good examples of why one might need to bump a counter in quick succession are a little hard to come by, but try this: Suppose I have a water well which can supply only 3 gallons per minute. My home's intermittent water demand is often much higher, so I purchase a storage tank to buffer the water supply. Now, somehow I have to shut off the well pump when the tank gets full, and I have to start it again when the water level drops to, say, half full. There are any number of pump control solutions that measure the water level directly (ultrasonic sensors, for example), but direct measurement is not feasible for reasons of cost, reliability, maintainability, etc. So, I buy a couple of flow meters that issue a pulse every time a gallon of water passes through them. One meter is installed at the well head, and the other at the house water service entrance. I program an HS3 event to increment a counter every time the well produces a gallon, and another HS3 event to decrement the same counter every time the house uses a gallon. You can see that the water level in the storage tank will be proportional to the value of the HS3 counter. BUT, the whole scheme falls apart if flow pulses are getting lost because one flow meter reports a gallon added at almost the same time as the other meter reports a gallon dispensed.

                        It occurs to me that I could likely work around this problem by using two different counters, one for each flow meter. Pump control events would use the difference of the two counters to turn power on/off. But knowledge of a good model of the event engine would have helped avoid the problem in the first place.
                        A counter should increment/decrement when requested. I suspect they added the "By x" just because of this limitation. Rich completely redid the Event engine so it's threaded now, so there shouldn't be a way for a "count" to be lost. This may just be an exception (ie multiple counter actions within the same event) they didn't address yet. Drop an email to support@homeseer.com and see what they say.

                        As far as your "example" , typically a pressure tank and pressure switch are used to control a well pump, it's simple and reliable. Not sure I'd want HS3 controling a well pump.

                        I have a well which has an Enowell sonic depth finder, and a water meter hooked to a Rfxcom pulse counter that does all of what you described with the exception of the pump control.

                        Z




                        Comment


                          #13
                          I think it is fair to say that HS events are not 100% reliable.

                          Although I'd rate HS as very reliable, I do experience some failures of events, including delayed events and event actions. I would not recommend using HS for critical control functions where failure could result in serious mechanical damage, serious injury, or any life threatening situation.

                          I also try to make it a practice to design events so that missed event actions are (eventually) corrected automatically. That way, the error often goes undetected, and even if it is noticed the annoyance is mitigated.
                          Mike____________________________________________________________ __________________
                          HS3 Pro Edition 3.0.0.548, NUC i3

                          HW: Stargate | NX8e | CAV6.6 | Squeezebox | PCS | WGL 800RF | RFXCOM | Vantage Pro | Green-Eye | Edgeport/8 | Way2Call | Ecobee3 | EtherRain | Ubiquiti

                          Comment

                          Working...
                          X