Announcement

Collapse
No announcement yet.

Device Control Lockout Redux

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

    Device Control Lockout Redux

    There is a bug somewhere that is giving me fits.

    This may not be the best forum for this post. If it isn't, my apologies. I'm posting in the Event Clinic because device control event action statements don't always get timely completion by my HS3 Hometroller SEL. Before I go any further, I'd like to head off a few predictable responses:
    1. "Hey, sometimes the system gets busy." My Hometroller is lightly loaded. The problem I describe here occurred in the middle of the night when the the system was largely quiescent. I haven't proven that it's repeatable, but it probably is.
    2. "The system can get overwhelmed by noise from faulty devices (such as Z-Wave) or the general environment (such as Wi-Fi RF noise)." There is no faulty device or noise problem. I have been running a very simple, lightly loading test event every two minutes, around the clock, for several days now. There have been zero test failures, except under well defined conditions.
    3. "Well, you're running HS3. That's an obsolete product, so it doesn't really matter, anyway." Perhaps. But my regular reading of the forums suggests that there really wasn't much changed in the event execution engine between HS3 and HS4. Accordingly, I speculate it's highly likely my problem will be found in HS4 also. If I am wrong, that might be an additional reason to consider upgrading.
    I've already shown in this thread that a simple event action to turn on a light can be delayed by more that 20 seconds simply because Device History was building a chart at the time.

    In today's post, turning on the same light is getting delayed by three seconds because BLBackup starts a system backup. How do I know? Circumstantial evidence. I ran thousands of test iterations, around the clock, over several days. There was exactly one failure, and it occurred seconds after a scheduled backup via BLBackup began. And that was at a time when the total HS3 workload was lightest.

    Let me add some hard data. Here's an HS3 log excerpt covering the failure I describe:

    Code:
     [TABLE]
    [TR]
    [TD]2022-01-02 00:33:26[/TD]
     			[TD]Event[/TD]
     			[TD]Event Trigger "Shop Red Light Red Light On for 9 Seconds Every 1:59"[/TD]
     		[/TR]
    [TR]
    [TD]2022-01-02 00:31:40[/TD]
     			[TD]Event[/TD]
     			[TD]Event Trigger "RGB Bulb2 Set Bulb2 Off"[/TD]
     		[/TR]
    [TR]
    [TD]2022-01-02 00:31:39[/TD]
     			[TD]Event[/TD]
     			[TD]'Run Event' action execution skipped,​ the conditions applied were not met: Shop Red Light Failure to Turn Red Light Off - Copy[/TD]
     		[/TR]
    [TR]
    [TD]2022-01-02 00:31:36[/TD]
     			[TD]Event[/TD]
     			[TD]Event Trigger "MQTT Update Telemetry on C9F2A5 Change"[/TD]
     		[/TR]
    [TR]
    [TD]2022-01-02 00:31:36[/TD]
     			[TD]Device Control[/TD]
     			[TD]Device: tasmota_C9F2A5 cmnd Shop Red Light Control Ssec to Off (0)[/TD]
     		[/TR]
    [TR]
    [TD]2022-01-02 00:31:30[/TD]
     			[TD]Event[/TD]
     			[TD]Event Trigger "RGB Bulb2 Set Bulb2 On"[/TD]
     		[/TR]
    [TR]
    [TD]2022-01-02 00:31:30[/TD]
     			[TD]Event[/TD]
     			[TD]Event Trigger "MQTT Update Power on RGB Bulb2 Change"[/TD]
     		[/TR]
    [TR]
    [TD]2022-01-02 00:31:30[/TD]
     			[TD]Event[/TD]
     			[TD]Event Trigger "RGB Bulb2 Set Bulb2 Yellow"[/TD]
     		[/TR]
    [TR]
    [TD]2022-01-02 00:31:30[/TD]
     			[TD]Event[/TD]
     			[TD]Event Trigger "RGB Bulb2 Set Bulb2 Blink On"[/TD]
     		[/TR]
    [TR]
    [TD]2022-01-02 00:31:30[/TD]
     			[TD]Event[/TD]
     			[TD]Event Trigger "Shop Red Light Failure to Turn Red Light On - Copy"[/TD]
     		[/TR]
    [TR]
    [TD]2022-01-02 00:31:27[/TD]
     			[TD]Event[/TD]
     			[TD]Event Trigger "MQTT Update Telemetry on C9F2A5 Change"[/TD]
     		[/TR]
    [TR]
    [TD]2022-01-02 00:31:27[/TD]
     			[TD]Device Control[/TD]
     			[TD]Device: tasmota_C9F2A5 cmnd Shop Red Light Control Ssec to On (1)[/TD]
     		[/TR]
    [TR]
    [TD]2022-01-02 00:31:27[/TD]
     			[TD]Event[/TD]
     			[TD]Event Trigger "Shop Red Light Red Light On for 9 Seconds Every 1:59"[/TD]
     		[/TR]
    [TR]
    [TD]2022-01-02 00:30:01[/TD]
     			[TD]Device Control[/TD]
     			[TD]Device: Humidity Outside Humidity Outside Filtered to 86 percent (86) by/from: CAPI Control Handler[/TD]
     		[/TR]
    [TR]
    [TD]2022-01-02 00:30:00[/TD]
     			[TD]BLBackup Info[/TD]
     			[TD]The backup process (One Zip File - Full) has started at 1/2/2022 12:30:00 AM...[/TD]
     		[/TR]
    [TR]
    [TD]2022-01-02 00:30:00[/TD]
     			[TD]Device Control[/TD]
     			[TD]Device: Temperature Living Room Temperature Living Room Filtered Previous to 58 degrees (58) by/from: CAPI Control Handler[/TD]
     		[/TR]
    [TR]
    [TD]2022-01-02 00:30:00[/TD]
     			[TD]Event[/TD]
     			[TD]Event Trigger "HomeSeer BLBackup Weekly"[/TD]
     		[/TR]
    [TR]
    [TD]2022-01-02 00:29:40[/TD]
     			[TD]Event[/TD]
     			[TD]'Run Event' action execution skipped,​ the conditions applied were not met: Shop Red Light Failure to Turn Red Light Off - Copy[/TD]
     		[/TR]
    [TR]
    [TD]2022-01-02 00:29:37[/TD]
     			[TD]Event[/TD]
     			[TD]Event Trigger "MQTT Update Telemetry on C9F2A5 Change"[/TD]
     		[/TR]
    [TR]
    [TD]2022-01-02 00:29:37[/TD]
     			[TD]Device Control[/TD]
     			[TD]Device: tasmota_C9F2A5 cmnd Shop Red Light Control Ssec to Off (0)[/TD]
     		[/TR]
    [TR]
    [TD]2022-01-02 00:29:31[/TD]
     			[TD]Event[/TD]
     			[TD]'Run Event' action execution skipped,​ the conditions applied were not met: Shop Red Light Failure to Turn Red Light On - Copy[/TD]
     		[/TR]
    [TR]
    [TD]2022-01-02 00:29:28[/TD]
     			[TD]Event[/TD]
     			[TD]Event Trigger "MQTT Update Telemetry on C9F2A5 Change"[/TD]
     		[/TR]
    [TR]
    [TD]2022-01-02 00:29:28[/TD]
     			[TD]Device Control[/TD]
     			[TD]Device: tasmota_C9F2A5 cmnd Shop Red Light Control Ssec to On (1)[/TD]
     		[/TR]
    [TR]
    [TD]2022-01-02 00:29:28[/TD]
     			[TD]Event[/TD]
     			[TD]Event Trigger "Shop Red Light Red Light On for 9 Seconds Every 1:59"[/TD]
     		[/TR]
    [/TABLE]
    Entries from 00:29:28 through 00:29:40 illustrate the log sequence when tests are proceeding normally.

    At 00:30:00 BLBackup begins, as scheduled.

    The next test event iteration begins at 00:31:27, precisely as scheduled. The log confirms that the test event turns on the red light at 00:31:27 -- perfect. It also confirms that the test event turns off the light at 00:31:36 -- also, exactly as expected.

    BUT, I have been using Device History to track the red light device (tasmota_C9F2A5 cmnd Shop Red Light Control Ssec) changes. Here's what it says:

    Code:
     [TABLE]
    [TR]
    [TD="width: 150"]1/2/2022 12:33:27 AM[/TD]
     			[TD="width: 150"]1/2/2022 12:33:35 AM[/TD]
     			[TD="width: 108"]00:00:08[/TD]
     			[TD="width: 120"]On[/TD]
     			[TD="width: 130"]Off[/TD]
     			[TD="width: 100"]--[/TD]
     			[TD]Off[/TD]
     		[/TR]
    [TR]
    [TD="width: 150"]1/2/2022 12:31:39 AM[/TD]
     			[TD="width: 150"]1/2/2022 12:33:27 AM[/TD]
     			[TD="width: 108"]00:01:47[/TD]
     			[TD="width: 120"]Off[/TD]
     			[TD="width: 130"]On[/TD]
     			[TD="width: 100"]--[/TD]
     			[TD]On[/TD]
     		[/TR]
    [TR]
    [TD="width: 150"]1/2/2022 12:31:30 AM[/TD]
     			[TD="width: 150"]1/2/2022 12:31:39 AM[/TD]
     			[TD="width: 108"]00:00:08[/TD]
     			[TD="width: 120"]On[/TD]
     			[TD="width: 130"]Off[/TD]
     			[TD="width: 100"]--[/TD]
     			[TD]Off[/TD]
     		[/TR]
    [TR]
    [TD="width: 150"]1/2/2022 12:29:37 AM[/TD]
     			[TD="width: 150"]1/2/2022 12:31:30 AM[/TD]
     			[TD="width: 108"]00:01:53[/TD]
     			[TD="width: 120"]Off[/TD]
     			[TD="width: 130"]On[/TD]
     			[TD="width: 100"]--[/TD]
     			[TD]On[/TD]
     		[/TR]
    [TR]
    [TD="width: 150"]1/2/2022 12:29:28 AM[/TD]
     			[TD="width: 150"]1/2/2022 12:29:37 AM[/TD]
     			[TD="width: 108"]00:00:08[/TD]
     			[TD="width: 120"]On[/TD]
     			[TD="width: 130"]Off[/TD]
     			[TD="width: 100"]--[/TD]
     			[TD]Off[/TD]
     		[/TR]
    [TR]
    [TD="width: 150"]1/2/2022 12:27:38 AM[/TD]
     			[TD="width: 150"]1/2/2022 12:29:28 AM[/TD]
     			[TD="width: 108"]00:01:50[/TD]
     			[TD="width: 120"]Off[/TD]
     			[TD="width: 130"]On[/TD]
     			[TD="width: 100"]--[/TD]
     			[TD]On[/TD]
     		[/TR]
    [TR]
    [TD="width: 150"]1/2/2022 12:27:29 AM[/TD]
     			[TD="width: 150"]1/2/2022 12:27:38 AM[/TD]
     			[TD="width: 108"]00:00:08[/TD]
     			[TD="width: 120"]On[/TD]
     			[TD="width: 130"]Off[/TD]
     			[TD="width: 100"]--[/TD]
     			[TD]Off[/TD]
     		[/TR]
    [TR]
    [TD="width: 150"]1/2/2022 12:25:39 AM[/TD]
     			[TD="width: 150"]1/2/2022 12:27:29 AM[/TD]
     			[TD="width: 108"]00:01:49[/TD]
     			[TD="width: 120"]Off[/TD]
     			[TD="width: 130"]On[/TD]
     			[TD="width: 100"]--[/TD]
     			[TD]On[/TD]
     		[/TR]
    [TR]
    [TD="width: 150"]1/2/2022 12:25:30 AM[/TD]
     			[TD="width: 150"]1/2/2022 12:25:39 AM[/TD]
     			[TD="width: 108"]00:00:09[/TD]
     			[TD="width: 120"]On[/TD]
     			[TD="width: 130"]Off[/TD]
     			[TD="width: 100"]--[/TD]
     			[TD]Off[/TD]
     		[/TR]
    [/TABLE]
    So, the HS log says the light was turned on at 00:31:27. Device History says the light was turned on at 00:31:30. The HS log says the light was turned off at 00:31:36. Device History begs to differ, says the light was turned off at 00:31:39. This is a substantial disagreement.

    I am inclined to believe the Device History times, rather than the HS log times. The reason is simple. If the on/off times quoted in the HS log were accurate, then the test failure events would not execute. If the Device History data were accurate, then the test would fail, and the test failure events would execute. HS log confirms execution of the test failure event. I must therefore conclude that its reported on/off times are wrong.

    One way to rationalize the on/off time discrepancies in the HS log is to argue, "Well, what it logged was the times the event processing engine said the light should be turned on/off, not when it actually happened." However, that approach begs a few questions:
    1. If you are going to log when a light is turned on, isn't it better to log when it is actually turned on, rather than when it ought to be turned on? And
    2. Three seconds is a very long time for a modern computer which is essentially idle. What could possibly justify such time delays?
    3. Among thousands of successful tests conducted under all conditions of my running environment, why did HS have two 3 second delays (one turning the light on, and the other turning the light off) just seconds after BKBackup is activated?
    I think the evidence implies a problem with thread execution. Here's the test event, which is scheduled to run every (1:59) two minutes:

    Click image for larger version

Name:	2022-01-02_12-30-38.jpg
Views:	368
Size:	196.9 KB
ID:	1518014

    It may be coincidence that the light turn on commanded by the first action statement is delayed by 3 seconds, according to Device History, while the following WAIT statement just happens to be 3 seconds.

    It would have to be a further coincidence that the light turn off commanded by the fifth action statement is delayed by 3 seconds, according to Device History, while the following WAIT statement just happens to be 3 seconds.

    If anything is to be made of this, it should be remembered that 99.99% of the time, this event runs just fine. It has something to do with concurrent plug-in activity.

    I'm on the outside, thus blind to the internals, but this surely smells like a problem of queues, thread scheduling, and priorities. For example, my data could be explained by the hypothesis that a WAIT statement is blocking the processing of a command that is waiting in a device queue. But that's just a guess.

    There are other, perhaps indirect, interactions between this test event and the BLBackup plug-in. I've been running BLBackup weekly for a year or two. It always runs, and it always requires just over 4 minutes to complete a full backup. When my test failed this morning, BLBackup did run to completion, but it required 17 minutes 16 seconds to complete! Again, the system was otherwise mostly idle.
Working...
X