Announcement

Collapse
No announcement yet.

Plugin won't initialize at startup

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

    #16
    Originally posted by bsobel View Post
    I think this is another case of one of your uses (also named Bill) running into the bug; https://forums.homeseer.com/forum/li...lugin-restarts Notice with config changes, it went away, its part of the 'fun' transient nature of this one (like you, I much prefer 100% reproducible bugs!)

    Edit; I am more and more convinced that HS is doing some long running operation on its primary thread and is not pumping windows messages during those operations. Async callbacks rely on the message queue to get dispatched. I just sent Rich a long email on this and asked him to audit the startup code with this in mind, Ill let you know what he comes back with. If thats indeed the root cause, we can fix this by ensuring HS itself is pumping messages correctly, but will remain a legacy bug for anyone before whatever version includes that fix (meaning the retry, if you do agree to do it, won't be completely wasted work)
    Bill,

    Your edit makes me have a ohhhhhh!!!!! wave here!
    On Windows I had a growing irriration myself for a few years already. The startup of HomeSeer was more or less random in duration. At certain times I had to wait for 15 minutes before HS3 would finish initialising plugins. Sometimes plugins would timeout even before it would start. At other times it would take a minute or two, but HS would start at reasonable speed.

    So almost three weeks ago I got hit by the name change of Philips and my lights started to be not controllable(how conveniant at 10:15 PM of course.....). The issue that is now solved in the beta....
    But I was analysing what really happend and wanted to have a clean start of HS. But after half an hour, HS did not start, rebooted the machine, kicked it and shook it, but HS would just stick at initialising and time outing plugins. Pulled my hair there...

    So I moved all files to another computer and again tried to restart it, with the same results - it just would not start. On this second pc I checked for large files and found one above 1 GB... a log file, removed it and as expected without result. The second large file was my energy database. It holds 5 years of energy usage.I had to clean the database myself as HST leaves all 0 value rows in it, even with compacting. Did add some other compaction actions in it to make the database even smaller.
    Anyway, I removed the database and restarted HS. Within 10 seconds HS started!! This was on the second much more powerfull PC

    Of course did the same action on the production PC - removed the energy database and started Homeseer - fast, within 30 seconds all is buzzing again!
    I restored the energy database after creating a history table that is holding compacted data to one line per day/device from two months back and longer ago. The database is now reduced from 200Mb to 33 Mb and HS still starts fast. AND I still have my history

    I had contact with rjh in the past on this issue, but Rich could not copy the issue and to be honest I did let it rest as I saw not many complaints on the startup time, hoping that in a future release the issue would dissapear. Was planning to mail him, but waited as I wanted to be sure adjusting the Energy db was lasting.

    For you it might be very interesting to test removing the energy database? How large is your energy database?

    Wim
    -- Wim

    Plugins: JowiHue, RFXCOM, Sonos4, Jon00's Perfmon and Network monitor, EasyTrigger, Pushover 3P, rnbWeather, BLBackup, AK SmartDevice, Pushover, PHLocation, Zwave, GCalseer, SDJ-Health, Device History, BLGData

    1210 devices/features ---- 392 events ----- 40 scripts

    Comment


      #17
      Thanks for the hint but its at 35mb, I had reset it once before trying to narrow this down, but that version was 36.4mb. So looks like I haven't seen that unconstrained growth. Right now I suspect one startup culprit is the parent child relationship checking, I have about 2500 devices so it takes awhile. I get the events dropped log entry immediately after it runs and plugins starting around that time often run into the error Im chasing...

      Comment


        #18
        Ok, was hoping the energy database would hold up process as well for you, would have been a lucky shot ....

        I will update the plugin with a 10 time retry on the device enumurators, but I do hope there are no other points throwing an issue as I also see a error registering webpages.
        If the plugin has to loop, it will log how may it had to loop. If it exceeded 10, I guess the plugin better quits, so it will not create more issues ok?
        -- Wim

        Plugins: JowiHue, RFXCOM, Sonos4, Jon00's Perfmon and Network monitor, EasyTrigger, Pushover 3P, rnbWeather, BLBackup, AK SmartDevice, Pushover, PHLocation, Zwave, GCalseer, SDJ-Health, Device History, BLGData

        1210 devices/features ---- 392 events ----- 40 scripts

        Comment


          #19
          Originally posted by w.vuyk View Post
          Ok, was hoping the energy database would hold up process as well for you, would have been a lucky shot ....

          I will update the plugin with a 10 time retry on the device enumurators, but I do hope there are no other points throwing an issue as I also see a error registering webpages.
          If the plugin has to loop, it will log how may it had to loop. If it exceeded 10, I guess the plugin better quits, so it will not create more issues ok?
          Really any call that sets up remoting on a secondary thread will initially see this issue, so if the webpage registration happens on a different thread than the device enumerator you may need to do that there as well. One option (I haven't tested this, but from what we know this should work) would be to put a call to something like HS.ApiVersion or another side effect free call as the entry point of any background threads (along with a try/catch to eat the potential NPE). After that first failed call the rest should work.

          I went the retry route, and like you suggested, I log and exit if I can't get the enumerator in 10 attempts. On 4.5.2 I always get it by the second attempt, on when earlier built with 4.6+ it sometimes failed up to 10 times (but sometimes didn't!). Moving back to 4.5.2 fixed that issue, leaving the single retry remaining with 100% success.

          Comment


            #20
            Ok, had finished the check on the enumarator as you said earlier, but will also put the repeat in the HS init part, which adds the webages.

            Wim
            -- Wim

            Plugins: JowiHue, RFXCOM, Sonos4, Jon00's Perfmon and Network monitor, EasyTrigger, Pushover 3P, rnbWeather, BLBackup, AK SmartDevice, Pushover, PHLocation, Zwave, GCalseer, SDJ-Health, Device History, BLGData

            1210 devices/features ---- 392 events ----- 40 scripts

            Comment


              #21
              Originally posted by w.vuyk View Post
              Ok, had finished the check on the enumarator as you said earlier, but will also put the repeat in the HS init part, which adds the webages.

              Wim
              Great. Been talking with Rich, there definitely is some long running process on the main STA thread at startup, we are discussing some options to fix and I'll be testing it here to see if we can get things stable for all.

              Comment


                #22
                Bill,

                Beta is available now... be sure to install 2.0.4.7 as for some reason some users do see two beta's of the JowiHue plugin
                -- Wim

                Plugins: JowiHue, RFXCOM, Sonos4, Jon00's Perfmon and Network monitor, EasyTrigger, Pushover 3P, rnbWeather, BLBackup, AK SmartDevice, Pushover, PHLocation, Zwave, GCalseer, SDJ-Health, Device History, BLGData

                1210 devices/features ---- 392 events ----- 40 scripts

                Comment


                  #23
                  Originally posted by w.vuyk View Post
                  Bill,

                  Beta is available now... be sure to install 2.0.4.7 as for some reason some users do see two beta's of the JowiHue plugin
                  Thanks, will grab it tomorrow. Continuing to debug the issue so trying to not change my config at all.

                  Minor issue; as an FYI I notice that at every HS shutdown I see:

                  Unhandled Exception:
                  System.NullReferenceException: Object reference not set to an instance of an object
                  at HSPI_JowiHue.BaseCache.StopTimers () [0x0000f] in <2ad28a1e03c442209dd24cf9deef1f6f>:0
                  at HSPI_JowiHue.BaseCache.Dispose (System.Boolean disposing) [0x00015] in <2ad28a1e03c442209dd24cf9deef1f6f>:0
                  at HSPI_JowiHue.BaseCache.Finalize () [0x00001] in <2ad28a1e03c442209dd24cf9deef1f6f>:0
                  [ERROR] FATAL UNHANDLED EXCEPTION: System.NullReferenceException: Object reference not set to an instance of an object
                  at HSPI_JowiHue.BaseCache.StopTimers () [0x0000f] in <2ad28a1e03c442209dd24cf9deef1f6f>:0
                  at HSPI_JowiHue.BaseCache.Dispose (System.Boolean disposing) [0x00015] in <2ad28a1e03c442209dd24cf9deef1f6f>:0
                  at HSPI_JowiHue.BaseCache.Finalize () [0x00001] in <2ad28a1e03c442209dd24cf9deef1f6f>:0


                  Guessing no really side effects, but wanted you to know just in case the exception is stopping some additional shutdown activity you rely on...

                  Comment


                    #24
                    Just wanted to update you on what I've found, in further debugging:

                    HomeSeer has a thread called "Check Interfaces" which is responsible, among other things, for calling the InitIO routine of each plugin in series at startup. The call is a RPC, so the thread itself is held up while the remote call occurs. During this time calls from other plugins to HS fail with a null pointer exception in remoting, they start working again when that thread is not blocked. I suspect (but have not verified) this same thread is the one that calls HSEvent as well and likely has the same problem them (albeit HSEvent is usually a faster call than InitIO), so we have this condition where the load sequence and timing of plugins makes the issue appear 'random' but finally know when the issue is occurring. This feels 'fixable' (potentially with a secondary thread to do the actual calls while this thread continues to do whatever else it does that enables the remoting calls to not fail), but a backwards compatibility issue will remain until everyone gets updated to a fixed version .

                    Comment


                      #25
                      A further update, Rich was kind enough to get me the code I needed to debug. Internally there is a single threaded queue of requests from plugins for the server to process, it appears this queue stalls during InitIO and at least GetDeviceEnumerator() I suspect they share a lock object. If the total time spent waiting in the queue is 60 seconds or longer, the plugins abort the request and you get the NPE we are seeing.

                      My system has 3300 devices and 28 plugins, so just basic math means each plugin needs to complete startup in 2.14 seconds or less or the error will occur. WeatherXML as an example sometimes takes 250 seconds to start, guaranteeing this issue will effect your startup. I proposed some changes to Rich to deal with it which should solve it for the release he puts it in and later, but we will have to code defensively and retry if we want to support older versions.

                      Thank you again for your willingness to make those changes, and I am very very happy to finally have root cause on this darn thing; this one took awhile...

                      Comment


                        #26
                        Bill,

                        That indeed clarifies the issue, also for my understanding. Thanks for diving this deep to get to the roots!

                        Wim
                        -- Wim

                        Plugins: JowiHue, RFXCOM, Sonos4, Jon00's Perfmon and Network monitor, EasyTrigger, Pushover 3P, rnbWeather, BLBackup, AK SmartDevice, Pushover, PHLocation, Zwave, GCalseer, SDJ-Health, Device History, BLGData

                        1210 devices/features ---- 392 events ----- 40 scripts

                        Comment


                          #27
                          I am having a similar issue with the latest BETA 2.0.4.8.

                          Code:
                          Sep-04 23:09:23         Info    Plugin JowiHue has connected. IP:127.0.0.1:52615
                          Sep-04 23:09:23         Warning    I/O interface JowiHue is down, executable is not running, restarting ...
                          Sep-04 23:09:06         Info    Plugin JowiHue with instance: has disconnected
                          Sep-04 23:09:06         JowiHue    Initialisation bridgedevices did not complete correctly, exiting plugin
                          Sep-04 23:09:06         JowiHue    Error: (FindBridgesSelf)::Object reference not set to an instance of an object
                          Sep-04 23:08:54         JowiHue    Importing SensorModels
                          Sep-04 23:08:54         JowiHue    Importing SensorTypes
                          Sep-04 23:08:54         JowiHue    Importing lampTypes
                          Sep-04 23:08:53         Plug-In    Finished initializing plug-in JowiHue
                          Sep-04 23:08:53         Starting Plug-In    JowiHue loaded in 500 milliseconds
                          Sep-04 23:08:53         Starting Plug-In    Plugin JowiHue started successfully in 238 milliseconds
                          Sep-04 23:08:53         JowiHue    InitIO:Plugin basic initialisation done
                          Sep-04 23:08:53         JowiHue    Starting plugin JowiHue 2.0.4.8 on Linux
                          I also sometimes see a: "/usr/local/HomeSeer/html/JowiHue_Configuration" not found error.

                          I am on centos 7.1 x64. Mono v6


                          Comment


                            #28
                            Originally posted by accessdenied View Post
                            I am having a similar issue with the latest BETA 2.0.4.8.

                            Code:
                            Sep-04 23:09:23 Info Plugin JowiHue has connected. IP:127.0.0.1:52615
                            Sep-04 23:09:23 Warning I/O interface JowiHue is down, executable is not running, restarting ...
                            Sep-04 23:09:06 Info Plugin JowiHue with instance: has disconnected
                            Sep-04 23:09:06 JowiHue Initialisation bridgedevices did not complete correctly, exiting plugin
                            Sep-04 23:09:06 JowiHue Error: (FindBridgesSelf)::Object reference not set to an instance of an object
                            Sep-04 23:08:54 JowiHue Importing SensorModels
                            Sep-04 23:08:54 JowiHue Importing SensorTypes
                            Sep-04 23:08:54 JowiHue Importing lampTypes
                            Sep-04 23:08:53 Plug-In Finished initializing plug-in JowiHue
                            Sep-04 23:08:53 Starting Plug-In JowiHue loaded in 500 milliseconds
                            Sep-04 23:08:53 Starting Plug-In Plugin JowiHue started successfully in 238 milliseconds
                            Sep-04 23:08:53 JowiHue InitIO:Plugin basic initialisation done
                            Sep-04 23:08:53 JowiHue Starting plugin JowiHue 2.0.4.8 on Linux
                            I also sometimes see a: "/usr/local/HomeSeer/html/JowiHue_Configuration" not found error.

                            I am on centos 7.1 x64. Mono v6
                            Sounds like you are running into what I've been fighting. This is most likely not a JowiHue issue, its a HS core issue. I am literally in the code debugging it now working with Rich on a fix.

                            Comment


                              #29
                              Originally posted by w.vuyk View Post
                              Bill,

                              That indeed clarifies the issue, also for my understanding. Thanks for diving this deep to get to the roots!

                              Wim
                              An update, we are still fighting the root cause but are very close. We now know exactly where the timeout is occurring in the SCS code which results in the NULL internally that gets propagated as the NPE. However, it may not be a timeout as much as a deadlock Rich and I are testing this now, I thought InitIO in conjunction with GetDeviceEnumerator caused it but it may actually be HSEvent taking a lock as InitIO doesn't seem to have any shared locks around that explains it. That all said, not stopping until this is dead!

                              Had two requests for you; I see that when JowiHue starts even if its the only plugin running on the system some events are dropped. This usually means that HSEvent has done some long running thing, as 2000 events can happen very very quickly at startup a delay of just a few seconds can cause this. Can you look at your HSEvent logic and see if there are things you can defer to a queue instead of processing in line?

                              Startup:
                              Connected, waiting to be initialized...
                              11:56:08:1651:[Starting Plug-In]->Initializing plugin JowiHue ...
                              11:56:09:3469:[JowiHue]->Starting plugin JowiHue 2.0.4.6 on Linux
                              11:56:10:2997:[JowiHue]->InitIO:Plugin basic initialisation done
                              11:56:10:3318:[Starting Plug-In]->Plugin JowiHue started successfully in 2166 milliseconds
                              11:56:10:3656:[Starting Plug-In]->JowiHue loaded in 3405 milliseconds
                              11:56:10:3757:[Plug-In]->Finished initializing plug-in JowiHue
                              11:56:18:2096:[Warning]->Dropping event callbacks due to full queue (Type: CONFIG_CHANGE) (2000 entries), system may be too busy, plugins and HSTouch may not receive all device updates
                              Slow call: Process: HSPI_JowiHue Thread: 10:Thread Pool Worker Message: ScsRemoteInvokeMessage: IHSApplication.GetDeviceEnumerator(...) Timeout: 60000
                              Slow call: Process: HSPI_JowiHue Thread: 3: Message: ScsRemoteInvokeMessage: IHSApplication.GetDeviceByRef(...) Timeout: 60000
                              Slow call: Process: HSConsole Thread: 17: Message: ScsRemoteInvokeMessage: IAllRemoteAPI.HSEvent(...) Timeout: 480000
                              Slow call: Process: HSConsole Thread: 11:Thread Pool Worker Message: ScsRemoteInvokeMessage: IAllRemoteAPI.InterfaceStatus(...) Timeout: 480000
                              Slow call: Process: HSConsole Thread: 32:Thread Pool Worker Message: ScsRemoteInvokeMessage: IAllRemoteAPI.InterfaceStatus(...) Timeout: 480000
                              Slow call: Process: HSConsole Thread: 89:Check Interfaces Message: ScsRemoteInvokeMessage: IAllRemoteAPI.get_Name(...) Timeout: 480000
                              11:57:00:2431:[Event]->Event Trigger "Delayed Actions System Startup Notification (Delayed Event) - 1"
                              11:57:00:2705:[Event]->Event Trigger "Maintenance System Startup Notification"

                              Second issue is more maintenance, but you seem like the same type of coder I am so wanted you to know. Looks like when you receive the shutdown call you may still have a thread in a loop doing work that is not checking if the plugin is active. Now with additional exception logging in the communication framework this is what i see at shutdown (notice ShutdownIO is called, but thread 10 is calling GetDeviceByRef quite a bit post that). Since the server is disconnected the calls fail fairly quickly, but figured you might want to know in case you have a shared shutdown flag or cancellation token you could be checking in that thread.

                              Slow call: Process: HSPI_JowiHue Thread: 3: Message: ScsRemoteInvokeMessage: IHSApplication.GetDeviceByRef(...) Timeout: 60000
                              Slow call: Process: HSPI_JowiHue Thread: 10:Thread Pool Worker Message: ScsRemoteInvokeMessage: IHSApplication.GetDeviceEnumerator(...) Timeout: 60000
                              Slow call: Process: HSConsole Thread: 17: Message: ScsRemoteInvokeMessage: IAllRemoteAPI.HSEvent(...) Timeout: 480000
                              Slow call: Process: HSConsole Thread: 9:Thread Pool Worker Message: ScsRemoteInvokeMessage: IAllRemoteAPI.InterfaceStatus(...) Timeout: 480000
                              Slow call: Process: HSConsole Thread: 89:Check Interfaces Message: ScsRemoteInvokeMessage: IAllRemoteAPI.get_Name(...) Timeout: 480000
                              11:58:09:1449:[JowiHue]->Error: (ShutdownIO)::Error ending JowiHue Plug-In
                              Shutting down plugin
                              **** Exception Process: HSPI_JowiHue Thread: 10:Thread Pool Worker Message: ScsRemoteInvokeMessage: IHSApplication.GetDeviceByRef(...) Timeout: 60000 Exception: System.Exception: Client is not connected to the server communication state: Disconnected
                              at HSCF.Communication.Scs.Client.ScsClientBase.SendMessage (HSCF.Communication.Scs.Communication.Messages.IScsMessage message) [0x0001d] in <90ce3ee650ce43fd8757f2f759091405>:0
                              at HSCF.Communication.ScsServices.Communication.RequestReplyMes senger`1[T].SendMessageAndWaitForResponse (HSCF.Communication.Scs.Communication.Messages.IScsMessage message, System.Int32 timeoutMilliseconds) [0x000c0] in <90ce3ee650ce43fd8757f2f759091405>:0
                              at HSCF.Communication.ScsServices.Communication.RequestReplyMes senger`1[T].SendMessageAndWaitForResponse (HSCF.Communication.Scs.Communication.Messages.IScsMessage message) [0x00007] in <90ce3ee650ce43fd8757f2f759091405>:0
                              ScsServiceClient:Client_Disconnected Process: HSPI_JowiHue Thread: 7: Sender: HSCF.Communication.Scs.Client.Tcp.ScsTcpClient Args: System.EventArgs
                              Error SendInvokeResponse: Client is not connected to the server communication state: Disconnected
                              ScsServiceClient:Client_Disconnected Process: HSPI_JowiHue Thread: 5: Sender: HSCF.Communication.Scs.Client.Tcp.ScsTcpClient Args: System.EventArgs
                              Disconnected from server - client
                              **** Exception Process: HSPI_JowiHue Thread: 10:Thread Pool Worker Message: ScsRemoteInvokeMessage: IHSApplication.GetDeviceByRef(...) Timeout: 60000 Exception: System.Exception: Client is not connected to the server communication state: Disconnected
                              at HSCF.Communication.Scs.Client.ScsClientBase.SendMessage (HSCF.Communication.Scs.Communication.Messages.IScsMessage message) [0x0001d] in <90ce3ee650ce43fd8757f2f759091405>:0
                              at HSCF.Communication.ScsServices.Communication.RequestReplyMes senger`1[T].SendMessageAndWaitForResponse (HSCF.Communication.Scs.Communication.Messages.IScsMessage message, System.Int32 timeoutMilliseconds) [0x000c0] in <90ce3ee650ce43fd8757f2f759091405>:0
                              at HSCF.Communication.ScsServices.Communication.RequestReplyMes senger`1[T].SendMessageAndWaitForResponse (HSCF.Communication.Scs.Communication.Messages.IScsMessage message) [0x00007] in <90ce3ee650ce43fd8757f2f759091405>:0
                              **** Exception Process: HSPI_JowiHue Thread: 10:Thread Pool Worker Message: ScsRemoteInvokeMessage: IHSApplication.GetDeviceByRef(...) Timeout: 60000 Exception: System.Exception: Client is not connected to the server communication state: Disconnected
                              at HSCF.Communication.Scs.Client.ScsClientBase.SendMessage (HSCF.Communication.Scs.Communication.Messages.IScsMessage message) [0x0001d] in <90ce3ee650ce43fd8757f2f759091405>:0
                              at HSCF.Communication.ScsServices.Communication.RequestReplyMes senger`1[T].SendMessageAndWaitForResponse (HSCF.Communication.Scs.Communication.Messages.IScsMessage message, System.Int32 timeoutMilliseconds) [0x000c0] in <90ce3ee650ce43fd8757f2f759091405>:0
                              at HSCF.Communication.ScsServices.Communication.RequestReplyMes senger`1[T].SendMessageAndWaitForResponse (HSCF.Communication.Scs.Communication.Messages.IScsMessage message) [0x00007] in <90ce3ee650ce43fd8757f2f759091405>:0
                              **** Exception Process: HSPI_JowiHue Thread: 10:Thread Pool Worker Message: ScsRemoteInvokeMessage: IHSApplication.GetDeviceByRef(...) Timeout: 60000 Exception: System.Exception: Client is not connected to the server communication state: Disconnected
                              at HSCF.Communication.Scs.Client.ScsClientBase.SendMessage (HSCF.Communication.Scs.Communication.Messages.IScsMessage message) [0x0001d] in <90ce3ee650ce43fd8757f2f759091405>:0
                              at HSCF.Communication.ScsServices.Communication.RequestReplyMes senger`1[T].SendMessageAndWaitForResponse (HSCF.Communication.Scs.Communication.Messages.IScsMessage message, System.Int32 timeoutMilliseconds) [0x000c0] in <90ce3ee650ce43fd8757f2f759091405>:0
                              at HSCF.Communication.ScsServices.Communication.RequestReplyMes senger`1[T].SendMessageAndWaitForResponse (HSCF.Communication.Scs.Communication.Messages.IScsMessage message) [0x00007] in <90ce3ee650ce43fd8757f2f759091405>:0
                              **** Exception Process: HSPI_JowiHue Thread: 10:Thread Pool Worker Message: ScsRemoteInvokeMessage: IHSApplication.GetDeviceByRef(...) Timeout: 60000 Exception: System.Exception: Client is not connected to the server communication state: Disconnected
                              at HSCF.Communication.Scs.Client.ScsClientBase.SendMessage (HSCF.Communication.Scs.Communication.Messages.IScsMessage message) [0x0001d] in <90ce3ee650ce43fd8757f2f759091405>:0
                              at HSCF.Communication.ScsServices.Communication.RequestReplyMes senger`1[T].SendMessageAndWaitForResponse (HSCF.Communication.Scs.Communication.Messages.IScsMessage message, System.Int32 timeoutMilliseconds) [0x000c0] in <90ce3ee650ce43fd8757f2f759091405>:0
                              at HSCF.Communication.ScsServices.Communication.RequestReplyMes senger`1[T].SendMessageAndWaitForResponse (HSCF.Communication.Scs.Communication.Messages.IScsMessage message) [0x00007] in <90ce3ee650ce43fd8757f2f759091405>:0
                              **** Exception Process: HSPI_JowiHue Thread: 10:Thread Pool Worker Message: ScsRemoteInvokeMessage: IHSApplication.GetDeviceByRef(...) Timeout: 60000 Exception: System.Exception: Client is not connected to the server communication state: Disconnected
                              at HSCF.Communication.Scs.Client.ScsClientBase.SendMessage (HSCF.Communication.Scs.Communication.Messages.IScsMessage message) [0x0001d] in <90ce3ee650ce43fd8757f2f759091405>:0
                              at HSCF.Communication.ScsServices.Communication.RequestReplyMes senger`1[T].SendMessageAndWaitForResponse (HSCF.Communication.Scs.Communication.Messages.IScsMessage message, System.Int32 timeoutMilliseconds) [0x000c0] in <90ce3ee650ce43fd8757f2f759091405>:0
                              at HSCF.Communication.ScsServices.Communication.RequestReplyMes senger`1[T].SendMessageAndWaitForResponse (HSCF.Communication.Scs.Communication.Messages.IScsMessage message) [0x00007] in <90ce3ee650ce43fd8757f2f759091405>:0
                              **** Exception Process: HSPI_JowiHue Thread: 10:Thread Pool Worker Message: ScsRemoteInvokeMessage: IHSApplication.GetDeviceByRef(...) Timeout: 60000 Exception: System.Exception: Client is not connected to the server communication state: Disconnected
                              at HSCF.Communication.Scs.Client.ScsClientBase.SendMessage (HSCF.Communication.Scs.Communication.Messages.IScsMessage message) [0x0001d] in <90ce3ee650ce43fd8757f2f759091405>:0
                              at HSCF.Communication.ScsServices.Communication.RequestReplyMes senger`1[T].SendMessageAndWaitForResponse (HSCF.Communication.Scs.Communication.Messages.IScsMessage message, System.Int32 timeoutMilliseconds) [0x000c0] in <90ce3ee650ce43fd8757f2f759091405>:0
                              at HSCF.Communication.ScsServices.Communication.RequestReplyMes senger`1[T].SendMessageAndWaitForResponse (HSCF.Communication.Scs.Communication.Messages.IScsMessage message) [0x00007] in <90ce3ee650ce43fd8757f2f759091405>:0
                              **** Exception Process: HSPI_JowiHue Thread: 10:Thread Pool Worker Message: ScsRemoteInvokeMessage: IHSApplication.GetDeviceByRef(...) Timeout: 60000 Exception: System.Exception: Client is not connected to the server communication state: Disconnected
                              at HSCF.Communication.Scs.Client.ScsClientBase.SendMessage (HSCF.Communication.Scs.Communication.Messages.IScsMessage message) [0x0001d] in <90ce3ee650ce43fd8757f2f759091405>:0
                              at HSCF.Communication.ScsServices.Communication.RequestReplyMes senger`1[T].SendMessageAndWaitForResponse (HSCF.Communication.Scs.Communication.Messages.IScsMessage message, System.Int32 timeoutMilliseconds) [0x000c0] in <90ce3ee650ce43fd8757f2f759091405>:0
                              at HSCF.Communication.ScsServices.Communication.RequestReplyMes senger`1[T].SendMessageAndWaitForResponse (HSCF.Communication.Scs.Communication.Messages.IScsMessage message) [0x00007] in <90ce3ee650ce43fd8757f2f759091405>:0
                              **** Exception Process: HSPI_JowiHue Thread: 10:Thread Pool Worker Message: ScsRemoteInvokeMessage: IHSApplication.GetDeviceByRef(...) Timeout: 60000 Exception: System.Exception: Client is not connected to the server communication state: Disconnected
                              at HSCF.Communication.Scs.Client.ScsClientBase.SendMessage (HSCF.Communication.Scs.Communication.Messages.IScsMessage message) [0x0001d] in <90ce3ee650ce43fd8757f2f759091405>:0
                              at HSCF.Communication.ScsServices.Communication.RequestReplyMes senger`1[T].SendMessageAndWaitForResponse (HSCF.Communication.Scs.Communication.Messages.IScsMessage message, System.Int32 timeoutMilliseconds) [0x000c0] in <90ce3ee650ce43fd8757f2f759091405>:0
                              at HSCF.Communication.ScsServices.Communication.RequestReplyMes senger`1[T].SendMessageAndWaitForResponse (HSCF.Communication.Scs.Communication.Messages.IScsMessage message) [0x00007] in <90ce3ee650ce43fd8757f2f759091405>:0
                              **** Exception Process: HSPI_JowiHue Thread: 10:Thread Pool Worker Message: ScsRemoteInvokeMessage: IHSApplication.GetDeviceByRef(...) Timeout: 60000 Exception: System.Exception: Client is not connected to the server communication state: Disconnected
                              at HSCF.Communication.Scs.Client.ScsClientBase.SendMessage (HSCF.Communication.Scs.Communication.Messages.IScsMessage message) [0x0001d] in <90ce3ee650ce43fd8757f2f759091405>:0
                              at HSCF.Communication.ScsServices.Communication.RequestReplyMes senger`1[T].SendMessageAndWaitForResponse (HSCF.Communication.Scs.Communication.Messages.IScsMessage message, System.Int32 timeoutMilliseconds) [0x000c0] in <90ce3ee650ce43fd8757f2f759091405>:0
                              at HSCF.Communication.ScsServices.Communication.RequestReplyMes senger`1[T].SendMessageAndWaitForResponse (HSCF.Communication.Scs.Communication.Messages.IScsMessage message) [0x00007] in <90ce3ee650ce43fd8757f2f759091405>:0
                              **** Exception Process: HSPI_JowiHue Thread: 10:Thread Pool Worker Message: ScsRemoteInvokeMessage: IHSApplication.GetDeviceByRef(...) Timeout: 60000 Exception: System.Exception: Client is not connected to the server communication state: Disconnected
                              at HSCF.Communication.Scs.Client.ScsClientBase.SendMessage (HSCF.Communication.Scs.Communication.Messages.IScsMessage message) [0x0001d] in <90ce3ee650ce43fd8757f2f759091405>:0
                              at HSCF.Communication.ScsServices.Communication.RequestReplyMes senger`1[T].SendMessageAndWaitForResponse (HSCF.Communication.Scs.Communication.Messages.IScsMessage message, System.Int32 timeoutMilliseconds) [0x000c0] in <90ce3ee650ce43fd8757f2f759091405>:0
                              at HSCF.Communication.ScsServices.Communication.RequestReplyMes senger`1[T].SendMessageAndWaitForResponse (HSCF.Communication.Scs.Communication.Messages.IScsMessage message) [0x00007] in <90ce3ee650ce43fd8757f2f759091405>:0
                              **** Exception Process: HSPI_JowiHue Thread: 10:Thread Pool Worker Message: ScsRemoteInvokeMessage: IHSApplication.GetDeviceByRef(...) Timeout: 60000 Exception: System.Exception: Client is not connected to the server communication state: Disconnected
                              at HSCF.Communication.Scs.Client.ScsClientBase.SendMessage (HSCF.Communication.Scs.Communication.Messages.IScsMessage message) [0x0001d] in <90ce3ee650ce43fd8757f2f759091405>:0
                              at HSCF.Communication.ScsServices.Communication.RequestReplyMes senger`1[T].SendMessageAndWaitForResponse (HSCF.Communication.Scs.Communication.Messages.IScsMessage message, System.Int32 timeoutMilliseconds) [0x000c0] in <90ce3ee650ce43fd8757f2f759091405>:0
                              at HSCF.Communication.ScsServices.Communication.RequestReplyMes senger`1[T].SendMessageAndWaitForResponse (HSCF.Communication.Scs.Communication.Messages.IScsMessage message) [0x00007] in <90ce3ee650ce43fd8757f2f759091405>:0
                              **** Exception Process: HSPI_JowiHue Thread: 10:Thread Pool Worker Message: ScsRemoteInvokeMessage: IHSApplication.GetDeviceByRef(...) Timeout: 60000 Exception: System.Exception: Client is not connected to the server communication state: Disconnected
                              at HSCF.Communication.Scs.Client.ScsClientBase.SendMessage (HSCF.Communication.Scs.Communication.Messages.IScsMessage message) [0x0001d] in <90ce3ee650ce43fd8757f2f759091405>:0
                              at HSCF.Communication.ScsServices.Communication.RequestReplyMes senger`1[T].SendMessageAndWaitForResponse (HSCF.Communication.Scs.Communication.Messages.IScsMessage message, System.Int32 timeoutMilliseconds) [0x000c0] in <90ce3ee650ce43fd8757f2f759091405>:0
                              at HSCF.Communication.ScsServices.Communication.RequestReplyMes senger`1[T].SendMessageAndWaitForResponse (HSCF.Communication.Scs.Communication.Messages.IScsMessage message) [0x00007] in <90ce3ee650ce43fd8757f2f759091405>:0

                              (Repeats for awhile but the forum has a post length limit so reducing the list)


                              Comment


                                #30
                                Originally posted by bsobel View Post

                                An update, we are still fighting the root cause but are very close. We now know exactly where the timeout is occurring in the SCS code which results in the NULL internally that gets propagated as the NPE. However, it may not be a timeout as much as a deadlock Rich and I are testing this now, I thought InitIO in conjunction with GetDeviceEnumerator caused it but it may actually be HSEvent taking a lock as InitIO doesn't seem to have any shared locks around that explains it. That all said, not stopping until this is dead!

                                Had two requests for you; I see that when JowiHue starts even if its the only plugin running on the system some events are dropped. This usually means that HSEvent has done some long running thing, as 2000 events can happen very very quickly at startup a delay of just a few seconds can cause this. Can you look at your HSEvent logic and see if there are things you can defer to a queue instead of processing in line?

                                Startup:
                                Connected, waiting to be initialized...
                                11:56:08:1651:[Starting Plug-In]->Initializing plugin JowiHue ...
                                11:56:09:3469:[JowiHue]->Starting plugin JowiHue 2.0.4.6 on Linux
                                11:56:10:2997:[JowiHue]->InitIO:Plugin basic initialisation done
                                11:56:10:3318:[Starting Plug-In]->Plugin JowiHue started successfully in 2166 milliseconds
                                11:56:10:3656:[Starting Plug-In]->JowiHue loaded in 3405 milliseconds
                                11:56:10:3757:[Plug-In]->Finished initializing plug-in JowiHue
                                11:56:18:2096:[Warning]->Dropping event callbacks due to full queue (Type: CONFIG_CHANGE) (2000 entries), system may be too busy, plugins and HSTouch may not receive all device updates
                                Slow call: Process: HSPI_JowiHue Thread: 10:Thread Pool Worker Message: ScsRemoteInvokeMessage: IHSApplication.GetDeviceEnumerator(...) Timeout: 60000
                                Slow call: Process: HSPI_JowiHue Thread: 3: Message: ScsRemoteInvokeMessage: IHSApplication.GetDeviceByRef(...) Timeout: 60000
                                Slow call: Process: HSConsole Thread: 17: Message: ScsRemoteInvokeMessage: IAllRemoteAPI.HSEvent(...) Timeout: 480000
                                Slow call: Process: HSConsole Thread: 11:Thread Pool Worker Message: ScsRemoteInvokeMessage: IAllRemoteAPI.InterfaceStatus(...) Timeout: 480000
                                Slow call: Process: HSConsole Thread: 32:Thread Pool Worker Message: ScsRemoteInvokeMessage: IAllRemoteAPI.InterfaceStatus(...) Timeout: 480000
                                Slow call: Process: HSConsole Thread: 89:Check Interfaces Message: ScsRemoteInvokeMessage: IAllRemoteAPI.get_Name(...) Timeout: 480000
                                11:57:00:2431:[Event]->Event Trigger "Delayed Actions System Startup Notification (Delayed Event) - 1"
                                11:57:00:2705:[Event]->Event Trigger "Maintenance System Startup Notification"

                                Bill,

                                The only HSevent the plugin handles is indeed a CONFIG_Change, Which in the plugin is evaluated to see if the changed device is of the plugin. Sounds to me that HS is offering all devices as a change during startup? If it is a device owned by the plugin, it is handled in a seperate thread.
                                If the plugin indeed handles the device change, a logline is created on level debug.
                                I think the question is here why HS offers all devices (2000 of 3000?) to a plugin that won't do anything with it. I am a bit doubting if HS would offer those queued events all to the plugin, I think it just don't know where to deliver them if my plugin is the only one?

                                In the plugin I do limit the requested callback with

                                Code:
                                callback.RegisterEventCB(Enums.HSEvent.CONFIG_CHANGE, IFACE_NAME, "")
                                So would really only expect the plugins devices to be offered, and even then, nothing changed yet, so why offer it?

                                Guess this might need some viewing?

                                Will be back on the shutdown issue later

                                Wim
                                -- Wim

                                Plugins: JowiHue, RFXCOM, Sonos4, Jon00's Perfmon and Network monitor, EasyTrigger, Pushover 3P, rnbWeather, BLBackup, AK SmartDevice, Pushover, PHLocation, Zwave, GCalseer, SDJ-Health, Device History, BLGData

                                1210 devices/features ---- 392 events ----- 40 scripts

                                Comment

                                Working...
                                X