Announcement

Collapse
No announcement yet.

Plugin won't initialize at startup

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

    Plugin won't initialize at startup

    On 2.0.3.3, for the last few weeks JowiHue reports "Plugin is not in ready state yet, bridge unreachable?" after boot. If I stop the plugin and start it again, it works fine. Is there anything that could potentially fail at startup that could be periodically retried to resolve this? Anything I can do to help track it down?
    Thanks
    Bill

    #2
    Bill,

    This error is shown when you open the configuration page and the bridge is not found (yet). Is it possible that you open the configuration page too soon after the boot?
    Also there can be a challenge when you restart deCONZ at the same time as HS3, are they on the same machine being rebooted? Then make sure to start deCONZ first and HS3 after that. Am still trying to find a solution to that..

    Wim
    -- Wim

    Plugins: JowiHue, RFXCOM, Sonos4, Jon00's Perfmon and Network monitor, EasyTrigger, Pushover 3P, rnbWeather, BLBackup, AK SmartDevice, Pushover, PHLocation, Zwave, GCalseer, SDJ-Health, Device History, BLGData

    1210 devices/features ---- 392 events ----- 40 scripts

    Comment


      #3
      "Is it possible that you open the configuration page too soon after the boot?" No, it doesn't matter if I open the page in a minute or 24 hours later. Until I stop and restart the plugin, it never finds the bridge. Its like whatever you are using to search must have an error at startup so never tries again until manually restarted. Perhaps an uncaught exception is killing the search loop?

      Also not using deCONZ, just have one Hue square bridge currently configured (out of 4 the plugin sees on the network).

      Comment


        #4
        Logs from startup and then the restart of the plugin. At startup there are a couple errors, on restart I get the not found error but then its immediately found (the hub was up the whole time). I can reproduce the not finding the hub at HS startup on every boot, currently it is 100% reproducible and as of now a restart is 100% bandaid (e.g. I have not seen the error occur when I do the restart, only at system boot).

        Aug-24 2:41:41 PM JowiHue ============================================================ ==================================
        Aug-24 2:41:41 PM JowiHue Device of reporting bridge and light firmware updates is enabled
        Aug-24 2:41:41 PM JowiHue 0 Animations available for use in this plugin
        Aug-24 2:41:41 PM JowiHue 2 Local scenes available for use in the JowiHue plugin
        Aug-24 2:41:41 PM JowiHue 23 Presets available for use in the JowiHue plugin
        Aug-24 2:41:41 PM JowiHue -------------------------------------Plugin info ---------------------------------------------
        Aug-24 2:41:41 PM JowiHue Bridge Main House is at version of type on a
        Aug-24 2:41:41 PM JowiHue 21 Groups available on bridge 'Main House'
        Aug-24 2:41:41 PM JowiHue 49 lights and/or plugs connected to bridge 'Main House'
        Aug-24 2:41:41 PM JowiHue -------------------------------------Bridge info ---------------------------------------------
        Aug-24 2:41:41 PM JowiHue ============================================================ ==================================
        Aug-24 2:41:03 PM JowiHue * Bridge Main House at 10.1.0.182:80 not connected (not reachable)
        Aug-24 2:40:53 PM JowiHue Warning: Bridge Main House not reachable on 10.1.0.182 is bridge powered on or IP/Port address has changed?
        Aug-24 2:40:53 PM JowiHue Cannot find bridge Main House on ip adress 10.1.0.182:80 marking unreachable
        Aug-24 2:40:13 PM JowiHue InitIO:Plugin basic initialisation done
        Aug-24 2:40:12 PM JowiHue Starting plugin JowiHue 2.0.3.3 on Linux
        Aug-24 2:39:59 PM JowiHue Error: (ShutdownIO)::Error ending JowiHue Plug-In
        Aug-24 10:27:41 AM JowiHue InitIO:Plugin basic initialisation done
        Aug-24 10:27:41 AM JowiHue Warning: MainWatchThread: No bridge found, plugin will scan until found
        Aug-24 10:27:30 AM JowiHue Error: (RegisterWebPage)::Registering Web Links: Object reference not set to an instance of an object
        Aug-24 10:27:30 AM JowiHue Error: (InitPlugin)::Object reference not set to an instance of an object
        Aug-24 10:26:29 AM JowiHue Starting plugin JowiHue 2.0.3.3 on Linux

        Comment


          #5
          The initial error you get is one I never saw happening before:
          Aug-24 10:27:30 AM JowiHue Error: (RegisterWebPage)::Registering Web Links: Object reference not set to an instance of an object
          Aug-24 10:27:30 AM JowiHue Error: (InitPlugin)::Object reference not set to an instance of an object
          This is done at the moment the plugin starts, it is not even yet looking for bridges? Is there anything special in the installation that could cause this issue? Are there any HS releated messages around the same time? Is HS maybe starting the plugin to early while still starting itself?

          Can you send me a detailed trace of the initial start after the boot and the restart? To do this follow these steps:
          1. Set the detailed trace and log to filoe option on the configuration page
          2. Perform the boot and let the plugin run for 10 minutes or so
          3. Restart the plugin and let it again run for 10 minutes.
          4. Disable the detailed log and log to file options
          5. In the logs subdirectory you will find JowiHue.log and JowiHue-last.log, send me both files?
          Send the files to jowihue at ziggo dot nl?

          Thanks,

          Wim
          -- Wim

          Plugins: JowiHue, RFXCOM, Sonos4, Jon00's Perfmon and Network monitor, EasyTrigger, Pushover 3P, rnbWeather, BLBackup, AK SmartDevice, Pushover, PHLocation, Zwave, GCalseer, SDJ-Health, Device History, BLGData

          1210 devices/features ---- 392 events ----- 40 scripts

          Comment


            #6
            Will do. As to the error, i have been fighting startup errors on Mono the entire time I've been on HS. Rich knows, but hasn't been able to identify a root cause. I see random exceptions on HS calls (which often work when retried, when HS is busy during startup). Likely you are hitting one that if retried would work but isn't retried, at the second startup the plugin doesn't hit that (because HS isnt busy) and life is good.

            Off to do the detailed logging now, thank you.

            Comment


              #7
              Getting the logs now, on the initial startup you are getting an exception calling GetDeviceEnumerator. We've found on Mono when running on a secondary thread that if the plugin is built on .NET 4.5.2 or less, it often gets an exception on the first attempt (if HS is busy) and works fine on the second. It seems to have to do with the connection to HS for remoting trying to setup while HS doesn't respond promptly. On .NET >= 4.6 I have seen this error much more randomly (fails sometimes, not others, sometimes multiple retries are required). But for your case, I really really suspect if you retry those HS calls at least once after any null pointer error, you will see it start working consistently on Mono.

              I'd probably be much happier if I moved to Windows, but really am working towards root cause on this (I think its an HS/mono issue that plugin authors are running into, as having to retry the HS calls should in an ideal world not be required)...

              Here is the problematic call:

              10:28:31.2259190 -- (HueTools.InitPlugin) Error: :Object reference not set to an instance of an object
              10:28:33.1662960 -- (HueTools.InitPlugin) System.NullReferenceException(0x80004003): Object reference not set to an instance of an object
              at System.Runtime.Remoting.Proxies.RealProxy.PrivateInvoke (System.Runtime.Remoting.Proxies.RealProxy rp, System.Runtime.Remoting.Messaging.IMessage msg, System.Exception& exc, System.Object[]& out_args) [0x00108] in <15c986724bdc480293909469513cfdb3>:0
              at (wrapper managed-to-native) System.Object.__icall_wrapper_mono_remoting_wrapper(intptr,i ntptr)
              at (wrapper remoting-invoke) HomeSeerAPI.IHSApplication.GetDeviceEnumerator()
              at HSPI_JowiHue.MyBridges.FindBridgesSelf () [0x000be] in <49591be7829f4c96b8b5a8559a6b5db2>:0
              at HSPI_JowiHue.MyBridges.Init () [0x0001b] in <49591be7829f4c96b8b5a8559a6b5db2>:0
              at HSPI_JowiHue.HueTools.InitPlugin () [0x000b3] in <49591be7829f4c96b8b5a8559a6b5db2>:0

              10:28:33.1665650 -- (HueTools.InitPlugin) Debug: Config Done InitOK=False

              I will send the two logs to you right now.

              Comment


                #8
                Well, that is one of those challences with Mono. Earlier I had a discussion about the plugin raising CPU and memory, a bit later irt was not the plugin anymore, but a bug in a unsupported (by HS) Mono version.... later it was the plugin taking more cpu usage, now it is HS3 not being ready to take calls from plugins. At the same time Windows is just buzzing around. I am getting more doubts if Linux is the road to go with Mono around as it keeps throwing strange bugs.

                Not sure here if it is wise to just 'ignore' errors and just bump it again. Does not sound as correct coding to me.....

                But even if I want to try this test, did you test the latest beta? I compiled it on a different .NET level for you to test, maybe the issue changes too on the beta?

                Thanks,

                Wim
                -- Wim

                Plugins: JowiHue, RFXCOM, Sonos4, Jon00's Perfmon and Network monitor, EasyTrigger, Pushover 3P, rnbWeather, BLBackup, AK SmartDevice, Pushover, PHLocation, Zwave, GCalseer, SDJ-Health, Device History, BLGData

                1210 devices/features ---- 392 events ----- 40 scripts

                Comment


                  #9
                  "Not sure here if it is wise to just 'ignore' errors and just bump it again. Does not sound as correct coding to me....."

                  I agree, but it's not really ignoring the issue, its a known bug causing the first attempt on a secondary thread to fail. Since we know what it is, at least there is a known way to resolve it. Its always a Null pointer exception in a call via remoting to HS and always from from the first call to remoting on a non-primary thread.

                  On the beta, I have not, would you like me to? The .net level change was effecting plugins >= .NET 4.6. This issue occurs regardless of the .NET level do to this remoting bug (the issues on later .NET happen much more often, the symptom is similar. In those versions its like the remoting is torn down more and has to reestablish or something else is apparently retriggering the exact or similar issue, but on 4.5.2 and above its always 'secondary thread, first call'...

                  "At the same time Windows is just buzzing around. I am getting more doubts if Linux is the road to go with Mono around as it keeps throwing strange bugs."

                  I have the same concerns, part of that is the HS itself is only doing development on Windows. They really need a few folks on Mono full time, IMHO if they are serious about non Windows platforms.

                  Comment


                    #10
                    So far I know you are now the only one reporting this bug on the plugin. There are several other users of the plugin using Mono that did not report (yet) this issue. For where I am standing it could be a specific issue in your installation?
                    I am a bit unsure on what you mean by 'remoting'? Is the plugin used on a remote machine for you or are you referring to the plugin being an external ap for HS3? If the plugin is used on a remote machine I might better understand the slowness - although then synchronous starting woult not be needed I guess?

                    Wim
                    -- Wim

                    Plugins: JowiHue, RFXCOM, Sonos4, Jon00's Perfmon and Network monitor, EasyTrigger, Pushover 3P, rnbWeather, BLBackup, AK SmartDevice, Pushover, PHLocation, Zwave, GCalseer, SDJ-Health, Device History, BLGData

                    1210 devices/features ---- 392 events ----- 40 scripts

                    Comment


                      #11
                      Originally posted by w.vuyk View Post
                      So far I know you are now the only one reporting this bug on the plugin. There are several other users of the plugin using Mono that did not report (yet) this issue. For where I am standing it could be a specific issue in your installation?
                      I am a bit unsure on what you mean by 'remoting'? Is the plugin used on a remote machine for you or are you referring to the plugin being an external ap for HS3? If the plugin is used on a remote machine I might better understand the slowness - although then synchronous starting woult not be needed I guess?
                      Wim
                      What plugins are effected by the bug varies as since it seems to manifest the most during startup, right now it seems to be hitting you. Im sure you have other users who have hit this without knowing, if you search the forum for that specific mono line at the top of the stack trace you'll see threads on it.

                      As for remoting, HomeSeer uses .NET remoting to talk to the plugins and for the plugins to talk back to HS. They used an open source package which is just a wrapper over the dot net framework called SCS (https://github.com/hikalkan/scs). The plugin is running on the same box as HS, but it still uses remoting since its a separate process. its this remoting setup that fails, and like I mentioned when it occurs its the first HS call on an secondary threads (the main thread already has remoting setup, so thats why calls on the main thread don't fail).

                      I've been dealing with this darn bug for quite some time, as have a few others users (you'll notice when you search for that mono line). Its frustrating because its nothing in your code, so I realize I'm asking you to work around a bug. My code catches the NPE and retries after a 1 second delay up to 10 times, and then if that fails it throws the original exception. This way if there is a real bug its not masked, but these transient ones are dealt with.

                      The 10 times came from when I was building on 4.7 not realizing it made the problem much worse, on 4.5.2 Im consistently seeing the one exception for each new thread (if the thread dies and has to connect again, it can happen again, but seems to depend on if HS is busy). I suspect there is timeout somewhere or a full message queue that needs to be dumped while HS is doing some startup tasks.

                      I am continuing to try and find the final root cause on this, as honestly HS doesn't fully work on Mono today IMHO until this is addressed. Even if you made retires optional, I'd really really appreciate a build with it, until root cause (and a fix from mono or HS) its kinda whack-a-mole at startup seeing which plugins failed and which didn't (and since its timing related, that changes as new plugins are added, etc)...

                      Comment


                        #12
                        If you search the forum for: "__icall_wrapper_mono_remoting_wrapper(intptr,intptr)" many of those cases are manifestations of this bug, its just taken a long time to get all the pieces together to figure it out...

                        Comment


                          #13
                          I think this is another case of one of your uses (also named Bill) running into the bug; https://forums.homeseer.com/forum/li...lugin-restarts Notice with config changes, it went away, its part of the 'fun' transient nature of this one (like you, I much prefer 100% reproducible bugs!)

                          Edit; I am more and more convinced that HS is doing some long running operation on its primary thread and is not pumping windows messages during those operations. Async callbacks rely on the message queue to get dispatched. I just sent Rich a long email on this and asked him to audit the startup code with this in mind, Ill let you know what he comes back with. If thats indeed the root cause, we can fix this by ensuring HS itself is pumping messages correctly, but will remain a legacy bug for anyone before whatever version includes that fix (meaning the retry, if you do agree to do it, won't be completely wasted work)

                          Comment


                            #14
                            I too am having this same issue. The bridge is pingable, I can see it's notification webpage yet HS reports it as unreachable. All of my devices then show unreachable yet I can control them via Alexa or the Hue app so I know the bridge is alive and functioning.

                            Comment


                              #15
                              Disregard, the issue seems resolved in the most recent Beta.

                              Comment

                              Working...
                              X