Announcement

Collapse
No announcement yet.

MCSSprinklers silently dying

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

    MCSSprinklers silently dying

    Hi. I am having a problem since I moved homeseer pro to a windows 7 guest machine on an ESXi hypervisor.

    I have everything working, but after some amount of time, 2-5 days, mcssprinklers (profession) stops functioning. When I try to exit homeseer to restart it, hs complains about not being able to find some html content and seemingly waits forever. Here is the log file output:

    5/12/2013 10:32:28 AM ~!~Shutdown~!~Shutting down plug-in: mcsSprinklersP
    5/12/2013 10:32:45 AM ~!~Error~!~Error processing plug-in links for /mcsSprinklers/mcsSprinklers.asp?Page=sprinkler requested by default from 10.0.8.66, at:0, Index was outside the bounds of the array.
    5/12/2013 10:32:46 AM ~!~Error~!~Error processing plug-in links for /mcsSprinklers/StyleNoBody.css requested by from 10.0.8.66, at:0, Index was outside the bounds of the array.
    5/12/2013 10:32:46 AM ~!~Error~!~Error processing plug-in links for /mcsSprinklers/OverLib/overlib.js requested by default from 10.0.8.66, at:0, Index was outside the bounds of the array.
    5/12/2013 10:32:46 AM ~!~Error~!~Error processing plug-in links for /mcsSprinklers/clock.gif requested by from 10.0.8.66, at:0, Index was outside the bounds of the array.
    5/12/2013 10:32:46 AM ~!~Error~!~Error processing plug-in links for /mcsSprinklers/sunrise.gif requested by from 10.0.8.66, at:0, Index was outside the bounds of the array.
    5/12/2013 10:32:46 AM ~!~Error~!~Error processing plug-in links for /mcsSprinklers/sunset.gif requested by from 10.0.8.66, at:0, Index was outside the bounds of the array.
    5/12/2013 10:32:46 AM ~!~Error~!~Error processing plug-in links for /kesmall.jpg requested by from 10.0.8.66, at:0, Index was outside the bounds of the array.
    5/12/2013 10:32:46 AM ~!~Error~!~Error processing plug-in links for /mcsSprinklers/Images/Calendar.gif requested by from 10.0.8.66, at:0, Index was outside the bounds of the array.
    5/12/2013 10:32:46 AM ~!~Error~!~Error processing plug-in links for /mcsSprinklers/Images/led-yellow.gif requested by from 10.0.8.66, at:0, Index was outside the bounds of the array.
    5/12/2013 10:32:46 AM ~!~Error~!~Error processing plug-in links for /mcsSprinklers/Images/green.gif requested by from 10.0.8.66, at:0, Index was outside the bounds of the array.
    5/12/2013 10:32:46 AM ~!~Error~!~Error processing plug-in links for /mcsSprinklers/Images/Sensors/blank1.gif requested by from 10.0.8.66, at:0, Index was outside the bounds of the array.
    5/12/2013 10:32:46 AM ~!~Error~!~Error processing plug-in links for /mcsSprinklers/Images/led-ltblue.gif requested by from 10.0.8.66, at:0, Index was outside the bounds of the array.
    5/12/2013 10:33:46 AM ~!~Error~!~Error processing plug-in links for /mcsSprinklers/mcsSprinklers.asp?Page=sprinkler requested by default from 10.0.8.66, at:0, Index was outside the bounds of the array.
    5/12/2013 10:33:47 AM ~!~Error~!~Error processing plug-in links for /mcsSprinklers/StyleNoBody.css requested by from 10.0.8.66, at:0, Index was outside the bounds of the array.
    5/12/2013 10:33:47 AM ~!~Error~!~Error processing plug-in links for /mcsSprinklers/OverLib/overlib.js requested by default from 10.0.8.66, at:0, Index was outside the bounds of the array.
    5/12/2013 10:33:47 AM ~!~Error~!~Error processing plug-in links for /mcsSprinklers/clock.gif requested by from 10.0.8.66, at:0, Index was outside the bounds of the array.
    5/12/2013 10:33:47 AM ~!~Error~!~Error processing plug-in links for /mcsSprinklers/sunrise.gif requested by from 10.0.8.66, at:0, Index was outside the bounds of the array.
    5/12/2013 10:33:47 AM ~!~Error~!~Error processing plug-in links for /mcsSprinklers/sunset.gif requested by from 10.0.8.66, at:0, Index was outside the bounds of the array.
    5/12/2013 10:33:47 AM ~!~Error~!~Error processing plug-in links for /kesmall.jpg requested by from 10.0.8.66, at:0, Index was outside the bounds of the array.

    This keeps on going....

    If I kill homeseer through taskmgr and restart it, mcssprinklers still doesnt work. However, a full reboot of the guest vm works fine, and restores service fine.

    When I say mcssprinklers dies, you can look at the run log and see that all communications checking with the rain8 modules stops. Weather checking with the wunderground sites in the logs also stops. And while you can access the plugin via the browser, it displays everything normally, but nothing "works". So when I go to the the zones status, and trigger a manual run, the zone says "manual control", and but the zone thsta is supposed to be running doesn't say watering now, and it says it's supposed to start at the current time, which keeps on increasing. There is no log activity of valve state changes, and no valve action occurs.

    I did do a clean install of HS into the guest VM (I didnt use a P2V tool), but did bring over all the config files from the old hs system as the hs instructions for moving to hardware instructs.


    Any ideas as to what's going on? This is a real problem since the weather is hot. rebooting the vm isn't that hard, but checking up on it is a pain.

    Thx
    Mike

    #2
    In a user's debug I saw a case where the schedule loop stopped running. The debug that was available did not have information to give hints why. The latest beta posted has some additional debug to try to isolate cause. Run with the degug on if you update to latest.

    Comment


      #3
      Ok, I will upgrade to the latest beta and turn debug on.

      Can you tell me if there is some process that may be dying that I could look for?

      thx
      mike

      Comment


        #4
        Debug file from last night

        Mike, I have attached the mcs debugging log from last night. I have had debugging on for some time, but never could spot any obvious cause. From the serial i/o log, mcssprinklers seemed to have stopped at around 8:20 PM. You can see the behavior in the log changes around then, and not much happens afterwards.

        For the other user that was having the problem, how did he resolve it?

        Thx
        mike
        Attached Files

        Comment


          #5
          This is the same behavior. The latest beta has additional debug to help understand why the periodic cycle stopped. No report from anybody using this latest beta 2.13.0.10. Using it should help get to the bottom of it as soon as possible.

          Comment


            #6
            Looking at your data it appears the issue is associated with sync lock between two threads. One for sending Rain8Net and one for receiving Rain8Net. I added additional debug in V2.13.0.12 to zero in on this.

            Comment


              #7
              Originally posted by Michael McSharry View Post
              Looking at your data it appears the issue is associated with sync lock between two threads. One for sending Rain8Net and one for receiving Rain8Net. I added additional debug in V2.13.0.12 to zero in on this.
              Ok, I have loaded that version with debugging on, so I'll report back when it fails...

              Thx
              Mike

              Comment


                #8
                Ok, it didn't take long to fail. I have attached the log file. According to the serial I/O log, things quit around 10:27 PM.

                Is there a older version of the code that is known not to have this problem that I can revert to?

                BTW, why is it that I have to reboot the system to clear the fault, and simply restarting HS doesn't fix it?

                Thx
                mike
                Attached Files

                Comment


                  #9
                  I'm guessing you have a setup where the Rain8 Setup page has selected that status for Rain8 be polled every minute. A setup where status is only polled after command will greatly reduce the communcations that is causing the problem.

                  I'm guessing is that it is an interlock in .NET and apparently the reboot is the only mechanism to clear it. It is only a guess at this point.

                  It would take some research to identify where the additional thread protection logic was added. If you want to try an earlier version then perhaps an early V2.12. The links are located at the top of this forum.

                  Comment


                    #10
                    Originally posted by Michael McSharry View Post
                    I'm guessing you have a setup where the Rain8 Setup page has selected that status for Rain8 be polled every minute. A setup where status is only polled after command will greatly reduce the communcations that is causing the problem.

                    I'm guessing is that it is an interlock in .NET and apparently the reboot is the only mechanism to clear it. It is only a guess at this point.

                    It would take some research to identify where the additional thread protection logic was added. If you want to try an earlier version then perhaps an early V2.12. The links are located at the top of this forum.
                    That's right! I just changed it to only poll after a request. I didn't know this would trigger a problem, but I can see whybthisnis hard to debug. Should I be running a specific version of NET? I am happy to set this back to 1 min polls to help you debug it. An odd race condition like this is unerving have in a function like this.

                    If you can detect it, can you at least generate an alert for it?

                    Thx
                    Mike

                    Comment


                      #11
                      I posted 2.13.0.13 based upon the data in the debug that showed a nested synclock that looks to have caused the issue.

                      Comment


                        #12
                        Originally posted by Michael McSharry View Post
                        I posted 2.13.0.13 based upon the data in the debug that showed a nested synclock that looks to have caused the issue.
                        Ok, will try it out shortly with polling set to 1 min.


                        Thanks for the quick turnaround,
                        Mike

                        Comment


                          #13
                          Mike, so far so good, almost 24 hrs without a hang...

                          Good work.

                          Thx
                          Mike

                          Comment


                            #14
                            Still working fine. Looks like you can cross the bug off the buglist.

                            Thx
                            mike

                            Comment

                            Working...
                            X