Announcement

Collapse
No announcement yet.

System crashing (out of memory) when plugin enabled

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

    System crashing (out of memory) when plugin enabled

    So I finally setup my weather station this week and turned on the plugin (was sitting disabled for about 2 months) I've had HS crash twice in the last 24 hours with out of memory. Usually HS runs at about 15% memory usage on my machine but I've seen it spike to 40% before crashing with the plugin enabled. Disabled it again last night, have now ran 24 hours 'normally'.

    Linux SEL system
    Mono 5.10.0.220

    Any suggestions on where to start looking for this? Any other similar reports?

    Thanks
    Bill

    #2
    First report of this.

    How much memory does the SEL have?
    How many of the devices do you have enabled (Ambient Weather devices)?
    40% memory usage during updates is not terribly a lot. The PI is more CPU intensive during updates while updating the HS3 devices generally but only for a short time period.

    Can you provide some memory stats during the run time of the plugin? Some top or htop output sorted by memory?

    Comment


      #3
      Ill re-run it, but to be clearer, its the HSConsole process that runs out of memory when the plugin is enabled. Yes this shouldn't happen. In the past some plugins would create threads or other object statically when loaded (e.g. their HSPI constructor would allocate objects instead of waiting to do it in InitIO). Since HSConsole loads each plugin twice (once to read its settings, which is loads in its own process, and a second launched externally those allocations and threads can crash HSConsole). The other issue (which is just as likely) is a bug in one of the HS api's that leaks memory that your plugin happens to hit more often than the other plugins active on my system.

      Ill run and get the data in the morning since I will need to babysit it for a few hours. Thanks!

      p.s. Oh and 4gig physical on the SEL boxes.

      Comment


        #4
        What version of the plugin? If not the latest beta then update to that.

        What's your setup also? AmbientWeather.Net only (cloud) or also using an ObserverIP module?

        Comment


          #5
          Originally posted by Simplex Technology View Post
          What version of the plugin? If not the latest beta then update to that.

          What's your setup also? AmbientWeather.Net only (cloud) or also using an ObserverIP module?
          I was on 2.1.0.34, I just installed the beta and will start monitoring it after breakfast (ok brunch, late start today ,))
          Using ObserverIP module on local network.

          Comment


            #6
            Ok, I enabled the plugin right after my post and headed to brunch, figured it would take awhile. Nope, HS restarted 45 minutes later. But the symptoms are different, it seems the plugin is now consuming the memory and not responding promptly to HS, so HS seems hung and HsSentry restarts it. The plugin was running for about 10 minutes and already consumed 34+% of memory and 40-100% cpu (screen shot from top sorted by memory attached).

            Click image for larger version  Name:	Screen Shot 2018-12-23 at 12.42.12 PM.png Views:	1 Size:	693.9 KB ID:	1269082

            Comment


              #7
              Oh and by the way, you forgot to tag your config link as a config link so its clickable from the Plugin page directly to the config page. Just FYI, should be a one line change for that

              Comment


                #8
                Click image for larger version

Name:	Screen Shot 2018-12-23 at 12.54.58 PM.png
Views:	52
Size:	314.5 KB
ID:	1269087 One more data point. Restarted the plugin, tried to bring up the config page. Been sitting waiting for 5 minutes while the process has gone to 22% memory and 40ish-134% cpu (dual core). Was going to turn off the local server to see if it helped, will look for an ini file since I can't get to config. Since you haven't seen this before, could be a mono issue. Any chance you can test on the latest mono?

                Comment


                  #9
                  Ok, big change. I went into the .ini file and removed the ObserverIp ip and mac entries (Obs1IP, Obs1Mac). Plugin is at 3.2% memory usage and spiking to about 54% cpu every minute during polling. Running fine in this mode, so the issue seem tied to the direct connection to the observeIP. Luckily I can run polling the cloud until we figure this out.

                  I'm also a developer, so happy to try whatever you need, just lmk...

                  I haven't dug into the http library you are using, but fyi: https://github.com/mono/mono/issues/11928 No evidence (at all) other than there is an issue connecting to a device, but a guess...

                  Comment


                    #10
                    I'm curious if your ObserverIP module is hung and either slow or not responding at all.

                    1. Can you load the ObserverIP in a Web Browser?
                    Does it load quickly or slowly?

                    The ObserverIP Module has been an issue from the start at the little box has a memory leak/constraint and it will eventually run out of memory and crash. In most other instances of the Plugin this just throws Connection Errors. However it may behave differently with Mono.

                    Comment


                      #11
                      Originally posted by Simplex Technology View Post
                      I'm curious if your ObserverIP module is hung and either slow or not responding at all.

                      1. Can you load the ObserverIP in a Web Browser?
                      Does it load quickly or slowly?

                      The ObserverIP Module has been an issue from the start at the little box has a memory leak/constraint and it will eventually run out of memory and crash. In most other instances of the Plugin this just throws Connection Errors. However it may behave differently with Mono.
                      If I go with a browser it 'appears' to load fast, e.g. the page renders with data reasonably. However I turned on Safari developer tools, the live data page is taking 8-40 seconds to load. It visually loads in about 4 seconds, and most times totally done in under 10. But just saw one take 20 seconds, another pass took 40. The holdup seems to be loading the axisj0.js file. I would normally assume your plugin doesn't load those components, but perhaps they are. I noticed the refresh time for the observerIP defaulted to 16 seconds, so possibly it is doing another request before processing the first? (Haven't dug into the code enough to know if the 16 second timer restarts AFTER the prior run, or is concurrent).

                      Click image for larger version

Name:	Screen Shot 2018-12-23 at 1.47.50 PM.png
Views:	40
Size:	101.7 KB
ID:	1269105

                      Comment


                        #12
                        FYI I re-added the IP but changed the ObserverIP polling to 60 seconds from 16. Letting it run now.

                        Comment


                          #13
                          Originally posted by bsobel View Post
                          FYI I re-added the IP but changed the ObserverIP polling to 60 seconds from 16. Letting it run now.
                          Early versions polled at the 16 seconds because that is the availability time of data from PWS to OIP. However we learned quickly the OIP dies rapidly and it was changed to a polling instead of a auto poll service.

                          Comment


                            #14
                            Originally posted by Simplex Technology View Post

                            Early versions polled at the 16 seconds because that is the availability time of data from PWS to OIP. However we learned quickly the OIP dies rapidly and it was changed to a polling instead of a auto poll service.
                            Sorry, what do you mean by auto poll vs polling? What is the timer set to in the later versions? Its running but the leak (I believe) remains. Memory has gone from 3.4% to about 10% in the last 40 minutes. Slower, but that seems to reflect the lower polling interval. The high cpu usage is also back (its the highest cpu usage overall of any of the plugins by a good multiple). 30mins of total CPU time the last 40 mins (58 mins for HSConsole, 15mins for zwave which is normally my highest)

                            Comment


                              #15
                              Originally posted by bsobel View Post

                              Sorry, what do you mean by auto poll vs polling? What is the timer set to in the later versions? Its running but the leak (I believe) remains. Memory has gone from 3.4% to about 10% in the last 40 minutes. Slower, but that seems to reflect the lower polling interval. The high cpu usage is also back (its the highest cpu usage overall of any of the plugins by a good multiple). 30mins of total CPU time the last 40 mins (58 mins for HSConsole, 15mins for zwave which is normally my highest)
                              The memory leak I was referring to is the OIP itself. Not the plugin. On initial start there's nothing in memory until the first polling cycle so measurement from start to active is not a clear measure.

                              There is no internal polling interval in the latest versions that was removed. The only polling is what you set in configuration.

                              CPU is known to spike during the refresh (polling cycle) and then it goes back down. For it to stay at a consistently high level would indicate a problem

                              Comment

                              Working...
                              X