Announcement

Collapse
No announcement yet.

Homeseer crashes...how to debug?

Collapse
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Homeseer crashes...how to debug?

    I am getting random crashes on a Homeseer SEL running 3.0.0.531 (though this has been happening on earlier builds as well)...there does not seem to be any pattern with them. It is happening roughly one or twice a week.

    I have HSsentry running yet it fails to restart HS every time this happens. I have been rebooting the box when this occurs. There is nothing in the HS logs that indicates a problem...the process just dies. Syslog does not show anything either.

    I am looking for some advice on if there is any additional logging/monitoring I can install to try and track down what is going on here. This is really starting to wear on the family (and me).

    thanks,
    Howie





  • #2
    Try dmesg first and see if anything jumps out there.
    Also, try going to the /var/log folder and ls -lt to see if there are any logs there that will help.
    HS3Pro Running on a Raspberry Pi3
    64 Z-Wave Nodes, 168 Events, 280 Devices
    UPB modules via OMNI plugin/panel
    Plugins: Z-Wave, BLRF, OMNI, HSTouch, weatherXML, EasyTrigger
    HSTouch Clients: 3 Android, 1 Joggler

    Comment


    • #3
      I have looked through all of the system logs and can't find anything. It is frustrating that HSSentry is not restarting the system like it is supposed to. I have a script to run an event that texts me on startup but since HSSentry is not working I have to find out about crashes by my lights or something else not working.


      Comment


      • #4
        I have been having the same issue for about the past month. Once or twice a week I'll come home and HS will have shot down. I went to the Windows Event Viewer > Windows Logs > Application and there is usually three sequential errors (screen shot below). It's an ASP.NET error, then a .NET Runtime, then an Application error. The details from the ASP.NET error are below. Any help would be appreciated.
        - System
        - Provider
        [ Name] ASP.NET 4.0.30319.0
        - EventID 1325
        [ Qualifiers] 49152
        Level 2
        Task 0
        Keywords 0x80000000000000
        - TimeCreated
        [ SystemTime] 2019-05-15T15:40:08.805534900Z
        EventRecordID 19349
        Channel Application
        Computer VM-893945
        Security
        - EventData
        An unhandled exception occurred and the process was terminated. Application ID: HS3.exe Process ID: 7548 Exception: System.OutOfMemoryException Message: Exception of type 'System.OutOfMemoryException' was thrown. StackTrace: at System.Net.Sockets.OverlappedAsyncResult.PostCompletion(Int3 2 numBytes) at System.Net.Sockets.BaseOverlappedAsyncResult.CompletionPortC allback(UInt32 errorCode, UInt32 numBytes, NativeOverlapped* nativeOverlapped) at System.Threading._IOCompletionCallback.PerformIOCompletionCa llback(UInt32 errorCode, UInt32 numBytes, NativeOverlapped* pOVERLAP)

        Comment


        • #5
          I have simlar issues, with high CPU usage after X time, tried also differnet things. Disableing pluginjs, hstouch, hsphone, but stille happening. No apparent messages in LOG and Win Eventlog.
          Regards Bart
          ------------------------------------------
          Win7 64Bit on Intel NUCI7 with SSD
          HSPRO 3.
          Devices; 1370 Events; 691

          Jon00 Scripts, JowHue, HSTouch, Plugwise, Z-wave, Ultranetatmo, Ultracam, PHlocation, BLUSBUIRT, MeiHarmony, Buienradar, MEiUnifi Pushover 3P, Random, Nest HSPhone and Blueiris

          Visonic Powermax Alarm System (HS3) Interface: http://www.domoticaforum.eu/viewtopic.php?f=68&t=11129

          Comment


          • #6
            Another HS crash today. I have no idea when or why. I find it hard to believe that nothing gets logged but that is what is happening.

            Does anyone have any idea how to track this down? Is there a more detailed level of logging I can enable for Homeseer?

            Anybody from Homeseer have any suggestions?

            Comment


            • #7
              You can enable debug logging by setting DebugMode to the proper value.

              https://homeseer.com/support/homesee..._debugmode.htm

              Comment


              • #8
                Originally posted by lveatch View Post
                You can enable debug logging by setting DebugMode to the proper value.

                https://homeseer.com/support/homesee..._debugmode.htm
                That’s an old version of the help file. The correct HS3 one is here: http://help.homeseer.com/help/HS3/st...gins_debugmode
                HS 3.0.0.532: 1963 Devices 1141 Events
                Z-Wave 3.0.1.261: 122 Nodes on one Z-Net

                Comment


                • #9
                  I have enabled the debug logging...waiting for this to happen again.

                  I have been keeping an eye on the system more closely...I noticed this morning that HS is using a ton of CPU. Not sure why it shows up as "(system)_17-May" in top, but looking at the PID it is indeed the mono processes for HS:

                  top - 07:28:01 up 1 day, 10:09, 2 users, load average: 1.43, 1.13, 0.93
                  Tasks: 138 total, 4 running, 134 sleeping, 0 stopped, 0 zombie
                  %Cpu(s): 66.4 us, 4.7 sy, 0.0 ni, 28.7 id, 0.2 wa, 0.0 hi, 0.0 si, 0.0 st
                  KiB Mem: 1959280 total, 1830492 used, 128788 free, 3884 buffers
                  KiB Swap: 0 total, 0 used, 0 free. 77040 cached Mem

                  PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
                  1108 root 20 0 222184 74416 708 S 125.4 3.8 472:14.57 (System)_17-May
                  1585 root 20 0 124296 56108 1164 S 10.0 2.9 105:56.15 mono
                  1499 root 20 0 159328 78144 0 S 4.0 4.0 70:36.52 mono
                  1598 root 20 0 101344 38372 0 R 0.7 2.0 25:46.08 mono
                  1637 root 20 0 111564 44096 2888 S 0.7 2.3 6:37.28 mono

                  root@hometrollerSEL:/home/homeseer# ps -eaf | grep 1108
                  root 1108 1106 23 May17 ? 07:55:00 mono HSConsole.exe
                  root 1437 1108 0 May17 ? 00:06:03 /usr/bin/mono /usr/local/HomeSeer/HSPI_drhsIpPlugIn.exe
                  root 1449 1108 0 May17 ? 00:17:22 /usr/bin/mono /usr/local/HomeSeer/HSPI_BLBackup.exe
                  root 1461 1108 0 May17 ? 00:14:38 /usr/bin/mono /usr/local/HomeSeer/HSPI_EasyTrigger.exe
                  root 1482 1108 0 May17 ? 00:14:56 /usr/bin/mono /usr/local/HomeSeer/HSPI_ImperiHome.exe
                  root 1499 1108 3 May17 ? 01:10:42 /usr/bin/mono /usr/local/HomeSeer/HSPI_ZWave.exe
                  root 1528 1108 0 May17 ? 00:05:59 /usr/bin/mono /usr/local/HomeSeer/HSPI_PUSHOVER.exe
                  root 1547 1108 0 May17 ? 00:12:24 /usr/bin/mono /usr/local/HomeSeer/HSPI_BLEditor.exe
                  root 1563 1108 0 May17 ? 00:06:09 /usr/bin/mono /usr/local/HomeSeer/HSPI_DOORBIRD3P.exe
                  root 1585 1108 5 May17 ? 01:46:08 /usr/bin/mono /usr/local/HomeSeer/HSPI_AmbientWeather.exe
                  root 1598 1108 1 May17 ? 00:25:47 /usr/bin/mono /usr/local/HomeSeer/HSPI_ULTRAM1G3.exe
                  root 1619 1108 0 May17 ? 00:03:09 /usr/bin/mono /usr/local/HomeSeer/HSPI_Meiku.exe
                  root 1637 1108 0 May17 ? 00:06:37 /usr/bin/mono /usr/local/HomeSeer/HSPI_MeiUnifi.exe
                  root 1672 1108 0 May17 ? 00:01:17 /usr/bin/mono /usr/local/HomeSeer/HS3Sentry.exe




                  Comment


                  • #10
                    So I am still not having any luck here. It appears that the box is running out of memory?...all of the mono processes die when this happens and any ssh connections I have get closed. This explains why HSSentry fails to restart. It seems like HS should have HS sentry running outside of the mono process to deal with situations like this.


                    I have asked homeseer for instructions on how to upgrade Ubuntu to a more recent version as who knows what could be causing this given how out of date my original SEL is (14.04.1). I have never gotten a response. I wish I had known that I would be on my own when purchasing the SEL. I wrongly assumed that Homeseer would be supporting their product from an OS update perspective.


                    Comment


                    • #11
                      I have been running the date through a watch command to try and catch when this was happening. It looks like a memory issue for sure. Here is what my screen looked like on the last crash:

                      watch: unable to fork process: Cannot allocate memory
                      root@hometrollerSEL:/home/homeseer#
                      Broadcast message from root@hometrollerSEL
                      (unknown) at 5:51 ...

                      The system is going down for reboot NOW!


                      Here is what syslog shows at that time:

                      May 28 05:40:01 hometrollerSEL CRON[17492]: (root) CMD (/usr/local/HomeSeer/register_with_find.sh)
                      May 28 05:45:01 hometrollerSEL CRON[17696]: (root) CMD (/usr/local/HomeSeer/register_with_find.sh)
                      May 28 05:50:01 hometrollerSEL CRON[17888]: (root) CMD (/usr/local/HomeSeer/register_with_find.sh)
                      May 28 05:51:50 hometrollerSEL rsyslogd: [origin software="rsyslogd" swVersion="7.4.4" x-pid="451" x-info="http://www.rsyslog.com"] exiting on signal 15.
                      May 28 05:52:14 hometrollerSEL rsyslogd: [origin software="rsyslogd" swVersion="7.4.4" x-pid="451" x-info="http://www.rsyslog.com"] start
                      May 28 05:52:14 hometrollerSEL rsyslogd: rsyslogd's groupid changed to 104
                      May 28 05:52:14 hometrollerSEL rsyslogd: rsyslogd's userid changed to 101
                      May 28 05:52:14 hometrollerSEL kernel: [ 0.000000] Initializing cgroup subsys cpuset
                      May 28 05:52:14 hometrollerSEL kernel: [ 0.000000] Initializing cgroup subsys cpu
                      May 28 05:52:14 hometrollerSEL kernel: [ 0.000000] Initializing cgroup subsys cpuacct
                      May 28 05:52:14 hometrollerSEL kernel: [ 0.000000] Linux version 3.16.0-031600-generic (apw@gomeisa) (gcc version 4.6.3 (Ubuntu/Linaro 4.6.3-1ubuntu5) ) #201408031935 SMP Sun Aug 3 23:56:17 UTC 2014
                      May 28 05:52:14 hometrollerSEL kernel: [ 0.000000] KERNEL supported cpus:
                      May 28 05:52:14 hometrollerSEL kernel: [ 0.000000] Intel GenuineIntel


                      Does anyone have any suggestions on how I can figure out why my system is crashing? Hard to believe there are not more entries in the syslog but that is all that there is.

                      Comment


                      • #12
                        Run top, then press Shift-M to change the sorting to %MEM. Run for awhile and see if the top entry(s) are climbing to a large %.

                        You might try using htop instead of top, its a bit more user-friendly. If you don't have it, try adding it via
                        Code:
                        sudo apt-get install htop
                        Also, run this command and post results. Some of the HS Raspberry Pi builds mapped some of their file systems to RAM; they may be doing similar on the SEL.
                        Code:
                        df -h
                        For any entries with tmpfs in 1st column (meaning a file system living in RAM rather than disk), monitor over time to see if the Use% is growing.

                        To run that (or any command) periodically,
                        Code:
                        watch -n 60 -d df -h
                        The number after -n is number of seconds between updates; -d makes it highlight differences.

                        Maybe not an issue, but I see that they are logging cron jobs, which can add up over time. This can be muted by creating file at /etc/rsyslog.d/cron.conf with this contents:
                        Code:
                        # mute info messages generated by cron jobs
                        cron,authpriv.info
                        Then restart rsyslogd via
                        Code:
                        sudo service rsyslogd restart
                        or just reboot.

                        I mention this because on some of the raspi builds, they mapped /var/log to a tmpfs.

                        Comment


                        • #13
                          Thanks for the suggestions. I am continuing to keep an eye on things. I did find that the BLbackup plugin seems to be using a lot of memory even after my daily backup is complete...not sure if this is my issue but I posted in the BLbackup forum to see if anyone else has run into this:

                          Click image for larger version

Name:	BLbackup2.png
Views:	68
Size:	54.0 KB
ID:	1307956

                          Comment


                          • #14
                            Very interesting, that does seem high. Will be interested if it grows again after subsequent backup runs.

                            Comment


                            • #15
                              I manually triggered a couple of backups...looks like it is indeed growing:
                              Click image for larger version

Name:	BLbackup3.png
Views:	64
Size:	35.7 KB
ID:	1307959

                              Comment

                              Working...
                              X