Announcement

Collapse
No announcement yet.

ECC memory: worth it for HomeSeer, or not?

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • ECC memory: worth it for HomeSeer, or not?

    The memory itself isn't all that much more expensive, but the motherboards that run it do seem to be quite a bit more expensive. Or, am I wrong about that? If not, the decision to utilize ECC memory can start to get pricey for that reason.

    On the other hand, for most people, HomeSeer does run 24/7, and it's something you want to "just work" with high reliability. Therefore, you typically want to avoid crashes and hangs and other errors that might arise when bits get inadvertently flipped in memory.

    If you're able to run code that has a tiny memory footprint, there's less probability of being affected. Also, a watchdog should reboot a system that crashes or hangs. So, there might be ways to dance around the problem, if it even is a problem at all. Are there software techniques that would effectively obviate the need for ECC, perhaps at some mild performance penalty?

    Really, though, maybe we should start with: is it even a problem worth addressing? How would you know whether it was or wasn't? Let's be clear: commercial grade servers do use it, and so this isn't a purely theoretical question. Those in the know seem to be voting in favor of it using hard cash.

    Assume HS3 either is or becomes stable enough that the question is worth considering. I can't predict what future releases may hold, but so far my current system seems solid. My "production" system (i.e. my in-use HomeSeer system) may be parked where it is for quite a while, and I'll probably use expendable systems to track the current releases and for deciding when to upgrade my production system.

    Bottom line: What's a reasoned argument for utilizing ECC or not on a production HomeSeer system?
    Last edited by NeverDie; January 21st, 2015, 11:52 AM.

  • #2
    My opinion is that it's certainly worth it. Check out this thread that was going on XtremeSystems a few years ago. http://www.xtremesystems.org/forums/...red-Data/page2

    At the beginning of the year I threw together a fast script that I posted here for others to check their arrays for bit errors in files. Since then I have both increased the number of arrays here (eight 8-way 1TB drive RAID-6's on the test server) and constrained the array sizes to cut down on UBE percentages. With the larger sampling sizes I see an average of ~20 bit errors in files per month (corrupted/changed files). In none of the cases did the raid controller discover any errors from a raid level on down (all silent errors). In light of this I pretty much got fed up just identifying the files that had issues and doing manual restores and went to modify the old script to add in par2 support for entire mount points. PAR2 is a pretty bad implementation of reed-solomon (inefficient in compute cycles, and program interface is very bad for handling multiple (i.e. thousands/millions) of files. It needs a major re-write or if someone knows of a different program?

    Anyway, I managed to force it into a scripted state that passably works and it's attached. It should be able to run on most *nix (linux and bsd like systems (solaris, irix, et al)) as long as the programs are installed (sha1/md5/par2/et al). This would include cygwin under windows and similar. I've tested it here under linux with ~2,500,000 files of various sizes (32KiB up to ~120GiB). It's multi-threaded so the more cores you have (assuming you don't have an I/O bottleneck) the better.

    Anyway, in hopes that it can help someone else who's also pulling their hair out and may not have the time to throw their own utility together. Especially since BTRFS/ZFS is at least several years away from a production standard (unless you're running solaris then ZFS is good). And DIX/DIF support under T13 (SATA) is equally long off this is the only real option at this point if data integrity is a major concern.
    He's reporting around 20 files per month which were altered. Who knows haw many flipped bits there were. And he was using top flight hardware with ECC memory. Anything you can do to minimize this is a good thing. You never know if it's a flipped bit or memory error which led to your HS database getting corrupted.
    Originally posted by rprade
    There is no rhyme or reason to the anarchy a defective Z-Wave device can cause

    Comment


    • #3
      Not sure I would bother, I and the vast majority of users on here I doubt use anything with ECC corrected memory and get on fine. I used a standard netbook type PC for four odd years and whilst I had my fair share of issues I doubt ECC memory would've solved them. You really have no idea what you are running with HS, plugins, various other applications - you could have as much ECC memory as you want but if applications aren't writing to it correctly in the first place then it would appear a wasteful exercise.
      My Plugins:

      Pushover 3P | DoorBird 3P | Current Cost 3P | Velleman K8055 3P | LAMetric 3P | Garadget 3P | Hive 3P |
      Yeelight 3P | Nanoleaf 3P

      Comment


      • #4
        Originally posted by NeverDie View Post
        Bottom line: What's a reasoned argument for utilizing ECC or not on a production HomeSeer system?
        I think the bigger benefit from going to an ECC-capable system is that they only exist in server-class hardware that are made w/ higher standard, better components, design, etc... (and priced accordingly).

        The actual uncorrected errors (bitrot) that happens on an average (non broken) computer is very2 small...
        HW: HS3 w/ Win8.1 on ASRock C2550d4i. Digi AnywhereUSB, Hubport, Edgeport, UZB, Z-trollers, PLCBUS, SONOS, GC-100, iTach IP2SL, WF2IR, IP2IR, RFXtrx433, Harmony Hubs, Hue, Ademco Vista 128BP, NetAtmo, NetAtmo Welcome

        Google Search for HomeSeer Forum

        Comment


        • #5
          I've been running HS for about 10 years. I've always used orphaned computers, either my own cast offs or refurbished off-lease hardware. I've certainly had problems, but if there were any caused by RAM they were WAY down the list in terms of frequency and impact. I would bet there are dozens of weaker links in the typical home brew HA system than the computer and its memory. Unless you have unlimited resources, there are almost certainly more cost effective places for hardware upgrade spending that should be addressed first.
          Mike____________________________________________________________ __________________
          HS3 Pro Edition 3.0.0.548

          HW: Stargate | NX8e | CAV6.6 | Squeezebox | PCS | WGL 800RF, Rain8Net+ | RFXCOM | QSE100D | Vantage Pro | Green-Eye | X10: XTB-232, -IIR | Edgeport/8 | Way2Call | Ecobee3

          Comment

          Working...
          X