Announcement

Collapse
No announcement yet.

Using the ALIVE pin to reset a arduino

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

    Using the ALIVE pin to reset a arduino

    I noticed that if a board gets disconnected, in most cases a power cycle is enough to get it up and running.

    Anybody looked into a solution to connect the alive PIN to a (delayed) relay that power cycles the Arduino in case of lost connection?

    #2
    Originally posted by fvhemert View Post
    I noticed that if a board gets disconnected, in most cases a power cycle is enough to get it up and running.

    Anybody looked into a solution to connect the alive PIN to a (delayed) relay that power cycles the Arduino in case of lost connection?
    I use the Freetronics watchdog timer module on my boards. These boards will perform a hard reset once every 5 minutes if they are not told the Arduino is alive. I have to toggle an output once every 4 minutes with these boards. They have absolutely eliminated any board from being disconnected more than 5 minutes. I have combined my "pet the dog" pulses with an event driven Arduino testing routine. In addition to letting the watchdog board know the board is alive, I also loop the output back to an input and use events to verify that the input is pulled high when the output goes high. I also use a timer to measure the response time of the Arduino and write it to the log. Failures of these events increment a counter. If I have more than 6 failures within 10 minutes (has happened only once), a Pushover alert is sent and HomeSeer is restarted. I have 5 Arduinos in production and 2 are in critical areas and one is semi-critical. For these reasons, I have created a self testing and self healing system to monitor them.

    I have been thinking about switching to SwitchDoc Labs Dual WatchDog Timer because of the longer (30-220 second) interval.

    I'm sure you could fashion a circuit using the alive pin but it would require that the circuit provide a short power cycle for the boards. My Arduinos are powered by POE Ethernet shields, so a power cycle is much more difficult to implement than a watchdog hard reset. I have also found that a hard reset when the board is powered is more reliable at reconnecting than a power cycle.
    HS4 Pro, 4.2.19.16 Windows 10 pro, Supermicro LP Xeon

    Comment


      #3
      Randy

      I've kinda been toying with setting this up too.......thanks for the link to the boards....

      SInce I've set mine up using battery backup, I've had no issue with the Mega boards...must say I'm very happy with the Arduino setup..

      Cheers..Pete
      HS 2.2.0.11

      Comment


        #4
        Originally posted by petez69 View Post
        Randy

        I've kinda been toying with setting this up too.......thanks for the link to the boards....

        SInce I've set mine up using battery backup, I've had no issue with the Mega boards...must say I'm very happy with the Arduino setup..

        Cheers..Pete
        The Freetronics boards are from your neck of the woods. They will always force a reset if the board fails to connect. Like you, my boards are very reliable, this is just another layer.
        HS4 Pro, 4.2.19.16 Windows 10 pro, Supermicro LP Xeon

        Comment


          #5
          Randy

          I've got the circuit up for the freetronics boards and I think I might go grab some parts and build up a few. Agreed they are another safeguard to nirvana :-)
          HS 2.2.0.11

          Comment


            #6
            Thanks for the replies, I was think along the same lines but still in doubt about one thing.

            According to the manual:


            My first thought would be to connect a NE555 timer to the alive PIN, if the ALIVE goes LOW because of lost connection it will start the timer and after 2 min a reset is triggered. If within the 2 minutes connection is restored, the timer is reset and a reset is not triggered. Simple and easy solution.

            This type of operation will cover the "lost connection scenario" but will not discover a Arduino board that is in a lock/hang state. If the Arduino locks up, it will be disconnected from HS but it will never drop the alive pin, remains high indefinite.

            To address this, we need the alive pin to toggle on a regular interval (blink). The attached timer should look for alive pin changes and reset the timer every time a change is detected. In this way we can cover both scenario's:
            • The board is running but has lost connectivity to HS --> detected by the sketch and stops the blinking
            • The board is hanging --> the software stops and so does the blinking


            I do not want to start adding additional code to the sketch and therefore:

            GREIG: would it be possible to change the behavior of the alive pin to BLINK instead of ON ?? What do you think?

            Comment


              #7
              Originally posted by fvhemert View Post
              Thanks for the replies, I was think along the same lines but still in doubt about one thing.

              According to the manual:
              "The Alive pin is a pin on the board that when connected to Homeseer will be “HIGH” and has a watchdog on it that when the connection to Homeseer is lost it will go “LOW”. This can be used for backup system switchover. "

              My first thought would be to connect a NE555 timer to the alive PIN, if the ALIVE goes LOW because of lost connection it will start the timer and after 2 min a reset is triggered. If within the 2 minutes connection is restored, the timer is reset and a reset is not triggered. Simple and easy solution.

              This type of operation will cover the "lost connection scenario" but will not discover a Arduino board that is in a lock/hang state. If the Arduino locks up, it will be disconnected from HS but it will never drop the alive pin, remains high indefinite.

              To address this, we need the alive pin to toggle on a regular interval (blink). The attached timer should look for alive pin changes and reset the timer every time a change is detected. In this way we can cover both scenario's:
              • The board is running but has lost connectivity to HS --> detected by the sketch and stops the blinking
              • The board is hanging --> the software stops and so does the blinking


              I do not want to start adding additional code to the sketch and therefore:

              GREIG: would it be possible to change the behavior of the alive pin to BLINK instead of ON ?? What do you think?
              That is essentially what I do, but without leveraging the Alive pin. The watchdog boards I use are based on a 555 timer. I use another pin controlled by HomeSeer to reset the timer. If the board becomes disconnected, the pin stops, if the board hangs the pin stops. Alternately you could set the digital pin to blink. No extra code in the sketch, just a pin added in the configuration. How would a "blinking" alive pin be better? I still use the Alive pin to control failover relays to revert to redundant controls of my heating system in the event I have a HomeSeer or Arduino failure and I definitely don't want it to blink. This method also lets me know if there is a problem as HomeSeer sees that the input pin monitoring the output stops changing state.
              HS4 Pro, 4.2.19.16 Windows 10 pro, Supermicro LP Xeon

              Comment


                #8
                " How would a "blinking" alive pin be better? "

                The current sketch checks for connectivity to HS, if lost it set the alive pin to LOW. If the sketch itself hangs, it will never run the routine to set the alive pin to LOW and the HIGH status will show a false positive.

                A output that toggles from a routine in the sketch will show both.

                Comment


                  #9
                  Originally posted by rprade View Post
                  That is essentially what I do, but without leveraging the Alive pin. The watchdog boards I use are based on a 555 timer. I use another pin controlled by HomeSeer to reset the timer. If the board becomes disconnected, the pin stops, if the board hangs the pin stops. Alternately you could set the digital pin to blink. No extra code in the sketch, just a pin added in the configuration. How would a "blinking" alive pin be better? I still use the Alive pin to control failover relays to revert to redundant controls of my heating system in the event I have a HomeSeer or Arduino failure and I definitely don't want it to blink. This method also lets me know if there is a problem as HomeSeer sees that the input pin monitoring the output stops changing state.
                  Can I ask you if it happens so often for a board to be disconnected but still alive?
                  I wonder why an external watchdog is needed when the board has both a SW and a HW watchdog already onboard.
                  If the board is alive and not hanging in a weird state, clearly the internal watchdog won't trigger

                  Comment


                    #10
                    Originally posted by doppiaemme View Post
                    Can I ask you if it happens so often for a board to be disconnected but still alive?
                    I wonder why an external watchdog is needed when the board has both a SW and a HW watchdog already onboard.
                    If the board is alive and not hanging in a weird state, clearly the internal watchdog won't trigger
                    First of all, I have not had a situation where a disconnected board has the Alive pin remain high in almost a year. There was a problem at some point but Greig has correct edit in the plug-in. I have had boards hang in the past, but it has been 5-6 months since this has happened. These methods I use are just a fail safe.

                    While your Idea of blinking the alive pin would work fine, the method I use of toggling an output and monitoring an input attached to it through HomeSeer gives me the ability to monitor whether the board is functioning in HomeSeer, regardless of its connected state. The toggling of the output also resets the 555 timer boards. If the 555 doesn't get reset once every 3 minutes it will force a reset of the attached Arduino. I also increment a counter with each failure of the input to reflect a change in the output. If this counter exceeds a threshold within a certain amount of time, I restart the plug-in. A higher threshold on the counter will cause a restart of HomeSeer and a Pushover message to me. These last two actions have not occurred in over a year because Greig has made the plug-in so stable.

                    While all of the above is designed to restore operation if a board or the plug-in hangs, I also use the Alive pin to switch my heating and DHW over to redundant control. It will run on redundant control if the critical boards do not reconnect, either due to plug-in, board, network or HomeSeer failures.

                    The bottom line (what an overused phrase) is that the boards and plug-in are very reliable in my installation now and most of this is unnecessary. Some of it is from a legacy of growth problems in the plug-in during he first year of use, but the balance of the measures are just from an abundance of caution due to my relying on Ardunio control for heat and DHW. It has been almost two years of using this system and we have yet awakened to a cold house.

                    The added value of all of this redundancy is that I have been alerted to a failed zone valve and boiler control board. Due to all of the monitoring I am able to be notified if things that should be happening fail to do so.
                    HS4 Pro, 4.2.19.16 Windows 10 pro, Supermicro LP Xeon

                    Comment


                      #11
                      Randy, I've bought all the bits to make my own watchdog-timers. I've never had a board hang but thought I'd add this as it looks cool :-)
                      HS 2.2.0.11

                      Comment


                        #12
                        Originally posted by rprade View Post
                        First of all, I have not had a situation where a disconnected board has the Alive pin remain high in almost a year. There was a problem at some point but Greig has correct edit in the plug-in. I have had boards hang in the past, but it has been 5-6 months since this has happened. These methods I use are just a fail safe.
                        I am not using the alive pin at the moment but frequently, 1 to 2 times a month, see boards (esp the ones that have 1-wire temp sensors connected) that hang. The situation improved significantly since I implemented an event that resets the boards twice a day. But it is not enough to solve the issue completely.


                        Originally posted by rprade View Post
                        While your Idea of blinking the alive pin would work fine, the method I use of toggling an output and monitoring an input attached to it through HomeSeer gives me the ability to monitor whether the board is functioning in HomeSeer, regardless of its connected state.
                        Yes, I like this idea and have considered it, but if it detects a error state there is no way to reset the board from HS. This is where the hardware solution has to come in.

                        Originally posted by rprade View Post
                        The toggling of the output also resets the 555 timer boards. If the 555 doesn't get reset once every 3 minutes it will force a reset of the attached Arduino.
                        It is smart to reset the timer based on the change of the output instead of starting it when the output goes high or low. The constant change of the output level is the only way to detect that there is connectivity to HS and the board is still running.

                        Have you built the NE555 based solution yourself? If so, do you mind sharing the logic on how to detect the output toggle?

                        Originally posted by rprade View Post
                        I also increment a counter with each failure of the input to reflect a change in the output. If this counter exceeds a threshold within a certain amount of time, I restart the plug-in. A higher threshold on the counter will cause a restart of HomeSeer and a Pushover message to me. These last two actions have not occurred in over a year because Greig has made the plug-in so stable.
                        I have also implemented a counter but it triggers the reset of the board, I have not tried to restart the plugin but believe it will not solve the problem with the disconnected board. I agree the plugin is stable (all other boards keep running) and the problem imho originates in the board.

                        Originally posted by rprade View Post
                        While all of the above is designed to restore operation if a board or the plug-in hangs, I also use the Alive pin to switch my heating and DHW over to redundant control. It will run on redundant control if the critical boards do not reconnect, either due to plug-in, board, network or HomeSeer failures.

                        The bottom line (what an overused phrase) is that the boards and plug-in are very reliable in my installation now and most of this is unnecessary. Some of it is from a legacy of growth problems in the plug-in during he first year of use, but the balance of the measures are just from an abundance of caution due to my relying on Ardunio control for heat and DHW. It has been almost two years of using this system and we have yet awakened to a cold house.
                        Wish I could say the same, yes the plugin is very stable but some of the boards every now and then simply just stop. It might be the Ethernet shield, maybe the board or the oneWire connection, I do not know but for now I need a hardware solution to address this.

                        The added value of all of this redundancy is that I have been alerted to a failed zone valve and boiler control board. Due to all of the monitoring I am able to be notified if things that should be happening fail to do so.[/QUOTE]

                        Comment


                          #13
                          Just wondering if its a "clone board" issue ? I've had various issues over time with clone MEGA and W5100 boards. There was no pattern, I had to swap them out until I found a stable configuration.

                          Are you using UNO boards or MEGA ?
                          HS 2.2.0.11

                          Comment


                            #14
                            Using the clone MEGA boards.

                            Comment


                              #15
                              Originally posted by fvhemert View Post

                              Have you built the NE555 based solution yourself? If so, do you mind sharing the logic on how to detect the output toggle?
                              I used the Freetronics boards as linked in post 2 above. The output pin going high resets the timer on the board. If the board doesn't see the output pin go high once in 4 minutes, it resets the Arduino. The 4 pads on the board match up perfectly with the Arduino pin header for ground, reset and 5V, I run the input pin to the pin I am using on the Arduino for testing. Not that it matters, but pin 18 is the output and pin 19 is the input. I connect between these pins and connect them to the input on the Freetronics board.

                              Once every 3 minutes I run this event for each of the boards. I sequence them 30 seconds apart and there are 5 boards.

                              Click image for larger version

Name:	Capture.PNG
Views:	1
Size:	29.6 KB
ID:	1190306

                              If the input toggles to on after the first event, this one runs.

                              Click image for larger version

Name:	Capture2.PNG
Views:	1
Size:	50.5 KB
ID:	1190307

                              If the input fails to toggle within 20 seconds this event runs. If there is a failure, it starts the failure timer and increments the cumulative failures as well as the individual board failures. The pushover message and board reset have been disabled, because I now look at the cumulative failures rather than an individual board count in a separate event. If the cumulative failures exceed 6 within a 25 minute span, I force a reset of the boards. If the count reaches 10, I restart HomeSeer.

                              Click image for larger version

Name:	Capture3.PNG
Views:	1
Size:	68.6 KB
ID:	1190308

                              This event is a cleanup for the timers in case I disable testing mid-run

                              Click image for larger version

Name:	Capture4.PNG
Views:	1
Size:	33.5 KB
ID:	1190309


                              The 555 timer boards will work autonomously as long as an output is toggled once every 3 minutes. If HomeSeer sees failure counts exceeded it can also force a reset of the boards or a restart of HomeSeer. Restarting the plug-in would usually suffice, but I prefer to do a complete system restart if the failures are excessive. This will reboot all 3 Z-Nets, shut down HomeSeer and reboot the HS server. I have not triggered a restart due to Arduino failures (except during testing) in over a year.
                              HS4 Pro, 4.2.19.16 Windows 10 pro, Supermicro LP Xeon

                              Comment

                              Working...
                              X