Announcement

Collapse
No announcement yet.

Jon00 DataScraper/JSON Parser Script For Homeseer 3 and Homeseer 4

Collapse
This topic is closed.
X
This is a sticky topic.
X
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

    Actually, it seems it isn't the tabs, rather a LF (hex 0A) in there that makes life hard for the sign. I actually got the replace to work with the tabs, but the 0A won't work.

    Comment


      Try Version 1.0.7 on my site.

      I have added the following tags to the replace function:

      {CHRXX} where XX is the decimal number for the ASCII character.

      I have not tackled the spaces, however the following should work:

      DeviceText1=Veimelding RV159 [0] [Replace "{CHR13}",""] [Replace "{CHR10}",""] [Replace "{CHR9}",""]
      Jon

      Comment


        That's great, thanks! Will test and get back to you.

        Comment


          Formatted String

          Jon00,

          I get the text I want returned via the regex I've constructed but the <br> html code is being stripped and replaced with no character. So I get GrilledCheese as opposed to Grilled Cheese. Other sections of the text returned have spaces embedded in them and this works fine. I'm sure I need to add a replacement string but don't know where to begin. Replacing <br> with space would most likely fix my issue.

          Thanks,

          HittR

          Comment


            If the <br> tag is being stripped, there is nothing to replace.

            I would suggest you set StripHTML=0 in the ini file so that you get the <br> tag returned within the regex.

            You then would use something like:

            DeviceTextX=[0] [replace "<br>"," "]
            Jon

            Comment


              That did it!

              Sorry for being thick
              You are correct that I had StripHTML set to 1. Changing to 0 and using your replacement string worked perfectly.

              Thanks for the prompt assistance and great script btw.

              Comment


                Jon,

                Would your scraper be able to pull the Energy Produced and Energy Used information in the right most column from the screen that is attached?
                Attached Files

                Comment


                  If you view the source in a web browser and can find the metrics in the text, then yes. I suspect however with these energy pages, they are using inline elements which cannot be parsed easily.
                  Jon

                  Comment


                    Jon,

                    I'm no expert in coding but this is what i found.

                    These look like the statements i would need to scrape.

                    <div class="sp-energy-mix-label" translate="FROM_SOLAR">From Solar</div>
                    <div class="sp-energy-kws">5.3 kWh</div>

                    <div class="sp-energy-mix-used" translate="TOTAL_ENERGY_USAGE">Total Energy Used</div>
                    <div class="sp-energy-kws">10.7 kWh</div>

                    <div class="sp-energy-mix-label" translate="FROM_GRID">From Grid</div>
                    <div class="sp-energy-kws">5.4 kWh</div>

                    They were taken from this page. But I am not sure how to tell if I can scrape the data.
                    Attached Files

                    Comment


                      That should scrape well.

                      Try the following pattern:

                      Pattern1=(?s)"sp-energy-kws">(.*?)</div>

                      It should give you all the results back in the ini file under 0, 1 , 2 etc. As they will always be in the same order. you just need to work out which metric is which and assign a device to each one.

                      If you just want the value, amend the pattern as:

                      Pattern1=(?s)"sp-energy-kws">(.*?)kWh</div>
                      Jon

                      Comment


                        Originally posted by jon00 View Post
                        That should scrape well.

                        Try the following pattern:

                        Pattern1=(?s)"sp-energy-kws">(.*?)</div>

                        It should give you all the results back in the ini file under 0, 1 , 2 etc. As they will always be in the same order. you just need to work out which metric is which and assign a device to each one.

                        If you just want the value, amend the pattern as:

                        Pattern1=(?s)"sp-energy-kws">(.*?)kWh</div>
                        I seem to have a problem as the first screen i get when going to the site is a login screen, once logged in i then need to get to the screen i posted above. i does not look like i can do that from your ini file as it only accepts one URL.

                        Comment


                          Originally posted by jpape View Post
                          I seem to have a problem as the first screen i get when going to the site is a login screen, once logged in i then need to get to the screen i posted above. i does not look like i can do that from your ini file as it only accepts one URL.


                          Jon, would you agree with the above statement or am I missing an option?

                          Judd

                          Comment


                            Yes, the fact that you need to navigate to a new page once logged in make the process very difficult to automate.
                            Jon

                            Comment


                              Jon, Thanks for spending the time with me on this.

                              So once I log onto the website, the first page does have the information i'm looking to scrape so i don't need to go to another screen. But when i execute the script, nothing seems to be happening. Is there any debug log that is produced?

                              This is the logon screen https://monitor.us.sunpower.com/#/login and once logged in it brings me to the screen with the data.

                              Judd

                              Comment


                                Have you added your username/password to the ini file config?
                                Jon

                                Comment

                                Working...
                                X