Thanks Jon. I did post the wrong grab. But I was able to go back to the original grab and get everything working for pollen. I will post that shortly as a new message here in case it can help others.
Announcement
Collapse
No announcement yet.
Jon00 DataScraper/JSON Parser Script For Homeseer 3 and Homeseer 4
Collapse
This topic is closed.
X
This is a sticky topic.
X
X
-
Using Jon00's Datascraper Script plug in I was able to get pollen values for my zipcode. I'm posting it here in case it helps others. Many pollen sites are not set up to allow datascraping. However I found one that was. Set up is very quick.
First go to pollen.com and create an account with your email and zip code. Then get a daily email sent to you with the local pollen information. At the top of that email will be a link: "Having problems viewing this email? Click here to view the online version."
Use the link embedded in "Click here" for your Path. That site allows scraping. Regular pollen.com does not allow scraping.
Here is my set up which creates 5 devices (today's pollen level, today's high-med-low, today's pollen type, tomorrow's pollen level, tomorrow's high-med-low):
[Grab2]
Path=http://www.pollenapps.com/email/aa/default.asp?e=[NOTE: customized for your unique URL...see text above]
TextFile=1
Encoding=
Username=
Password=
Options=
UserAgent=
Devicemode=2
StripHTML=0
UseIE=0
Delay=10
Pattern1=(?s)TODAY.*?"http://www.pollenapps.com/email/aa/images/levels/gauge-(.*?).png"
Pattern2=(?s)TODAY.*?style="font-size:18px;font-family:Georgia !important;font-style:italic;margin:0;padding-top:10px;">(.*?)<
Pattern3=(?s)Today's Top Allergens.*?icons/type-Tree-sm.png" width="40" height="40" alt="(.*?)"
Pattern4=(?s)Today's Top Allergens.*?icons/type-Tree-sm.png" width="40" height="40" alt=".*?icons/type-Tree-sm.png" width="40" height="40" alt="(.*?)"
Pattern5=(?s)Today's Top Allergens.*?icons/type-Tree-sm.png" width="40" height="40" alt=".*?icons/type-Tree-sm.png" width="40" height="40" alt=".*?icons/type-Tree-sm.png" width="40" height="40" alt="(.*?)"
Pattern6=(?s)tomorrow.*?src="http://www.pollenapps.com/email/aa/images/levels/gauge-(.*?).png
Pattern7=(?s)tomorrow.*?style="font-size:18px;font-family:Georgia !important;font-style:italic;margin:0;padding-top:10px;">(.*?)<
DeviceName1=Today's Pollen Level
DeviceText1=[0]
DeviceValue1=[0]
DeviceImage1=pollen.png
Speakbutton1=0
TriggerString1=
SearchMode1=1
TriggerEvent1=
DeviceName2=Today's Pollen High-Med-Low
DeviceText2=[100]
DeviceValue2=[100]
DeviceImage2=pollen.png
Speakbutton2=0
TriggerString2=
SearchMode2=1
TriggerEvent2=
DeviceName3=Today's Pollen Type
DeviceText3=[200], [300], [400]
DeviceValue3=[200]
DeviceImage3=pollen.png
Speakbutton3=0
TriggerString3=
SearchMode3=1
TriggerEvent3=
DeviceName4=Tomorrow's Pollen Level
DeviceText4=[500]
DeviceValue4=[500]
DeviceImage4=pollen.png
Speakbutton4=0
TriggerString4=
SearchMode4=1
TriggerEvent4=
DeviceName5=Tomorrow's Pollen High-Med-Low
DeviceText5=[600]
DeviceValue5=[600]
DeviceImage5=pollen.png
Speakbutton5=0
TriggerString5=
SearchMode5=1
TriggerEvent5=
I run the script (Main, 2) once each day at 5:30am. It goes to the site and pulls the pollen data into HS3. Devices image below. Thank you Jon00 for enabling this!!
Comment
-
Hi! I'm trying to scrape a site, but I can't even get the grab text file to be created... Note, I have several other sites that I scrape that work fine, so I can't for the life of me understand why this one is a problem A couple of other users have no problem, but I know they're using HS3 on Linux, I'm on Windows.
Using this in the config:
Path=https://pollenkontroll.no/api/pollen-count?country=no&location=126
TextFile=1
Opening the link in a browser yields the correct result. Any tips?
Comment
-
Originally posted by mk1 black limited View PostHi! I'm trying to scrape a site, but I can't even get the grab text file to be created... Note, I have several other sites that I scrape that work fine, so I can't for the life of me understand why this one is a problem A couple of other users have no problem, but I know they're using HS3 on Linux, I'm on Windows.
Using this in the config:
Path=https://pollenkontroll.no/api/pollen-count?country=no&location=126
TextFile=1
Opening the link in a browser yields the correct result. Any tips?
grab1.txt
Comment
-
Hi Jon,
I'm trying to add a new scrape to my system but running into wall on the latest one.
I've got access to the RealTrainTimes API, which uses HTTP auth, and is HTTPS, the JSON output i recieve is below (a portion of it anyway).
Code:"locationDetail":{ "realtimeActivated":true, "tiploc":"BCKNHMJ", "crs":"BKJ", "description":"Beckenham Junction", "gbttBookedArrival":"1210", "gbttBookedDeparture":"1210", "origin":[ { "tiploc":"ORPNGTN", "description":"Orpington", "workingTime":"115400", "publicTime":"1154" }
Code:Path=https://api.rtt.io/api/v1/json/search/bkj/to/brx TextFile=0 Encoding= Username=**** Password=**** Options= UserAgent= Devicemode=0 StripHTML=1 Pattern1="gbttBookedArrival":"(.*?)"," Pattern2= Pattern3= Pattern4= Pattern5= DeviceName1=RTT - BKJ to VIC: 1 DeviceText1=[0] DeviceValue1=[0] DeviceImage1= Speakbutton1=1 TriggerString1= SearchMode1=1 TriggerEvent1=
I have no errors in my HS log, so not sure where i've gone wrong but there is just no grab data for grab3 in my data file after running my scrape event with just ID 3.
Any ideas where i've messed up?
EDIT: Ive just realised everytime i run the script, there is a login prompt popping up on my HS box (this may be since ive tried using the UseIE flag, but even adding credentials here does not get it working.
Comment
-
Unfortunately you won't get any response (grab data) if it cannot authenticate. Matters are getting worse now that sites are changing over to SSL. Check this thread regarding SSL when running scripts under .NET4: https://forums.homeseer.com/forum/de...turl-ssl-issue
Comment
-
Just a quick question before i try to update,
right now it still closes iexplore.exe even without UseIE= is present in the ini file.
i did also try to not include scrape 6 and 7 but still closes iexplore.
regardsPreferred -> Jon's Plugins, Pushover, Phlocation, Easy-trigger,
Rfxcom, Blade Plugins, Pushbullet, homekit, Malosa Scripts
HS3Pro 4.1.14.0 on windows 10 enterprise X64 on hp quadcore laptop 8 GB.
Comment
-
hi jon,
i see it works now but i got an error in the beginning, but no error anymore,
May-17 13:07:28 Error Authenticating SSL stream inner exception: An unknown error occurred while processing the certificate May-17 13:07:28 Error Authenticating SSL stream: A call to SSPI failed, see inner exception. May-17 13:07:28 Error Authenticating SSL stream inner exception: An unknown error occurred while processing the certificate and thx for the new update,May-17 13:07:28 Error Authenticating SSL stream: A call to SSPI failed, see inner exception.
regardsPreferred -> Jon's Plugins, Pushover, Phlocation, Easy-trigger,
Rfxcom, Blade Plugins, Pushbullet, homekit, Malosa Scripts
HS3Pro 4.1.14.0 on windows 10 enterprise X64 on hp quadcore laptop 8 GB.
Comment
-
Originally posted by Malosa View Posthi jon,
i see it works now but i got an error in the beginning, but no error anymore,
May-17 13:07:28 Error Authenticating SSL stream inner exception: An unknown error occurred while processing the certificate May-17 13:07:28 Error Authenticating SSL stream: A call to SSPI failed, see inner exception. May-17 13:07:28 Error Authenticating SSL stream inner exception: An unknown error occurred while processing the certificate and thx for the new update,May-17 13:07:28 Error Authenticating SSL stream: A call to SSPI failed, see inner exception.
regards
Comment
Comment