Announcement

Collapse
No announcement yet.

Jon00 DataScraper/JSON Parser Script For Homeseer 3 and Homeseer 4

Collapse
This topic is closed.
X
This is a sticky topic.
X
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

    HI

    I start using this tool few two days ago and I am in the learning process. There is one issue I am having, I am unable to get the images displayed on the devices that are created:

    here is ma datascrapper.ini file:

    [Settings]
    DataLogging=1
    ExpressionLogging=1
    ErrorLogging=1
    IEClearDelay=3

    [Grab1]
    Path=http://XXXXXXX
    TextFile=1
    Encoding=UTF-8
    Username=
    Password=
    Options=
    UserAgent=
    Devicemode=2
    StripHTML=1
    UseIE=0
    Pattern1=(?s)"name".*?),
    Pattern2=(?s)field1["\\]":{["\\]"value["\\]".*?),(.*?):["\\]"(.*?)T(.*?)Z
    Pattern3=(?s)field2["\\]":{["\\]"value["\\]".*?),(.*?):["\\]"(.*?)T(.*?)Z
    Pattern4=(?s)field3["\\]":{["\\]"value["\\]".*?),(.*?):["\\]"(.*?)T(.*?)Z
    DeviceName1=Unibot_roulotte
    DeviceText1=[0]
    DeviceName2=temperature
    DeviceImage2=thermometer.png
    DeviceText2=[100]
    DeviceValue2=[100]
    DeviceName3=date
    DeviceText3=[102]
    DeviceName4=heure
    DeviceText4=[103]
    DeviceName5=humidite
    DeviceImage5=humidity.png
    DeviceText5=[200]
    DeviceValue5=[200]
    DeviceName6=luminosite
    DeviceImage6=luminance.png
    DeviceText6=[300]
    DeviceValue6=[300]

    here is the content of my HS3/html/images/Devices/jon00/datascraper folder:
    Click image for larger version

Name:	datascrapper.JPG
Views:	376
Size:	29.9 KB
ID:	1349394

    here are my created devices:

    Click image for larger version

Name:	devices.JPG
Views:	328
Size:	31.2 KB
ID:	1349395

    would anyone have any idea what I am doing wrong ?


    Comment


      Looks like you have set DeviceMode=2

      Change it to 0

      (see page 4 in the docs)
      Jon

      Comment


        Originally posted by jon00 View Post
        Looks like you have set DeviceMode=2

        Change it to 0

        (see page 4 in the docs)
        thanks ... i guess it is christmass that gets me blind

        have a wondefull holiday time !

        Comment


          I have had a data scrape running on the BBC news site for some time now and just noticed that the image capture failed in July. I must admit I didn't fully understand how I set this up as jon00 provided me with the starter ini and I tweaked from there.

          Here is my ini:

          Code:
          [Grab1]
          Path=http://feeds.bbci.co.uk/news/rss.xml?edition=us
          TextFile=0
          Encoding=
          Username=
          Password=
          Options=
          UserAgent=
          Devicemode=2
          
          Pattern1=(?s)<item>.*?<title><!\[CDATA\[(.*?)\]\]></title>.*?</item>
          Pattern2=(?s)<item>.*?<description><!\[CDATA\[(.*?)\]\]></description>.*?</item>
          Pattern3=(?s)<item>.*?<link>(.*?)</link>.*?</item>
          Pattern4=(?s)<item>.*?<pubDate>(.*?)</pubDate>.*?</item>
          Pattern5=(?s)<item>.*?url="(.*?)"/>.*?</item>
          
          DeviceName1=RSS Feed 1 Image
          DeviceText1=[400]
          DeviceValue1=
          DeviceImage1=rss.png
          Speakbutton1=0
          TriggerString1=
          SearchMode1=1
          TriggerEvent1=
          
          DeviceName2=RSS Feed 2 Image
          DeviceText2=[401]
          DeviceValue2=
          DeviceImage2=
          Speakbutton2=0
          TriggerString2=
          SearchMode2=1
          TriggerEvent2=
          
          DeviceName3=RSS Feed 3 Image
          DeviceText3=[402]
          DeviceValue3=
          DeviceImage3=
          Speakbutton3=0
          TriggerString3=
          SearchMode3=1
          TriggerEvent3=
          
          DeviceName4=RSS Feed 4 Image
          DeviceText4=[403]
          DeviceValue4=
          DeviceImage4=
          Speakbutton4=0
          TriggerString4=
          SearchMode4=1
          TriggerEvent4=
          
          DeviceName5=RSS Feed 5 Image
          DeviceText5=[404]
          DeviceValue5=
          DeviceImage5=
          Speakbutton5=0
          TriggerString5=
          SearchMode5=1
          TriggerEvent5=
          
          DeviceName6=RSS Feed 6 Image
          DeviceText6=[405]
          DeviceValue6=
          DeviceImage6=
          Speakbutton6=0
          TriggerString6=
          SearchMode6=1
          TriggerEvent6=
          
          DeviceName7=RSS Feed 7 Image
          DeviceText7=[406]
          DeviceValue7=
          DeviceImage7=
          Speakbutton7=0
          TriggerString7=
          SearchMode7=1
          TriggerEvent7=
          
          DeviceName8=RSS Feed 8 Image
          DeviceText8=[407]
          DeviceValue8=
          DeviceImage8=
          Speakbutton8=0
          TriggerString8=
          SearchMode8=1
          TriggerEvent8=
          
          DeviceName9=RSS Feed 9 Image
          DeviceText9=[408]
          DeviceValue9=
          DeviceImage9=
          Speakbutton9=0
          TriggerString9=
          SearchMode9=1
          TriggerEvent9=
          
          DeviceName10=RSS Feed 10 Image
          DeviceText10=[409]
          DeviceValue10=
          DeviceImage10=rss.png
          Speakbutton10=0
          TriggerString10=
          SearchMode10=1
          TriggerEvent10=
          
          DeviceName11=RSS Feed 11 Image
          DeviceText11=[410]
          DeviceValue11=
          DeviceImage11=
          Speakbutton11=0
          TriggerString11=
          SearchMode11=1
          TriggerEvent11=
          
          DeviceName12=RSS Feed 12 Image
          DeviceText12=[411]
          DeviceValue12=
          DeviceImage12=
          Speakbutton12=0
          TriggerString12=
          SearchMode12=1
          TriggerEvent12=
          
          DeviceName13=RSS Feed 13 Image
          DeviceText13=[412]
          DeviceValue13=
          DeviceImage13=
          Speakbutton13=0
          TriggerString13=
          SearchMode13=1
          TriggerEvent13=
          
          DeviceName14=RSS Feed 14 Image
          DeviceText14=[413]
          DeviceValue14=
          DeviceImage14=
          Speakbutton14=0
          TriggerString14=
          SearchMode14=1
          TriggerEvent14=
          
          DeviceName15=RSS Feed 15 Image
          DeviceText15=[414]
          DeviceValue15=
          DeviceImage15=
          Speakbutton15=0
          TriggerString15=
          SearchMode15=1
          TriggerEvent15=
          
          DeviceName16=RSS Feed 16 Image
          DeviceText16=[415]
          DeviceValue16=
          DeviceImage16=
          Speakbutton16=0
          TriggerString16=
          SearchMode16=1
          TriggerEvent16=
          
          DeviceName17=RSS Feed 17 Image
          DeviceText17=[416]
          DeviceValue17=
          DeviceImage17=
          Speakbutton17=0
          TriggerString17=
          SearchMode17=1
          TriggerEvent17=
          
          DeviceName18=RSS Feed 18 Image
          DeviceText18=[417]
          DeviceValue18=
          DeviceImage18=
          Speakbutton18=0
          TriggerString18=
          SearchMode18=1
          TriggerEvent18=
          
          DeviceName19=RSS Feed 19 Image
          DeviceText19=[418]
          DeviceValue19=
          DeviceImage19=
          Speakbutton19=0
          TriggerString19=
          SearchMode19=1
          TriggerEvent19=
          
          DeviceName20=RSS Feed 20 Image
          DeviceText20=[419]
          DeviceValue20=
          DeviceImage20=rss.png
          Speakbutton20=0
          TriggerString20=
          SearchMode20=1
          TriggerEvent20=
          
          DeviceName21=RSS Feed 21 Image
          DeviceText21=[420]
          DeviceValue21=
          DeviceImage21=
          Speakbutton21=0
          TriggerString21=
          SearchMode21=1
          TriggerEvent21=
          
          DeviceName22=RSS Feed 22 Image
          DeviceText22=[421]
          DeviceValue22=
          DeviceImage22=
          Speakbutton22=0
          TriggerString22=
          SearchMode22=1
          TriggerEvent22=
          
          DeviceName23=RSS Feed 23 Image
          DeviceText23=[422]
          DeviceValue23=
          DeviceImage23=
          Speakbutton23=0
          TriggerString23=
          SearchMode23=1
          TriggerEvent23=
          
          DeviceName24=RSS Feed 24 Image
          DeviceText24=[423]
          DeviceValue24=
          DeviceImage24=
          Speakbutton24=0
          TriggerString24=
          SearchMode24=1
          TriggerEvent24=
          
          DeviceName25=RSS Feed 25 Image
          DeviceText25=[424]
          DeviceValue25=
          DeviceImage25=
          Speakbutton25=0
          TriggerString25=
          SearchMode25=1
          TriggerEvent25=
          
          DeviceName26=RSS Feed 26 Image
          DeviceText26=[425]
          DeviceValue26=
          DeviceImage26=
          Speakbutton26=0
          TriggerString26=
          SearchMode26=1
          TriggerEvent26=
          
          DeviceName27=RSS Feed 27 Image
          DeviceText27=[426]
          DeviceValue27=
          DeviceImage27=
          Speakbutton27=0
          TriggerString27=
          SearchMode27=1
          TriggerEvent27=
          
          DeviceName28=RSS Feed 28 Image
          DeviceText28=[427]
          DeviceValue28=
          DeviceImage28=
          Speakbutton28=0
          TriggerString28=
          SearchMode28=1
          TriggerEvent28=
          
          DeviceName29=RSS Feed 29 Image
          DeviceText29=[428]
          DeviceValue29=
          DeviceImage29=
          Speakbutton29=0
          TriggerString29=
          SearchMode29=1
          TriggerEvent29=
          
          DeviceName30=RSS Feed 30 Image
          DeviceText30=[429]
          DeviceValue30=
          DeviceImage30=rss.png
          Speakbutton30=0
          TriggerString30=
          SearchMode30=1
          TriggerEvent30=
          
          DeviceName31=RSS Feed 1 Link
          DeviceText31=[200]
          DeviceValue31=
          DeviceImage31=rss.png
          Speakbutton31=0
          TriggerString31=
          SearchMode31=1
          TriggerEvent31=
          
          DeviceName32=RSS Feed 2 Link
          DeviceText32=[201]
          DeviceValue32=
          DeviceImage32=rss.png
          Speakbutton32=0
          TriggerString32=
          SearchMode32=1
          TriggerEvent32=
          
          DeviceName33=RSS Feed 3 Link
          DeviceText33=[202]
          DeviceValue33=
          DeviceImage33=rss.png
          Speakbutton33=0
          TriggerString33=
          SearchMode33=1
          TriggerEvent33=
          
          DeviceName34=RSS Feed 4 Link
          DeviceText34=[203]
          DeviceValue34=
          DeviceImage34=rss.png
          Speakbutton34=0
          TriggerString34=
          SearchMode34=1
          TriggerEvent34=
          
          DeviceName35=RSS Feed 5 Link
          DeviceText35=[204]
          DeviceValue35=
          DeviceImage35=rss.png
          Speakbutton35=0
          TriggerString35=
          SearchMode35=1
          TriggerEvent35=
          
          DeviceName36=RSS Feed 6 Link
          DeviceText36=[205]
          DeviceValue36=
          DeviceImage36=rss.png
          Speakbutton36=0
          TriggerString36=
          SearchMode36=1
          TriggerEvent36=
          
          DeviceName37=RSS Feed 7 Link
          DeviceText37=[206]
          DeviceValue37=
          DeviceImage37=rss.png
          Speakbutton37=0
          TriggerString37=
          SearchMode37=1
          TriggerEvent37=
          
          DeviceName38=RSS Feed 8 Link
          DeviceText38=[207]
          DeviceValue38=
          DeviceImage38=rss.png
          Speakbutton38=0
          TriggerString38=
          SearchMode38=1
          TriggerEvent38=
          
          DeviceName39=RSS Feed 9 Link
          DeviceText39=[208]
          DeviceValue39=
          DeviceImage39=rss.png
          Speakbutton39=0
          TriggerString39=
          SearchMode39=1
          TriggerEvent39=
          
          DeviceName40=RSS Feed 10 Link
          DeviceText40=[209]
          DeviceValue40=
          DeviceImage40=rss.png
          Speakbutton40=0
          TriggerString40=
          SearchMode40=1
          TriggerEvent40=
          
          DeviceName41=RSS Feed 11 Link
          DeviceText41=[210]
          DeviceValue41=
          DeviceImage41=rss.png
          Speakbutton41=0
          TriggerString41=
          SearchMode41=1
          TriggerEvent41=
          
          DeviceName42=RSS Feed 12 Link
          DeviceText42=[211]
          DeviceValue42=
          DeviceImage42=rss.png
          Speakbutton42=0
          TriggerString42=
          SearchMode42=1
          TriggerEvent42=
          
          DeviceName43=RSS Feed 13 Link
          DeviceText43=[212]
          DeviceValue43=
          DeviceImage43=rss.png
          Speakbutton43=0
          TriggerString43=
          SearchMode43=1
          TriggerEvent43=
          
          DeviceName44=RSS Feed 14 Link
          DeviceText44=[213]
          DeviceValue44=
          DeviceImage44=rss.png
          Speakbutton44=0
          TriggerString44=
          SearchMode44=1
          TriggerEvent44=
          
          DeviceName45=RSS Feed 15 Link
          DeviceText45=[214]
          DeviceValue45=
          DeviceImage45=rss.png
          Speakbutton45=0
          TriggerString45=
          SearchMode45=1
          TriggerEvent45=
          
          DeviceName46=RSS Feed 16 Link
          DeviceText46=[215]
          DeviceValue46=
          DeviceImage46=rss.png
          Speakbutton46=0
          TriggerString46=
          SearchMode46=1
          TriggerEvent46=
          
          DeviceName47=RSS Feed 17 Link
          DeviceText47=[216]
          DeviceValue47=
          DeviceImage47=rss.png
          Speakbutton47=0
          TriggerString47=
          SearchMode47=1
          TriggerEvent47=
          
          [Grab2]
          Path=https://solaros.datareadings.com/Client/dashboard/
          TextFile=0
          Encoding=
          Username=****************
          Password=***********
          Options=
          UserAgent=
          Devicemode=0
          StripHTML=1
          
          Pattern1=
          Pattern2=
          Pattern3=
          Pattern4=
          Pattern5=
          
          DeviceName1=
          DeviceText1=
          DeviceValue1=
          DeviceImage1=
          Speakbutton1=1
          TriggerString1=
          SearchMode1=1
          TriggerEvent1=
          and here is what I am getting in the data scraper ini:

          Code:
          [Grab1Data]
          0=Australia fires worsen as every state hits 40C
          1=Texas church shooting: Two fatally shot before gunman killed by churchgoer
          2=Greta Thunberg's father: 'She is happy, but I worry'
          3=Briton guilty over Cyprus false rape claim
          4=Civil rights icon John Lewis diagnosed with pancreatic cancer
          5=Actress Sharon Stone blocked from dating app Bumble
          6=Putin thanks Trump for foiling New Year attacks
          7=Gay in Nigeria: 'Everybody sees me as an abomination'
          8=Monsey stabbing: NYC mayor vows action on anti-Semitism 'crisis'
          9=Moscow brings in artificial snow for New Year in mild winter
          10=US attacks Iran-backed militia bases in Iraq and Syria
          11=China jails 'gene-edited babies' scientist for three years
          12=South Korea to pardon 1,800 conscientious objectors
          13=Saudi court sentences man to death for stabbing Spanish theatre group
          14=Photo requests from solitary confinement
          15=When Greta Thunberg met Sir David Attenborough
          16=Medieval combat: A Chinese knight fights for his dream
          17=Hydrogen-powered drones could point way to future travel
          18=Striking photojournalism from around the world in 2019
          19=‘Gardening gives me a lot of peace’
          20=How puppetry can help with trauma
          21='We fell in love on the dance floor’
          22=Autism diagnosis: 'I want 40 years of my life back'
          23=The Syrian town with more cats than people
          24=The best space images of 2019
          25='We can give a lot of the power back to the fans'
          26=Hunting the missing millions from collapsed cryptocurrency
          27=How crowds toppled communism's house of cards in 1989
          28=Why Canada's cannabis bubble burst
          101=The gunman was also killed after a member of the congregation returned fire in the packed church.
          102=The activist's father says he thought her skipping school to fight climate change was a "bad idea".
          103=A British woman, 19, is found guilty over lying about being gang-raped in Ayia Napa, Cyprus, in July.
          104=Mr Lewis, 79, said he would continue his work as a congressman while receiving treatment.
          105=Bumble told her it had received several reports that her profile was fake.
          106=The Russian president says co-operation from US intelligence has prevented attacks within his country.
          107=Five years on from Nigeria's Same Sex Marriage Prohibition Act, discrimination seems to be worse.
          108=Bill de Blasio announces a series of measures following Saturday's mass stabbing at a Jewish event.
          109=Artificial snow is dumped in central Moscow so that snowboarders can still have their fun.
          110=The strikes, which reportedly killed 25 fighters, were in retaliation for an attack on an Iraqi base.
          111=He Jiankui said he altered the genes of a set of twins to try to give them protection against HIV.
          112=Men who refused mandatory military service were subjected to prison - and social stigma.
          113=At least three people were wounded in last month's attack on members of a Spanish theatre group.
          114=Artists and everyday people send images of life outside prison to inmates in solitary confinement.
          115=The teenage activist and veteran naturalist talk to each other for the first time (via Skype).
          116=A Chinese teacher with a passion for medieval combat hopes to take his hobby to the next level.
          117=Hydrogen-powered drones have several advantages to lithium ion-powered ones, says Dr Enass Abo-Hamed.
          118=A selection of the best news photographs from around the world in 2019.
          119=Joanna is an urban gardener trying to reconnect with nature in Singapore.
          120=After being trafficked for sex by her family as a child, puppeteer Raven wants to show people healing is possible through art.
          121=Taiwan has been declared polio free since 2000 and some of the last polio survivors have decided to band together to dance wheelchair ballroom dancing.
          122=People diagnosed with autism in adulthood describe growing up believing they were "bad" or "alien".
          123=The few remaining inhabitants of a bombed-out Syrian town take comfort from hundreds, perhaps thousands, of cats.
          124=With some blockbuster space missions underway, 2019 saw some amazing images beamed back to Earth.
          125=How two friends created an online storytelling platform with more than 80 million global users.
          126=On the trail of almost half a billion dollars lost when the Wex exchange collapsed in 2018.
          127=The BBC's John Simpson recalls witnessing the communist bloc's collapse in three revolutions.
          128=Justin Trudeau legalised cannabis in Canada. So why are people still breaking the law?
          201=https://www.bbc.co.uk/news/world-us-canada-50942664
          202=https://www.bbc.co.uk/news/uk-50901789
          203=https://www.bbc.co.uk/news/uk-50945206
          204=https://www.bbc.co.uk/news/world-us-canada-50944896
          205=https://www.bbc.co.uk/news/world-us-canada-50946431
          206=https://www.bbc.co.uk/news/world-europe-50941754
          207=https://www.bbc.co.uk/news/world-africa-50869022
          208=https://www.bbc.co.uk/news/world-us-canada-50938507
          209=https://www.bbc.co.uk/news/world-europe-50945383
          210=https://www.bbc.co.uk/news/world-middle-east-50941693
          211=https://www.bbc.co.uk/news/world-asia-china-50944461
          212=https://www.bbc.co.uk/news/world-asia-50943442
          213=https://www.bbc.co.uk/news/world-middle-east-50947324
          214=https://www.bbc.co.uk/news/world-us-canada-50832025
          215=https://www.bbc.co.uk/news/science-environment-50904881
          216=https://www.bbc.co.uk/news/world-asia-china-50819492
          217=https://www.bbc.co.uk/news/business-50839917
          218=https://www.bbc.co.uk/news/in-pictures-50728680
          219=https://www.bbc.co.uk/news/world-asia-50866760
          220=https://www.bbc.co.uk/news/entertainment-arts-50866764
          221=https://www.bbc.co.uk/news/world-asia-48956713
          222=https://www.bbc.co.uk/news/health-50380411
          223=https://www.bbc.co.uk/news/stories-50856274
          224=https://www.bbc.co.uk/news/science-environment-50765663
          225=https://www.bbc.co.uk/news/world-us-canada-50383329
          226=https://www.bbc.co.uk/news/world-europe-50821547
          227=https://www.bbc.co.uk/news/world-europe-50821545
          228=https://www.bbc.co.uk/news/world-us-canada-50664578
          301=Mon, 30 Dec 2019 07:05:49 GMT
          302=Mon, 30 Dec 2019 08:50:14 GMT
          303=Mon, 30 Dec 2019 11:35:02 GMT
          304=Mon, 30 Dec 2019 10:45:49 GMT
          305=Mon, 30 Dec 2019 11:53:30 GMT
          306=Sun, 29 Dec 2019 20:39:27 GMT
          307=Mon, 30 Dec 2019 01:57:06 GMT
          308=Mon, 30 Dec 2019 12:03:52 GMT
          309=Mon, 30 Dec 2019 12:46:18 GMT
          310=Mon, 30 Dec 2019 10:34:54 GMT
          311=Mon, 30 Dec 2019 11:17:20 GMT
          312=Mon, 30 Dec 2019 08:15:16 GMT
          313=Mon, 30 Dec 2019 12:23:34 GMT
          314=Mon, 30 Dec 2019 00:44:42 GMT
          315=Mon, 30 Dec 2019 08:06:09 GMT
          316=Mon, 30 Dec 2019 01:15:29 GMT
          317=Mon, 30 Dec 2019 00:49:23 GMT
          318=Sun, 29 Dec 2019 00:16:39 GMT
          319=Sun, 29 Dec 2019 00:21:58 GMT
          320=Sat, 28 Dec 2019 00:22:17 GMT
          321=Sat, 28 Dec 2019 00:22:36 GMT
          322=Mon, 30 Dec 2019 00:59:54 GMT
          323=Mon, 30 Dec 2019 01:04:33 GMT
          324=Mon, 30 Dec 2019 01:07:40 GMT
          325=Mon, 30 Dec 2019 01:29:17 GMT
          326=Mon, 30 Dec 2019 01:28:14 GMT
          327=Sun, 29 Dec 2019 00:58:45 GMT
          328=Sun, 29 Dec 2019 00:53:45 GMT
          401=http://c.files.bbci.co.uk/9DE8/production/_107542404_gettyimages-1151760791.jpg
          402=http://c.files.bbci.co.uk/393E/production/_107545641_mediaitem107545640.jpg
          403=http://c.files.bbci.co.uk/2D5E/production/_107541611_mediaitem107541610.jpg
          404=http://c.files.bbci.co.uk/84E8/production/_107542043_p07f6jnk.jpg
          405=http://c.files.bbci.co.uk/148F6/production/_107541248_miyazaki-spirited_away_1_-c1162-poster.jpg
          406=http://c.files.bbci.co.uk/4672/production/_107543081_jallow9762.png
          407=http://c.files.bbci.co.uk/1C69/production/_107537270_enb_sb50_21june19_kiaraworth-47.jpg
          408=http://c.files.bbci.co.uk/299B/production/_107515601_hunt-johnson_comp_rt-getty.jpg
          409=http://c.files.bbci.co.uk/42F8/production/_107544171_gettyimages-909467988.jpg
          410=http://c.files.bbci.co.uk/13C39/production/_107535908_gettyimages-160023947.jpg
          411=http://c.files.bbci.co.uk/5E50/production/_107544142_6c5b8b9b-7a0d-4974-82dc-2e6eecc41e3d.jpg
          412=http://c.files.bbci.co.uk/1096F/production/_107515976_bc6d8751-ef90-417f-8845-b6869f0eb900.jpg
          413=http://c.files.bbci.co.uk/30E2/production/_107541521_befunky-collage3.jpg
          414=http://c.files.bbci.co.uk/138A/production/_107520050_p07f18zw.jpg
          415=http://c.files.bbci.co.uk/11FA2/production/_107543637_p07f6n7n.jpg
          416=http://c.files.bbci.co.uk/7B7E/production/_107541613_p07f6175.jpg
          417=http://c.files.bbci.co.uk/7781/production/_107539503_p07f5vlm.jpg
          418=http://c.files.bbci.co.uk/7C95/production/_107539813_p07f5tdc.jpg
          419=http://c.files.bbci.co.uk/E8F9/production/_107514695_screenshot2019-06-19at20.47.11.jpg
          420=http://c.files.bbci.co.uk/90C6/production/_107526073_p07f317l.jpg
          421=http://c.files.bbci.co.uk/11474/production/_107527707_gettyimages-1074400644.jpg
          422=http://c.files.bbci.co.uk/EDEC/production/_107080906_gettyimages-654239286.jpg
          423=http://c.files.bbci.co.uk/1515B/production/_107536368_054865685-1.jpg
          424=http://c.files.bbci.co.uk/76B5/production/_106898303_photo-2019-05-08-15-53-032.jpg
          425=http://c.files.bbci.co.uk/5FE7/production/_107515542_isdalwomansequenceendnewmouth.jpg
          426=http://c.files.bbci.co.uk/5E62/production/_107526142_58462021_406552023468743_2926755006583406592_o.jpg
          427=http://c.files.bbci.co.uk/D201/production/_107516735_rexfeatures_10203848bh.jpg
          428=http://c.files.bbci.co.uk/6B16/production/_107541472_stephhoughtonmilliebrightreuters.jpg
          429=http://c.files.bbci.co.uk/9032/production/_107541963_gettyimages-1156898936.jpg
          430=http://c.files.bbci.co.uk/2368/production/_107546090_p07f6tfr.jpg
          431=http://c.files.bbci.co.uk/5521/production/_107539712_martens_rex.jpg
          432=http://c.files.bbci.co.uk/2DA6/production/_107468611_gettyimages-508467821.jpg
          433=http://c.files.bbci.co.uk/2FAC/production/_107540221_silva.jpg
          434=http://c.files.bbci.co.uk/106C/production/_107540240_p07f5yqs.jpg
          435=http://c.files.bbci.co.uk/9473/production/_107530083_p07f5ghx.jpg
          436=http://c.files.bbci.co.uk/7349/production/_107531592_gettyimages-1150938423.jpg
          437=http://c.files.bbci.co.uk/7349/production/_107531592_gettyimages-1150938423.jpg
          438=http://c.files.bbci.co.uk/7349/production/_107531592_gettyimages-1150938423.jpg
          439=http://c.files.bbci.co.uk/9473/production/_107530083_p07f5ghx.jpg
          440=http://c.files.bbci.co.uk/15877/production/_107538188_p07f5tbw.jpg
          441=http://c.files.bbci.co.uk/7349/production/_107531592_gettyimages-1150938423.jpg
          442=http://c.files.bbci.co.uk/E4D4/production/_107408585_p07dcr80.jpg
          443=http://c.files.bbci.co.uk/E4D4/production/_107408585_p07dcr80.jpg
          444=http://c.files.bbci.co.uk/E4D4/production/_107408585_p07dcr80.jpg
          445=http://c.files.bbci.co.uk/18072/production/_107481489_chile_getty.jpg
          446=http://c.files.bbci.co.uk/17815/production/_107477269_morgan.jpg
          447=http://c.files.bbci.co.uk/2540/production/_106763590_schedule_promo.png
          448=http://c.files.bbci.co.uk/8C12/production/_107485853_boris_johnson_getty.jpg
          449=http://c.files.bbci.co.uk/8C12/production/_107485853_boris_johnson_getty.jpg
          450=http://c.files.bbci.co.uk/3DF2/production/_107485851_jeremy_hunt_getty.jpg
          451=http://c.files.bbci.co.uk/C92D/production/_107310515_ipiccy-collage.jpg
          452=http://c.files.bbci.co.uk/9FA9/production/_107337804_gettyimages-1155227177.jpg
          453=http://c.files.bbci.co.uk/1382A/production/_107341997_morgan_rater.png
          454=http://c.files.bbci.co.uk/C92D/production/_107310515_ipiccy-collage.jpg
          455=http://c.files.bbci.co.uk/C92D/production/_107310515_ipiccy-collage.jpg
          456=http://c.files.bbci.co.uk/BF8F/production/_107293094_p07cjf8z.jpg
          457=http://c.files.bbci.co.uk/C68E/production/_107203805_trump_duchess_getty.jpg
          458=http://c.files.bbci.co.uk/1492/production/_106566250_p07782lk.jpg
          459=http://c.files.bbci.co.uk/179B6/production/_107149669_taslim_phone976.jpg
          460=http://c.files.bbci.co.uk/1287F/production/_106230957_saudi_promo_alamy.jpg
          461=http://c.files.bbci.co.uk/14447/production/_106351038_img-20190408-wa0020.jpg
          462=http://c.files.bbci.co.uk/17F0C/production/_106106089_mediaitem106106088.jpg
          29=China and Twitter: The year Chinese diplomacy went social
          129=This year saw a marked changed in tone from China, as more diplomats began using Twitter.
          229=https://www.bbc.co.uk/news/world-asia-china-50832915
          329=Sun, 29 Dec 2019 00:53:55 GMT
          30=Liverpool edge past Wolves to restore 13-point lead at top
          130=Liverpool take another step towards the Premier League title with a hard-fought win over Wolves in yet another match embroiled in VAR controversy.
          230=https://www.bbc.co.uk/sport/football/50882581
          330=Sun, 29 Dec 2019 19:25:14 GMT
          31=Fighter of the decade: Costello & Bunce choose from iconic boxing names
          131=Anthony Joshua, Tyson Fury and Deontay Wilder have done big things, but who gets chosen as fighter of the decade?
          231=https://www.bbc.co.uk/sport/boxing/50898494
          331=Mon, 30 Dec 2019 09:56:35 GMT
          32=A 13-point lead at the top - do even the most pessimistic Liverpool fans believe?
          132=It has been 30 years of hurt for Liverpool fans. But even with a 13-point lead in the Premier League , some still can't say "we are going to win it".
          232=https://www.bbc.co.uk/sport/football/50942464
          332=Sun, 29 Dec 2019 23:24:54 GMT
          33=Greatest sporting moments: England win the World Cup
          133=Stephan Shemilt looks back on the incredible day when England's men won the Cricket World Cup for the first time.
          233=https://www.bbc.co.uk/sport/cricket/50873059
          333=Mon, 30 Dec 2019 12:25:03 GMT
          34=Brady's Patriots in wild card play-offs for first time in 10 seasons
          134=The New England Patriots will have to play in the wild card play-offs for the first time since 2009 after losing 27-24 at home to the Miami Dolphins.
          234=https://www.bbc.co.uk/sport/american-football/50944508
          334=Mon, 30 Dec 2019 08:39:56 GMT
          100=As the bushfires continue, mammoth blazes in Victoria are generating their own lightning.
          200=https://www.bbc.co.uk/news/world-australia-50938504
          300=Mon, 30 Dec 2019 12:39:55 GMT
          The images - line 401 onwards do not seem to be updating. Presumably because BBC changed something in their web site. Any pointers on this would be appreciated. I would like to figure out how this works and be able to diagnose it myself in the future! Also, it would be good if the image field were left blank if it doesn't find a new image - that would let me know that something is going wrong. I don't know if that is possible? Thanks.

          Comment


            Originally posted by simonmason View Post
            Here is my ini:
            They are basic Regular Expressions scraping the RSS feed.
            http://feeds.bbci.co.uk/news/rss.xml?edition=us contains no images though, so that is probably why that stopped working. You could add rss.png to all the DeviceImage entries for "DeviceNameXX", the same way it was setup for [400] and [430]+. BBC probably had an icon/preview image as part of the description that would get shown, but as you can see from the raw feed it is pure text now.

            The only image reference is an RSS feed one @ https://news.bbcimg.co.uk/nol/shared...ews_120x60.gif

            Each article has none in the primary feed, but does have one at article level, so not sure how advanced the scraper-script options are to travel one level deeper and fetch the "og:image" meta tag contents.

            Comment


              Got it - it has been so long since I set this up. As you say, the pictures were probably all displayed on this page at one time and now are gone - I can't actually remember but this would appear to be the case based on what the data scraping was looking for.

              Not sure about going to the next page for each of these. Presumably I would need to feed each link dynamically and have it scrape the underlying page. I will look through the documentation to see if this is possible.

              Comment


                Ran into same issue with a Reuters news feed about 5 years ago and just gave up and used default RSS image, but AnandTech feed still works with images @ https://www.anandtech.com/rss/

                Comment


                  Hi Jon, im just getting started with your plugin and hitting problems with line breaks ( at a guess )
                  Can you help with how I can grab these figures?

                  <tr><td>Number of Microinverters Online</td>
                  <td>10</td></tr>
                  <tr><td>Current Software Version</td>
                  <td>R3.12.49 (590f48)</td></tr>
                  <tr><td>Software Build Date</td>
                  <td>Thu Oct 29, 2015 02:56 PM PDT </td></tr>
                  <tr><td>Database Size</td>
                  <td>13 MB (3 % full)</td></tr>

                  Currently tried:
                  Pattern3=(?s)Last connection to website</td> <td><div class=good> (.*?) minutes</div>
                  Pattern4=(?s)Number of Microinverters Online</td> <td>(.*?)</td></tr>

                  and the original with line breaks, but gives this error:
                  [Grab1] Error at Tags Block 46 System.IndexOutOfRangeException: Index was outside the bounds of the array.
                  at scriptcode2.VBWrapper.Tags(String Num, String[] TagData, String Line, Int32 DecimalPlaces)

                  Cheers,
                  Dylan

                  Comment


                    It would be easier if you provided me with the actual data downloaded and also post all of your [Grab1] settings.

                    In the Jon00DataScraper.ini file, make sure TextFile=1 under [Grab1]

                    Then run the script. Then post grab1.txt here (found in your <homeseer root>\Data\Jon00\datascraper folder).
                    Jon

                    Comment


                      Hi Jon,

                      Thanks for the speedy response here are the requested files

                      Kind regards,

                      Dylan
                      Attached Files

                      Comment


                        Here are your patterns:

                        Pattern1=(?s)Lifetime generation.*?<td>(.*?)MWh
                        Pattern2=(?s)Currently generating.*?<td>(.*?)</td>
                        Pattern3=(?s)Last connection to website.*?good>(.*?) min
                        Pattern4=(?s)Number of Microinverters Online.*?<td>(.*?)</td>
                        Pattern5=(?s)Software Build Date.*?<td>(.*?)</td>
                        Pattern6=(?s)Database Size.*?<td>(.*?)</td>

                        And the results:

                        [Grab1Data]
                        0=13.1
                        100=731 W
                        200=5
                        300=10
                        400=Thu Oct 29, 2015 02:56 PM PDT
                        500=13 MB (3 % full)

                        For Pattern 2, I have included the unit as it would appear the string switches between W (below 1000W) and Kw (above 999W). If not included, the pattern would break. You can use the [replace] tags to remove the W and/or Kw so you get left with an actual value, however I'm not sure how you would differentiate between the two values.

                        Example below:

                        DeviceText2=[100]
                        DeviceValue2=[100] [Replace "Kw",""] [Replace "W",""]


                        As for the errors, please download the latest version (1.0.19) which should cure the issue.


                        Jon

                        Comment


                          Hi Jon,

                          I hope you're well. I used the scraper for a number of things a while back - it worked like a charm !

                          I've got some 3D printers that use a cloud based webpage to show the video feed and also statistics - such as percentage of print completed, time remaining and printer status - such as print complete. I've tried to scrape the data but have a feeling the webpage format or type isn't working with the plugin. Is there any chance that you would be able to just look at the webpage and confirm if it's possible ? There are login details for the actual screen, but imagine the whole site will be the same format:

                          https://cloud.sz3dp.com/main.html

                          There's nothing confidential on the site so if you wanted a password I would be more than happy to PM it to you.

                          Many many thanks,

                          Jay

                          Comment


                            I'm currently fine thanks.

                            Have you confirmed you are accessing & downloading the page correctly i.e. getting passed the login? That to me would be the issue and it may not be possible to use with web login pages.

                            Set TextFile=1 and then open the respective grabX.txt file to see what is returned.

                            Are you running the latest version of Datascraper?
                            Jon

                            Comment


                              Originally posted by jon00 View Post
                              I'm currently fine thanks.

                              Have you confirmed you are accessing & downloading the page correctly i.e. getting passed the login? That to me would be the issue and it may not be possible to use with web login pages.

                              Set TextFile=1 and then open the respective grabX.txt file to see what is returned.

                              Are you running the latest version of Datascraper?
                              Glad to hear !

                              The version is the very latest. I had already set the textfile to = 1. It loads the webpage into the text file, however the data that I see on the screen isn't all included in the file. I wasn't sure whether the main page calls other elements outside of the page source? Does that make sense? Sorry I'm not using the correct terminology I know.

                              Would it help to send a logon as a direct message?

                              Cheers,

                              Jay

                              Comment


                                Thanks for the details. Yes it logs on fine; however it will only take you to the main page. You cannot 'navigate' to other sections on that page unfortunately so unless the data is on the main page, you are stuck.
                                Jon

                                Comment

                                Working...
                                X