Announcement

Collapse
No announcement yet.

Regex - The Dark Art - Still Struggling

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

    Regex - The Dark Art - Still Struggling

    Hi All,

    I am having a play around with REGEX to use it to match parts of various strings which have very variable formats, but I do know there are certain words /phrases that exist in most strings that I was hoping i might be able to parse to extract the address from the string itself. I found an online regex tester and have been slowly learning, but I am struggling around the OR component, normally the OR is fine since it will be any type in the OR (but not multiple) however there are some times when the string will actually have two OR conditions that are met, and I would like to try and get the string from the second point of the condition.

    The Regex I have at the moment is:
    PHP Code:
    (?<=/INCIC1 |INCIC3 |STRUC1 |STRUC3 |G&SC1 |G&SC3 |ALARC1 |ALARC3 |NOSTC1NOSTC3 |RESCC1 |RESCC3 |HIARC1 |HIARC3)(.*?)(?=\SVSESVC SVSW SVNE |SVNW
    An example String:
    PHP Code:
    ALERT ABBR1 F000000964 STRUC1 THERMAL IMAGING CAMERA REQUIRED CODE 1 ABBRSV REQUIRED 20 EXAMPLE RD EXAMPLE CREEK /EXAMPLE 1 RD //EXAMPLE 2 TRK SVC 0000 J7 (00000) ABBRSV [ABBR] 
    Using the above Regex I can get the following:
    PHP Code:
    THERMAL IMAGING CAMERA REQUIRED CODE 1 ABBRSV REQUIRED 20 EXAMPLE RD EXAMPLE CREEK /EXAMPLE 1 RD //EXAMPLE 2 TRK 
    So my question is, how can I determine if a phrase is in the string, then use regex to conduct a search accordingly, e.g. in the above string if REQUIRED is present, i would like to start the extraction from the last REQUIRED phrase (e.g. if one REQUIRED it starts after that one, if two REQUIRED it starts after the second phrase)

    In the above example, i would like to try and extract like this:
    PHP Code:
    20 EXAMPLE RD EXAMPLE CREEK /EXAMPLE 1 RD //EXAMPLE 2 TRK 
    Id like to try and do this in conditional regex if possible, as sometimes it would not be required, but another word or phrase that should be matched instead (e.g. CNR)
    HS3 PRO, Win10, WeatherXML, HSTouch, Pushover, UltraGCIR, Heaps of Jon00 Plugins, Just sold and about to move so very slim system.

    Facebook | Twitter | Flickr | Google+ | Website | YouTube

    #2
    I should ask, is there a better way of doing this. I am slowly starting to understand REGEX so figured that way might be better, but should i use something like INSTR to determine if CNR is in then do this, otherwise do that?

    If i do have multiple phrases and want to match them in order of priority how is the best way to do that, e.g.

    If instr = CNR then
    do this regex
    elseif instr = REQUIRED then
    do that regex
    elseif instr = UNDEFINED FIRE then
    do something else regex
    else
    global regex
    end if

    Is something like that the easiest way?
    HS3 PRO, Win10, WeatherXML, HSTouch, Pushover, UltraGCIR, Heaps of Jon00 Plugins, Just sold and about to move so very slim system.

    Facebook | Twitter | Flickr | Google+ | Website | YouTube

    Comment


      #3
      i always have trouble with regex since i use it so little. this helped me. http://www.ultrapico.com/expresso.htm

      Comment


        #4
        Oh how i hate Regex!

        I have been trying to use regex to extract addresses from a highly variable string, if my fire department were able to use a fixed point at the start of the string (e.g. LOC) i would be soo over the moon happy and would not have an issue in the world! However as always is the case there is no start location I can easily use, so i need to try and work out every possible computation of an address I can to get this stupid thing to work!

        I am about 80% there, with a regex patern that could make some faint it is that horrible, however I think I have to now move towards Instr statements to work out when a different regex pattern needs to be used. I will give you an example below:

        Regex Pattern:
        PHP Code:
                Dim regex As Regex = New Regex("( ([\d]+) | ([\d]+-[\d]+) | ([\d]+ - [\d]+) | CAR SMOULDERING | INPUT |INPUT| OFF | OPPOSITE |CNR|SPARKING |INCIC1 |INCIC3 |STRUC1 |STRUC3 |G&SC1 |G&SC3 |ALARC1 |ALARC3 |NOSTC1| NOSTC3 |RESCC1 |RESCC3 |HIARC1 |HIARC3|CAR ACCIDENT - POSS PERSON TRAPPED |EXPLOSIONS HEARD |WASHAWAY AS A RESULT OF ACCIDENT |ENTRANCE |ENT |LHS |RHS |POWER LINES ARCING AND SPARKING |SMOKE ISSUING FROM FAN |CAR FIRE |FIRE ALARM OPERATING |GAS LEAK |GAS PIPE |NOW OUT |ACCIDENT |SMOKING |ROOF |GAS |REQUIRED | FIRE |LOCKED IN CAR |SMOKE RISING |SINGLE CAR ACCIDENT |ACCIDENT|FIRE)(.*?)(?=\SVSE| M | SVC | SVSW | SVNE |SVNW )"RegexOptions.RightToLeft
        The above works from right to left, and I know the right side will always have SVSE, M, SVC, SVSW, SVNE, SVNW or something similar on that side, so I have the starting reference point, it then works to the right, looking for any of those phrases where when it finds them it stops and that is the string, the example is a call might be INCI1 CAR ACCIDENT POSSIBLE PERSONS TRAPPED 234 Example St EXAMPLE TOWN SVSW 6555 J1 so it sees the car accident possible persons trapped and works fine, it also looks for a number or CNR as the street address and that works fine.

        What does not work well is where I have a CNR and multiple streets which are seperated by a \ for example CNR STREET A ST\STREET B ST EXAMPLE TOWN

        The issue i have is that in my string there is actually sometimes cross streets specified by \STREET 1 ST \\STREET 2 ST, is the best way just to do an INSTR to find the \ with no spaces between it, and if that exists then use a different regex? and if so how can I identify a \ with letters either side of it rather than with a space, or symbols?

        This is an example below, that there is a RD/EXAMPLE which i want to include, so I would look to see if there was a / with text each side, and if so my regex above might be as simple as:

        PHP Code:
        HbXXXX3 G&SC3 SMOKE ISSUING CNR STREET A RD/EXAMPLE RD SUBURB M 000 J00 (000000F0000000 CXXXX CXXXX 
        Regex Pattern:
        PHP Code:
                Dim regex As Regex = New Regex(CNR)(.*?)(?=\SVSESVC SVSW SVNE |SVNW )", RegexOptions.RightToLeft) 

        Thanks for your help!
        HS3 PRO, Win10, WeatherXML, HSTouch, Pushover, UltraGCIR, Heaps of Jon00 Plugins, Just sold and about to move so very slim system.

        Facebook | Twitter | Flickr | Google+ | Website | YouTube

        Comment

        Working...
        X