realbasic-nug
[Top] [All Lists]

Re: finding links with RegEx?

To: REALbasic Network Users Group <realbasic-nug at lists dot realsoftware dot com>
Subject: Re: finding links with RegEx?
From: Jerry Hamlet <jerry at hamletzone dot com>
Date: Wed, 27 Feb 2002 17:36:02 -0800
On 2/27/02 5:20 PM, "Thomas Reed" <thomasareed at earthlink dot net> wrote:

> Okay, with all the suggestions, I've put together a regular expression
> that appears to work -- but as I don't feel 100% comfortable with
> building regular expressions, I'd like to run it by folks and see if
> anyone can find any problems with it.  Here's what I'm doing:
> 
> aRegEx.SearchPattern = "<[^>]*(src|
> background|href)[\s\n]*=[\s\n]*""?([^\s""]+)[\s""][^>]*>"

Don't need the double double-quotes, just one will do. \s implies \n. Using
[^>]* won't allow for ">" within a quoted attribute value (extremely rare,
but I use them in developing HTML templates to run through Perl.)

The [^>]* operators are EXTREMEMLY GREEDY and will probably bog down your
searches. You might want to try the two-problem method against this one for
speed, see which one wins :)

-jerry

> 
> -Thomas
> 
> Personal web page:                 http://home.earthlink.net/~thomasareed/
> My shareware:            http://home.earthlink.net/~thomasareed/shareware/
> Pixel Pen web pub. guide: http://home.earthlink.net/~thomasareed/pixelpen/
> 
> Any closet is a walk-in closet if you try hard enough.
> 
> 
> ---
> Subscribe to the digest:
> <mailto:realbasic-nug-digest at lists dot realsoftware dot com>
> Unsubscribe:
> <mailto:realbasic-nug-off at lists dot realsoftware dot com>
> 


+---------------------------+
| Jerry Hamlet              |
| Web Designer/Programmer   |
| www.hamletzone.com        |
| jerry at hamletzone dot com      |
+---------------------------+





<Prev in Thread] Current Thread [Next in Thread>