realbasic-nug
[Top] [All Lists]

[Oops]Re: finding links with RegEx?

To: REALbasic Network Users Group <realbasic-nug at lists dot realsoftware dot com>
Subject: [Oops]Re: finding links with RegEx?
From: Jerry Hamlet <jerry at hamletzone dot com>
Date: Wed, 27 Feb 2002 17:16:36 -0800
On 2/27/02 5:09 PM, "Jerry Hamlet" <jerry at hamletzone dot com> wrote:

> On 2/27/02 3:40 PM, "Thomas Reed" <thomasareed at earthlink dot net> wrote:
> 
>>> Off the top of my head:
>>> 
>>> (src|background|href)="?([^\s"]+)[\s"]
>> 
>> This works, and seems to work well, except for a couple things.
>> 
>> First, it doesn't handle line breaks well.  For example, suppose there's
>> a break like this:
>> 
>> ............................... SRC =
>> "somelink.html" ...
>> 
>> This would be skipped, but I need to match it.  (Handling returns has
>> always been a problem for me in regular expressions before.)
>> 
>> Second, I want to make sure this only matches if the text in question is
>> somewhere inside a pair of angle brackets.
>> 
>> For example, I don't want to match the link in a case like this:
>> 
>> <P>To display a link on your web page, do the following:</P>
>> 
>> <P>&lt;A HREF="samplelink.html"&gt;</P>
>> 
>> How can I extend this to handle these two conditions?
>> 
> 
> Separate your problem into two:
> 
> 1) Find all opening tags, put them in an array:
>   Search Pattern = <(\w+)(\s[\w-]+(\s?=\s?"?([^\s"]+)[\s"]?)?)*>
>   \1 = tag name (use to check if you really want to add it to your array.)
>   
> 
> 2)  Loop through the array looking for the first search string (modified
> below  to allow whitespace around the equal sign):
>   (src|background|href)\s?=\s?"?([^\s"]+)[\s">]

Should be: (src|background|href)\s?=\s?"?([^\s"]+)[\s"]?

-jerry

>   \2 = file path
> 
> 
> This works in Bbedit. Haven't tested it in RB yet. RB handles RegEx
> differently...
> 
> -jerry
> 
> +---------------------------+
> | Jerry Hamlet              |
> | Web Designer/Programmer   |
> | www.hamletzone.com        |
> | jerry at hamletzone dot com      |
> +---------------------------+
> 
> 
> 
> 
> ---
> Subscribe to the digest:
> <mailto:realbasic-nug-digest at lists dot realsoftware dot com>
> Unsubscribe:
> <mailto:realbasic-nug-off at lists dot realsoftware dot com>
> 


+---------------------------+
| Jerry Hamlet              |
| Web Designer/Programmer   |
| www.hamletzone.com        |
| jerry at hamletzone dot com      |
+---------------------------+





<Prev in Thread] Current Thread [Next in Thread>