On 2/27/02 5:09 PM, "Jerry Hamlet" <jerry at hamletzone dot com> wrote:
> On 2/27/02 3:40 PM, "Thomas Reed" <thomasareed at earthlink dot net> wrote:
>
>>> Off the top of my head:
>>>
>>> (src|background|href)="?([^\s"]+)[\s"]
>>
>> This works, and seems to work well, except for a couple things.
>>
>> First, it doesn't handle line breaks well. For example, suppose there's
>> a break like this:
>>
>> ............................... SRC =
>> "somelink.html" ...
>>
>> This would be skipped, but I need to match it. (Handling returns has
>> always been a problem for me in regular expressions before.)
>>
>> Second, I want to make sure this only matches if the text in question is
>> somewhere inside a pair of angle brackets.
>>
>> For example, I don't want to match the link in a case like this:
>>
>> <P>To display a link on your web page, do the following:</P>
>>
>> <P><A HREF="samplelink.html"></P>
>>
>> How can I extend this to handle these two conditions?
>>
>
> Separate your problem into two:
>
> 1) Find all opening tags, put them in an array:
> Search Pattern = <(\w+)(\s[\w-]+(\s?=\s?"?([^\s"]+)[\s"]?)?)*>
> \1 = tag name (use to check if you really want to add it to your array.)
>
>
> 2) Loop through the array looking for the first search string (modified
> below to allow whitespace around the equal sign):
> (src|background|href)\s?=\s?"?([^\s"]+)[\s">]
Should be: (src|background|href)\s?=\s?"?([^\s"]+)[\s"]?
-jerry
> \2 = file path
>
>
> This works in Bbedit. Haven't tested it in RB yet. RB handles RegEx
> differently...
>
> -jerry
>
> +---------------------------+
> | Jerry Hamlet |
> | Web Designer/Programmer |
> | www.hamletzone.com |
> | jerry at hamletzone dot com |
> +---------------------------+
>
>
>
>
> ---
> Subscribe to the digest:
> <mailto:realbasic-nug-digest at lists dot realsoftware dot com>
> Unsubscribe:
> <mailto:realbasic-nug-off at lists dot realsoftware dot com>
>
+---------------------------+
| Jerry Hamlet |
| Web Designer/Programmer |
| www.hamletzone.com |
| jerry at hamletzone dot com |
+---------------------------+
|