On 02/28/2002 14:13, "Joel Rosenblum" <pyropostal at mac dot com> wrote:
>> This will enable you to catch also flash <param name=movie
>> value="xxx.swf">:
>
> But, the value parameter is common for tags that don't use URLs, so you
> may have a value=true.
That's why I said it would have to be treated separately. You have actually
to check for name=movie.
> Also, what does this mean: "(?blah)" ? Does that
> mean you don't care if you find (blah) or not? Why not use "(blah)?" ?
It is not (?blah), but (?=blah). See lookahead in J. Friedl's book (pp. 228
and fol.).
As I said before, this is optional, it has been added to speed up matching.
>> <\s*(?=[ABIP])\w+\s+
>> ([-=\w./:@&';\(\)%]+\s+)*(?=[VSBH])(src|background|href|
>> value)\s*=\s*[""']?([-=?\w./:@&';\(\)%]+)[^>]*>
>
> Btw, there are URLs with spaces in them which work in web browsers, so
> don't be so strict.
Spaces should be coded %20, hence the % in the pattern ;-)
--
Didier Barbas
Dilettante programmer and linguist
http://ww.sungnyemun.org
|