realbasic-nug
[Top] [All Lists]

Re: finding links with RegEx?

To: REALbasic Network Users Group <realbasic-nug at lists dot realsoftware dot com>
Subject: Re: finding links with RegEx?
From: Didier BARBAS <lists at sungnyemun dot org>
Date: Thu, 28 Feb 2002 11:50:44 +0900
Okay, so I thought I would contribute with a little regEx of my own...
<\s*(?=[ABI])\w+\s+(?=[SBH])(src|background|href)\s*=\s*[""']?([-=?\w./:@&';
\(\)%]+)[^>]*>
It may not catch everything, although it is trying hard...
The focus is put on the three main types of links:
A HREF
BODY BACKGROUND
IMG SRC
The positive lookahead is not necessary, but it speeds up things a little.
The second part is an attempt to catch everything... but not too much!

Sample program:

  dim rg as regex
  dim m as regexMatch
  dim s,t as string
  dim k as integer
  dim t1,t2 as double
  
  s=editField1.text
  
  rg=new regex
rg.searchPattern="<\s*(?=[ABI])\w+\s+(?=[SBH])(src|background|href)\s*=\s*["
"']?([-=?\w./:@&';\(\)%]+)[^>]*>"
  rg.options.caseSensitive=false
  t1=microseconds
  m=rg.search(s)
  listBox1.deleteAllRows
  while m<>nil
    k=k+1
    listBox1.addRow m.subExpressionString(2)
    m=rg.search(s,len(m.subExpressionString(0))+m.SubExpressionStart(0))
  wend
  t2=microseconds
  t2=t2-t1
  staticText1.text=str(t2)
  staticText2.text=str(k)+" matches"

HTH

-- 
Didier Barbas
Dilettante programmer and linguist
http://ww.sungnyemun.org


On 02/28/2002 10:20, "Thomas Reed" <thomasareed at earthlink dot net> wrote:

> Okay, with all the suggestions, I've put together a regular expression
> that appears to work -- but as I don't feel 100% comfortable with
> building regular expressions, I'd like to run it by folks and see if
> anyone can find any problems with it.  Here's what I'm doing:
> 
> aRegEx.SearchPattern = "<[^>]*(src|
> background|href)[\s\n]*=[\s\n]*""?([^\s""]+)[\s""][^>]*>"
> 
> -Thomas
> 
> Personal web page:                 http://home.earthlink.net/~thomasareed/
> My shareware:            http://home.earthlink.net/~thomasareed/shareware/
> Pixel Pen web pub. guide: http://home.earthlink.net/~thomasareed/pixelpen/
> 
> Any closet is a walk-in closet if you try hard enough.



<Prev in Thread] Current Thread [Next in Thread>