realbasic-nug
[Top] [All Lists]

Re: RegEx question

To: REALbasic NUG <realbasic-nug at lists dot realsoftware dot com>
Subject: Re: RegEx question
From: Charles Yeomans <charles at declareSub dot com>
Date: Fri, 31 Aug 2007 11:51:47 -0400
Delivered-to: listarchive at realsoftware dot com
Delivered-to: realbasic-nug at lists dot realsoftware dot com
References: <46D74652 dot 4070705 at stny dot rr dot com> <D76587F6-C882-49DB-9122-A1606318C376 at declareSub dot com> <46D748C2 dot 7030105 at stny dot rr dot com> <46D74A06 dot 8080707 at stny dot rr dot com> <683B6611-9F70-4FD1-9AA6-85E45038C69B at declareSub dot com> <46D75154 dot 6060802 at stny dot rr dot com>
On Aug 30, 2007, at 7:23 PM, Tom Russell wrote:

> Charles Yeomans wrote:
>> On Aug 30, 2007, at 6:51 PM, Tom Russell wrote:
>>
>>
>>> Tom Russell wrote:
>>>
>>>> Charles Yeomans wrote:
>>>>
>>>>
>>>>> On Aug 30, 2007, at 6:36 PM, Tom Russell wrote:
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>> I need to parse some info from a web page but not sure how to  
>>>>>> regEx
>>>>>> it.
>>>>>>
>>>>>> Example would be:
>>>>>>
>>>>>> href="http://myworld.ebay.com/xxxxxxxxx/";>
>>>>>>
>>>>>> But I only need the stuff in the quotes.
>>>>>>
>>>>>> I assume my search pattern would be something like:
>>>>>> rg.SearchPattern="href="http:(\D+)>"
>>>>>>
>>>>>> Would this be correct?
>>>>>>
>>>>>>
>>>>>>
>>>>> What happened when you tried it?
>>>>>
>>>>> Charles Yeomans
>>>>> _______________________________________________
>>>>>
>>>>>
>>>> Im getting a syntax error. I think my quotes are screwy
>>>>
>>> Fixed the quote issue. Found out Im doing my regex before the  
>>> content
>>> loaded into my socket.
>>>
>>> It would probably be better to load my content into a string array
>>> rather than use the string itself? When I do my regex it pulls the
>>> first
>>> one it finds and its not even the one I want.
>>>
>>> plus Im getting part of what I want: //www.ebay.com"
>>>
>>
>> If you want to include quotes as part of the search string, use hex
>> -- \x22 is ".
>>
>> For what you want, I'd suggest starting with the search pattern
>>
>> \x22(.+)\x22
>>
>> and refine as needed.  match.SubexpressionString(1) should give the
>> part inside the quotes.
>>
>>
> I am not sure what you mean here. Instead of using quotes, use  
> this? And
> wrap my stuff in that?
>
> \x22(.+)href="http:(\D+)>\x22


\x22(.+)\x22  is the same as "(.+)"

Using the hex representation of " makes it clear as to which " are  
part of the regular expression and which are for delimiting the  
REALbasic string literal.

This regular expression will find anything that starts with " and  
ends with ".

Charles Yeomans


_______________________________________________
Unsubscribe or switch delivery mode:
<http://www.realsoftware.com/support/listmanager/>

Search the archives:
<http://support.realsoftware.com/listarchives/lists.html>


<Prev in Thread] Current Thread [Next in Thread>