realbasic-nug
[Top] [All Lists]

Working with bytes vs characters

To: "REALbasic NUG" <realbasic-nug@lists.realsoftware.com>
Subject: Working with bytes vs characters
From: "Paul Rehill" <paul.rehill@gmail.com>
Date: Sun, 28 Sep 2008 15:25:52 +1000
Authentication-results: mx.google.com; spf=neutral (google.com: 74.124.194.228 is neither permitted nor denied by best guess record for domain of realbasic-nug-bounces@lists.realsoftware.com) smtp.mail=realbasic-nug-bounces@lists.realsoftware.com; dkim=neutral (body hash did not verify) header.i=@gmail.com
Delivered-to: listarchive@realsoftware.com
Dkim-signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:received:message-id:date:from:to :subject:mime-version:content-type:content-transfer-encoding :content-disposition; bh=X+9ksoxwPGJ/kc+BUNnscmy1uQkj3HjKx7sJWjAgA2Q=; b=ShqRMA3rQVF8kLW0rO7JLs0pttTZ2jXZWfEKTjAFZJ+ubk1bZoASADdTWuSewjrdhK KfnLOPXKvJLwLpxly0H8fXzOt7mLHaifOecHu2HSEWU4imtYtK/b114nXT+W8goUk3ff EDm8HIbgV8wqpF9YXWytcDhIqxe0DQ8oHe/3M=
Domainkey-signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=message-id:date:from:to:subject:mime-version:content-type :content-transfer-encoding:content-disposition; b=bRzG1ZXxRsQnyAS2tlUY37R9gzukyGaTNnKS43kBNYijdnYKG/0uBjnR95KRr+jBtQ 0dTBYb3Yz0z3wzRjTfmehXGX8J0zErkJfXb4C6a0M4o91lncutGdibV3ZDgIbekkGl6V Mg8svTtX1WBwz11iQXMjV4rIQHeX7yJd2AbGM=
Reply-to: REALbasic NUG <realbasic-nug@lists.realsoftware.com>
Sender: realbasic-nug-bounces@lists.realsoftware.com
I am trying to code to calculate the number of bytes rather than
characters to read incoming data from sockets.

I am aiming to read in data to the end of some closing tag, let's call
it Str1="</sometag>" that has been extracted using the Lookahead
function for a socket.  sLook is the full string data obtained using
the LookAhead function.

If I was only working with characters and not bytes, I could use the
following code to determine iTagEnd which represents the number of
characters to be read by a socket's read function:
    iFoundTag=Instr(sLook, Str1)
    iTagLength=Len(Str1)
    iTagEnd=iFoundTag+iTagLength-1

But since I am working with bytes, I have replaced Instr with InstrB,
Len with LenB and 1 in the above code with LenB(Left(Str1,1)) as in
the following code:

    iFoundTag=InstrB(sLook, Str1)
    iTagLength=LenB(Str1)
    iTagEnd=iFoundTag+iTagLength-LenB(Left(Str1,1))

Then I would use code like Me.Read(iTagEnd, UTF8Encod) to read in the
specified bytes.  I suspect (from a discussion in an earlier thread)
that since I'm using UTF8 Encoding on strings that are written or
read, the above 2 code blocks are equivalent (1 character = 1 byte
always in UTF8 Encoding?).  But I think the second block is more
robust since it can cater for strings where different encoding is
specified (say UTF16 is specified).

Does this look OK?

Thanks in advance

Paul

_______________________________________________
Unsubscribe or switch delivery mode:
<http://www.realsoftware.com/support/listmanager/>

Search the archives:
<http://support.realsoftware.com/listarchives/lists.html>


<Prev in Thread] Current Thread [Next in Thread>