realbasic-nug
[Top] [All Lists]

Re: Working with bytes vs characters

To: REALbasic NUG <realbasic-nug@lists.realsoftware.com>
Subject: Re: Working with bytes vs characters
From: Joe Strout <joe@inspiringapps.com>
Date: Sun, 28 Sep 2008 07:19:35 -0600
Authentication-results: mx.google.com; spf=neutral (google.com: 74.124.194.228 is neither permitted nor denied by best guess record for domain of realbasic-nug-bounces@lists.realsoftware.com) smtp.mail=realbasic-nug-bounces@lists.realsoftware.com
Delivered-to: listarchive@realsoftware.com
In-reply-to: <f72ec0cb0809272225y327ef795o4493501ce3cf09fa@mail.gmail.com>
References: <f72ec0cb0809272225y327ef795o4493501ce3cf09fa@mail.gmail.com>
Reply-to: REALbasic NUG <realbasic-nug@lists.realsoftware.com>
Sender: realbasic-nug-bounces@lists.realsoftware.com
On Sep 27, 2008, at 11:25 PM, Paul Rehill wrote:

I am trying to code to calculate the number of bytes rather than
characters to read incoming data from sockets.

Simply use the -B versions of all the string functions (InStrB, MidB, etc.).

If I was only working with characters and not bytes, I could use the
following code to determine iTagEnd which represents the number of
characters to be read by a socket's read function:
   iFoundTag=Instr(sLook, Str1)
   iTagLength=Len(Str1)
   iTagEnd=iFoundTag+iTagLength-1

But since I am working with bytes, I have replaced Instr with InstrB,
Len with LenB and 1 in the above code with LenB(Left(Str1,1)) as in
the following code:

   iFoundTag=InstrB(sLook, Str1)
   iTagLength=LenB(Str1)
   iTagEnd=iFoundTag+iTagLength-LenB(Left(Str1,1))

Hmm, that's working too hard. Just subtract one as you did before. iTagEnd isn't the end position; it's the number of bytes you want to read. So the length of the characters is irrelevant. You need to subtract one only because the position returned by InStr (or InStrB) is 1-based rather than 0-based.

Then I would use code like Me.Read(iTagEnd, UTF8Encod) to read in the
specified bytes.

Right.

 I suspect (from a discussion in an earlier thread)
that since I'm using UTF8 Encoding on strings that are written or
read, the above 2 code blocks are equivalent (1 character = 1 byte
always in UTF8 Encoding?).

Not always. Just when dealing with ASCII characters (but there are only 128 of those; there are a lot more characters than that in the world).

Best,
- Joe

--
Joe Strout
Inspiring Applications, Inc.
http://www.InspiringApps.com





_______________________________________________
Unsubscribe or switch delivery mode:
<http://www.realsoftware.com/support/listmanager/>

Search the archives:
<http://support.realsoftware.com/listarchives/lists.html>


<Prev in Thread] Current Thread [Next in Thread>