As was recently suggested, I've decided to start a FAQ on text
encoding in REALbasic 5. I'm going to let this be driven by actual
questions. Here's what I have so far, based only on posts I saw
today. If you have more questions, please feel free to ask and I'll
add the answers to the FAQ.
Best,
- Joe
Frequently Asked Questions
about Text Encoding in REALbasic 5
----------------------------------
1. What encoding are my string literals, constants, etc. in?
All strings in your REALbasic project should be compiled as UTF-8.
This is a Unicode encoding that uses one byte for ASCII characters,
and up to four bytes for non-ASCII characters. It has a number of
other handy properties too, for example, an ASCII character will
never appear as part of a multi-byte character.
2. Which is faster, ConvertEncoding or TextConverter.Convert?
In most cases, ConvertEncoding is much faster than using
TextConverter.Convert. ConvertEncoding has a number of optimizations
for common cases, such as converting the same string multiple times,
or converting from one superset of ASCII to another. (All
WorldScript encodings, most Windows encodings, and UTF-8 are all
supersets of ASCII.)
So, you should usually use ConvertEncoding, but if you really need
the speed then you should just measure it both ways and see which
performs better in your particular situation.
3. How do I get a specific byte into a string?
Use ChrB. ChrB takes a byte value (0-255) and returns a string with
undefined encoding, containing exactly that byte. You can build a
string containing multiple bytes by just adding these together.
Of course, don't expect such a string to display as text in any
sensible way. If you want to make text, see the next question.
4. How do I get a specific character by its code point (or "ASCII value")?
Use TextEncoding.Chr. This returns a one-character string with the
character you specified by its code point within that encoding. For
example, a capital A in the ASCII character set would be:
s = Encodings.ASCII.Chr(65)
A copyright symbol represented in UTF-8 would be:
s = Encodings.UTF8.Chr(169)
--
,------------------------------------------------------------------.
| Joseph J. Strout REAL Software, Inc. |
| joe at realsoftware dot com http://www.realsoftware.com |
`------------------------------------------------------------------'
---
A searchable archive of this list is available at:
<http://support.realsoftware.com/listarchives/search.php>
Unsubscribe:
<mailto:realbasic-nug-off at lists dot realsoftware dot com>
Subscribe to the digest:
<mailto:realbasic-nug-digest at lists dot realsoftware dot com>
.
|