In my - very - limited TextEncoding knowledge, when I build a string using +,
the newly 'created' string is defined as UTF-8.
s = "" + "Hi world !" + EndOfLine + "Hòla" // I hope the spelling is right
s TextEncoding is UTF-8.
If the above assumption is right, why did I care about the encoding of my text ?
Emile
PS: a better example (?) can be:
s = EditField1.Text + EndOfLine + "This text is encoded using UTF-8."
s TextEncoding is UTF-8
Knowing the internal working of something can be a good advantage, but then we
fall into 'non documented' trap ('non documented' = no guaranties to stay the
same in next versions).
REALbasic Network Users Group wrote:
REALbasic-NUG Digest #10154 - Friday, February 27, 2004
Subject: Re: TextEncoding question
From: "Charles Yeomans" <yeomans at desuetude dot com>
Date: Fri, 27 Feb 2004 17:27:45 -0500
On Feb 27, 2004, at 2:15 PM, Joseph J. Strout wrote:
At 1:07 PM -0500 2/27/04, Charles Yeomans wrote:
The encoding property of a string literal in Rb is UTF-8 -- except
for the empty string "", which is ASCII. Why does this special case
exist?
As Brady points out, we haven't documented any such behavior. But
FYI, the reason is simple: the empty string could be considered any
encoding, so we can pick one without even looking at the data (since
of course there is no data). We pick ASCII since that is a subset of
almost all other encodings, and so does nice things when, for example,
it's combined with a string of some other encoding.
I consider "" a string literal just like "foo", and it is documented
that string literals are UTF-8. So I'd think that "" should be UTF-8
like all other string literals.
- - -
Unsubscribe or switch delivery mode:
<http://support.realsoftware.com/listmanager/>
Search the archives of this list here:
<http://support.realsoftware.com/listarchives/lists.html>
|