At 1:07 PM -0500 2/27/04, Charles Yeomans wrote:
The encoding property of a string literal in Rb is UTF-8 -- except
for the empty string "", which is ASCII. Why does this special case
exist?
As Brady points out, we haven't documented any such behavior. But
FYI, the reason is simple: the empty string could be considered any
encoding, so we can pick one without even looking at the data (since
of course there is no data). We pick ASCII since that is a subset of
almost all other encodings, and so does nice things when, for
example, it's combined with a string of some other encoding.
Granted, many nonempty UTF-8 strings are ASCII strings too. But to
establish that you'd have to actually scan the bytes, which just
isn't worth it.
Cheers,
- Joe
--
REAL World 2004, The REALbasic User Conference, March 24th-26th
<http://www.realsoftware.com/realworld/index.html>
- - -
Unsubscribe or switch delivery mode:
<http://support.realsoftware.com/listmanager/>
Search the archives of this list here:
<http://support.realsoftware.com/listarchives/lists.html>
|