realbasic-nug
[Top] [All Lists]

Re: TextEncoding question

To: "REALbasic Network Users Group" <realbasic-nug at lists dot realsoftware dot com>
Subject: Re: TextEncoding question
From: "Joseph J. Strout" <joe at realsoftware dot com>
Date: Fri, 27 Feb 2004 13:15:55 -0600
References: <D8A1C5D1-694F-11D8-885F-003065BB0634 at desuetude dot com>
At 1:07 PM -0500 2/27/04, Charles Yeomans wrote:

The encoding property of a string literal in Rb is UTF-8 -- except for the empty string "", which is ASCII. Why does this special case exist?

As Brady points out, we haven't documented any such behavior. But FYI, the reason is simple: the empty string could be considered any encoding, so we can pick one without even looking at the data (since of course there is no data). We pick ASCII since that is a subset of almost all other encodings, and so does nice things when, for example, it's combined with a string of some other encoding.

Granted, many nonempty UTF-8 strings are ASCII strings too. But to establish that you'd have to actually scan the bytes, which just isn't worth it.

Cheers,
- Joe

--

REAL World 2004, The REALbasic User Conference, March 24th-26th
<http://www.realsoftware.com/realworld/index.html>

- - -
Unsubscribe or switch delivery mode:
<http://support.realsoftware.com/listmanager/>

Search the archives of this list here:
<http://support.realsoftware.com/listarchives/lists.html>

<Prev in Thread] Current Thread [Next in Thread>