realbasic-nug
[Top] [All Lists]

StrComp weirdness

To: REALbasic NUG Network Users Group <realbasic-nug@lists.realsoftware.com>
Subject: StrComp weirdness
From: Thomas Tempelmann <tt@tempel.org>
Date: Sat, 29 Nov 2008 10:55:17 +0100
Authentication-results: mx.google.com; spf=neutral (google.com: 74.124.194.228 is neither permitted nor denied by best guess record for domain of realbasic-nug-bounces@lists.realsoftware.com) smtp.mail=realbasic-nug-bounces@lists.realsoftware.com
Delivered-to: listarchive@realsoftware.com
Reply-to: REALbasic NUG <realbasic-nug@lists.realsoftware.com>
Sender: realbasic-nug-bounces@lists.realsoftware.com
Thread-index: AclSCJUD03vI1r37Ed2iNwAdT0hFXA==
Thread-topic: StrComp weirdness
User-agent: Microsoft-Entourage/11.3.6.070618
Yesterday I discovered an odd bug with StrComp, and it amazes me that it
hasn't been discovered and fixed yet:

Imagine you want to _order_ strings case-insentively, i.e.
lexicographically.

You have two choices to do that: Compare with "<" which doesn't obey
international ordering rules well, or use StrComp which should do this
right.

We all know what to expect here:

  dim s1, s2 as String
  s1 = "a"
  s2 = "B"

  if s1 < s2 then
    MsgBox "yes"
  else
    MsgBox "no"
  end

This gives us a unambiguous "yes". Good.

Now try this:

  if StrComp( s1, s2, 1 ) < 0 then
    MsgBox "yes"
  else
    MsgBox "no"
  end

Shouldn't we expect a "yes" here, too?

Well, it's getting tricky:

If you try the above with REALbasic 4.5, it'll say "yes". With 2008r4 it
says "no". Isn't that weird?

Now, here's the solution to get a "yes" response:

  s1 = s1.ConvertEncoding (Encodings.MacRoman) // ASCII works as well
  s2 = s2.ConvertEncoding (Encodings.MacRoman)

  if StrComp( s1, s2, 1 ) < 0 then
    MsgBox "yes"
  else
    MsgBox "no"
  end

Which reveals the source of the problem: When StrComp operates on Unicode
strings, it doesn't do it right, while it works with pure single-byte
encodings.

The whole idea of encodings is, however, that they allow to represent the
_same_ text in different encodings. And the _same_ text should _behave_ the
same. StrComp, however, does not ovserve this obvious rule. Thus it's a bug
which needs fixing.

Thomas



_______________________________________________
Unsubscribe or switch delivery mode:
<http://www.realsoftware.com/support/listmanager/>

Search the archives:
<http://support.realsoftware.com/listarchives/lists.html>


<Prev in Thread] Current Thread [Next in Thread>
  • StrComp weirdness, Thomas Tempelmann <=