Improve company productivity with a Business Account.Sign Up

  • Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 321
  • Last Modified:

What string comparision does Microsoft allows in VB?

Where in Microsoft documentation is stated and guaranteed for the future that syntax:

  if "aaa" < "aba" then msgbox "I guess"

is allowed and will output "I guess".

Perhaps, it is hidden in some early documentation for VB3 and
enforced in higher versions by backward compatibility.
Perhaps, there are some official examples from Microsoft.

Please don't make answers like "this works". For example, it is known that
following program will work. This is not a question. The question intends to
find word from the vendor which is most reliable quarantee.

Option Compare Binary
Private Sub Form_Load()

t = "aaaa" < "aaba"
t = t And " aa" < "aaa"
t = t And "aa" > "aA"
t = t And Chr(0) & "aa" < "aaa"
t = t And "aa" < "aaa"
t = t And "" < "a"

f = "aaaa" > "aaba"
f = f Or "aaaa" >= "aaba"
f = f Or "aaa" < "aa" & Chr(0)

If t And Not f Then MsgBox "Works"
'The real output is "Works"

End Sub

Thank you very much.
  • 12
  • 5
1 Solution
beaverstoneAuthor Commented:
Thank you Javin007.

This is very close. But, I don't see that VB allows to compare second, and n-th characters
if first and (n-1)th are equal. All examples which I see restricted to the first character
comparision. (Like printer is before scanner because p is less than s.
But, will be aaa less than aab?)

Thank you.
Actually, it's not even comparing the characters, Beaverstone.  What it's doing is taking the bit values of the whole string:

abc = 011000010110001001100011
aaa = 011000010110000101100001

And it orders the string according to the bits, left to right.

Free Tool: Path Explorer

An intuitive utility to help find the CSS path to UI elements on a webpage. These paths are used frequently in a variety of front-end development and QA automation tasks.

One of a set of tools we're offering as a way of saying thank you for being a part of the community.

Side note:

It also considers the lack of a bit to be a null string (chr$(0), or 00000000)

So if
aaa=011000010110000101100001 and
  aa=011000010110000100000000, then

aa < aaa

God this thing needs an edit function...

Rereading what I wrote, I noticed that this could be confusing:  "And it orders the string according to the bits, left to right."

I don't mean it orders the string according to the physical bits, but to the total value of the bits.  This is why capitalized letters are considered "smaller" than their counterparts.  Thus a > AA

Strangely enough, microsoft's own Listboxes and controls don't use this method to alphabetize.

:/  I'm not explaining this very well.  It made more sense to me before I tried to explain it.

"Option Compare Binary results in string comparisons based on a sort order derived from the internal binary representations of the characters. In Microsoft Windows, sort order is determined by the code page. A typical binary sort order is shown in the following example:"

I think that's pretty much all you're going to find microsoft saying on the subject.  But suffice to say that comparisons like you were doing are by default binary comparisons.  Maybe someone else can explain it better.  The concept itself is very simple (and basic for most languages, this isn't just a Microsoft thing), but the result is that aab will always be greater than aaa.

beaverstoneAuthor Commented:
Thank you Javin, you are doing more work than asked.

The question is not about how Microsof implements string comparision.
The question is not about how to describe ths string comparision algorithm in
equivalent languich of "bytes" or other representaion.

The question is about to find word from Microsoft that it will work as stated in question.
Why Microsoft does not offer and example where second or n-th character are compared?
Why you seems cannot find this example in MSDN, or VB3, or VB4, VB6 documentation?
All examples which I see restricted to the first character
comparision. (Like printer is before scanner because p is less than s.
Perhaps  be aaa less than aab only accidentaly because some Microsoft programmer
did an extra job which is not documented and not intended to be supported?)

Althoug particular implementation of the algorithm is not a subject of the question, but
for accuracy let me note that adding a char(0) at the end seems incorrect as shows this

If "aa" < "aa" & Chr(0) Then MsgBox "Not At the end."
If "aa" > Chr(0) & "aa" Then MsgBox "Not At the beginning"
If Chr(0) > "" Then MsgBox "Not chr(0)"

Perhaps the rule is that any char at the end is greater than unfilled position at the end.
But again, does MS guarantees this detais along with entire algorithm?
The only reason that we can rely on this is that "common sence" suggestes this algorithm,
and MS may be forced to follow common practice in order to be "in the crowd".

Thank you.

I don't believe you're going to find ANY documentation on this, though, because it's not a Microsoft implementation.  It's something that comes with any language you use.  C++, C#, Delphi, Basica, etc.  When a language looks at the string, it sees an array of bytes, not a string.  More accurately, it sees a very long string of bits.  Thus, the language DOESN'T see it when you compare "abc."  It doesn't SEE abc.  What it sees is the bitvalue equivilent of abc, which is 6513249.  That's why I was trying to explain how it works, and why you won't find the "guarantee" you're wanting.  Microsoft would spend no more time explaining that function than they would spend explaining the low-level details of what AND does.

beaverstoneAuthor Commented:
Thank you Javin for your comments.

The idea described in your explanation
"It's something that comes with any language you use" perhaps right.
I understand this idea that there is certain "unspoken agreement" or
"programming culture" to generalize string comparision algorigthm including all the characters
not only the first. You probably trying to point that Microsoft implicity follows
this culture. But, if so, this culture must have traces in magazins, journals,
documentation or examples.
This is what the part of real job of  programmer is - read documentation.

Nothing can force Microsoft to follow this culture.
Even if there were were "low-level requirements", nothing can prevent programmer
from using getchar C-function, or "sub" Assembler instruction to
implement character by character or word by word string comparision.

Your explanation tried to support that algorithm by idea of string as an "udivided entity",
In particular, your explanation tries to model string as a number.
and state that programming language compares numbers not a characters.
That particular model is incorrect. It is a Myth which sometimes programmes have
to create in the mind to picture the backend of the system which low-level they don't
know and do not have to know.

Inded: if to follow this model, "b" = x42, "ab" = x4142 and "ab" > "b".
(xNN - hexadecimal representation). It to try
to fix this flaw in the model by adding char(0) at the end, then the model still does
not work:  Compare s1="b" and s2="b" & chr(0). If to add char(0) to the s1 to
make it equal in length, then the model give s1 >=  s2, which is not the case in VB.
(I've pointed to this in previous comment, but this seems ignored.)

Finally, there are absolutely no low-level requirements to follow this algorithm.
Inded: if string in the memory of 16bit -Intel based computer is
then one of the ways to submit this string  to CPU for comparision is to read this
string from memory via computer bus and use CPU instruction "subscruct".
In this case, string is read in sequence:

  B A   D C   E  

Not only string is split by two-byte fragments, each fragment is filipped.
And, C and Assembler programmer must make loop via words; moreover,
handling inside of the loop will be different for round word and half word,

I am returning to my question:
what I expecting from Expert is to find evidence of algorithm in literature, or
disassemble the piece of VB string comparision program to
reveal the truth (which is still occasional; the word from vendor
is much reliable).

Thank you.

Well, then I bow out of this one.  Because as I've said, I don't think it exists.

By the way, I don't see your logic behind the argument that my explanation of the bits comparison "doesn't work" with the added Chr$(0)

All of the statements are the exact same, with the exact same values, and all are true:

"a" & chr$(0) > a
0110000100000000 > 01100001
 24832 > 97

I have absolutely NO clue what you were talking about when you got into your argument about hexidecimals and flipping string values, but it made no sense to me what so ever.  Not even logically.

As you can tell, this question has been bugging me.  :)  I hate not getting a satisfactory answer, and it won't leave me alone.

Maybe this is what you're looking for:
Well, I've exhausted the search engines on both Microsoft, and Google, and have determined that you won't find an authoritative answer on WHY Binary String Comparison does what it does.  This would be the equivilent of asking someone to explain why 2+2=4 on a binary level.  People just seem to assume nobody's going to ask that question.  So in answer to your question:

"Where in Microsoft documentation is stated and guaranteed for the future that syntax:"

The answer is simply, nowhere.  The syntax you're asking about is basic binary string comparison.  I've searched high and low through microsoft, and there's no explanation as to HOW or WHY binary string comparison works.  

The closest you are going to find is the following, where microsoft quotes from a book ("Faster Smarter Beginning Programming" by Jim Buyens):

>Comparing Strings
>When comparing two strings, Visual Basic .NET starts by comparing the first character of each operand, then the next >character of each operand, and so forth, until it finds two unequal characters or until one string runs out of characters.
>If it finds two unequal characters, the result of comparing them becomes the result of the entire operation. For example, >the string "abcDEF" is less than "abcXA" because D (Unicode 0044) comes before X (Unicode 0058).
>If one string runs out of characters before the other, the longer string is greater. Thus, "abcd" is greater than "abc". The >string "abc " (which includes a trailing space) is also greater than "abc".
>If both strings run out of characters at the same time, then they are equal.

beaverstoneAuthor Commented:
Yes I've made a mistake:
I wrote:
"b" = x42, "ab" = x4142 and "ab" > "b".
But, I ment
"A" = x42, "AB" = x4142 and "AB" > "B".

In your comment your wrote:

"a" & chr$(0) > a
0110000100000000 > 01100001
 24832 > 97

The string on left site of comparision which is "a" & chr$(0)="a\0x00"
 is one character greater than
string on the right site.
According your algorithm, the ch(0)="\0x00" must be added
to the shorter string "a", and then make the comparision.
For example, in former comment  you wrote that
"So if
aaa=011000010110000101100001 and
  aa=011000010110000100000000, then ...",
you added chr(0) to "aa" at the end.
Thus before you compare "a" & chr(0) > "a",
the chr(0) must be added to "a" on the right side which
make both strings equal; thus, algorithm will give "false" in case of this comparision.

Whith out addition a chr(0), your algorithm does not work either:

  "A" = x42=66=1000010, "AB" = x4142=10000011000010 and "AB" > "B" which is


>I have absolutely NO clue what you were talking
>about when you got into your argument about
>hexidecimals and flipping string values, but it
>made no sense to me what so ever.  Not even logically.

Flipping string fragment values is a basic thing how Intel platform operates in

In your previous comment, there is a string:

 "abc" = 011000010110001001100011

which is stored in computer memory as you correctly wrote


 which is x61 x62 x63  in hexadecimal notation.

When CPU takes this string to comparision (in method described in my comment),
the CPU reads from the memory a word. A word in 16 bit platform is a two bytes chunk
of data. CPU cannot take chunk "abc". There is no space in CPU registry to hold that chunk.
Then CPU places in some of its registry this chunk "ab". But the FACT is that
it flips it. So, if in the computer memory this word "ab"=x6162=0110000101100010=24930.
In computer registry, say in ax, this chunk is stored as "ba".
The content of registry ax is a number ax=x6261=0110001001100001=25185.

Then CPU takes the first word of another string which is in right part of comparision
expression, flips it and puts in registry, say bx. Then the program looks like:

      sub ax,bx
      jmz label

This means that CPU substructs numbers ax and bx and jumps to "label" depending
on sign of result.

You can say, we are using now 32bit platform or 64bit platform.
Unlikely, this changes this example in principal; rather
CPU will take 32 bit words and the string "abcd"=1633837924 will be flipped to
"dcba"=1684234849 and then compared.

Thank you.

Well, in all honesty, I'm a paying member, so I couldn't care less about the points.  But if you don't accept my last answer as the "correct" answer, then you're simply a troll in my book.

beaverstoneAuthor Commented:
Thank you for all your effort Javin.

In addition to this beautiful fragment of Visual Basic .NET which you discovered
I've looked closely to the section "Comparison Operators" in VB4 and Visual Studio 6.0 and found:

   MSDN VS6.0 ... Visual Basic Documentation\ Reference\ Language Reference\ Operators
        scroll to the line:
        Both expressions areString Perform astring comparison.
        open up:
 string comparison
 A comparison of two sequences of characters. Use Option Compare to specify binary or text comparison. In English-U.S.,   binary comparisons are case sensitive; text comparisons are not.

It says "sequences of characters." This is enough good.

The same reference is:

All of your answers (plus my comments) form the entire answer.
The closest comment is chosen as accepted.

Thank you very much.
Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

Join & Write a Comment

Featured Post

Upgrade your Question Security!

Your question, your audience. Choose who sees your identity—and your question—with question security.

  • 12
  • 5
Tackle projects and never again get stuck behind a technical roadblock.
Join Now