Solved

"bsearch" function crashing.

Posted on 1998-09-02
12
279 Views
Last Modified: 2010-04-02
Hello experts,

This program was written using Microsoft Visual C++ 5.0.

I have a list of words that I'd like to search through using the bsearch function.  Yes, the words have been sorted, using qsort (which works fine).  Here is the code now:

char temps3[128], *temps2;
strcpy (temps3, (char *) (m_Search2.operator LPCTSTR ()));
temps2 = (char *) bsearch (temps3, m_Words, (DWORD) m_NumWords, sizeof (char *), CompareFunc);

if (temps2 != NULL)
{ strcpy (temps3, temps2);
  m_Results = temps3;
}
else
{ m_Results = "Nope."; }

The compare function is:
int CompareFunc (const void *arg1, const void *arg2)
{ // Compare all of both strings:
  return _stricmp (*(char **) arg1, *(char **) arg2);
}

Description of variables:
m_Search2 is an MFC CString class.  It contains a perfectly valid string.  No matter what it contains, it crashes.

m_Words is an array defined as:
 BYTE *m_Words[2000000]; // equiv to unsigned char

Yes, yes, there is immense overkill etc... but there are A LOT of words (> 150000 and always fluctuating).  I have increaed the stack size to 8mb to accomodate it.  The rest of the program works fine, so I don't think it has to do with that.

m_NumWords is an unsigned long containing the number of words in the array.

m_Result is another MFC CString, but that doesn't matter.

After running the search (searching on any string), it will crash at the same point with the following:
WORDS caused an invalid page fault in
module MSVCRTD.DLL at 024f:10216e63.
Registers:
EAX=1025a46c CS=024f EIP=10216e63 EFLGS=00010206
EBX=0059e790 SS=0257 ESP=0059e1cc EBP=0059e1d8
ECX=0117e1e0 DS=0257 ESI=0117e1e1 FS=5ed7
EDX=0059e26c ES=0257 EDI=2a2c0061 GS=0000
Bytes at CS:EIP:
8a 27 47 38 c4 74 f2 2c 41 3c 1a 1a c9 80 e1 20
Stack dump:
0059e790 005101c0 0059e6d0 0059e1e8 00402b0a 2a2c0061 0117e1e0 0059e214 102316e1 0059e26c 005da01c 0059e204 0000ed38 00615500 0059eb3c 005da01c

The debugger called the error an access violation.  For those who like assembler, here is what it shows:

10216E52   cmp         dword ptr [eax+8],0
10216E56   jne         notclocale(0x10216e93)
10216E58   mov         al,0FFh
10216E5A   mov         eax,eax
chk_null:
10216E5C   or          al,al
10216E5E   je          done(0x10216e8e)
10216E60   mov         al,byte ptr [esi]
10216E62   inc         esi
10216E63   mov         ah,byte ptr [edi]  << ERROR OCURRED HERE
10216E65   inc         edi
10216E66   cmp         ah,al
10216E68   je          chk_null(0x10216e5c)
10216E6A   sub         al,41h
10216E6C   cmp         al,1Ah
10216E6E   sbb         cl,cl
10216E70   and         cl,20h
10216E73   add         al,cl

Now what am I doing wrong?
0
Comment
Question by:january
  • 4
  • 3
  • 2
  • +2
12 Comments
 
LVL 2

Expert Comment

by:duneram
ID: 1171944
EDI has a pretty big number in it
0
 
LVL 1

Expert Comment

by:Bonev
ID: 1171945
Do you have this problem with a small number of items (like 10, 15) ?
0
 

Author Comment

by:january
ID: 1171946
duneram: Could that affect anything?

Bonev: I just tried it, no it does not work.  This would lead me to believe there is something wrong with my code, but what?
0
 
LVL 1

Expert Comment

by:Bonev
ID: 1171947
The compare function should be:
return _stricmp ((char *) arg1, *(char **) arg2);

And, the result is (char**).
0
 

Author Comment

by:january
ID: 1171948
For now, I am going to reject your answer because now the bsearch function returns garbage.  It seems like it is a step in the right direction though.

What could cause the function to return garbage?  I know the data in m_Words is valid.
0
 
LVL 6

Expert Comment

by:thresher_shark
ID: 1171949
Perhaps adding some more explicit casts?
0
Top 6 Sources for Identifying Threat Actor TTPs

Understanding your enemy is essential. These six sources will help you identify the most popular threat actor tactics, techniques, and procedures (TTPs).

 
LVL 23

Accepted Solution

by:
chensu earned 100 total points
ID: 1171950
The first parameter and the return value of bsearch are wrong. It should be like the following.

char temps3[128], **temps2;
char *temps = temps3;
strcpy (temps3, (char *) (m_Search2.operator LPCTSTR ()));
temps2 = (char **) bsearch ((const void *)&temps, m_Words, (DWORD) m_NumWords, sizeof (char *), CompareFunc);

if (temps2 != NULL)
{ strcpy (temps3, *temps2);
  m_Results = temps3;
}
else
{ m_Results = "Nope."; }

The compare function is correct.

int CompareFunc (const void *arg1, const void *arg2)
{ // Compare all of both strings:
  return _stricmp (*(char **) arg1, *(char **) arg2);
}

By the way, it is better to use dynamical allocation (eg. new and delete) than BYTE *m_Words[2000000].
0
 
LVL 2

Expert Comment

by:duneram
ID: 1171951
well in the original code it was something like 700 million.... probably an offset in an array. I would expect an out of bounds issue.
0
 
LVL 1

Expert Comment

by:Bonev
ID: 1171952
Shame on me. Yes, it is in the right direction, but not exactly.
The result of the compare function is (char**) - to access the string, you have to make two steps.

Chensu did it right.

0
 

Author Comment

by:january
ID: 1171953
Thank you chensu.  You have solved my problem.  I should have seen that...  When dealing with all these addresses and indirections, you occasionally lose track!

Also, chensu, about the *m_Words[2000000]... Right now the way it works is during startup, it will read in all the words.  As it's doing that, it will initialize every array element needed using new.  Then, when it exits, it goes through the same process with delete.  Do you know of a better / faster way of doing it (other than, say, allocating on monster array of BYTEs and then separating each word with a NULL or something)?
0
 
LVL 23

Expert Comment

by:chensu
ID: 1171954
>Do you know of a better / faster way of doing it (other than, say, allocating on monster array of BYTEs and then separating each word with a NULL or something)?

I think your current method is right. I can't think of a better way.
0
 

Author Comment

by:january
ID: 1171955
Okay, thanks for the help!
0

Featured Post

Highfive Gives IT Their Time Back

Highfive is so simple that setting up every meeting room takes just minutes and every employee will be able to start or join a call from any room with ease. Never be called into a meeting just to get it started again. This is how video conferencing should work!

Join & Write a Comment

Written by John Humphreys C++ Threading and the POSIX Library This article will cover the basic information that you need to know in order to make use of the POSIX threading library available for C and C++ on UNIX and most Linux systems.   [s…
This article shows you how to optimize memory allocations in C++ using placement new. Applicable especially to usecases dealing with creation of large number of objects. A brief on problem: Lets take example problem for simplicity: - I have a G…
The goal of the video will be to teach the user the concept of local variables and scope. An example of a locally defined variable will be given as well as an explanation of what scope is in C++. The local variable and concept of scope will be relat…
The viewer will be introduced to the technique of using vectors in C++. The video will cover how to define a vector, store values in the vector and retrieve data from the values stored in the vector.

707 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

13 Experts available now in Live!

Get 1:1 Help Now