Solved

passing structure by value, bug?

Posted on 1998-10-17
21
186 Views
Last Modified: 2010-04-02
First let me say I do *not* want to get into a discussion about the inefficiency etc. of passing structures by value. I am well aware of these issues. I am looking for a detailed description of why the code below compiles but does not always run correctly.

typedef struct {
   int   foo;
   int   bar;
} BLOB;

BLOB func1(BLOB blob, int var2) {
   // do something here
   return(blob);
}
This is called by:-

blob = func1(blob, something);


I can call func1 and all is well in most cases. Sometimes however when the function returns it appears as though the stack has been corrupted and it returns to an odd location. It *might* be related to what system calls are made within the function but I can't be sure.

A solution that works is as follows.

BLOB func2(BLOB b, int var2) {
   BLOB blob = b;
   // do something here
   return(blob);
}

Now this is even more in-efficient than the first. The structure is being copied an extra time and the stack usage is greater, but it always works.

I would like a detailed description of why func1 sometimes fails by corrupting the stack.

I am using MSVC 4.1
0
Comment
Question by:icd
  • 8
  • 7
  • 4
  • +1
21 Comments
 
LVL 22

Expert Comment

by:nietod
ID: 1175339
The code is 100% legal and should work.  I suspect that you have a bug somewhere.  There are two potential places for problems.  either you are passing in a bad  (already corrupted) BLOB or you are passing in a good BLOB, but corrupting it in the function.

If you post more code I might be able to see the problem.
0
 
LVL 6

Expert Comment

by:thresher_shark
ID: 1175340
I agree with nietod, there is probably just some small glitch somewhere else along the line.  There remains the potential for a compiler bug though, but that is doubtful.  Actually though, come to think of it, I was inches away from posting a question here a month ago about why I am getting the dread error:

Fatal Error C1001
INTERNAL COMPILER ERROR

and then it suggests I contact tech support... but we need not get into that :-)
0
 
LVL 5

Author Comment

by:icd
ID: 1175341
I really don't think there was anything at all wrong with the code, the code was like func1 but failed on return. I simply changed it to the func2 form and it worked.

0
 

Expert Comment

by:jbalagop
ID: 1175342
The only thing that might be taking place besides possible bugs in the complier, etc is that the funtion is called using a PASCAL calling convetion. Funtions using a PASCAL type calling convention remove the parameters from the stack before they return to the caller. Usually by default Windows uses the Pascal calling convention, expect when there are variable parameters in the function. So maybe you have the VC++ set to call all funtions using the pascal calling convention.
Btw, the pascal calling convention looks like this:

BLOB __stdcall func1 (BLOB b1, int var2); Even thought you didn;t explictly define it as a Pascal type calling convention, maybe by default it's using a pascal type calling convention.

you might want to try to use the C Calling convention by explicitly defining the above function as:
BLOB __cdecl func1(BLOB b1, int var2)
In a C calling convention the parameters are discarded my the caller of the function.

Hope this helps.
0
 

Expert Comment

by:jbalagop
ID: 1175343
Also as nietod suggested, if you are corrupting the BLOB structure, then that could be a potential reason why the function is not returning correctly.

0
 
LVL 22

Expert Comment

by:nietod
ID: 1175344
jbalagop, there must be 1,000's of possible reasons for this.  Why would you lock an answer with complete guess?  I would bet your answer is wrong simly because the odds are against it.  But in the meantime icd will get little help.
0
 
LVL 22

Expert Comment

by:nietod
ID: 1175345
icd, why don't you post more of your code.  If it is too long to post, you could e-mail it to me at nietod@theshop.net.  I assure you, it is something you have done (or maybe a compiler bug) but what you;ve shown us is perfectly legal.
0
 

Expert Comment

by:jbalagop
ID: 1175346
geez... im new to this thing. i thought if i know the answer, i can answer it. i dont know, im supposed to wait, etc. if you want the points take it. i could care less.
0
 

Expert Comment

by:jbalagop
ID: 1175347
the answer isn;t a complete guess. what makes you think that - nietod. if the parameters are getting corrupted, there is a high chance, that before the function returns it has trashed it's parameters.
0
 
LVL 6

Expert Comment

by:thresher_shark
ID: 1175348
We already covered the data corruption thing at the very beginning, nietod said that in his first comment, and like he also says, there are millions of things that it could be:

1) corrupted data; bug in his code.  Could be caused my a MULTITUDE of things.
2) memory getting stepped on; one off bug.
3) compiler bug
4) if statement like
   if (i = 2)
   { ... }
   that was supposed to be if (i == 2)
5) ... use your imagination.

In other words, if you aren't 90%-100% of the answer, don't answer, comment!  A good rule of thumb for whether or not you should answer is: If I end my answer with "Hope this helps" then it probably should be a comment.  Just thought you ought to know that.
0
Threat Intelligence Starter Resources

Integrating threat intelligence can be challenging, and not all companies are ready. These resources can help you build awareness and prepare for defense.

 
LVL 5

Author Comment

by:icd
ID: 1175349
jbalagop.

Thanks for that suggestion but it is not the case. I am using the C calling convention. Also don't worry that you have been critised for 'answering' this question. You will soon learn the conventions here on EE. (After a while you find there is less urge to 'get points' and more urge just to help people).

All.
Thanks for your suggestions, just by my stopping to put down in words what my problem is has helped me. I feel now that my problem is one that I am not going to be able to re-create. I changed all my functions from type 'func1' to type 'func2' and the problem went away. Putting them back as they were and the code still works!

The reason for my initial question was my lack of understanding of the C calling convention and how structures are passed on the stack. I understood that the called procedure returned a pointer to the structure in the eax register. I did not (and in writing this I realise I still don't) know where this structure sat. Presumably the returned structure sits in the stack frame of func1?

Perhaps someone can explain the following machine code generated by my compiler and describe where the returned structure sits?

The return address from func1 is as follows:-

65:       return(blob);
004083cd   lea       esi,dword ptr [blob]
004083d0   mov       edi,dword ptr [ebp+08]
004083d3   mov       ecx,00000026
004083d8   rep movsd
004083da   mov       eax,dword ptr [ebp+08]
004083dd   jmp       func1+000000b2 (004083e2)
66:   }
004083e2   pop       edi
004083e3   pop       esi
004083e4   pop       ebx
004083e5   leave
004083e6   ret

This code is copying the blob structure (which is 4 * 0x26 bytes long) to ebp+8 then returning a pointer to this structure in eax ??

The call and tidy up of the calling program is:-

184:      blob = func1(blob, var);
0040904e   mov       eax,dword ptr [var]
00409051   push      eax
00409052   sub       esp,00000098
00409058   lea       esi,dword ptr [blob]
0040905e   mov       edi,esp
00409060   mov       ecx,00000026
00409065   rep movsd
00409067   lea       eax,dword ptr [ebp-00000204]
0040906d   push      eax
0040906e   call      func1 (00411127)
00409073   add       esp,000000a0
00409079   lea       edi,dword ptr [ebp-0000016c]
0040907f   mov       esi,eax
00409081   mov       ecx,00000026
00409086   rep movsd
00409088   lea       esi,dword ptr [ebp-0000016c]
0040908e   lea       edi,dword ptr [blob]
00409094   mov       ecx,00000026
00409099   rep movsd

It seems to me that after the call the esp is tidied up (but does this not therfore allow the structure on the stack frame of func1 to be overwritten by asynchronous system calls?). Then the structure is copied to the current functions stack frame at ebp-16c then copied from there to the local blob.

If someone can explain how the structure pointed to by eax after func1 returns avoids getting corrupted by asynchronous system calls then they can have the points.

(I hope this makes sense to you!)

0
 
LVL 5

Author Comment

by:icd
ID: 1175350
If anyone wants the full code to the original (faulty) code it is:-
BLOB      SendASPICmd(BLOB blob, SRB_ExecSCSICmd *cmd) {
       blob.s.m.pfnSendASPI32Command(cmd);                        if (cmd->SRB_Status == SS_PENDING) {                              blob.s.m.pfnWaitForSingleObject(blob.s.m.eventHandle, INFINITE);
      }                                                return(blob);
}

As you can see there is little scope here for corruption of the blob structure (apart from the call to the function pointer within the structure, which I think is unlikely to have caused the original problem).

0
 
LVL 22

Expert Comment

by:nietod
ID: 1175351
    004083cd   lea       esi,dword ptr [blob]  
This gets a pointer to the BLOB withig your procedure.
     004083d0   mov       edi,dword ptr [ebp+08]
This gets a pointer to the BLOB that the caller wants tol receive.  This is probably on the caller's stack frame.
     004083d3   mov       ecx,00000026
Load the size of a blob.
     004083d8   rep movsd
Copy function's blob to caller's BLOB.  i.e. return the BLOB.
     004083da   mov       eax,dword ptr [ebp+08]
Load EAX with a pointer to the caller's BLOB.  This is due to a convention that EAX contains the return value.  It can't contain a BLOB, is too small.  so instead the caller must make space for the BLOB and EAX just points to the space the caller provided.
     004083dd   jmp       func1+000000b2 (004083e2)
     66:   }
That code jumps nowhere.  I would need to see more to know why.
     004083e2   pop       edi
     004083e3   pop       esi
Restores EDI ESI, Microsoft's rule is that must be preserved by functions.
     004083e4   pop       ebx
Restore the caller's stack frame  (pointer to local variables)
     004083e5   leave
Destroy this function's local variables.
     004083e6   ret
Bye!
0
 
LVL 22

Expert Comment

by:nietod
ID: 1175352
184:      blob = func1(blob, var);
     0040904e   mov       eax,dword ptr [var]
     00409051   push      eax
This pushes the last parameter, VAR, onto the stack.  Parameters go in reverse order under MS.
     00409052   sub       esp,00000098
This makes room ont he stack for a BLOB to be passed.  (Stack grows down.)
     00409058   lea       esi,dword ptr [blob]
Get pointer to blob to be passed
     0040905e   mov       edi,esp
/Get pointer to place on stack where  BLOB to be passed is located
     00409060   mov       ecx,00000026
     00409065   rep movsd
Copy function's blob onto the stack.  
     00409067   lea       eax,dword ptr [ebp-00000204]
     0040906d   push      eax
Pass pointer to location to receive the BLOB returned.  This is proabably a local BLOB that the compiler allocated.  Not the BLOB you declared.  That would be too efficient.
     0040906e   call      func1 (00411127)
Call the function.
     00409073   add       esp,000000a0
Remove the 3 parameters from the stack.  No risk of corruption here, they are not being used.
     00409079   lea       edi,dword ptr [ebp-0000016c]
Get  pointer to  a temporary  BLOB,
     0040907f   mov       esi,eax
Get pointer to the BLOB that is holding the return value.
     00409081   mov       ecx,00000026
     00409086   rep movsd
Copy returned BLOB to temporary BLOB.
     00409088   lea       esi,dword ptr [ebp-0000016c]
Get pointer to temporary BLOB.
     0040908e   lea       edi,dword ptr [blob]
Get pointer to the BLOB you declared in your function.
     00409094   mov       ecx,00000026
     00409099   rep movsd
Copy temporary blob with return value to the BLOB you declared in the function.

0
 
LVL 22

Accepted Solution

by:
nietod earned 200 total points
ID: 1175353
>> t seems to me that after the call the esp is tidied up (but does this not
>> therfore allow the structure on the stack frame of func1 to be overwritten by asynchronous system calls?).
when ESB is cleaned up it is to remove parameters that were passed to the function.  since the function has returned there is no problem with this.  

>>  If someone can explain how the structure pointed to by eax after func1
>> returns avoids getting corrupted by asynchronous system calls then they
>> can have the points.
It is because the structure is passed back lower on the stack.   The caller has made room for the returned structure at [EBP-204] and passes a "hidden" pointer to this location to the function as a new first parameter.  (i.e. a prameter you didn't declare).  If you remember, the function copied the return value to this location at the end of the function and then set EAX to point to this location.

Let me know if you have questions.
0
 
LVL 22

Expert Comment

by:nietod
ID: 1175354
to jabagop,
>> geez... im new to this thing. i thought if i know the answer, i can answer it.
>> i dont know, im supposed to wait, etc. if you want the points take it. i
>> could care less.
Sorry if my comment was a bit harsh.  Thresher_shark explained the situation well.  If you make a guess, submit it as a comment.  If it is right, that is, if it is the answer, the client will ussually let you know and then you can submit a dummy "answer".  

Good luck.
0
 
LVL 5

Author Comment

by:icd
ID: 1175355
Thanks nietod, I missed the point that there was an address passed *to* the function giving a location for the returned structure.

0
 
LVL 5

Author Comment

by:icd
ID: 1175356
Thanks nietod, I missed the point that there was an address passed *to* the function giving a location for the returned structure.

0
 
LVL 5

Author Comment

by:icd
ID: 1175357
The original problem that prompted this question 'went away' only to come back recently to bite me!

On re-investigation the problem was that I was calling a function in a DLL from within my procedure. I had defined the function as requiring the pascal calling convention (called program tidies up the stack) whereas it was in fact using the C calling convention this meant that on return the stack pointer was out by 4 bytes. For some reason this bug was not causing me any problem until I added some more code in a totally different area.


0
 
LVL 22

Expert Comment

by:nietod
ID: 1175358
C++ procedures tend to use stack frames.  This technique saves the stack pointer on the stack when a call is made and restores it when done.  Although it is not the purpose of the technique one side effect is that functions that screw up the stack pointer tend to do less damage.  If they don't pop off everything, not problem.  If they pop off too much, they may screw up some local variables but unless they go too far they will still be able to return to the caller and the caller will still have a valid stack frame.
0
 
LVL 5

Author Comment

by:icd
ID: 1175359
Yes, that is why I think I got away with it for so long. The version I had the problem with however had no local variables and it seemed the compiler 'optimised' the procedure end code so that it did not do a 'mov esp,ebp, pop ebp' but instead did just a 'pop ebp'.

I think this was just at the time I moved from MSVC 4.1 to MSVC 5 and it is possible that it was this change of generated code in MSVC 5 that exposed the bug.

Thanks for all your support and suggestions.
0

Featured Post

What Is Threat Intelligence?

Threat intelligence is often discussed, but rarely understood. Starting with a precise definition, along with clear business goals, is essential.

Join & Write a Comment

Unlike C#, C++ doesn't have native support for sealing classes (so they cannot be sub-classed). At the cost of a virtual base class pointer it is possible to implement a pseudo sealing mechanism The trick is to virtually inherit from a base class…
Templates For Beginners Or How To Encourage The Compiler To Work For You Introduction This tutorial is targeted at the reader who is, perhaps, familiar with the basics of C++ but would prefer a little slower introduction to the more ad…
The viewer will learn how to use the return statement in functions in C++. The video will also teach the user how to pass data to a function and have the function return data back for further processing.
The viewer will learn how to user default arguments when defining functions. This method of defining functions will be contrasted with the non-default-argument of defining functions.

757 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

19 Experts available now in Live!

Get 1:1 Help Now