• Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 414
  • Last Modified:

looks like malloc is failing in hpux

Hello
One of our app on hpux is coredummped with following reasons.
How do i know thereason for the failure?

#  dbx /opt/CA/SharedComponents/ccs/cci/bin/ccijimd core.ccijimd.16136
(dbx) where
=>[1] _nsw_getoneconfig_v1(0x405b8, 0x40, 0x40000, 0xfefd7178, 0xff0b03c0, 0x6dd88), at 0xfefd73d4
  [2] check_format(0x405b8, 0x43, 0xd96ac, 0x57238, 0x0, 0x7dda8), at 0xfefd6db8
  [3] _nsw_getoneconfig_v1(0x0, 0xd, 0xd8dfc, 0x1f264, 0xff0b03c0, 0xff0b92e4), at 0xfefd7614
  [4] extract_format(0xa0, 0x20, 0x7afb0, 0x7afb8, 0xff0b3958, 0x1fec4), at 0xfefd676c
  [5] extract_format(0xa0, 0x1, 0xd9da8, 0x20390, 0xff0b03c0, 0xff0ba558), at 0xfefd665c
  [6] CCI_UTIL_WaitForMultipleObjects(0x1, 0xfd21fe28, 0x0, 0xffffffff, 0x0, 0x0), at 0x2069c
  [7] CCI_UTIL_WaitForSingleObject(0x83940, 0xffffffff, 0xff0b5840, 0x0, 0x83940, 0xffffffff), at 0x20514
  [8] CCI_UTIL_SuspendThread(0x99a60, 0x1, 0xe80cc, 0x13eec, 0x99a60, 0xffffffff), at 0x21a44
  [9] 0x238e0(0x8a1b0, 0xffffffff, 0x0, 0x3cc90, 0x83940, 0x8a1b0), at 0x238e0
  [10] 0x21e74(0x99a60, 0xfd220000, 0x0, 0x0, 0x21da0, 0x99a60), at 0x21e74
(dbx) up
0xfefd6db8: check_format+0x019c:        sll      %i4, 1, %o2
(dbx) up
0xfefd7614: _nsw_getoneconfig_v1+0x054c:        call     spaceskip      ! 0xfefd8140
(dbx) up
0xfefd676c: extract_format+0x01a4:      sll      %o5, 1, %o4
(dbx) up
0xfefd665c: extract_format+0x0094:      sll      %l6, 1, %l1
(dbx) up
0x0002069c: CCI_UTIL_WaitForMultipleObjects+0x00fc:     call     malloc [PLT]   ! 0x38184
(dbx) up
0x00020514: CCI_UTIL_WaitForSingleObject+0x009c:        call     CCI_UTIL_WaitForMultipleObjects        ! 0x205a0
(dbx) up
0x00021a44: CCI_UTIL_SuspendThread+0x0094:      call     CCI_UTIL_WaitForSingleObject   ! 0x20478
(dbx) up
0x000238e0:     call     CCI_UTIL_SuspendThread ! 0x219b0
(dbx) up
0x00021e74:     call     %l1
(dbx) up
dbx: Already at the top call level




Sham

0
mohet01
Asked:
mohet01
  • 11
  • 7
  • 2
1 Solution
 
mohet01Author Commented:
corresponding line in source code for malloc is:
#define _MALLOC malloc
typedef struct _WaitQueue
{
  pthread_cond_t cv;
  psIpcContainer objs[ WAITQUEUE_MAX_OBJS ];
  DWORD obj_count;
  struct _WaitQueue * prev;
  struct _WaitQueue * next;
} sWaitQueue, * psWaitQueue;

if ( (wq = (psWaitQueue) _MALLOC( sizeof(sWaitQueue) )) == NULL ) // this is the one
{
....
}


Sham
0
 
sarabandeCommented:
it looks as if the check_format would crash when accessing a variable argument.

you may check whether the check_format has printf like arguments for example "%s %d %s"  and whether for each of the placeholders an appropriate argument was passed.

Sara
0
 
mohet01Author Commented:
Below calls are not from our code, i think this is system related calls



0xfefd6db8: check_format+0x019c:        sll      %i4, 1, %o2
(dbx) up
0xfefd7614: _nsw_getoneconfig_v1+0x054c:        call     spaceskip      ! 0xfefd8140
(dbx) up
0xfefd676c: extract_format+0x01a4:      sll      %o5, 1, %o4
(dbx) up
0xfefd665c: extract_format+0x0094:      sll      %l6, 1, %l1


We only call _MALLOC() which intrun calls malloc()
0
Industry Leaders: We Want Your Opinion!

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

 
Kent OlsenData Warehouse Architect / DBACommented:

Hi Sham,

I suspect that malloc is failing because the heap has been corrupted, probably as a result of over-indexing a dynamically assigned buffer.

Can you check your program for the items that may have read/written a dynamic buffer shortly before this call to malloc?



Kent
0
 
mohet01Author Commented:
Hello kdo
May be you are correct, but
Do you think it is see to know, what is the previous instruction that read/write heap?
This app is multi threaded.
Is there a tool for checking this easilty?
Sham
0
 
Kent OlsenData Warehouse Architect / DBACommented:
Hi Sham,

Check out Valgrind (http://valgrind.com).  

It can help with a lot of memory debugging just by building it into your existing application.


Kent
0
 
sarabandeCommented:
in the core dump there is no malloc involved. it is a call from a thread which calls extract_format and crashes shortly after that. the thread was suspended while waiting for an event or timer.

Sara

0
 
mohet01Author Commented:
0x0002069c: CCI_UTIL_WaitForMultipleObjects+0x00fc:     call     malloc [PLT]   ! 0x38184

check this
in the source code we are doing the same
0
 
mohet01Author Commented:
Hello kdo
I was able to get the code for _nsw_getoneconfig_v1() getting called on google

please find the attached
sham
 nscd-nswcfgst.c
0
 
mohet01Author Commented:
May be from this file we can get your info?
0
 
mohet01Author Commented:
nparse.c is giving the definition of _nsw_getoneconfig_v1()

these two files are libc i guess
 nparse.c
0
 
sarabandeCommented:
the file you posted is from solaris. didn't you say hpux? dbx in my opinion also is solaris?

the call in the source is

switchcfg = _nsw_getoneconfig_v1(nswdb->name, buf, &err);

that means the first two arguments are probably pointers. when you look at the coredump you see that the first call to _nsw_getoneconfig_v1 has values 0x0 and 0xd for the first both arguments. the 0x0 could be a null pointer what would explain the crash. but you see that  _nsw_getoneconfig_v1 has 6 arguments and not 3 as in the call. so the functions are not the same and we can't be sure why it crashes.

Sara
0
 
mohet01Author Commented:
very sorry
yes it is solaris issue
not hpux
very sorry
0
 
mohet01Author Commented:
CCI_UTIL_WaitForMultipleObjects(0x1, 0xfd21fe28, 0x0, 0xffffffff, 0x0, 0x0), at 0x2069c
does not mean it has 6 arguments
Because
DWORD WaitForMultipleObjects( DWORD nCount,
                              CONST HANDLE *lpHandles,
                              BOOL bWaitAll,
                              DWORD dwMilliseconds )
has only 4 args.
Sham
0
 
sarabandeCommented:
you compare windows api with unix api. that is not valid.

the CCI_UTIL_WaitForMultipleObjects do not need to have same interface than WaitForMultipleObjects.

is it solaris or hpux?

Sara
0
 
sarabandeCommented:
oh i see you already confirmed that it is solaris.

Sara
0
 
sarabandeCommented:
did you open the core file with sunstudio? then you comfortably can debug all arguments.

Sara
0
 
mohet01Author Commented:
hello
Please provide resource link for sun studio?
Sham
0
 
sarabandeCommented:
the newer name is "oracle solaris studio" and you can download at

http://www.oracle.com/technetwork/server-storage/solarisstudio/downloads/index.html

Sara
0
 
mohet01Author Commented:
Thanx
0

Featured Post

Free Tool: Site Down Detector

Helpful to verify reports of your own downtime, or to double check a downed website you are trying to access.

One of a set of tools we are providing to everyone as a way of saying thank you for being a part of the community.

  • 11
  • 7
  • 2
Tackle projects and never again get stuck behind a technical roadblock.
Join Now