• Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 418
  • Last Modified:

looks like malloc is failing in hpux

Hello
One of our app on hpux is coredummped with following reasons.
How do i know thereason for the failure?

#  dbx /opt/CA/SharedComponents/ccs/cci/bin/ccijimd core.ccijimd.16136
(dbx) where
=>[1] _nsw_getoneconfig_v1(0x405b8, 0x40, 0x40000, 0xfefd7178, 0xff0b03c0, 0x6dd88), at 0xfefd73d4
  [2] check_format(0x405b8, 0x43, 0xd96ac, 0x57238, 0x0, 0x7dda8), at 0xfefd6db8
  [3] _nsw_getoneconfig_v1(0x0, 0xd, 0xd8dfc, 0x1f264, 0xff0b03c0, 0xff0b92e4), at 0xfefd7614
  [4] extract_format(0xa0, 0x20, 0x7afb0, 0x7afb8, 0xff0b3958, 0x1fec4), at 0xfefd676c
  [5] extract_format(0xa0, 0x1, 0xd9da8, 0x20390, 0xff0b03c0, 0xff0ba558), at 0xfefd665c
  [6] CCI_UTIL_WaitForMultipleObjects(0x1, 0xfd21fe28, 0x0, 0xffffffff, 0x0, 0x0), at 0x2069c
  [7] CCI_UTIL_WaitForSingleObject(0x83940, 0xffffffff, 0xff0b5840, 0x0, 0x83940, 0xffffffff), at 0x20514
  [8] CCI_UTIL_SuspendThread(0x99a60, 0x1, 0xe80cc, 0x13eec, 0x99a60, 0xffffffff), at 0x21a44
  [9] 0x238e0(0x8a1b0, 0xffffffff, 0x0, 0x3cc90, 0x83940, 0x8a1b0), at 0x238e0
  [10] 0x21e74(0x99a60, 0xfd220000, 0x0, 0x0, 0x21da0, 0x99a60), at 0x21e74
(dbx) up
0xfefd6db8: check_format+0x019c:        sll      %i4, 1, %o2
(dbx) up
0xfefd7614: _nsw_getoneconfig_v1+0x054c:        call     spaceskip      ! 0xfefd8140
(dbx) up
0xfefd676c: extract_format+0x01a4:      sll      %o5, 1, %o4
(dbx) up
0xfefd665c: extract_format+0x0094:      sll      %l6, 1, %l1
(dbx) up
0x0002069c: CCI_UTIL_WaitForMultipleObjects+0x00fc:     call     malloc [PLT]   ! 0x38184
(dbx) up
0x00020514: CCI_UTIL_WaitForSingleObject+0x009c:        call     CCI_UTIL_WaitForMultipleObjects        ! 0x205a0
(dbx) up
0x00021a44: CCI_UTIL_SuspendThread+0x0094:      call     CCI_UTIL_WaitForSingleObject   ! 0x20478
(dbx) up
0x000238e0:     call     CCI_UTIL_SuspendThread ! 0x219b0
(dbx) up
0x00021e74:     call     %l1
(dbx) up
dbx: Already at the top call level




Sham

0
mohet01
Asked:
mohet01
  • 11
  • 7
  • 2
1 Solution
 
mohet01Author Commented:
corresponding line in source code for malloc is:
#define _MALLOC malloc
typedef struct _WaitQueue
{
  pthread_cond_t cv;
  psIpcContainer objs[ WAITQUEUE_MAX_OBJS ];
  DWORD obj_count;
  struct _WaitQueue * prev;
  struct _WaitQueue * next;
} sWaitQueue, * psWaitQueue;

if ( (wq = (psWaitQueue) _MALLOC( sizeof(sWaitQueue) )) == NULL ) // this is the one
{
....
}


Sham
0
 
sarabandeCommented:
it looks as if the check_format would crash when accessing a variable argument.

you may check whether the check_format has printf like arguments for example "%s %d %s"  and whether for each of the placeholders an appropriate argument was passed.

Sara
0
 
mohet01Author Commented:
Below calls are not from our code, i think this is system related calls



0xfefd6db8: check_format+0x019c:        sll      %i4, 1, %o2
(dbx) up
0xfefd7614: _nsw_getoneconfig_v1+0x054c:        call     spaceskip      ! 0xfefd8140
(dbx) up
0xfefd676c: extract_format+0x01a4:      sll      %o5, 1, %o4
(dbx) up
0xfefd665c: extract_format+0x0094:      sll      %l6, 1, %l1


We only call _MALLOC() which intrun calls malloc()
0
Cloud Class® Course: SQL Server Core 2016

This course will introduce you to SQL Server Core 2016, as well as teach you about SSMS, data tools, installation, server configuration, using Management Studio, and writing and executing queries.

 
Kent OlsenData Warehouse Architect / DBACommented:

Hi Sham,

I suspect that malloc is failing because the heap has been corrupted, probably as a result of over-indexing a dynamically assigned buffer.

Can you check your program for the items that may have read/written a dynamic buffer shortly before this call to malloc?



Kent
0
 
mohet01Author Commented:
Hello kdo
May be you are correct, but
Do you think it is see to know, what is the previous instruction that read/write heap?
This app is multi threaded.
Is there a tool for checking this easilty?
Sham
0
 
Kent OlsenData Warehouse Architect / DBACommented:
Hi Sham,

Check out Valgrind (http://valgrind.com).  

It can help with a lot of memory debugging just by building it into your existing application.


Kent
0
 
sarabandeCommented:
in the core dump there is no malloc involved. it is a call from a thread which calls extract_format and crashes shortly after that. the thread was suspended while waiting for an event or timer.

Sara

0
 
mohet01Author Commented:
0x0002069c: CCI_UTIL_WaitForMultipleObjects+0x00fc:     call     malloc [PLT]   ! 0x38184

check this
in the source code we are doing the same
0
 
mohet01Author Commented:
Hello kdo
I was able to get the code for _nsw_getoneconfig_v1() getting called on google

please find the attached
sham
 nscd-nswcfgst.c
0
 
mohet01Author Commented:
May be from this file we can get your info?
0
 
mohet01Author Commented:
nparse.c is giving the definition of _nsw_getoneconfig_v1()

these two files are libc i guess
 nparse.c
0
 
sarabandeCommented:
the file you posted is from solaris. didn't you say hpux? dbx in my opinion also is solaris?

the call in the source is

switchcfg = _nsw_getoneconfig_v1(nswdb->name, buf, &err);

that means the first two arguments are probably pointers. when you look at the coredump you see that the first call to _nsw_getoneconfig_v1 has values 0x0 and 0xd for the first both arguments. the 0x0 could be a null pointer what would explain the crash. but you see that  _nsw_getoneconfig_v1 has 6 arguments and not 3 as in the call. so the functions are not the same and we can't be sure why it crashes.

Sara
0
 
mohet01Author Commented:
very sorry
yes it is solaris issue
not hpux
very sorry
0
 
mohet01Author Commented:
CCI_UTIL_WaitForMultipleObjects(0x1, 0xfd21fe28, 0x0, 0xffffffff, 0x0, 0x0), at 0x2069c
does not mean it has 6 arguments
Because
DWORD WaitForMultipleObjects( DWORD nCount,
                              CONST HANDLE *lpHandles,
                              BOOL bWaitAll,
                              DWORD dwMilliseconds )
has only 4 args.
Sham
0
 
sarabandeCommented:
you compare windows api with unix api. that is not valid.

the CCI_UTIL_WaitForMultipleObjects do not need to have same interface than WaitForMultipleObjects.

is it solaris or hpux?

Sara
0
 
sarabandeCommented:
oh i see you already confirmed that it is solaris.

Sara
0
 
sarabandeCommented:
did you open the core file with sunstudio? then you comfortably can debug all arguments.

Sara
0
 
mohet01Author Commented:
hello
Please provide resource link for sun studio?
Sham
0
 
sarabandeCommented:
the newer name is "oracle solaris studio" and you can download at

http://www.oracle.com/technetwork/server-storage/solarisstudio/downloads/index.html

Sara
0
 
mohet01Author Commented:
Thanx
0
Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

Join & Write a Comment

Featured Post

Cloud Class® Course: Certified Penetration Testing

This CPTE Certified Penetration Testing Engineer course covers everything you need to know about becoming a Certified Penetration Testing Engineer. Career Path: Professional roles include Ethical Hackers, Security Consultants, System Administrators, and Chief Security Officers.

  • 11
  • 7
  • 2
Tackle projects and never again get stuck behind a technical roadblock.
Join Now