Solved

segfault in MPI code

Posted on 2004-08-22
3
594 Views
Last Modified: 2008-01-09
hi,

when i try to run this code it segfaults and do not understand why. the segfaults happens near the mem allocation.

all i am trying to do is allocate memory so tthat i can read values in the matrix that would be sent to other computers.


regards

#include <mpi.h>
#include <unistd.h>
#define  NUMS 10

int main (int argc, char *argv[])

{

  int myid, numprocs;
  char hostname[30];
  int  j, len=30;
  short int **a;
  short int **b;
  short int **c;


  for (j=0; j<NUMS; j++)
    {

      a[j]= malloc(NUMS, sizeof(short int));
      b[j]= malloc(NUMS*sizeof(short int));
      c[j]= malloc(NUMS*sizeof(short int));
    }

  MPI_Init(&argc, &argv);
  MPI_Comm_size(MPI_COMM_WORLD, &numprocs);
  MPI_Comm_rank(MPI_COMM_WORLD, &myid);
  gethostname(hostname, len);

  printf("\n %s is the hostname %d is my id ", hostname, myid);


  MPI_Finalize();

-----------

gdb session
--------
 gdb broad core.17740
GNU gdb Red Hat Linux (5.3post-0.20021129.18rh)
Copyright 2003 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for details.
This GDB was configured as "i386-redhat-linux-gnu"...
Core was generated by `./broad'.
Program terminated with signal 11, Segmentation fault.
Reading symbols from /usr/lib/libefence.so.0...done.
Loaded symbols for /usr/lib/libefence.so.0
Reading symbols from /lib/libutil.so.1...done.
Loaded symbols for /lib/libutil.so.1
Reading symbols from /lib/tls/libpthread.so.0...done.
Loaded symbols for /lib/tls/libpthread.so.0
Reading symbols from /lib/tls/libc.so.6...done.
Loaded symbols for /lib/tls/libc.so.6
Reading symbols from /lib/ld-linux.so.2...done.
Loaded symbols for /lib/ld-linux.so.2
#0  0x08049d54 in main (argc=1, argv=0xbfffe384) at broad.c:20
20            a[j]= (short int *) calloc(NUMS, sizeof(short int));
(gdb) run
Starting program: /home/karan/mpi/matrix/broad
[New Thread 1073971872 (LWP 17752)]

  Electric Fence 2.2.0 Copyright (C) 1987-1999 Bruce Perens <bruce@perens.com>

Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread 1073971872 (LWP 17752)]
0x40011d2a in strcmp () from /lib/ld-linux.so.2
(gdb) bt full
#0  0x40011d2a in strcmp () from /lib/ld-linux.so.2
No symbol table info available.
#1  0x40009941 in _dl_name_match_p () from /lib/ld-linux.so.2
No symbol table info available.
#2  0x40009565 in do_lookup_versioned () from /lib/ld-linux.so.2
No symbol table info available.
#3  0x40008676 in _dl_lookup_versioned_symbol_internal () from /lib/ld-linux.so.2
No symbol table info available.
#4  0x4000c403 in fixup () from /lib/ld-linux.so.2
No symbol table info available.
#5  0x4000c2c0 in _dl_runtime_resolve () from /lib/ld-linux.so.2
No symbol table info available.
#6  0x08057154 in lam_setfunc ()
No symbol table info available.
#7  0x0804a503 in MPI_Init ()
No symbol table info available.
#8  0x08049da8 in main (argc=1073991660, argv=0x40042fec) at broad.c:25
        myid = 134752202
        numprocs = -1073747304
        hostname = "\001\000\000\000\000\000\000\000ZT\002@\216\234\004\b\030.\023BÔ(\023B\210êÿ¿\221\225"
        j = 10
        len = 30
        a = (short int **) 0x4000c403
        b = (short int **) 0xbfffeaa0
        c = (short int **) 0x400156f8
#9  0x42015704 in __libc_start_main () from /lib/tls/libc.so.6
No symbol table info available.
0
Comment
Question by:team
  • 2
3 Comments
 

Author Comment

by:team
Comment Utility
i would need to soon allocate memory for 10000X10000 of each 8 bytes, so i would appreciate an idea that would make me allocate memory withouth a problem. i am not quite sure how making a 2d link list help in memory allocation. that is how would i address is it . like in the case of arrays it would [][] but if it were linked lists ?

ulimit shows :unlimited. i am compiling this code on a linux machine. and using mpich.

0
 
LVL 23

Accepted Solution

by:
brettmjohnson earned 50 total points
Comment Utility
>int  j, len=30;
>  short int **a;
>  short int **b;
>  short int **c;
>
>  for (j=0; j<NUMS; j++)
>    {
>      a[j]= malloc(NUMS, sizeof(short int));
>      b[j]= malloc(NUMS*sizeof(short int));
>      c[j]= malloc(NUMS*sizeof(short int));
>    }

You're problem is here - a, b, & c are arrays of pointers to 10-element arrays of shorts.
Although your loop allocates the arrays of shorts, you never allocate memory for the
arrays of pointers.  You also have a syntax error allocating a[j].

int  j, len=30;
 short int **a = (short **) malloc(NUMS*sizeof(short *));
 short int **b = (short **) malloc(NUMS*sizeof(short *));
 short int **c = (short **) malloc(NUMS*sizeof(short *));

 for (j=0; j<NUMS; j++)
   {
     a[j]= (short *)malloc(NUMS*sizeof(short int));
     b[j]= (short *)malloc(NUMS*sizeof(short int));
     c[j]= (short *)malloc(NUMS*sizeof(short int));
   }

0
 

Author Comment

by:team
Comment Utility
Hi Brett,

Thanks for your comments. I didn't realize that i was making a silly mistake. The memory allocation for a[j] was meant to be calloc not malloc. I did not realize the typo. figured it is lot easier to have the memory locations initialized to  0 initially than initiallize them later for the matrix operations.


regards

0

Featured Post

How your wiki can always stay up-to-date

Quip doubles as a “living” wiki and a project management tool that evolves with your organization. As you finish projects in Quip, the work remains, easily accessible to all team members, new and old.
- Increase transparency
- Onboard new hires faster
- Access from mobile/offline

Join & Write a Comment

This tutorial is posted by Aaron Wojnowski, administrator at SDKExpert.net.  To view more iPhone tutorials, visit www.sdkexpert.net. This is a very simple tutorial on finding the user's current location easily. In this tutorial, you will learn ho…
Windows programmers of the C/C++ variety, how many of you realise that since Window 9x Microsoft has been lying to you about what constitutes Unicode (http://en.wikipedia.org/wiki/Unicode)? They will have you believe that Unicode requires you to use…
The goal of this video is to provide viewers with basic examples to understand recursion in the C programming language.
Video by: Grant
The goal of this video is to provide viewers with basic examples to understand and use nested-loops in the C programming language.

743 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

13 Experts available now in Live!

Get 1:1 Help Now