Still celebrating National IT Professionals Day with 3 months of free Premium Membership. Use Code ITDAY17

x
?
Solved

I can't solve the "sigsuspend" problem.

Posted on 2004-04-26
8
Medium Priority
?
1,005 Views
Last Modified: 2008-02-01
My thread progream is running on solarlis 2.8
It uses msgrcv and mutex_lock functiion.
By the way, it is halted by sigsuspend.
I can't find the reason.

when the event is occurred, the process state which is monitored by truss, is as follows.

sigsuspend(0xFFAE6AC8)          (sleeping...)
signotifywait()                 (sleeping...)
lwp_cond_wait(0xFF275548, 0xFF275558, 0xFF26EDB0) (sleeping...)
door_return(0x00000000, 0, 0x00000000, 0) (sleeping...)

please help me!!!


0
Comment
Question by:hypark7
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
8 Comments
 
LVL 45

Expert Comment

by:sunnycoder
ID: 10916343
Hi hypark7,

Are you sure you do not have any sigsuspend call in your code ...

if you are sure, then may be mutex_lock or some other function that needs to sleep for an event is calling sigsuspend

Sunnycoder
0
 
LVL 3

Expert Comment

by:norsethomas
ID: 10918608

hypark7,

please have a look at the output of the pstack command, i.e.

/usr/proc/bin/pstack pid

where pid is the process id of your process.  Which of the threads
executes the sigsuspend?

Usually the signotifywait() mentioned above is executed by the ASLWP
(the Asynchroneous Signals LWP).

Thomas
0
 

Author Comment

by:hypark7
ID: 10924052
Thank's for Sunnycoder & Thomas.

The additional comment is as follows.

I'm sure that I don't have any sigsuspend call in my code, and the "sigsuspend" event happened few and far between.  
I can't find the situation in a test system.

In my case, the deadlock state which is caused by mutex_lock is as follows, and I solved the problem.

lwp_mutex_lock(0xFF0D0000)      (sleeping...)
signotifywait()                 (sleeping...)
lwp_cond_wait(0xFF385548, 0xFF385558, 0xFF37EDB0) (sleeping...)
door_return(0x00000000, 0, 0x00000000, 0) (sleeping...)

or

lwp_sema_wait(0x00072168)       (sleeping...)
signotifywait()                 (sleeping...)
lwp_cond_wait(0xFF2734E8, 0xFF2734F8, 0xFF26CD80) (sleeping...)
door_return(0x00000000, 0, 0x00000000, 0) (sleeping...)

Which kind of function may cause the "sigsuspend" situation?

Moreover, the system is on service.

so, when it happened, I must kill the process and restart it instantly.

please, help me!!
0
Independent Software Vendors: We Want Your Opinion

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

 
LVL 3

Accepted Solution

by:
norsethomas earned 1500 total points
ID: 10925726

> so, when it happened, I must kill the process and restart it instantly.

well, do a pstack before you kill it ;-)

sigsuspend might be called from any library function, even from within
the ASLWP; therefore we need the correct thread stack.

The sigsuspend might also be called from the ASLWP when it calls
one of your signal handlers ... Do you have signal handlers in your
code? If yes, how did you set them up (signal() or sigaction() ?)
What do they do? Do you use any kind of sync primitives inside
the handlers?

Thomas
0
 

Author Comment

by:hypark7
ID: 10935025
Thank's for Thomas.

If the "sigsuspend" event happens, I will do a pstack before I kill it.
But it happened once during 2~3 days within 10 same systems.
I hope to solve the problem by pstack.

I don't have any signal handlers in my code.

I  will give 50% point to you right now, and I will give another 50% point after I solve the problem.

Thanks!!!

0
 
LVL 3

Expert Comment

by:norsethomas
ID: 10978111

I'm sure, hypark7 hasn't forgotten to link -lthread
or -lpthread. Am I right?

Thomas
0
 

Author Comment

by:hypark7
ID: 10983180
Thank's Thomas!!

But, I did link -lthread.

By the way, I deleted the code "usleep" which is located after several mutex fuction(mutex_lock & mutex_unlock) calls.

The "sigsuspend" problem has not happend for about 6 days after delete.

I hope not to happen the "sigsuspend" problem.

I will give all pts to Thomas, Thank's.





0

Featured Post

Concerto Cloud for Software Providers & ISVs

Can Concerto Cloud Services help you focus on evolving your application offerings, while delivering the best cloud experience to your customers? From DevOps to revenue models and customer support, the answer is yes!

Learn how Concerto can help you.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Preface I don't like visual development tools that are supposed to write a program for me. Even if it is Xcode and I can use Interface Builder. Yes, it is a perfect tool and has helped me a lot, mainly, in the beginning, when my programs were small…
This tutorial is posted by Aaron Wojnowski, administrator at SDKExpert.net.  To view more iPhone tutorials, visit www.sdkexpert.net. This is a very simple tutorial on finding the user's current location easily. In this tutorial, you will learn ho…
The goal of this video is to provide viewers with basic examples to understand opening and writing to files in the C programming language.
Video by: Grant
The goal of this video is to provide viewers with basic examples to understand and use while-loops in the C programming language.

705 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question