Solved

I can't solve the "sigsuspend" problem.

Posted on 2004-04-26
8
978 Views
Last Modified: 2008-02-01
My thread progream is running on solarlis 2.8
It uses msgrcv and mutex_lock functiion.
By the way, it is halted by sigsuspend.
I can't find the reason.

when the event is occurred, the process state which is monitored by truss, is as follows.

sigsuspend(0xFFAE6AC8)          (sleeping...)
signotifywait()                 (sleeping...)
lwp_cond_wait(0xFF275548, 0xFF275558, 0xFF26EDB0) (sleeping...)
door_return(0x00000000, 0, 0x00000000, 0) (sleeping...)

please help me!!!


0
Comment
Question by:hypark7
8 Comments
 
LVL 45

Expert Comment

by:sunnycoder
ID: 10916343
Hi hypark7,

Are you sure you do not have any sigsuspend call in your code ...

if you are sure, then may be mutex_lock or some other function that needs to sleep for an event is calling sigsuspend

Sunnycoder
0
 
LVL 3

Expert Comment

by:norsethomas
ID: 10918608

hypark7,

please have a look at the output of the pstack command, i.e.

/usr/proc/bin/pstack pid

where pid is the process id of your process.  Which of the threads
executes the sigsuspend?

Usually the signotifywait() mentioned above is executed by the ASLWP
(the Asynchroneous Signals LWP).

Thomas
0
 

Author Comment

by:hypark7
ID: 10924052
Thank's for Sunnycoder & Thomas.

The additional comment is as follows.

I'm sure that I don't have any sigsuspend call in my code, and the "sigsuspend" event happened few and far between.  
I can't find the situation in a test system.

In my case, the deadlock state which is caused by mutex_lock is as follows, and I solved the problem.

lwp_mutex_lock(0xFF0D0000)      (sleeping...)
signotifywait()                 (sleeping...)
lwp_cond_wait(0xFF385548, 0xFF385558, 0xFF37EDB0) (sleeping...)
door_return(0x00000000, 0, 0x00000000, 0) (sleeping...)

or

lwp_sema_wait(0x00072168)       (sleeping...)
signotifywait()                 (sleeping...)
lwp_cond_wait(0xFF2734E8, 0xFF2734F8, 0xFF26CD80) (sleeping...)
door_return(0x00000000, 0, 0x00000000, 0) (sleeping...)

Which kind of function may cause the "sigsuspend" situation?

Moreover, the system is on service.

so, when it happened, I must kill the process and restart it instantly.

please, help me!!
0
 
LVL 3

Accepted Solution

by:
norsethomas earned 500 total points
ID: 10925726

> so, when it happened, I must kill the process and restart it instantly.

well, do a pstack before you kill it ;-)

sigsuspend might be called from any library function, even from within
the ASLWP; therefore we need the correct thread stack.

The sigsuspend might also be called from the ASLWP when it calls
one of your signal handlers ... Do you have signal handlers in your
code? If yes, how did you set them up (signal() or sigaction() ?)
What do they do? Do you use any kind of sync primitives inside
the handlers?

Thomas
0
Optimizing Cloud Backup for Low Bandwidth

With cloud storage prices going down a growing number of SMBs start to use it for backup storage. Unfortunately, business data volume rarely fits the average Internet speed. This article provides an overview of main Internet speed challenges and reveals backup best practices.

 

Author Comment

by:hypark7
ID: 10935025
Thank's for Thomas.

If the "sigsuspend" event happens, I will do a pstack before I kill it.
But it happened once during 2~3 days within 10 same systems.
I hope to solve the problem by pstack.

I don't have any signal handlers in my code.

I  will give 50% point to you right now, and I will give another 50% point after I solve the problem.

Thanks!!!

0
 
LVL 9

Expert Comment

by:ankuratvb
ID: 10967478
0
 
LVL 3

Expert Comment

by:norsethomas
ID: 10978111

I'm sure, hypark7 hasn't forgotten to link -lthread
or -lpthread. Am I right?

Thomas
0
 

Author Comment

by:hypark7
ID: 10983180
Thank's Thomas!!

But, I did link -lthread.

By the way, I deleted the code "usleep" which is located after several mutex fuction(mutex_lock & mutex_unlock) calls.

The "sigsuspend" problem has not happend for about 6 days after delete.

I hope not to happen the "sigsuspend" problem.

I will give all pts to Thomas, Thank's.





0

Featured Post

PRTG Network Monitor: Intuitive Network Monitoring

Network Monitoring is essential to ensure that computer systems and network devices are running. Use PRTG to monitor LANs, servers, websites, applications and devices, bandwidth, virtual environments, remote systems, IoT, and many more. PRTG is easy to set up & use.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Preface I don't like visual development tools that are supposed to write a program for me. Even if it is Xcode and I can use Interface Builder. Yes, it is a perfect tool and has helped me a lot, mainly, in the beginning, when my programs were small…
This is a short and sweet, but (hopefully) to the point article. There seems to be some fundamental misunderstanding about the function prototype for the "main" function in C and C++, more specifically what type this function should return. I see so…
The goal of this video is to provide viewers with basic examples to understand how to use strings and some functions related to them in the C programming language.
Video by: Grant
The goal of this video is to provide viewers with basic examples to understand and use nested-loops in the C programming language.

862 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

23 Experts available now in Live!

Get 1:1 Help Now