Solved

I can't solve the "sigsuspend" problem.

Posted on 2004-04-26
8
976 Views
Last Modified: 2008-02-01
My thread progream is running on solarlis 2.8
It uses msgrcv and mutex_lock functiion.
By the way, it is halted by sigsuspend.
I can't find the reason.

when the event is occurred, the process state which is monitored by truss, is as follows.

sigsuspend(0xFFAE6AC8)          (sleeping...)
signotifywait()                 (sleeping...)
lwp_cond_wait(0xFF275548, 0xFF275558, 0xFF26EDB0) (sleeping...)
door_return(0x00000000, 0, 0x00000000, 0) (sleeping...)

please help me!!!


0
Comment
Question by:hypark7
8 Comments
 
LVL 45

Expert Comment

by:sunnycoder
ID: 10916343
Hi hypark7,

Are you sure you do not have any sigsuspend call in your code ...

if you are sure, then may be mutex_lock or some other function that needs to sleep for an event is calling sigsuspend

Sunnycoder
0
 
LVL 3

Expert Comment

by:norsethomas
ID: 10918608

hypark7,

please have a look at the output of the pstack command, i.e.

/usr/proc/bin/pstack pid

where pid is the process id of your process.  Which of the threads
executes the sigsuspend?

Usually the signotifywait() mentioned above is executed by the ASLWP
(the Asynchroneous Signals LWP).

Thomas
0
 

Author Comment

by:hypark7
ID: 10924052
Thank's for Sunnycoder & Thomas.

The additional comment is as follows.

I'm sure that I don't have any sigsuspend call in my code, and the "sigsuspend" event happened few and far between.  
I can't find the situation in a test system.

In my case, the deadlock state which is caused by mutex_lock is as follows, and I solved the problem.

lwp_mutex_lock(0xFF0D0000)      (sleeping...)
signotifywait()                 (sleeping...)
lwp_cond_wait(0xFF385548, 0xFF385558, 0xFF37EDB0) (sleeping...)
door_return(0x00000000, 0, 0x00000000, 0) (sleeping...)

or

lwp_sema_wait(0x00072168)       (sleeping...)
signotifywait()                 (sleeping...)
lwp_cond_wait(0xFF2734E8, 0xFF2734F8, 0xFF26CD80) (sleeping...)
door_return(0x00000000, 0, 0x00000000, 0) (sleeping...)

Which kind of function may cause the "sigsuspend" situation?

Moreover, the system is on service.

so, when it happened, I must kill the process and restart it instantly.

please, help me!!
0
 
LVL 3

Accepted Solution

by:
norsethomas earned 500 total points
ID: 10925726

> so, when it happened, I must kill the process and restart it instantly.

well, do a pstack before you kill it ;-)

sigsuspend might be called from any library function, even from within
the ASLWP; therefore we need the correct thread stack.

The sigsuspend might also be called from the ASLWP when it calls
one of your signal handlers ... Do you have signal handlers in your
code? If yes, how did you set them up (signal() or sigaction() ?)
What do they do? Do you use any kind of sync primitives inside
the handlers?

Thomas
0
How to run any project with ease

Manage projects of all sizes how you want. Great for personal to-do lists, project milestones, team priorities and launch plans.
- Combine task lists, docs, spreadsheets, and chat in one
- View and edit from mobile/offline
- Cut down on emails

 

Author Comment

by:hypark7
ID: 10935025
Thank's for Thomas.

If the "sigsuspend" event happens, I will do a pstack before I kill it.
But it happened once during 2~3 days within 10 same systems.
I hope to solve the problem by pstack.

I don't have any signal handlers in my code.

I  will give 50% point to you right now, and I will give another 50% point after I solve the problem.

Thanks!!!

0
 
LVL 9

Expert Comment

by:ankuratvb
ID: 10967478
0
 
LVL 3

Expert Comment

by:norsethomas
ID: 10978111

I'm sure, hypark7 hasn't forgotten to link -lthread
or -lpthread. Am I right?

Thomas
0
 

Author Comment

by:hypark7
ID: 10983180
Thank's Thomas!!

But, I did link -lthread.

By the way, I deleted the code "usleep" which is located after several mutex fuction(mutex_lock & mutex_unlock) calls.

The "sigsuspend" problem has not happend for about 6 days after delete.

I hope not to happen the "sigsuspend" problem.

I will give all pts to Thomas, Thank's.





0

Featured Post

How to improve team productivity

Quip adds documents, spreadsheets, and tasklists to your Slack experience
- Elevate ideas to Quip docs
- Share Quip docs in Slack
- Get notified of changes to your docs
- Available on iOS/Android/Desktop/Web
- Online/Offline

Join & Write a Comment

Suggested Solutions

Summary: This tutorial covers some basics of pointer, pointer arithmetic and function pointer. What is a pointer: A pointer is a variable which holds an address. This address might be address of another variable/address of devices/address of fu…
This is a short and sweet, but (hopefully) to the point article. There seems to be some fundamental misunderstanding about the function prototype for the "main" function in C and C++, more specifically what type this function should return. I see so…
The goal of this video is to provide viewers with basic examples to understand recursion in the C programming language.
Video by: Grant
The goal of this video is to provide viewers with basic examples to understand and use while-loops in the C programming language.

747 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

9 Experts available now in Live!

Get 1:1 Help Now