• Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 910
  • Last Modified:

abnormal termination while running parallel jobs on linux cluster

I am running a parallelised solver on a cluster of HP AMD opteron with 16 processors

Problem is , when I run a case on 2,4 or 8 processors, the CPU time taken does not differ much.
The job seems to take more time r 4 processors than for 2 processors.

Second problem is when I submit a job through the batch manager, for one case which I run on
processor, I get the following error

TERM_RUNLIMIT: job killed after reaching LSF run time limit.
Exited with exit code 1.

Can you please explain what is meant by LSF ? and is this a problem of the server?

I am also using HPMPICH to run the parallel application


2 Solutions
Hanno P.S.IT Consultant and Infrastructure ArchitectCommented:
No comment has been added to this question in more than 21 days, so it is now classified as abandoned.

I will leave the following recommendation for this question in the Cleanup Zone:
SPLIT POINTS - between giltjr {http:#18777218} and ssvl {http:#18778068}

Any objections should be posted here in the next 4 days. After that time, the question will be closed.
JustUNIX, Experts Exchange Cleanup Volunteer

Forced accept.

EE Admin
Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

Join & Write a Comment

Featured Post

The 14th Annual Expert Award Winners

The results are in! Meet the top members of our 2017 Expert Awards. Congratulations to all who qualified!

Tackle projects and never again get stuck behind a technical roadblock.
Join Now