abnormal termination while running parallel jobs on linux cluster
Posted on 2007-03-22
I am running a parallelised solver on a cluster of HP AMD opteron with 16 processors
Problem is , when I run a case on 2,4 or 8 processors, the CPU time taken does not differ much.
The job seems to take more time r 4 processors than for 2 processors.
Second problem is when I submit a job through the batch manager, for one case which I run on
processor, I get the following error
TERM_RUNLIMIT: job killed after reaching LSF run time limit.
Exited with exit code 1.
Can you please explain what is meant by LSF ? and is this a problem of the server?
I am also using HPMPICH to run the parallel application