Solved

Job specific file passing

Posted on 2012-03-21
7
264 Views
Last Modified: 2012-06-15
Hi.

I have a shell script: pipeline.sh
I'd like to run the shell script 1000 times. This will be done by submitting each script to a linux server queue.
A TMP folder is set up for each job's files:

TMP=~/permanalysis/tmp.$$
mkdir $TMP

When multiple runs are taking place at the same time, there seems to be a problem with $queryfile being passed to fastacutter.pl.

I'm wondering if it is at all possible to give the created $queryfile a unique (job specific) name and then for this unique $queryfile to be used by fastacutter.pl?

I think it's currently erroring as multiple scripts have fastacutter.pl calling the same $queryfile.
pipeline.sh
fastacutter.txt
0
Comment
Question by:StephenMcGowan
  • 2
  • 2
7 Comments
 
LVL 2

Expert Comment

by:n4th4nr1ch
ID: 37749324
Hi, you have an old style of scripting; that is, you are scripting more for SH than for BASH.
You can remove all of these things:
if [ -z "$PHOBIUSBIN" ]
then
    PHOBIUSBIN=~/permanalysis/phobius/phobius.pl
fi

Open in new window


and simply use the default value option of bash:
echo ${PHOBIUSBIN:-~/permanalysis/phobius/phobius.pl} # echoes ~/permanalysis/phobius/phobius.pl
PHOBIUSBIN=/somewhere/else
echo ${PHOBIUSBIN:-~/permanalysis/phobius/phobius.pl} # echoes /somewhere/else

Open in new window


Also you generally don't use ALL_CAPS_VARIABLES for variables that are only found within a single script. That is only a convention, there is no functional difference really.


As for your question, it's a little hard to understand what you're asking. Let me tell you what it appears you are asking, and hopefully my answer will help:
It appears you are running something concurrently which is all trying to open and read/use  the same file and you want to somehow automatically generate filenames so each instance is only affecting its own file.

If that is correct you can do this:

filename=/my/normal/filename-$$

In bash $$ will be the PID of the process, and therefore would be different for each instance.
Perl I believe has the same syntax for PID : $$
0
 

Author Comment

by:StephenMcGowan
ID: 37751197
Hi n4th4ne1ch,

Thanks for getting back to me.
"you have an old style of scripting; that is, you are scripting more for SH than for BASH."

This script isn't mine. I've been asked to modify it so that I can run multiple jobs/instances at the same time by submitting the jobs to a queue.
So far, I've tried running 10 instances of the script at the same time by submitting the jobs from the command line:

qsub -b y –cwd -N test1 sh pipeline.sh
job submitted
qsub -b y –cwd -N test2 sh pipeline.sh
job submitted
qsub -b y –cwd -N test3 sh pipeline.sh
job submitted
qsub -b y –cwd -N test4 sh pipeline.sh
job submitted
etc upto job10

For three of the jobs, the process worked fine and the job ran to completion. But for the other seven there seemed to be a problem passing the file to fastacutter.pl

I'm wondering if these jobs are conflicting and trying to using the same files.
I'd like everything for a job to be kept separately from another job ( i.e. "I'm wondering if it is at all possible to give the created $queryfile a unique (job specific) name and then for this unique $queryfile to be used by fastacutter.pl?") <- therefore keeping everything used by one job separate from another job.

"It appears you are running something concurrently which is all trying to open and read/use  the same file and you want to somehow automatically generate filenames so each instance is only affecting its own file."
exactly. I have a script which has been written to run on it's own. But I've been asked to modify it so that it can be run 1000 times at the same time (without the jobs conflicting)


If this is achievable, what modifications would I need to make to this script?

Thanks,

Stephen.
0
 
LVL 34

Accepted Solution

by:
Duncan Roe earned 500 total points
ID: 37765039
There is a command called tempfile which is guaranteed to create you a unique temporary file. You can read about it by entering man 1 tempfile in a command window. You may be able to use this as the basis for a solution. if you are correct in assuming that your problem really is the same file's being used by multiple jobs. I.e. replace
TMP=~/permanalysis/tmp.$$
mkdir $TMP

Open in new window

with
TMP=$(tempfile -d ~/permanalysis)
rm $TMP
mkdir $TMP

Open in new window

You could do further debugging by following the above with these lines
set -x
exec >>$TMP.out 2>&1

Open in new window

The created .out files will show you all subsequently executed shell commands. Using >> rather than > in the exec command is just being ultra-safe: $TMP.out should never exist because of the uniqueness of names created by tempfile. Better safe than sorry though.
You could use the -p or -s options of tempfile to generate filenames that you can easily delete afterwards.
0
 

Author Comment

by:StephenMcGowan
ID: 37921960
I've requested that this question be deleted for the following reason:

Abandoned
0
 
LVL 34

Expert Comment

by:Duncan Roe
ID: 37921961
This is bad etiquette. StephenMcGowan has not responded to my last post which I intended as a complete solution to his problem as asked
0

Featured Post

Is Your Active Directory as Secure as You Think?

More than 75% of all records are compromised because of the loss or theft of a privileged credential. Experts have been exploring Active Directory infrastructure to identify key threats and establish best practices for keeping data safe. Attend this month’s webinar to learn more.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

How to remove superseded packages in windows w60 or w61 installation media (.wim) or online system to prevent unnecessary space. w60 means Windows Vista or Windows Server 2008. w61 means Windows 7 or Windows Server 2008 R2. There are various …
Checking the Alert Log in AWS RDS Oracle can be a pain through their user interface.  I made a script to download the Alert Log, look for errors, and email me the trace files.  In this article I'll describe what I did and share my script.
In this fourth video of the Xpdf series, we discuss and demonstrate the PDFinfo utility, which retrieves the contents of a PDF's Info Dictionary, as well as some other information, including the page count. We show how to isolate the page count in a…
Get a first impression of how PRTG looks and learn how it works.   This video is a short introduction to PRTG, as an initial overview or as a quick start for new PRTG users.

914 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

21 Experts available now in Live!

Get 1:1 Help Now