Solved

Job specific file passing

Posted on 2012-03-21
7
262 Views
Last Modified: 2012-06-15
Hi.

I have a shell script: pipeline.sh
I'd like to run the shell script 1000 times. This will be done by submitting each script to a linux server queue.
A TMP folder is set up for each job's files:

TMP=~/permanalysis/tmp.$$
mkdir $TMP

When multiple runs are taking place at the same time, there seems to be a problem with $queryfile being passed to fastacutter.pl.

I'm wondering if it is at all possible to give the created $queryfile a unique (job specific) name and then for this unique $queryfile to be used by fastacutter.pl?

I think it's currently erroring as multiple scripts have fastacutter.pl calling the same $queryfile.
pipeline.sh
fastacutter.txt
0
Comment
Question by:StephenMcGowan
  • 2
  • 2
7 Comments
 
LVL 2

Expert Comment

by:n4th4nr1ch
ID: 37749324
Hi, you have an old style of scripting; that is, you are scripting more for SH than for BASH.
You can remove all of these things:
if [ -z "$PHOBIUSBIN" ]
then
    PHOBIUSBIN=~/permanalysis/phobius/phobius.pl
fi

Open in new window


and simply use the default value option of bash:
echo ${PHOBIUSBIN:-~/permanalysis/phobius/phobius.pl} # echoes ~/permanalysis/phobius/phobius.pl
PHOBIUSBIN=/somewhere/else
echo ${PHOBIUSBIN:-~/permanalysis/phobius/phobius.pl} # echoes /somewhere/else

Open in new window


Also you generally don't use ALL_CAPS_VARIABLES for variables that are only found within a single script. That is only a convention, there is no functional difference really.


As for your question, it's a little hard to understand what you're asking. Let me tell you what it appears you are asking, and hopefully my answer will help:
It appears you are running something concurrently which is all trying to open and read/use  the same file and you want to somehow automatically generate filenames so each instance is only affecting its own file.

If that is correct you can do this:

filename=/my/normal/filename-$$

In bash $$ will be the PID of the process, and therefore would be different for each instance.
Perl I believe has the same syntax for PID : $$
0
 

Author Comment

by:StephenMcGowan
ID: 37751197
Hi n4th4ne1ch,

Thanks for getting back to me.
"you have an old style of scripting; that is, you are scripting more for SH than for BASH."

This script isn't mine. I've been asked to modify it so that I can run multiple jobs/instances at the same time by submitting the jobs to a queue.
So far, I've tried running 10 instances of the script at the same time by submitting the jobs from the command line:

qsub -b y –cwd -N test1 sh pipeline.sh
job submitted
qsub -b y –cwd -N test2 sh pipeline.sh
job submitted
qsub -b y –cwd -N test3 sh pipeline.sh
job submitted
qsub -b y –cwd -N test4 sh pipeline.sh
job submitted
etc upto job10

For three of the jobs, the process worked fine and the job ran to completion. But for the other seven there seemed to be a problem passing the file to fastacutter.pl

I'm wondering if these jobs are conflicting and trying to using the same files.
I'd like everything for a job to be kept separately from another job ( i.e. "I'm wondering if it is at all possible to give the created $queryfile a unique (job specific) name and then for this unique $queryfile to be used by fastacutter.pl?") <- therefore keeping everything used by one job separate from another job.

"It appears you are running something concurrently which is all trying to open and read/use  the same file and you want to somehow automatically generate filenames so each instance is only affecting its own file."
exactly. I have a script which has been written to run on it's own. But I've been asked to modify it so that it can be run 1000 times at the same time (without the jobs conflicting)


If this is achievable, what modifications would I need to make to this script?

Thanks,

Stephen.
0
 
LVL 34

Accepted Solution

by:
Duncan Roe earned 500 total points
ID: 37765039
There is a command called tempfile which is guaranteed to create you a unique temporary file. You can read about it by entering man 1 tempfile in a command window. You may be able to use this as the basis for a solution. if you are correct in assuming that your problem really is the same file's being used by multiple jobs. I.e. replace
TMP=~/permanalysis/tmp.$$
mkdir $TMP

Open in new window

with
TMP=$(tempfile -d ~/permanalysis)
rm $TMP
mkdir $TMP

Open in new window

You could do further debugging by following the above with these lines
set -x
exec >>$TMP.out 2>&1

Open in new window

The created .out files will show you all subsequently executed shell commands. Using >> rather than > in the exec command is just being ultra-safe: $TMP.out should never exist because of the uniqueness of names created by tempfile. Better safe than sorry though.
You could use the -p or -s options of tempfile to generate filenames that you can easily delete afterwards.
0
 

Author Comment

by:StephenMcGowan
ID: 37921960
I've requested that this question be deleted for the following reason:

Abandoned
0
 
LVL 34

Expert Comment

by:Duncan Roe
ID: 37921961
This is bad etiquette. StephenMcGowan has not responded to my last post which I intended as a complete solution to his problem as asked
0

Featured Post

How to improve team productivity

Quip adds documents, spreadsheets, and tasklists to your Slack experience
- Elevate ideas to Quip docs
- Share Quip docs in Slack
- Get notified of changes to your docs
- Available on iOS/Android/Desktop/Web
- Online/Offline

Join & Write a Comment

Linux users are sometimes dumbfounded by the severe lack of documentation on a topic. Sometimes, the documentation is copious, but other times, you end up with some obscure "it varies depending on your distribution" over and over when searching for …
This article will show, step by step, how to integrate R code into a R Sweave document
Learn how to get help with Linux/Unix bash shell commands. Use help to read help documents for built in bash shell commands.: Use man to interface with the online reference manuals for shell commands.: Use man to search man pages for unknown command…
Connecting to an Amazon Linux EC2 Instance from Windows Using PuTTY.

758 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

23 Experts available now in Live!

Get 1:1 Help Now