Go Premium for a chance to win a PS4. Enter to Win

x
  • Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 298
  • Last Modified:

Bash script help please with dates?

Hi,
   
    I'm working with this code in cygwin

#!/bin/bash -x
UTIL=/cygdrive/c/apps/dcm4che-2.0.25-bin/bin
INPUT=/cygdrive/c/projects/a4fill/A2_SUID
BASEDIR=/cygdrive/c/projects/a4fill
start="20140101"
end="20140102"
DATES=""
for i in $(seq 0 9999)
 do
  new=$(date -d "$start + $i days" "+%Y%m%d")
  if [[ $new -le $end ]]; then 
   DATES="$DATES $new"; 
   else break; 
  fi
done
for D in $DATES
 do
  $UTIL/dcmqr RADARCH2@10.10.50.51:104 -rStudyInstanceUID -qStudyDate=$D | grep "(0020,000D) UI #[0-9]*" | sed s/'(0020,000D) UI #[0-9]* \['// | sed s/'\] Study Instance UID'// | sed s/'] Study Instance'// | sed s/'UI'// | sed s/'U'// > $INPUT/A2_SUID
 cnt=0
 count=0
 exec 3<&0 # Save stdin to file descriptor 3.
 exec 0<$INPUT # Redirect standard input.
  while read input1 rest  # Let read split the line instead of running awk
   do
    echo "Row $count"
    ((cnt++))
    echo "Moving :" ${input1}
    $UTIL/dcmqr RADARCH2@10.10.50.51:104 -q0020000D=${input1} -cmove RADARCH4
    ((count++))
      if (( cnt == 50 )); then
       echo "Sleeping for 3 Minutes on $(date)"
       sleep 180
       cnt=0
      fi
exec 0<&3 # Restore old stdin.
done
echo "Counter:" $count # Show moved items

Open in new window


and its not getting past the 'date' function at the beginning of the script.  What I want it to do is I want to enter in a date range and then it will place that into '$D', but its having issues.  

Also, maybe there is a better way to write this entire script to make it more efficient?  I'm sure it looks very messy to everyone as I don't have very much experience.

thank you
0
doc_jay
Asked:
doc_jay
  • 8
  • 7
  • 2
3 Solutions
 
woolmilkporcCommented:
First, there is a blur in the definition and/or use of $INPUT.

Line 3: INPUT=/cygdrive/c/projects/a4fill/A2_SUID

Line 18: ... ... ... > $INPUT/A2_SUID

Line 22: exec 0<$INPUT

If INPUT is a directory then line 22 is wrong, and if INPUT is a file then line 18 is wrong.

Next, you're missing a "done" statement. I assume it should go between lines 18 and 19.
If this is true, in order to fill A2_SUID  correctly with more than one line you must use ">>" instead of ">" and take care that the file is initially empty.

Besides that the script doesn't look that messy, yet (at least today) I don't have the time to verify those "sed" commands.  The redirection of stdin could be achieved with a simpler method (" done < $INPUT"), but your way isn't wrong at all.

The "date" loop in lines 8-15 is quite OK and should not give any trouble!
Add "echo $DATES" between lines 15 and 16 to verify!
0
 
woolmilkporcCommented:
PS: My day is over now, sorry! CU tomorrow!

wmp
0
 
doc_jayAuthor Commented:
Wmp,

    thanks - good catch on line 18.  'A2_SUID' is a file with multiple rows that I want to feed 'dcmqr' on line 28.  If i change the end of line 18 to simple '> $INPUT', it should create a file named 'A2_SUID' correct?

I'll add the missing 'done' between lines 18 & 19 and add the 'echo $DATES' between 15 & 16.  

I'll post back with results.

thanks!
0
Veeam Disaster Recovery in Microsoft Azure

Veeam PN for Microsoft Azure is a FREE solution designed to simplify and automate the setup of a DR site in Microsoft Azure using lightweight software-defined networking. It reduces the complexity of VPN deployments and is designed for businesses of ALL sizes.

 
doc_jayAuthor Commented:
wmp,

   So, it runs now, thanks!  On row 28, when it is feeding the 'A2_SUID' to the 'dcmqr' tool, it does so one at a time.  This is fine, but the way the script is written now, it is wanting every row that is in the 'A2_SUID' first before it performs the rest of its function which is '-cmove RADARCH4' after the {input1}.  Is there a way to give it one row of the 'A2_SUID' and then make the 'dcmqr' tool think it is done to force it to finish the '-cmove' command & then just loop through the rest of the input file with one row at a time?

  The way the 'dcmqr' tool works is that does a query with against the IP address of 10.10.50.51, then it does its '-cmove' function.  If there are 2000 or even 10,000 rows, it is going to take a long time for anything to start happening (it will ultimately do a dicom move of x-ray studies with the -cmove function).

-hope this is clear as mud.

ps. the 'A2_SUID' is created if it doesn't already exist, but it is appended instead of being overwritten.   Can we make it be overwritten or removed each time?

thanks
0
 
Carlos IjalbaIT Systems CoordinatorCommented:
if you want to create it each script run, then just do a

echo "whatever" > $INPUT

at the beggining of the file, that will initialize it, and then to append lines to it

echo "Another whatever" >> $INPUT
0
 
woolmilkporcCommented:
Let's begin with your "PS" question:

Remove ">> $INPUT" from line 18 ("$UTIL/dcmqr RADARCH2@10.10.50.51:104 ... >> $INPUT") and add "> $INPUT"  to the new "done" statement between lines 18 and 19 ("done > $INPUT").
This way a new, complete file will be written after termination of the whole loop.

Next, "clear as mud" should have good chances to win the "Understatement of the Year" contest.

Let's see what we have now.

- In a first step the script creates the file "A2_SUID" containing (possibly) several lines of data.
- In a further step a loop reads this file  line by line.  The first whitespace-delimited word of each line is then fed as "$input1" to "dcmqr" in line 28, one at each iteration of the loop.  "$input1" will contain just one word, and dcmqr will just "see" this single word and nothing else.

This is exactly the same as your requirement: " a way to give it one row of the 'A2_SUID' and then make the 'dcmqr' tool think it is done to force it to finish the '-cmove' command & then just loop through the rest of the input file with one row at a time".

I assume that I didn't really understand what you're after. I fear you will have to do a bit more mud wrestling to better explain what's it really all about.
0
 
doc_jayAuthor Commented:
Thank you very much!  I appreciate all of your comments as they moved me along to sort out my issues.  :)

I ended up running this code:

#!/bin/bash
#created by
UTIL=/cygdrive/c/apps/dcm4che-3.3.3-bin/bin
INPUT=/cygdrive/c/projects/a4fill/2014/A2_SUID
BASEDIR=/cygdrive/c/projects/a4fill
cnt=0
count=0
exec 3<$0 #Save stdin to file descriptor 3.
exec 0<$INPUT # Redirect standard input.
while read input1 rest #Let read split the line instead of running awk
do
 echo "Row $count"
 ((cnt++))
 echo "Moving :" ${input1}
 $UTIL/movescu -c RADARCH2@10.10.50.51:104 -m 0020000D=${input1} --dest RADARCH4
  ((count++))
   if (( cnt == 50 )); then
    echo "Sleeping for 5 Seconds on $(date), I have moved $count exams"
    sleep 5
    cnt=0
   fi
exec 0<$3 #Restore old stdin.
done
echo "Counter:" $count # Show moved items

Open in new window


I moved the operation to 'movescu' because it only does  move instead of query + move.  It is exactly what I needed.

One question about the code above I posted.  My input file that I'm running this against has over 240,000 rows.  If for some reason I needed to stop the script, I would like to pick up where I left off.  How can I change the code to tell it 'start on row 4009' for example?  Or maybe there is a way for it to keep count and start where it leaves off?  I can always scroll up in the output of the console to see what row last echoed back.

thanks
0
 
woolmilkporcCommented:
Since you didn't post the part which creates the input file A2_suid - can I assume that this runs OK now?

As for the starting row:

You're using "read" to process the input file, and "read" always starts from the top of stdin.

I fear we must revert to "awk" (or a similar tool) which can be instructed to skip lines.

Remove lines 8, 9 and 22 (the stdin redirections) and change (the old) line 10 to

awk -v START=4009 'NR>=START {print $1}' $INPUT | while read input1
....

Change the "4009" in "-v START=4009" to whatever value is desired.

Alternatively add between (the old) lines 7 and  10 or another convenient place where variables get defined:

START=4009

and change (the old) line 10 to

awk -v START=$START 'NR>=START {print $1}' $INPUT | while read input1
....

Change the "4009" in the newly added "START=4009" statement to your desired value.

Making the script continue where it left off would involve creating an external file to store the current row number. Using external files is generally error-prone, because you could forget to update it before the next (regular) run, or someone could accidentally delete it. Nevertheless, I could help you with that, if desired.
0
 
doc_jayAuthor Commented:
wmp,

   I decided to just get the input file from a MSSQL query and output it to a file.  Thanks for your input and I'll try it out soon because its just inevitable that I'll have to interrupt the script.  :)
0
 
Carlos IjalbaIT Systems CoordinatorCommented:
doc_jay,

What you will have to do is implement TRAPS, so you can intercept signals that come from the OS, and act accordingly, that way you will always have control if any interruption occurs (apart from a power outage).

If you want to see some good examples, this one from Aaron Maxwell is a winner:


And in one of the traps you handle your row counters in a external file.

Also you will have to implement a locking file with your process PID to detect multiple runs of the same script, and only let one run at the same time.
0
 
doc_jayAuthor Commented:
wmp,

    In your answer above about using awk to start at a certain row, is there a way to have a 'start' and 'end'?  Reason being is that I was hoping to have only one input file with around 240,000 rows and multiple scripts reading that input file.

For example:
One script that would read from 1 to 20000
2nd script that would read from 20001 to 50000
0
 
woolmilkporcCommented:
START=1
END=20000
awk -v START=$START -v END=$END 'NR>=START&&NR<=END {print $1}' $INPUT

START=20001
END=50000
awk -v START=$START -v END=$END 'NR>=START&&NR<=END {print $1}' $INPUT
0
 
doc_jayAuthor Commented:
wmp,

   thanks for the help on this again.  It came back with an error.

'awk:  fatal:  cannot use gawk builtin 'END' as variable name

EDIT:  I changed the variable to 'ENDING' and it now works.

Also, is there a way to make the row count echo back the current row it is working on?  As of right now, it starts on say row 13677 and its first echo is 'Row 0'

here is what I am using as of now:

#!/bin/bash
UTIL=/cygdrive/c/apps/dcm4che-3.3.3-bin/bin
INPUT=/cygdrive/c/projects/a4fill/2014/A2_SUID
BASEDIR=/cygdrive/c/projects/a4fill
cnt=0
count=0
START=13677
ENDING=50000
awk -v START=$START -v ENDING=$ENDING 'NR>=START&&NR<=ENDING {print $1}' $INPUT  | while read input1  #starting at a certain row
do
 echo "Row $count"
 ((cnt++))
 echo "Moving :" ${input1}
 $UTIL/movescu -c RADARCH2@10.10.50.51:104 -m 0020000D=${input1} --dest RADARCH4
  ((count++))
   if (( cnt == 50 )); then
    echo "Sleeping for 5 Seconds on $(date), I have moved $count exams"
    sleep 5
    cnt=0
   fi
exec 0<$3 #Restore old stdin.
done
echo "Counter:" $count # Show moved items

Open in new window

0
 
woolmilkporcCommented:
Replace
 
count=0
START=13677

with

START=13677
count=$START

i. e. initialize the counter correctly.

At the end you should replace
echo "Counter: " $count # Show moved items
with
echo "Counter: " $((count-START)) # Show moved items
i. e. subtract the initial value from the final value (which is $ENDING+1) to get an exact count.

Sorry for the "END" thing, didn't think of that. In fact, I always use shorter variable names in awk, the long names were just for illustration. The version I tested was like this:

START=20
END=50
awk -v S=$START -v E=$END 'NR>=S&&NR<=E {print $1}' $INPUT
0
 
doc_jayAuthor Commented:
wmp,

   thanks for explanation, I've decided to do it the way you have in your last example, works out great.  Also, I can now run more than one script and read only one large input file.  In doing so, I've started to use the 'screen' utility but I can't scroll up in the buffer to read the last row that it echoed any more.

Question:
How could I write the current row its working on into an output file so that I could just open it up to refer to if I need to know what row the script is working on?

EDIT:  nevermind, looks like there are some commands for screen to use to scroll through the output.  -thanks again
0
 
woolmilkporcCommented:
Log the row count and keep displaying it on the terminal:

echo "Row $count" | tee -a outputfile

Log the entire script's output and keep displaying it on the terminal:

scriptname | tee outputfile
0
 
doc_jayAuthor Commented:
wmp,

   thanks!
0

Featured Post

NFR key for Veeam Backup for Microsoft Office 365

Veeam is happy to provide a free NFR license (for 1 year, up to 10 users). This license allows for the non‑production use of Veeam Backup for Microsoft Office 365 in your home lab without any feature limitations.

  • 8
  • 7
  • 2
Tackle projects and never again get stuck behind a technical roadblock.
Join Now