Solved

Bash script help please with dates?

Posted on 2014-09-23
17
264 Views
Last Modified: 2014-09-30
Hi,
   
    I'm working with this code in cygwin

#!/bin/bash -x
UTIL=/cygdrive/c/apps/dcm4che-2.0.25-bin/bin
INPUT=/cygdrive/c/projects/a4fill/A2_SUID
BASEDIR=/cygdrive/c/projects/a4fill
start="20140101"
end="20140102"
DATES=""
for i in $(seq 0 9999)
 do
  new=$(date -d "$start + $i days" "+%Y%m%d")
  if [[ $new -le $end ]]; then 
   DATES="$DATES $new"; 
   else break; 
  fi
done
for D in $DATES
 do
  $UTIL/dcmqr RADARCH2@10.10.50.51:104 -rStudyInstanceUID -qStudyDate=$D | grep "(0020,000D) UI #[0-9]*" | sed s/'(0020,000D) UI #[0-9]* \['// | sed s/'\] Study Instance UID'// | sed s/'] Study Instance'// | sed s/'UI'// | sed s/'U'// > $INPUT/A2_SUID
 cnt=0
 count=0
 exec 3<&0 # Save stdin to file descriptor 3.
 exec 0<$INPUT # Redirect standard input.
  while read input1 rest  # Let read split the line instead of running awk
   do
    echo "Row $count"
    ((cnt++))
    echo "Moving :" ${input1}
    $UTIL/dcmqr RADARCH2@10.10.50.51:104 -q0020000D=${input1} -cmove RADARCH4
    ((count++))
      if (( cnt == 50 )); then
       echo "Sleeping for 3 Minutes on $(date)"
       sleep 180
       cnt=0
      fi
exec 0<&3 # Restore old stdin.
done
echo "Counter:" $count # Show moved items

Open in new window


and its not getting past the 'date' function at the beginning of the script.  What I want it to do is I want to enter in a date range and then it will place that into '$D', but its having issues.  

Also, maybe there is a better way to write this entire script to make it more efficient?  I'm sure it looks very messy to everyone as I don't have very much experience.

thank you
0
Comment
Question by:doc_jay
  • 8
  • 7
  • 2
17 Comments
 
LVL 68

Assisted Solution

by:woolmilkporc
woolmilkporc earned 500 total points
ID: 40340502
First, there is a blur in the definition and/or use of $INPUT.

Line 3: INPUT=/cygdrive/c/projects/a4fill/A2_SUID

Line 18: ... ... ... > $INPUT/A2_SUID

Line 22: exec 0<$INPUT

If INPUT is a directory then line 22 is wrong, and if INPUT is a file then line 18 is wrong.

Next, you're missing a "done" statement. I assume it should go between lines 18 and 19.
If this is true, in order to fill A2_SUID  correctly with more than one line you must use ">>" instead of ">" and take care that the file is initially empty.

Besides that the script doesn't look that messy, yet (at least today) I don't have the time to verify those "sed" commands.  The redirection of stdin could be achieved with a simpler method (" done < $INPUT"), but your way isn't wrong at all.

The "date" loop in lines 8-15 is quite OK and should not give any trouble!
Add "echo $DATES" between lines 15 and 16 to verify!
0
 
LVL 68

Expert Comment

by:woolmilkporc
ID: 40340536
PS: My day is over now, sorry! CU tomorrow!

wmp
0
 

Author Comment

by:doc_jay
ID: 40340604
Wmp,

    thanks - good catch on line 18.  'A2_SUID' is a file with multiple rows that I want to feed 'dcmqr' on line 28.  If i change the end of line 18 to simple '> $INPUT', it should create a file named 'A2_SUID' correct?

I'll add the missing 'done' between lines 18 & 19 and add the 'echo $DATES' between 15 & 16.  

I'll post back with results.

thanks!
0
 

Author Comment

by:doc_jay
ID: 40340645
wmp,

   So, it runs now, thanks!  On row 28, when it is feeding the 'A2_SUID' to the 'dcmqr' tool, it does so one at a time.  This is fine, but the way the script is written now, it is wanting every row that is in the 'A2_SUID' first before it performs the rest of its function which is '-cmove RADARCH4' after the {input1}.  Is there a way to give it one row of the 'A2_SUID' and then make the 'dcmqr' tool think it is done to force it to finish the '-cmove' command & then just loop through the rest of the input file with one row at a time?

  The way the 'dcmqr' tool works is that does a query with against the IP address of 10.10.50.51, then it does its '-cmove' function.  If there are 2000 or even 10,000 rows, it is going to take a long time for anything to start happening (it will ultimately do a dicom move of x-ray studies with the -cmove function).

-hope this is clear as mud.

ps. the 'A2_SUID' is created if it doesn't already exist, but it is appended instead of being overwritten.   Can we make it be overwritten or removed each time?

thanks
0
 
LVL 9

Expert Comment

by:Carlos Ijalba
ID: 40341028
if you want to create it each script run, then just do a

echo "whatever" > $INPUT

at the beggining of the file, that will initialize it, and then to append lines to it

echo "Another whatever" >> $INPUT
0
 
LVL 68

Accepted Solution

by:
woolmilkporc earned 500 total points
ID: 40341213
Let's begin with your "PS" question:

Remove ">> $INPUT" from line 18 ("$UTIL/dcmqr RADARCH2@10.10.50.51:104 ... >> $INPUT") and add "> $INPUT"  to the new "done" statement between lines 18 and 19 ("done > $INPUT").
This way a new, complete file will be written after termination of the whole loop.

Next, "clear as mud" should have good chances to win the "Understatement of the Year" contest.

Let's see what we have now.

- In a first step the script creates the file "A2_SUID" containing (possibly) several lines of data.
- In a further step a loop reads this file  line by line.  The first whitespace-delimited word of each line is then fed as "$input1" to "dcmqr" in line 28, one at each iteration of the loop.  "$input1" will contain just one word, and dcmqr will just "see" this single word and nothing else.

This is exactly the same as your requirement: " a way to give it one row of the 'A2_SUID' and then make the 'dcmqr' tool think it is done to force it to finish the '-cmove' command & then just loop through the rest of the input file with one row at a time".

I assume that I didn't really understand what you're after. I fear you will have to do a bit more mud wrestling to better explain what's it really all about.
0
 

Author Comment

by:doc_jay
ID: 40344269
Thank you very much!  I appreciate all of your comments as they moved me along to sort out my issues.  :)

I ended up running this code:

#!/bin/bash
#created by
UTIL=/cygdrive/c/apps/dcm4che-3.3.3-bin/bin
INPUT=/cygdrive/c/projects/a4fill/2014/A2_SUID
BASEDIR=/cygdrive/c/projects/a4fill
cnt=0
count=0
exec 3<$0 #Save stdin to file descriptor 3.
exec 0<$INPUT # Redirect standard input.
while read input1 rest #Let read split the line instead of running awk
do
 echo "Row $count"
 ((cnt++))
 echo "Moving :" ${input1}
 $UTIL/movescu -c RADARCH2@10.10.50.51:104 -m 0020000D=${input1} --dest RADARCH4
  ((count++))
   if (( cnt == 50 )); then
    echo "Sleeping for 5 Seconds on $(date), I have moved $count exams"
    sleep 5
    cnt=0
   fi
exec 0<$3 #Restore old stdin.
done
echo "Counter:" $count # Show moved items

Open in new window


I moved the operation to 'movescu' because it only does  move instead of query + move.  It is exactly what I needed.

One question about the code above I posted.  My input file that I'm running this against has over 240,000 rows.  If for some reason I needed to stop the script, I would like to pick up where I left off.  How can I change the code to tell it 'start on row 4009' for example?  Or maybe there is a way for it to keep count and start where it leaves off?  I can always scroll up in the output of the console to see what row last echoed back.

thanks
0
 
LVL 68

Assisted Solution

by:woolmilkporc
woolmilkporc earned 500 total points
ID: 40344436
Since you didn't post the part which creates the input file A2_suid - can I assume that this runs OK now?

As for the starting row:

You're using "read" to process the input file, and "read" always starts from the top of stdin.

I fear we must revert to "awk" (or a similar tool) which can be instructed to skip lines.

Remove lines 8, 9 and 22 (the stdin redirections) and change (the old) line 10 to

awk -v START=4009 'NR>=START {print $1}' $INPUT | while read input1
....

Change the "4009" in "-v START=4009" to whatever value is desired.

Alternatively add between (the old) lines 7 and  10 or another convenient place where variables get defined:

START=4009

and change (the old) line 10 to

awk -v START=$START 'NR>=START {print $1}' $INPUT | while read input1
....

Change the "4009" in the newly added "START=4009" statement to your desired value.

Making the script continue where it left off would involve creating an external file to store the current row number. Using external files is generally error-prone, because you could forget to update it before the next (regular) run, or someone could accidentally delete it. Nevertheless, I could help you with that, if desired.
0
How your wiki can always stay up-to-date

Quip doubles as a “living” wiki and a project management tool that evolves with your organization. As you finish projects in Quip, the work remains, easily accessible to all team members, new and old.
- Increase transparency
- Onboard new hires faster
- Access from mobile/offline

 

Author Comment

by:doc_jay
ID: 40344662
wmp,

   I decided to just get the input file from a MSSQL query and output it to a file.  Thanks for your input and I'll try it out soon because its just inevitable that I'll have to interrupt the script.  :)
0
 
LVL 9

Expert Comment

by:Carlos Ijalba
ID: 40344812
doc_jay,

What you will have to do is implement TRAPS, so you can intercept signals that come from the OS, and act accordingly, that way you will always have control if any interruption occurs (apart from a power outage).

If you want to see some good examples, this one from Aaron Maxwell is a winner:


And in one of the traps you handle your row counters in a external file.

Also you will have to implement a locking file with your process PID to detect multiple runs of the same script, and only let one run at the same time.
0
 

Author Comment

by:doc_jay
ID: 40350068
wmp,

    In your answer above about using awk to start at a certain row, is there a way to have a 'start' and 'end'?  Reason being is that I was hoping to have only one input file with around 240,000 rows and multiple scripts reading that input file.

For example:
One script that would read from 1 to 20000
2nd script that would read from 20001 to 50000
0
 
LVL 68

Expert Comment

by:woolmilkporc
ID: 40350276
START=1
END=20000
awk -v START=$START -v END=$END 'NR>=START&&NR<=END {print $1}' $INPUT

START=20001
END=50000
awk -v START=$START -v END=$END 'NR>=START&&NR<=END {print $1}' $INPUT
0
 

Author Comment

by:doc_jay
ID: 40350368
wmp,

   thanks for the help on this again.  It came back with an error.

'awk:  fatal:  cannot use gawk builtin 'END' as variable name

EDIT:  I changed the variable to 'ENDING' and it now works.

Also, is there a way to make the row count echo back the current row it is working on?  As of right now, it starts on say row 13677 and its first echo is 'Row 0'

here is what I am using as of now:

#!/bin/bash
UTIL=/cygdrive/c/apps/dcm4che-3.3.3-bin/bin
INPUT=/cygdrive/c/projects/a4fill/2014/A2_SUID
BASEDIR=/cygdrive/c/projects/a4fill
cnt=0
count=0
START=13677
ENDING=50000
awk -v START=$START -v ENDING=$ENDING 'NR>=START&&NR<=ENDING {print $1}' $INPUT  | while read input1  #starting at a certain row
do
 echo "Row $count"
 ((cnt++))
 echo "Moving :" ${input1}
 $UTIL/movescu -c RADARCH2@10.10.50.51:104 -m 0020000D=${input1} --dest RADARCH4
  ((count++))
   if (( cnt == 50 )); then
    echo "Sleeping for 5 Seconds on $(date), I have moved $count exams"
    sleep 5
    cnt=0
   fi
exec 0<$3 #Restore old stdin.
done
echo "Counter:" $count # Show moved items

Open in new window

0
 
LVL 68

Expert Comment

by:woolmilkporc
ID: 40350448
Replace
 
count=0
START=13677

with

START=13677
count=$START

i. e. initialize the counter correctly.

At the end you should replace
echo "Counter: " $count # Show moved items
with
echo "Counter: " $((count-START)) # Show moved items
i. e. subtract the initial value from the final value (which is $ENDING+1) to get an exact count.

Sorry for the "END" thing, didn't think of that. In fact, I always use shorter variable names in awk, the long names were just for illustration. The version I tested was like this:

START=20
END=50
awk -v S=$START -v E=$END 'NR>=S&&NR<=E {print $1}' $INPUT
0
 

Author Comment

by:doc_jay
ID: 40350579
wmp,

   thanks for explanation, I've decided to do it the way you have in your last example, works out great.  Also, I can now run more than one script and read only one large input file.  In doing so, I've started to use the 'screen' utility but I can't scroll up in the buffer to read the last row that it echoed any more.

Question:
How could I write the current row its working on into an output file so that I could just open it up to refer to if I need to know what row the script is working on?

EDIT:  nevermind, looks like there are some commands for screen to use to scroll through the output.  -thanks again
0
 
LVL 68

Expert Comment

by:woolmilkporc
ID: 40351137
Log the row count and keep displaying it on the terminal:

echo "Row $count" | tee -a outputfile

Log the entire script's output and keep displaying it on the terminal:

scriptname | tee outputfile
0
 

Author Comment

by:doc_jay
ID: 40352573
wmp,

   thanks!
0

Featured Post

IT, Stop Being Called Into Every Meeting

Highfive is so simple that setting up every meeting room takes just minutes and every employee will be able to start or join a call from any room with ease. Never be called into a meeting just to get it started again. This is how video conferencing should work!

Join & Write a Comment

If you have a server on collocation with the super-fast CPU, that doesn't mean that you get it running at full power. Here is a preamble. When doing inventory of Linux servers, that I'm administering, I've found that some of them are running on l…
Join Greg Farro and Ethan Banks from Packet Pushers (http://packetpushers.net/podcast/podcasts/pq-show-93-smart-network-monitoring-paessler-sponsored/) and Greg Ross from Paessler (https://www.paessler.com/prtg) for a discussion about smart network …
Learn several ways to interact with files and get file information from the bash shell. ls lists the contents of a directory: Using the -a flag displays hidden files: Using the -l flag formats the output in a long list: The file command gives us mor…
Learn how to find files with the shell using the find and locate commands. Use locate to find a needle in a haystack.: With locate, check if the file still exists.: Use find to get the actual location of the file.:

743 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

10 Experts available now in Live!

Get 1:1 Help Now