Solved

Extract using grep - shell script

Posted on 2013-01-27
12
505 Views
Last Modified: 2013-02-04
Hi All,

   I have a webpage from which I want to extract a particular value and get in to a variable

For ex:

from the below webpage,

URL="http://www.tcmb.gov.tr/kurlar/`date +"%Y%m"`/`date +"%d%m%Y"`.html"
echo $URL

it will return:

http://www.tcmb.gov.tr/kurlar/201301/27012013.html

and from the webpage using grep command, want to extract 2.7882 from the below line in to a variable.

GBP/TRY  1 INGILIZ STERLINI               2.7882       2.8028          2.7862       2.8070

Any ideas...

grep for GBP/TRY and get the value 2.7882 in to a variable..

THankS
AMan
0
Comment
Question by:amankhan2005
  • 6
  • 5
12 Comments
 
LVL 84

Expert Comment

by:ozo
Comment Utility
I get
ERROR 404: Not Found.
when I try to connect to that URL, so I'm not sure how the data is supposed to be presented on the site, but assuming it literally looks like
GBP/TRY  1 INGILIZ STERLINI               2.7882       2.8028          2.7862       2.8070
you might try something like
VARIABLE=`wget $URL | awk '$1=="GBP/TRY"{print $5}'`
0
 
LVL 68

Expert Comment

by:woolmilkporc
Comment Utility
VARIABLE=$(wget -O - $URL |awk '/^GBP\/TRY/ {print $5}')
0
 

Author Comment

by:amankhan2005
Comment Utility
0
 
LVL 68

Expert Comment

by:woolmilkporc
Comment Utility
So ...

URL="http://www.tcmb.gov.tr/kurlar/201301/25012013.html"

VARIABLE=$(wget -O - $URL 2>/dev/null |awk '/^GBP\/TRY/ {print $5}')

echo $VARIABLE

2.7732
0
 
LVL 68

Expert Comment

by:woolmilkporc
Comment Utility
The below version will also work for "EUR/TRY" (where the currency name has only one word instead of two) and for "SAR/TRY" (where the currency name has three words).
In fact, it extracts the first field containing a dot "." surrounded by digits, so that the format of the currency name doesn't matter at all:

URL="http://www.tcmb.gov.tr/kurlar/201301/25012013.html"
VARIABLE=$(wget -O - $URL 2>/dev/null |awk '/^GBP\/TRY/ {for(F=1;F<=NF;F++) if($F~"[0-9]\\.[0-9]") {print $F;exit}}')
echo $VARIABLE

=> 2.7732

URL="http://www.tcmb.gov.tr/kurlar/201301/25012013.html"
VARIABLE=$(wget -O - $URL 2>/dev/null |awk '/^EUR\/TRY/ {for(F=1;F<=NF;F++) if($F~"[0-9]\\.[0-9]") {print $F;exit}}')
echo $VARIABLE

=> 2.3616

URL="http://www.tcmb.gov.tr/kurlar/201301/25012013.html"
VARIABLE=$(wget -O - $URL 2>/dev/null |awk '/^SAR\/TRY/ {for(F=1;F<=NF;F++) if($F~"[0-9]\\.[0-9]") {print $F;exit}}')
echo $VARIABLE

=> 0.46993

Open in new window

0
 

Author Comment

by:amankhan2005
Comment Utility
Hi Wool,

    There is a catch here...

    For days saturday and sunday, the rates will not be generated so the weblinks for the below saturdays and sundays will not have any data ...... In such cases, we need to take the friday's rates if the day falls under saturday and sunday...

for ex:

http://www.tcmb.gov.tr/kurlar/201301/26012013.html
http://www.tcmb.gov.tr/kurlar/201301/27012013.html

will not have data...

if saturday and sunday, then take friday value from the link:

http://www.tcmb.gov.tr/kurlar/201301/25012013.html

We might need to write an If statement to check the day if saturday or sunday and get the link linked to

http://www.tcmb.gov.tr/kurlar/201301/25012013.html

day - 1 for saturday
day - 2 for sunday

Any ideas,

Thanks
0
Highfive Gives IT Their Time Back

Highfive is so simple that setting up every meeting room takes just minutes and every employee will be able to start or join a call from any room with ease. Never be called into a meeting just to get it started again. This is how video conferencing should work!

 
LVL 68

Expert Comment

by:woolmilkporc
Comment Utility
Yes, we must create an if condition. The date manipuation below is in a way "portable" because it doesn't use GNU date which has other mechanisms, but is not available on all platforms.

export OTZ=$TZ
if [[ $(date "+%u") -eq 6 ]] ; then
 export TZ=GMT+25
  elif [[ $(date "+%u") -eq 7 ]] ; then
   export TZ=GMT+49
fi
URL="http://www.tcmb.gov.tr/kurlar/$(date +"%Y%m/%d%m%Y").html"
echo $URL
export TZ=$OTZ
VARIABLE=$(wget -O - $URL 2>/dev/null |awk '/^GBP\/TRY/ {for(F=1;F<=NF;F++) if($F~"[0-9]\\.[0-9]") {print $F;exit}}')
echo $VARIABLE

Open in new window

Testing shows that there is no web page for 30/01/2013 yet. Is this correct/expected?

It seems that the page isn't generated before 15:30. To take this into account (pull the previous day if it's before that time) change line 2 like this:

...
if [[ $(date "+%u") -eq 6 || $(date "+%H%M") -lt 1530 ]] ; then
...

Open in new window

0
 

Author Comment

by:amankhan2005
Comment Utility
Hi WoolMilk,

  The sat and sun weblinks are taking the same and not the friday's one...

  today is sunday and when I echo $URL its still taking the same 02022013 when it should take 31012013

the TZ variable you are using , should it be used anywhere in the URL to replace it with the old date..


Thanks
0
 

Author Comment

by:amankhan2005
Comment Utility
today i.e sunday when i typed:

$(date "+%u")
it gave output: 7

TZ=GMT+49

date
[root@martian ~]# date
Sat Feb  2 05:53:15 GMT 2013

Where as it should show Fri Feb 1 05:3:15 GMT 2013

Any ideas...
Thanks
0
 
LVL 68

Expert Comment

by:woolmilkporc
Comment Utility
Strange.

Here is what I get:

# date
Sun Feb  3 10:50:11 GMT+01:00 2013

# export TZ=GMT+49
# date
Fri Feb  1 08:50:21 GMT 2013

# export TZ=GMT+25
# date
Sat Feb  2 08:51:39 GMT 2013

This is the same with GNU date and AIX date.

Please remember that you must "export" the TZ variable (which my script does) to make "date" recognize the change!

Which is your OS?

I heard that some Linuxes don't support GMT differences greater than 24!
If this is the case in your OS we will have to use the "--date" feature of GNU date.

Which "date" implementation are you using?
0
 
LVL 68

Accepted Solution

by:
woolmilkporc earned 500 total points
Comment Utility
Here is the "GNU date" variant:

if [[ $(date "+%u") -eq 6 ]] ; then
  AGO=1
   elif [[ $(date "+%u") -eq 7 ]] ; then
    AGO=2
     else
      AGO=0
fi
URL="http://www.tcmb.gov.tr/kurlar/$(date -d "$AGO days ago" +"%Y%m/%d%m%Y").html"
echo $URL
VARIABLE=$(wget -O - $URL 2>/dev/null |awk '/^GBP\/TRY/ {for(F=1;F<=NF;F++) if($F~"[0-9]\\.[0-9]") {print $F;exit}}')
echo $VARIABLE

Open in new window

If you want to implement the "15:30" thing please use the below code.
My first suggestion in this regard didn't cover all possibilities.

D=$(date "+%u")
HM=$(date "+%H%M")
if [[ $D -eq 6 || ( $HM -lt 1530 && ( $D -ne 1 && $D -ne 7 ) ) ]] ; then
 AGO=1
  elif [[ $D -eq 7 ]] ; then
   AGO=2
    elif [[ $D -eq 1 && $HM -lt 1530 ]] ; then
     AGO=3
       else
        AGO=0
fi
URL="http://www.tcmb.gov.tr/kurlar/$(date -d "$AGO days ago" +"%Y%m/%d%m%Y").html"
VARIABLE=$(wget -O - $URL 2>/dev/null |awk '/^GBP\/TRY/ {for(F=1;F<=NF;F++) if($F~"[0-9]\\.[0-9]") {print $F;exit}}')
echo $VARIABLE

Open in new window



wmp
0
 

Author Closing Comment

by:amankhan2005
Comment Utility
YOU ROCK ...... WOOLSILK....... HATS OFF......GR8.....AWESUM......
0

Featured Post

IT, Stop Being Called Into Every Meeting

Highfive is so simple that setting up every meeting room takes just minutes and every employee will be able to start or join a call from any room with ease. Never be called into a meeting just to get it started again. This is how video conferencing should work!

Join & Write a Comment

SSH (Secure Shell) - Tips and Tricks As you all know SSH(Secure Shell) is a network protocol, which we use to access/transfer files securely between two networked devices. SSH was actually designed as a replacement for insecure protocols that sen…
Active Directory replication delay is the cause to many problems.  Here is a super easy script to force Active Directory replication to all sites with by using an elevated PowerShell command prompt, and a tool to verify your changes.
Learn several ways to interact with files and get file information from the bash shell. ls lists the contents of a directory: Using the -a flag displays hidden files: Using the -l flag formats the output in a long list: The file command gives us mor…
Learn how to find files with the shell using the find and locate commands. Use locate to find a needle in a haystack.: With locate, check if the file still exists.: Use find to get the actual location of the file.:

743 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

16 Experts available now in Live!

Get 1:1 Help Now