Solved

Extract using grep - shell script

Posted on 2013-01-27
12
516 Views
Last Modified: 2013-02-04
Hi All,

   I have a webpage from which I want to extract a particular value and get in to a variable

For ex:

from the below webpage,

URL="http://www.tcmb.gov.tr/kurlar/`date +"%Y%m"`/`date +"%d%m%Y"`.html"
echo $URL

it will return:

http://www.tcmb.gov.tr/kurlar/201301/27012013.html

and from the webpage using grep command, want to extract 2.7882 from the below line in to a variable.

GBP/TRY  1 INGILIZ STERLINI               2.7882       2.8028          2.7862       2.8070

Any ideas...

grep for GBP/TRY and get the value 2.7882 in to a variable..

THankS
AMan
0
Comment
Question by:amankhan2005
  • 6
  • 5
12 Comments
 
LVL 84

Expert Comment

by:ozo
ID: 38824287
I get
ERROR 404: Not Found.
when I try to connect to that URL, so I'm not sure how the data is supposed to be presented on the site, but assuming it literally looks like
GBP/TRY  1 INGILIZ STERLINI               2.7882       2.8028          2.7862       2.8070
you might try something like
VARIABLE=`wget $URL | awk '$1=="GBP/TRY"{print $5}'`
0
 
LVL 68

Expert Comment

by:woolmilkporc
ID: 38824290
VARIABLE=$(wget -O - $URL |awk '/^GBP\/TRY/ {print $5}')
0
 

Author Comment

by:amankhan2005
ID: 38824343
0
Microsoft Certification Exam 74-409

Veeam® is happy to provide the Microsoft community with a study guide prepared by MVP and MCT, Orin Thomas. This guide will take you through each of the exam objectives, helping you to prepare for and pass the examination.

 
LVL 68

Expert Comment

by:woolmilkporc
ID: 38824371
So ...

URL="http://www.tcmb.gov.tr/kurlar/201301/25012013.html"

VARIABLE=$(wget -O - $URL 2>/dev/null |awk '/^GBP\/TRY/ {print $5}')

echo $VARIABLE

2.7732
0
 
LVL 68

Expert Comment

by:woolmilkporc
ID: 38824662
The below version will also work for "EUR/TRY" (where the currency name has only one word instead of two) and for "SAR/TRY" (where the currency name has three words).
In fact, it extracts the first field containing a dot "." surrounded by digits, so that the format of the currency name doesn't matter at all:

URL="http://www.tcmb.gov.tr/kurlar/201301/25012013.html"
VARIABLE=$(wget -O - $URL 2>/dev/null |awk '/^GBP\/TRY/ {for(F=1;F<=NF;F++) if($F~"[0-9]\\.[0-9]") {print $F;exit}}')
echo $VARIABLE

=> 2.7732

URL="http://www.tcmb.gov.tr/kurlar/201301/25012013.html"
VARIABLE=$(wget -O - $URL 2>/dev/null |awk '/^EUR\/TRY/ {for(F=1;F<=NF;F++) if($F~"[0-9]\\.[0-9]") {print $F;exit}}')
echo $VARIABLE

=> 2.3616

URL="http://www.tcmb.gov.tr/kurlar/201301/25012013.html"
VARIABLE=$(wget -O - $URL 2>/dev/null |awk '/^SAR\/TRY/ {for(F=1;F<=NF;F++) if($F~"[0-9]\\.[0-9]") {print $F;exit}}')
echo $VARIABLE

=> 0.46993

Open in new window

0
 

Author Comment

by:amankhan2005
ID: 38833840
Hi Wool,

    There is a catch here...

    For days saturday and sunday, the rates will not be generated so the weblinks for the below saturdays and sundays will not have any data ...... In such cases, we need to take the friday's rates if the day falls under saturday and sunday...

for ex:

http://www.tcmb.gov.tr/kurlar/201301/26012013.html
http://www.tcmb.gov.tr/kurlar/201301/27012013.html

will not have data...

if saturday and sunday, then take friday value from the link:

http://www.tcmb.gov.tr/kurlar/201301/25012013.html

We might need to write an If statement to check the day if saturday or sunday and get the link linked to

http://www.tcmb.gov.tr/kurlar/201301/25012013.html

day - 1 for saturday
day - 2 for sunday

Any ideas,

Thanks
0
 
LVL 68

Expert Comment

by:woolmilkporc
ID: 38834705
Yes, we must create an if condition. The date manipuation below is in a way "portable" because it doesn't use GNU date which has other mechanisms, but is not available on all platforms.

export OTZ=$TZ
if [[ $(date "+%u") -eq 6 ]] ; then
 export TZ=GMT+25
  elif [[ $(date "+%u") -eq 7 ]] ; then
   export TZ=GMT+49
fi
URL="http://www.tcmb.gov.tr/kurlar/$(date +"%Y%m/%d%m%Y").html"
echo $URL
export TZ=$OTZ
VARIABLE=$(wget -O - $URL 2>/dev/null |awk '/^GBP\/TRY/ {for(F=1;F<=NF;F++) if($F~"[0-9]\\.[0-9]") {print $F;exit}}')
echo $VARIABLE

Open in new window

Testing shows that there is no web page for 30/01/2013 yet. Is this correct/expected?

It seems that the page isn't generated before 15:30. To take this into account (pull the previous day if it's before that time) change line 2 like this:

...
if [[ $(date "+%u") -eq 6 || $(date "+%H%M") -lt 1530 ]] ; then
...

Open in new window

0
 

Author Comment

by:amankhan2005
ID: 38848012
Hi WoolMilk,

  The sat and sun weblinks are taking the same and not the friday's one...

  today is sunday and when I echo $URL its still taking the same 02022013 when it should take 31012013

the TZ variable you are using , should it be used anywhere in the URL to replace it with the old date..


Thanks
0
 

Author Comment

by:amankhan2005
ID: 38848021
today i.e sunday when i typed:

$(date "+%u")
it gave output: 7

TZ=GMT+49

date
[root@martian ~]# date
Sat Feb  2 05:53:15 GMT 2013

Where as it should show Fri Feb 1 05:3:15 GMT 2013

Any ideas...
Thanks
0
 
LVL 68

Expert Comment

by:woolmilkporc
ID: 38848229
Strange.

Here is what I get:

# date
Sun Feb  3 10:50:11 GMT+01:00 2013

# export TZ=GMT+49
# date
Fri Feb  1 08:50:21 GMT 2013

# export TZ=GMT+25
# date
Sat Feb  2 08:51:39 GMT 2013

This is the same with GNU date and AIX date.

Please remember that you must "export" the TZ variable (which my script does) to make "date" recognize the change!

Which is your OS?

I heard that some Linuxes don't support GMT differences greater than 24!
If this is the case in your OS we will have to use the "--date" feature of GNU date.

Which "date" implementation are you using?
0
 
LVL 68

Accepted Solution

by:
woolmilkporc earned 500 total points
ID: 38848376
Here is the "GNU date" variant:

if [[ $(date "+%u") -eq 6 ]] ; then
  AGO=1
   elif [[ $(date "+%u") -eq 7 ]] ; then
    AGO=2
     else
      AGO=0
fi
URL="http://www.tcmb.gov.tr/kurlar/$(date -d "$AGO days ago" +"%Y%m/%d%m%Y").html"
echo $URL
VARIABLE=$(wget -O - $URL 2>/dev/null |awk '/^GBP\/TRY/ {for(F=1;F<=NF;F++) if($F~"[0-9]\\.[0-9]") {print $F;exit}}')
echo $VARIABLE

Open in new window

If you want to implement the "15:30" thing please use the below code.
My first suggestion in this regard didn't cover all possibilities.

D=$(date "+%u")
HM=$(date "+%H%M")
if [[ $D -eq 6 || ( $HM -lt 1530 && ( $D -ne 1 && $D -ne 7 ) ) ]] ; then
 AGO=1
  elif [[ $D -eq 7 ]] ; then
   AGO=2
    elif [[ $D -eq 1 && $HM -lt 1530 ]] ; then
     AGO=3
       else
        AGO=0
fi
URL="http://www.tcmb.gov.tr/kurlar/$(date -d "$AGO days ago" +"%Y%m/%d%m%Y").html"
VARIABLE=$(wget -O - $URL 2>/dev/null |awk '/^GBP\/TRY/ {for(F=1;F<=NF;F++) if($F~"[0-9]\\.[0-9]") {print $F;exit}}')
echo $VARIABLE

Open in new window



wmp
0
 

Author Closing Comment

by:amankhan2005
ID: 38853885
YOU ROCK ...... WOOLSILK....... HATS OFF......GR8.....AWESUM......
0

Featured Post

Master Your Team's Linux and Cloud Stack!

The average business loses $13.5M per year to ineffective training (per 1,000 employees). Keep ahead of the competition and combine in-person quality with online cost and flexibility by training with Linux Academy.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

SSH (Secure Shell) - Tips and Tricks As you all know SSH(Secure Shell) is a network protocol, which we use to access/transfer files securely between two networked devices. SSH was actually designed as a replacement for insecure protocols that sen…
Fine Tune your automatic Updates for Ubuntu / Debian
Connecting to an Amazon Linux EC2 Instance from Windows Using PuTTY.
This demo shows you how to set up the containerized NetScaler CPX with NetScaler Management and Analytics System in a non-routable Mesos/Marathon environment for use with Micro-Services applications.

772 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question