WebDvlp
asked on
Find string between two tags
Given the following file:
(...)blablbla start first string stop blabla start_second string.stop blabla(...)
How can I extract the string between the SECOND occurance of the words "start" and "stop" (should return "_second string.").
GREP seems to return only lines, so I guess the solution lies in SED...?
(...)blablbla start first string stop blabla start_second string.stop blabla(...)
How can I extract the string between the SECOND occurance of the words "start" and "stop" (should return "_second string.").
GREP seems to return only lines, so I guess the solution lies in SED...?
Also, a second option is to use perl. For example:
echo $string | perl -ne '/.*start(.*)stop/; print "$1\n"'
echo $string | perl -ne '/.*start(.*)stop/; print "$1\n"'
That gives the the string between the last occurrence of the words "start" and "stop" on each line
if you want the second on each line
echo $string | perl -ne 'print ((/start(.*)stop/g)[1]'
if you want the second on each line
echo $string | perl -ne 'print ((/start(.*)stop/g)[1]'
Sorry, I meant to type
echo $string | perl -ne 'print ((/start(.*?)stop/g)[1]'
echo $string | perl -ne 'print ((/start(.*?)stop/g)[1]'
Sorry, I meant to type
echo $string | perl -ne 'print ((/start(.*?)stop/g)[1])'
echo $string | perl -ne 'print ((/start(.*?)stop/g)[1])'
ASKER
Sorry guys, no PERL.
The first solution someone came up with is:
string=`grep $searchpattern $file`
echo $string | sed -e 's/.*start//; s/stop.*//'
But what needs to be put in $searchpattern?? Also, there are no line breaks that can be assumed...
The first solution someone came up with is:
string=`grep $searchpattern $file`
echo $string | sed -e 's/.*start//; s/stop.*//'
But what needs to be put in $searchpattern?? Also, there are no line breaks that can be assumed...
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Does that mean we can assume there are no line breaks?
That there are no line breaks in between the first and second occurrences of start stop pairs?
That there are exactly two start stop pairs?
That start and stop only occur as part of a pair?
If not perl, would awk be acceptable?
Would we be allowed to insert line breaks as a step toward a solution?
That there are no line breaks in between the first and second occurrences of start stop pairs?
That there are exactly two start stop pairs?
That start and stop only occur as part of a pair?
If not perl, would awk be acceptable?
Would we be allowed to insert line breaks as a step toward a solution?
awk 'BEGIN{ORS=" "}{gsub(/stop/,"\n")}1' file | grep start | head -2 | tail +2 | sed s/.*start//
ASKER
This did the trick, thanks.
string=`grep $searchpattern $file`
echo $string | sed -e 's/.*start//; s/stop.*//'