Solved

Sed: -e expression #1, char 16: unterminated address regex

Posted on 2016-09-29
20
47 Views
Last Modified: 2016-11-01
I am trying to grep for a particular text (Do action on cell BL330) in a text file(sample.gz) which is searched in the content filtered by date+timestamp (2016-09-14 01:09:56,796 to 2016-09-15 04:10:29,719) on a remote machine and finally write the output into a output file on a local machine.

Few details of variables passed as a parameter:
server_id = hostname
first_line_log_file =sample.gz
first_line_date_time=2016-09-14 01:09:56,796
last_log_first_line=2016-09-15 04:10:29,719
$do_action_on_cell_2=Do action on cell BL330

ssh -q -o "StrictHostKeyChecking no" $server_id "cd /intucell/data/logs/app; zcat $first_line_log_file | sed -rne '/$first_line_date_time/,/$first_log_first_line/p'| zgrep -A10 -i '$do_action_on_cell_2';" >> ./output.log

However, on executing the above i get the below error

sed: -e expression #1, char 62: unterminated address regex

I think the problem is with sed expression, Please suggest a way forward.
0
Comment
Question by:rbadveti R
  • 9
  • 8
  • 2
20 Comments
 
LVL 68

Expert Comment

by:woolmilkporc
Comment Utility
I assume that all variables are defined to the local shell (the one where "ssh" is issued) and thus are not known to the remote machine "$server_id".

If I'm right those variables may not be enclosed in single quotes, because the local shell will then not resolve them.

So use double quotes.

Next, since you're already using zcat to decompress the data you don't have to use "zgrep", "grep" alone will suffice.

ssh -q -o "StrictHostKeyChecking no" $server_id "cd /intucell/data/logs/app; zcat $first_line_log_file | sed -rne "/$first_line_date_time/,/$first_log_first_line/p"| grep -A10 -i "$do_action_on_cell_2'"" >> ./output.log
0
 

Author Comment

by:rbadveti R
Comment Utility
Hi woolmilkporc,

Thanks for the reply.

Yes, all the variables used are defined local to the shell and not known to the remote machine.

Double quote is already present after the "$server_id" - so it takes care of variable expansion I  believe.

I have tried the command you mentioned - it did not work. please see result below with -vx trace:

+ ssh -q -o 'StrictHostKeyChecking no' deb011 'cd /intucell/data/logs/app; grep -i '\''Cell BLA330 is down'\'' sample.gz; zgrep -i '\''Cell BLA330 is down'\'' sample.gz;'
+ ssh -q -o 'StrictHostKeyChecking no' deb011 'cd /intucell/data/logs/app; zcat sample.gz | sed -rne /2016-09-14' '01:09:56,796/,/2016-09-14' '01:46:56,438/p | grep -A10 -i Do' action on cell 'BLA330;'
grep: action: No such file or directory
grep: on: No such file or directory
grep: cell: No such file or directory
grep: BLA330: No such file or directory
sed: -e expression #1, char 16: unterminated address regex

Appreciate further checking on this.
Thanks.
0
 
LVL 68

Expert Comment

by:woolmilkporc
Comment Utility
>> Double quote is already present after the "$server_id"  <<
Yes, but later you enclosed the sed command and the grep string in single quotes.
The shell does not expand variables between single quotes.

Did you enter my command exactly as posted?
It contains a tiny but important error, sorry.
There is an extra single quote near the end which I forgot to remove:

... | grep -A10 -i "$do_action_on_cell_2"" >> ./output.log


You could also try this:

ssh -q -o "StrictHostKeyChecking no" $server_id 'cd /intucell/data/logs/app; zcat '$first_line_log_file' | sed -ne "/'$first_line_date_time'/,/'$first_log_first_line'/p"| grep -A10 -i "'$do_action_on_cell_2'"' >> ./output.log

I used single quotes around the whole ssh command and took the variables out of those quotes.
It's a bit hard to read, but this time all quotes should match (I hope).

You should also omit the "-r" flag of "sed". There are no extended regex present, as far as I can see.
0
 

Author Comment

by:rbadveti R
Comment Utility
Hi,

Tried the command and below is the output:

ssh -q -o "StrictHostKeyChecking no" $server_id 'cd /intucell/data/logs/app; zcat '$first_line_log_file' | sed -rne "/'$first_line_date_time'/,/'$first_log_first_line'/p"| grep -A10 -i "'$do_action_on_cell_2'"'

+ ssh -q -o 'StrictHostKeyChecking no' deb011 'cd /intucell/data/logs/app; zcat sample.gz | sed -rne "/2016-09-14' '01:09:56,796/,/2016-09-14' '01:46:56,438/p"| grep -A10 -i "Do' action on cell 'BLA330"'
sed: -e expression #1, char 62: unterminated address regex


I have used -vx option for debug trace after the server_id and below is the command and output for the same:

+ ssh -q -o 'StrictHostKeyChecking no' deb011 -vx 'cd /intucell/data/logs/app; zcat sample.gz | sed -rne "/2016-09-14' '01:09:56,796/,/2016-09-14' '01:46:56,438/p"| grep -A10 -i "Do' action on cell 'BLA330"'
OpenSSH_6.0p1 Debian-4+deb7u2, OpenSSL 1.0.1e 11 Feb 2013
debug1: Reading configuration data /etc/ssh/ssh_config
debug1: /etc/ssh/ssh_config line 19: Applying options for *
debug1: Connecting to deb011 [x.x.x.x] port 22.
...
debug1: Sending command: cd /intucell/data/logs/app; zcat sample.gz | sed -rne "/\033[36m2016-09-14 01:09:56,796/,/\033[32m2016-09-14 01:46:56,438/p"| grep -A10 -i "Do action on cell BLA330";
sed: -e expression #1, char 62: unterminated address regex
debug1: client_input_channel_req: channel 0 rtype exit-status reply 0
debug1: client_input_channel_req: channel 0 rtype eow@openssh.com reply 0
debug1: channel 0: free: client-session, nchannels 1
debug1: fd 1 clearing O_NONBLOCK
Transferred: sent 2888, received 2000 bytes, in 0.0 seconds
Bytes per second: sent 536288.6, received 371391.0
debug1: Exit status 1

From the output above, i feel the below part is suspicious in sed expression:
'/\033[36m2016-09-14 01:09:56,796/,/\033[32m2016-09-14 01:46:56,438/p'

Any idea how to get rid of these special characters?  033[36m and 033[32m
0
 
LVL 68

Expert Comment

by:woolmilkporc
Comment Utility
Those are ANSI color escape sequences for "cyan" and "green".
These sequences are definitely the cause of the error - I just tested it.

Do your variables contain highlighting??

If so, rewrite the strings "by hand" and retry.
0
 

Author Comment

by:rbadveti R
Comment Utility
I don't know how to confirm on that but yes, the .gz files which I use are displayed in a different highlighted font to the other normal files when listed (ls -lrt)

I have derived one of the sed variable (first_log_first_line) as below:

ssh -q -o "StrictHostKeyChecking no" $server_id "cd /intucell/data/logs/app; zcat sample.gz | tail -1| cut -d' ' -f1-2;" > ./first_line_output.txt

first_log_first_line=$(head -1 first_line_output.txt)

From the above, you can read that I am picking up the last line of the sample.gz file and picking up only the Date and time of that line using cut operation. Possibly the color escape sequence would have derived from there.

Any idea how could these be checked and removed?
0
 
LVL 68

Expert Comment

by:woolmilkporc
Comment Utility
When creating the variables from an "ls" listing you must switch off color highlighting:

ls -lrt --color=never ... ...

But ... it rather seems that it's the file "sample.gz" which contains this highlighting. What do you see with "zcat sample.gz"?
How did you (or somebody else) create that .gz file?
0
 
LVL 68

Accepted Solution

by:
woolmilkporc earned 500 total points (awarded by participants)
Comment Utility
If there is no other way we can strip the color codes using a small Perl one-liner:

X=$(echo $X | perl -pe 's/\e\[[\d;]*m//g;')

so

first_line_date_time=$(echo $first_line_date_time | perl -pe 's/\e\[[\d;]*m//g;')
and
first_log_first_line=$(echo $first_log_first_line | perl -pe 's/\e\[[\d;]*m//g;')

Acknowledgement: I got the Perl regex from the source of the module Term::ANSIColor, see here ("colorstrip"):
https://metacpan.org/source/RRA/Term-ANSIColor-4.05/lib/Term/ANSIColor.pm
Special thanks to Kurt Starsinic!
0
 

Author Comment

by:rbadveti R
Comment Utility
Thanks a ton to you- woolmilkporc and Kurt Starsinic - It really helped.

I had no clue for the last 4 days what was going wrong until this day - All thanks to the color magic :)

It was nice learning though. Thank you again - Have a good day!
0
Threat Intelligence Starter Resources

Integrating threat intelligence can be challenging, and not all companies are ready. These resources can help you build awareness and prepare for defense.

 
LVL 68

Expert Comment

by:woolmilkporc
Comment Utility
Glad I (we) could help!
Kurt is the author of TERM::ANSIColor and wasn't (directly) involved in answering this question!

wmp
0
 

Author Comment

by:rbadveti R
Comment Utility
However, I have a new requirement - explained as below:

I have to sort only the lines containing the date+time.

Sample text file as below:

========== server_1 ==========
sample1.gz: 2016-09-14 02:55:30,799
sample2.gz: 2016-09-14 01:09:56,796
sample2.gz: 2016-09-14 01:09:57,927
========== server_2 ==========
========== server_3 ==========
========== server_1 ==========
sample1.gz: 2016-09-14 02:55:31,250
sample1.gz: 2016-09-14 02:55:31,777

Open in new window



Desired output:

========== server_1 ==========
sample2.gz: 2016-09-14 01:09:56,796
sample2.gz: 2016-09-14 01:09:57,927
sample1.gz: 2016-09-14 02:55:30,799
sample1.gz: 2016-09-14 02:55:31,250
sample1.gz: 2016-09-14 02:55:31,777
========== server_2 ==========
========== server_3 ==========

Open in new window


Please suggest.
0
 
LVL 68

Expert Comment

by:woolmilkporc
Comment Utility
This is a whole new problem!
It's not as simple as it might look at first sight,
so I'm sure it would have been worth asking a new question!
Since you're new here on EE I'll provide a solution nonetheless,
but please keep in mind for the future to ask a new question for any new problem!

The following solution needs to create a bunch of work files in /tmp
whose names should be fairly unique and which will be removed after completion.

The only thing you have to change is the assignment of the "IN" variable
which designates your input file name. The script creates a sorted output file
named "input file name.sorted" in the same directory.

Please note that the intermediate header lines must start with
at least three equal signs ("===") as in your sample, otherwise the code will not work!

Have fun!

#!/bin/bash
IN=myfile.txt   # <-- Adjust here!
# no customization required below this line
OUT=${IN}.sorted
SUF=$$
awk -v SUF=$SUF '{if($0~"^===")
                  {A=$0;if(S[A]!~"") S[A]=" "}
                  if($0!~"^===") S[A]=S[A]"\n"$0 }
             END {l=asorti(S,T);
                  for(n=1;n<=l;n++)
                  print T[n] "\n" S[T[n]] | "grep -v \"^$\" > /tmp/"n"-"SUF".out"}' $IN
for tempfile in $(ls /tmp/*${SUF}*)
    do
     head -1 $tempfile; sed 1d $tempfile | sort -k2,3
    done > $OUT
rm /tmp/*${SUF}*

Open in new window

0
 

Author Comment

by:rbadveti R
Comment Utility
Hi,

Yes, I agree a new question should have been raised - will follow next time. Thanks.

I ran the script shared - however i see below error for the same:

awk: line 7: function asorti never defined
ls: cannot access /tmp/*4134*: No such file or directory
rm: cannot remove `/tmp/*4134*': No such file or directory

Open in new window


with -vx option:

IN=datetime.txt   # <-- Adjust here!
+ IN=datetime.txt
# no customization required below this line
OUT=${IN}.sorted
+ OUT=datetime.txt.sorted
SUF=$$
+ SUF=10985
awk -v SUF=$SUF '{if($0~"^===")
                  {A=$0;if(S[A]!~"") S[A]=" "}
                  if($0!~"^===") S[A]=S[A]"\n"$0 }
             END {l=asorti(S,T);
                  for(n=1;n<=l;n++)
                  print T[n] "\n" S[T[n]] | "grep -v \"^$\" > /tmp/"n"-"SUF".out"}' $IN
+ awk -v SUF=10985 '{if($0~"^===")
                  {A=$0;if(S[A]!~"") S[A]=" "}
                  if($0!~"^===") S[A]=S[A]"\n"$0 }
             END {l=asorti(S,T);
                  for(n=1;n<=l;n++)
                  print T[n] "\n" S[T[n]] | "grep -v \"^$\" > /tmp/"n"-"SUF".out"}' datetime.txt
awk: line 7: function asorti never defined
for tempfile in $(ls /tmp/*${SUF}*)
do
    head -1 $tempfile; sed 1d $tempfile | sort -k2,3
done > $OUT
ls /tmp/*${SUF}*)
ls /tmp/*${SUF}*)
ls /tmp/*${SUF}*
++ ls '/tmp/*10985*'
ls: cannot access /tmp/*10985*: No such file or directory
rm /tmp/*${SUF}*
+ rm '/tmp/*10985*'
rm: cannot remove `/tmp/*10985*': No such file or directory

Open in new window

0
 
LVL 68

Expert Comment

by:woolmilkporc
Comment Utility
Which is your OS? Linux has GNU awk which supports "asorti"!

Are you allowed to install additional software on your machine?

We could omit asorti, but only if you don't need the result sorted by the server names contained in the header lines.
0
 

Author Comment

by:rbadveti R
Comment Utility
Its Debian OS -  Debian 3.2.63-2 x86_64 GNU/Linux.
I am afraid we may not be allowed to install additional software.

I would need the server_n output as well.
The logs will be only for a single server in a text file and on sorting i need the information which server these logs belongs to as well.
0
 
LVL 68

Expert Comment

by:woolmilkporc
Comment Utility
Do you have "gawk"?  Check with "which gawk".

If it's there replace "awk" with "gawk" and retry.
0
 

Author Comment

by:rbadveti R
Comment Utility
Hi,

I was able to get help.

awk '/^====/ {PRF = $2 = "0"$2} {print PRF, $0}' file | sort -uk1,1 -k3 | awk '{sub ($1 FS, _); sub (/^0/, _, $2)}1'

This command helps to accomplish.
0
 
LVL 26

Expert Comment

by:skullnobrains
Comment Utility
ssh messes up with the quoting and you need to run the sed on the remote machine

you might like this approach which tends to be much easier to debug

ssh -q -o 'StrictHostKeyChecking no' deb011 sh -s <<'EOF' >/path/to/local/output/file
cd /intucell/data/logs/app || exit 1
zcat sample.gz | sed -rne /2016-09-14' '01:09:56,796/,/2016-09-14' '01:46:56,438/ /Do' action on cell 'BLA330;'/,+10 p
EOF
0
 
LVL 26

Expert Comment

by:skullnobrains
Comment Utility
note that without a quoting mess, the script would not provide desired output but would not provide the original error either which is caused by the spaces. the escape sequences would barely prevent the string from matching.

+ ssh -q -o 'StrictHostKeyChecking no' deb011 'cd /intucell/data/logs/app; zcat sample.gz | sed -rne "/2016-09-14' '01:09:56,796/,/2016-09-14' '01:46:56,438/p"| grep -A10 -i "Do' action on cell 'BLA330"'
sed: -e expression #1, char 62: unterminated address regex

the string is passed to ssh as the following arguments
ARG : cd /intucell/data/logs/app; zcat sample.gz | sed -rne "/2016-09-14
ARG : 01:09:56,796/,/2016-09-14
ARG : 01:46:56,438/p"| grep -A10 -i "Do
ARG : ...

this strings ends up broken up as the following including the double quotes
( actually i'm unsure abou the double quotes handling in that case )
ARG : ...
ARG : "/2016-09-14
ARG : 01:09:56,796/,/2016-09-14
ARG : 01:46:56,438/p"

best regards
0

Featured Post

How your wiki can always stay up-to-date

Quip doubles as a “living” wiki and a project management tool that evolves with your organization. As you finish projects in Quip, the work remains, easily accessible to all team members, new and old.
- Increase transparency
- Onboard new hires faster
- Access from mobile/offline

Join & Write a Comment

Suggested Solutions

Introduction: Load and Save to file, Document-View interaction inside the SDI. Continuing from the second article about sudoku.   Open the project in visual studio. From the class view select CSudokuDoc and double click to open the header …
Introduction: Hints for the grid button.  Nested classes, templated collections.  Squash that darned bug! Continuing from the sixth article about sudoku.   Open the project in visual studio. First we will finish with the SUD_SETVALUE messa…
Explain concepts important to validation of email addresses with regular expressions. Applies to most languages/tools that uses regular expressions. Consider email address RFCs: Look at HTML5 form input element (with type=email) regex pattern: T…
This video will show you how to get GIT to work in Eclipse.   It will walk you through how to install the EGit plugin in eclipse and how to checkout an existing repository.

763 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

7 Experts available now in Live!

Get 1:1 Help Now