asked on

Sed: -e expression #1, char 16: unterminated address regex

I am trying to grep for a particular text (Do action on cell BL330) in a text file(sample.gz) which is searched in the content filtered by date+timestamp (2016-09-14 01:09:56,796 to 2016-09-15 04:10:29,719) on a remote machine and finally write the output into a output file on a local machine.

Few details of variables passed as a parameter:
server_id = hostname
first_line_log_file =sample.gz
first_line_date_time=2016-09-14 01:09:56,796
last_log_first_line=2016-09-15 04:10:29,719
$do_action_on_cell_2=Do action on cell BL330

ssh -q -o "StrictHostKeyChecking no" $server_id "cd /intucell/data/logs/app; zcat $first_line_log_file | sed -rne '/$first_line_date_time/,/$first_log_first_line/p'| zgrep -A10 -i '$do_action_on_cell_2';" >> ./output.log

However, on executing the above i get the below error

sed: -e expression #1, char 62: unterminated address regex

I think the problem is with sed expression, Please suggest a way forward.

woolmilkporc

I assume that all variables are defined to the local shell (the one where "ssh" is issued) and thus are not known to the remote machine "$server_id".

If I'm right those variables may not be enclosed in single quotes, because the local shell will then not resolve them.

So use double quotes.

Next, since you're already using zcat to decompress the data you don't have to use "zgrep", "grep" alone will suffice.

ssh -q -o "StrictHostKeyChecking no" $server_id "cd /intucell/data/logs/app; zcat $first_line_log_file | sed -rne "/$first_line_date_time/,/$first_log_first_line/p"| grep -A10 -i "$do_action_on_cell_2'"" >> ./output.log

rbadveti R

ASKER

Hi woolmilkporc,

Thanks for the reply.

Yes, all the variables used are defined local to the shell and not known to the remote machine.

Double quote is already present after the "$server_id" - so it takes care of variable expansion I believe.

I have tried the command you mentioned - it did not work. please see result below with -vx trace:

+ ssh -q -o 'StrictHostKeyChecking no' deb011 'cd /intucell/data/logs/app; grep -i '\''Cell BLA330 is down'\'' sample.gz; zgrep -i '\''Cell BLA330 is down'\'' sample.gz;'
+ ssh -q -o 'StrictHostKeyChecking no' deb011 'cd /intucell/data/logs/app; zcat sample.gz | sed -rne /2016-09-14' '01:09:56,796/,/2016-09-14' '01:46:56,438/p | grep -A10 -i Do' action on cell 'BLA330;'
grep: action: No such file or directory
grep: on: No such file or directory
grep: cell: No such file or directory
grep: BLA330: No such file or directory
sed: -e expression #1, char 16: unterminated address regex

Appreciate further checking on this.
Thanks.

woolmilkporc

>> Double quote is already present after the "$server_id" <<
Yes, but later you enclosed the sed command and the grep string in single quotes.
The shell does not expand variables between single quotes.

Did you enter my command exactly as posted?
It contains a tiny but important error, sorry.
There is an extra single quote near the end which I forgot to remove:

... | grep -A10 -i "$do_action_on_cell_2"" >> ./output.log

You could also try this:

ssh -q -o "StrictHostKeyChecking no" $server_id 'cd /intucell/data/logs/app; zcat '$first_line_log_file' | sed -ne "/'$first_line_date_time'/,/'$first_log_first_line'/p"| grep -A10 -i "'$do_action_on_cell_2'"' >> ./output.log

I used single quotes around the whole ssh command and took the variables out of those quotes.
It's a bit hard to read, but this time all quotes should match (I hope).

You should also omit the "-r" flag of "sed". There are no extended regex present, as far as I can see.

rbadveti R

ASKER

Hi,

Tried the command and below is the output:

ssh -q -o "StrictHostKeyChecking no" $server_id 'cd /intucell/data/logs/app; zcat '$first_line_log_file' | sed -rne "/'$first_line_date_time'/,/'$first_log_first_line'/p"| grep -A10 -i "'$do_action_on_cell_2'"'

+ ssh -q -o 'StrictHostKeyChecking no' deb011 'cd /intucell/data/logs/app; zcat sample.gz | sed -rne "/2016-09-14' '01:09:56,796/,/2016-09-14' '01:46:56,438/p"| grep -A10 -i "Do' action on cell 'BLA330"'
sed: -e expression #1, char 62: unterminated address regex

I have used -vx option for debug trace after the server_id and below is the command and output for the same:

+ ssh -q -o 'StrictHostKeyChecking no' deb011 -vx 'cd /intucell/data/logs/app; zcat sample.gz | sed -rne "/2016-09-14' '01:09:56,796/,/2016-09-14' '01:46:56,438/p"| grep -A10 -i "Do' action on cell 'BLA330"'
OpenSSH_6.0p1 Debian-4+deb7u2, OpenSSL 1.0.1e 11 Feb 2013
debug1: Reading configuration data /etc/ssh/ssh_config
debug1: /etc/ssh/ssh_config line 19: Applying options for *
debug1: Connecting to deb011 [x.x.x.x] port 22.
...
debug1: Sending command: cd /intucell/data/logs/app; zcat sample.gz | sed -rne "/\033[36m2016-09-14 01:09:56,796/,/\033[32m2016-09-14 01:46:56,438/p"| grep -A10 -i "Do action on cell BLA330";
sed: -e expression #1, char 62: unterminated address regex
debug1: client_input_channel_req: channel 0 rtype exit-status reply 0
debug1: client_input_channel_req: channel 0 rtype eow@openssh.com reply 0
debug1: channel 0: free: client-session, nchannels 1
debug1: fd 1 clearing O_NONBLOCK
Transferred: sent 2888, received 2000 bytes, in 0.0 seconds
Bytes per second: sent 536288.6, received 371391.0
debug1: Exit status 1

From the output above, i feel the below part is suspicious in sed expression:
'/\033[36m2016-09-14 01:09:56,796/,/\033[32m2016-09-14 01:46:56,438/p'

Any idea how to get rid of these special characters? 033[36m and 033[32m

woolmilkporc

Those are ANSI color escape sequences for "cyan" and "green".
These sequences are definitely the cause of the error - I just tested it.

Do your variables contain highlighting??

If so, rewrite the strings "by hand" and retry.

rbadveti R

ASKER

I don't know how to confirm on that but yes, the .gz files which I use are displayed in a different highlighted font to the other normal files when listed (ls -lrt)

I have derived one of the sed variable (first_log_first_line) as below:

ssh -q -o "StrictHostKeyChecking no" $server_id "cd /intucell/data/logs/app; zcat sample.gz | tail -1| cut -d' ' -f1-2;" > ./first_line_output.txt

first_log_first_line=$(head -1 first_line_output.txt)

From the above, you can read that I am picking up the last line of the sample.gz file and picking up only the Date and time of that line using cut operation. Possibly the color escape sequence would have derived from there.

Any idea how could these be checked and removed?

woolmilkporc

When creating the variables from an "ls" listing you must switch off color highlighting:

ls -lrt --color=never ... ...

But ... it rather seems that it's the file "sample.gz" which contains this highlighting. What do you see with "zcat sample.gz"?
How did you (or somebody else) create that .gz file?

ASKER CERTIFIED SOLUTION

woolmilkporc

membership

This solution is only available to members.

To access this solution, you must be a member of Experts Exchange.

Start Free Trial

rbadveti R

ASKER

Thanks a ton to you- woolmilkporc and Kurt Starsinic - It really helped.

I had no clue for the last 4 days what was going wrong until this day - All thanks to the color magic :)

It was nice learning though. Thank you again - Have a good day!

woolmilkporc

Glad I (we) could help!
Kurt is the author of TERM::ANSIColor and wasn't (directly) involved in answering this question!

wmp

rbadveti R

ASKER

However, I have a new requirement - explained as below:

I have to sort only the lines containing the date+time.

Sample text file as below:

========== server_1 ==========
sample1.gz: 2016-09-14 02:55:30,799
sample2.gz: 2016-09-14 01:09:56,796
sample2.gz: 2016-09-14 01:09:57,927
========== server_2 ==========
========== server_3 ==========
========== server_1 ==========
sample1.gz: 2016-09-14 02:55:31,250
sample1.gz: 2016-09-14 02:55:31,777

Open in new window

Desired output:

========== server_1 ==========
sample2.gz: 2016-09-14 01:09:56,796
sample2.gz: 2016-09-14 01:09:57,927
sample1.gz: 2016-09-14 02:55:30,799
sample1.gz: 2016-09-14 02:55:31,250
sample1.gz: 2016-09-14 02:55:31,777
========== server_2 ==========
========== server_3 ==========

Open in new window

Please suggest.

woolmilkporc

This is a whole new problem!
It's not as simple as it might look at first sight,
so I'm sure it would have been worth asking a new question!
Since you're new here on EE I'll provide a solution nonetheless,
but please keep in mind for the future to ask a new question for any new problem!

The following solution needs to create a bunch of work files in /tmp
whose names should be fairly unique and which will be removed after completion.

The only thing you have to change is the assignment of the "IN" variable
which designates your input file name. The script creates a sorted output file
named "input file name.sorted" in the same directory.

Please note that the intermediate header lines must start with
at least three equal signs ("===") as in your sample, otherwise the code will not work!

Have fun!

#!/bin/bash
IN=myfile.txt   # <-- Adjust here!
# no customization required below this line
OUT=${IN}.sorted
SUF=$$
awk -v SUF=$SUF '{if($0~"^===")
                  {A=$0;if(S[A]!~"") S[A]=" "}
                  if($0!~"^===") S[A]=S[A]"\n"$0 }
             END {l=asorti(S,T);
                  for(n=1;n<=l;n++)
                  print T[n] "\n" S[T[n]] | "grep -v \"^$\" > /tmp/"n"-"SUF".out"}' $IN
for tempfile in $(ls /tmp/*${SUF}*)
    do
     head -1 $tempfile; sed 1d $tempfile | sort -k2,3
    done > $OUT
rm /tmp/*${SUF}*

Open in new window

rbadveti R

ASKER

Hi,

Yes, I agree a new question should have been raised - will follow next time. Thanks.

I ran the script shared - however i see below error for the same:

awk: line 7: function asorti never defined
ls: cannot access /tmp/*4134*: No such file or directory
rm: cannot remove `/tmp/*4134*': No such file or directory

Open in new window

with -vx option:

IN=datetime.txt   # <-- Adjust here!
+ IN=datetime.txt
# no customization required below this line
OUT=${IN}.sorted
+ OUT=datetime.txt.sorted
SUF=$$
+ SUF=10985
awk -v SUF=$SUF '{if($0~"^===")
                  {A=$0;if(S[A]!~"") S[A]=" "}
                  if($0!~"^===") S[A]=S[A]"\n"$0 }
             END {l=asorti(S,T);
                  for(n=1;n<=l;n++)
                  print T[n] "\n" S[T[n]] | "grep -v \"^$\" > /tmp/"n"-"SUF".out"}' $IN
+ awk -v SUF=10985 '{if($0~"^===")
                  {A=$0;if(S[A]!~"") S[A]=" "}
                  if($0!~"^===") S[A]=S[A]"\n"$0 }
             END {l=asorti(S,T);
                  for(n=1;n<=l;n++)
                  print T[n] "\n" S[T[n]] | "grep -v \"^$\" > /tmp/"n"-"SUF".out"}' datetime.txt
awk: line 7: function asorti never defined
for tempfile in $(ls /tmp/*${SUF}*)
do
    head -1 $tempfile; sed 1d $tempfile | sort -k2,3
done > $OUT
ls /tmp/*${SUF}*)
ls /tmp/*${SUF}*)
ls /tmp/*${SUF}*
++ ls '/tmp/*10985*'
ls: cannot access /tmp/*10985*: No such file or directory
rm /tmp/*${SUF}*
+ rm '/tmp/*10985*'
rm: cannot remove `/tmp/*10985*': No such file or directory

Open in new window

woolmilkporc

Which is your OS? Linux has GNU awk which supports "asorti"!

Are you allowed to install additional software on your machine?

We could omit asorti, but only if you don't need the result sorted by the server names contained in the header lines.

rbadveti R

ASKER

Its Debian OS - Debian 3.2.63-2 x86_64 GNU/Linux.
I am afraid we may not be allowed to install additional software.

I would need the server_n output as well.
The logs will be only for a single server in a text file and on sorting i need the information which server these logs belongs to as well.

woolmilkporc

Do you have "gawk"? Check with "which gawk".

If it's there replace "awk" with "gawk" and retry.

rbadveti R

ASKER

Hi,

I was able to get help.

awk '/^====/ {PRF = $2 = "0"$2} {print PRF, $0}' file | sort -uk1,1 -k3 | awk '{sub ($1 FS, _); sub (/^0/, _, $2)}1'

This command helps to accomplish.

skullnobrains

ssh messes up with the quoting and you need to run the sed on the remote machine

you might like this approach which tends to be much easier to debug

ssh -q -o 'StrictHostKeyChecking no' deb011 sh -s <<'EOF' >/path/to/local/output/file
cd /intucell/data/logs/app || exit 1
zcat sample.gz | sed -rne /2016-09-14' '01:09:56,796/,/2016-09-14' '01:46:56,438/ /Do' action on cell 'BLA330;'/,+10 p
EOF

skullnobrains

note that without a quoting mess, the script would not provide desired output but would not provide the original error either which is caused by the spaces. the escape sequences would barely prevent the string from matching.

+ ssh -q -o 'StrictHostKeyChecking no' deb011 'cd /intucell/data/logs/app; zcat sample.gz | sed -rne "/2016-09-14' '01:09:56,796/,/2016-09-14' '01:46:56,438/p"| grep -A10 -i "Do' action on cell 'BLA330"'
sed: -e expression #1, char 62: unterminated address regex

the string is passed to ssh as the following arguments
ARG : cd /intucell/data/logs/app; zcat sample.gz | sed -rne "/2016-09-14
ARG : 01:09:56,796/,/2016-09-14
ARG : 01:46:56,438/p"| grep -A10 -i "Do
ARG : ...

this strings ends up broken up as the following including the double quotes
( actually i'm unsure abou the double quotes handling in that case )
ARG : ...
ARG : "/2016-09-14
ARG : 01:09:56,796/,/2016-09-14
ARG : 01:46:56,438/p"

best regards