joaotelles
asked on
Shell - syntax to sort lines
Hi,
I have the following situation that Im not able to build a command line in Shell for it... Hope its not too complicated to explain...
Using this command I get these lines from a file that is generated daily - from 7 days ago :
find /var/opt/smarttreee/dpa/lo g/event/ou tput/Traff icErrorEve nts* -type f -mtime -7 -print0 | xargs -0 grep MDMD3
For this I get lines like this as output:
/var/opt/smarttreee/dpa/lo g/event/ou tput/Traff icErrorEve nts2013091 7.0340:201 3091705494 0#20130917 054940#638 8920287#MD MD3#mdm1a# 1#63889202 88#3#Confi gurationNa me=SP_ADD# Configurat ionVersion =2.2#Desti nationAddr ess=591759 59807#IMEI =358016040 28935#MSIS DN=5917595 9807#Termi nalId=3580 1604028935 #
/var/opt/smarttrust/dpa/lo g/event/ou tput/Traff icErrorEve nts2013091 7.0340:201 3091705494 0#20130917 054940#638 8920282#MD MD3#mdm1a# 1#63889202 83#3#Confi gurationNa me=SP_ADD# Configurat ionVersion =2.2#Desti nationAddr ess=591789 27867#IMEI =352101770 00649#MSIS DN=5917892 7867#Termi nalId=3521 0177000649 #
====
Some of these lines have this particular part duplicated:
IMEI=35210177
Not necessarily this number.. for example I can have something like this:
/var/opt/smarttreee/dpa/lo g/event/ou tput/Traff icErrorEve nts2013091 7.0340:201 3091705493 6#20130917 054936#638 8925052#MD MD3#mdm2a# 1#63889250 53#3#Confi gurationNa me=SP_ADD# Configurat ionVersion =2.2#Desti nationAddr ess=591784 70041#IMEI =358281418 10883#MSIS DN=5917847 0041#Termi nalId=3582 8141810883 #
/var/opt/smarttreee/dpa/lo g/event/ou tput/Traff icErrorEve nts2013091 7.0340:201 3091705494 0#20130917 054940#638 8920276#MD MD3#mdm1a# 1#63889202 77#3#Confi gurationNa me=SP_ADD# Configurat ionVersion =2.2#Desti nationAddr ess=591789 61090#IMEI =358281414 90515#MSIS DN=5917896 1090#Termi nalId=3576 3676490515 #
NOTE the IMEI=35828141 duplicated.
=====
So what I need to get is a the count of lines (wc -l) that doesnt have the IMEI part mentioned above duplicated.
For example, on the four lines I posted above, I would get a count of 3 since two of then have the IMEI=XXXXXXXX duplicated.
Is this possible?
Tks,
Joao
I have the following situation that Im not able to build a command line in Shell for it... Hope its not too complicated to explain...
Using this command I get these lines from a file that is generated daily - from 7 days ago :
find /var/opt/smarttreee/dpa/lo
For this I get lines like this as output:
/var/opt/smarttreee/dpa/lo
/var/opt/smarttrust/dpa/lo
====
Some of these lines have this particular part duplicated:
IMEI=35210177
Not necessarily this number.. for example I can have something like this:
/var/opt/smarttreee/dpa/lo
/var/opt/smarttreee/dpa/lo
NOTE the IMEI=35828141 duplicated.
=====
So what I need to get is a the count of lines (wc -l) that doesnt have the IMEI part mentioned above duplicated.
For example, on the four lines I posted above, I would get a count of 3 since two of then have the IMEI=XXXXXXXX duplicated.
Is this possible?
Tks,
Joao
perl -le ' /#IMEI=(\w{8})/ && ++$c{$1} for </var/opt/smarttreee/dpa/l og/event/o utput/Traf ficErrorEv ents*MDMD3 *>;print scalar keys %c'
How did you get
/var/opt/smarttrust/dpa/log/event/output/Traf ficErrorEv ents201309 17.0340:20 1309170549 40#2013091 7054940#63 88920282#M DMD3#mdm1a #1#6388920 283#3#Conf igurationN ame=SP_ADD #Configura tionVersio n=2.2#Dest inationAdd ress=59178 927867#IME I=35210177 000649#MSI SDN=591789 27867#Term inalId=352 1017700064 9#
as output from
find /var/opt/smarttreee/dpa/log/event/output/Traf ficErrorEv ents*
?
/var/opt/smarttrust/dpa/log/event/output/Traf
as output from
find /var/opt/smarttreee/dpa/log/event/output/Traf
?
perl -le '-f && 7 > -M && /#IMEI=(\w{8})/ && ++$c{$1} for </var/opt/smarttreee/dpa/l og/event/o utput/Traf ficErrorEv ents*MDMD3 *>;print scalar keys %c'
ASKER
Im sorry .. mixed up the outputs... the right one the the:
/var/opt/smarttreee/dpa/lo g/event/ou tput/Traff icErrorEve nts2013091 7.0340:201 3091705494 0#20130917 054940#638 8920282#MD MD3#mdm1a# 1#63889202 83#3#Confi gurationNa me=SP_ADD# Configurat ionVersion =2.2#Desti nationAddr ess=591789 27867#IMEI =352101770 00649#MSIS DN=5917892 7867#Termi nalId=3521 0177000649 #
/var/opt/smarttreee/dpa/lo
ASKER
Sorry the newbie question but do I have to include anything to put this command in a script?
perl -le '-f && 7 > -M && /#IMEI=(\w{8})/ && ++$c{$1} for </var/opt/smarttreee/dpa/l og/event/o utput/Traf ficErrorEv ents*MDMD3 *>;print scalar keys %c'
Something like this?
#!/bin/perl
Or this enough
#!/bin/sh
perl -le '-f && 7 > -M && /#IMEI=(\w{8})/ && ++$c{$1} for </var/opt/smarttreee/dpa/l
Something like this?
#!/bin/perl
Or this enough
#!/bin/sh
You should be able to include whatever you had included when you put your
find /var/opt/smarttreee/dpa/lo g/event/ou tput/Traff icErrorEve nts* -type f -mtime -7 -print0 | xargs -0 grep MDMD3 | wc -l
command in a script
find /var/opt/smarttreee/dpa/lo
command in a script
ASKER
It didnt work..
> perl -le '-f && 7 > -M && /#IMEI=(\w{8})/ && ++$c{$1} for </var/opt/smarttreee/dpa/l og/event/o utput/Traf ficErrorEv ents*MDMD3 *>;print scalar keys %c'
0
> pwd
/var/opt/smarttreee/dpa/lo g/event/ou tput
And it has lines with MDMD3
> > grep MDMD3 TrafficErrorEvents20130924 .0695 | more
20130924084946#20130924084 946#640853 7165#MDMD3 #mdm2a#1#6 408537166# 3#Configur ationName= SP_ADD#Con figuration Version=2. 2#Destinat ionAddress =591761590 59#IMEI=38 4013033445 24
#MSISDN=59176159059#Termin alId=38401 303344524#
20130924084951#20130924084 951#640853 7183#MDMD3 #mdm2a#1#6 408537184# 3#Configur ationName= SP_ADD#Con figuration Version=2. 2#Destinat ionAddress =591772544 95#IMEI=35 5986823644 89
#MSISDN=59177254495#Termin alId=35598 682364489#
Not if this is a problem but I have more than one file per day... for example: - I need that the last 7 seven days to be analized.. (so this could mean more than 7 files)
TrafficErrorEvents20130923 .0689
TrafficErrorEvents20130923 .0690
TrafficErrorEvents20130923 .0691
TrafficErrorEvents20130924 .0692
TrafficErrorEvents20130924 .0693
TrafficErrorEvents20130924 .0694
TrafficErrorEvents20130924 .0695
Tks,
Joao
> perl -le '-f && 7 > -M && /#IMEI=(\w{8})/ && ++$c{$1} for </var/opt/smarttreee/dpa/l
0
> pwd
/var/opt/smarttreee/dpa/lo
And it has lines with MDMD3
> > grep MDMD3 TrafficErrorEvents20130924
20130924084946#20130924084
#MSISDN=59176159059#Termin
20130924084951#20130924084
#MSISDN=59177254495#Termin
Not if this is a problem but I have more than one file per day... for example: - I need that the last 7 seven days to be analized.. (so this could mean more than 7 files)
TrafficErrorEvents20130923
TrafficErrorEvents20130923
TrafficErrorEvents20130923
TrafficErrorEvents20130924
TrafficErrorEvents20130924
TrafficErrorEvents20130924
TrafficErrorEvents20130924
Tks,
Joao
Sorry, I thought
/var/opt/smarttreee/dpa/lo g/event/ou tput/Traff icErrorEve nts2013091 7.0340:201 3091705494 0#20130917 054940#638 8920287#MD MD3#mdm1a# 1#63889202 88#3#Confi gurationNa me=SP_ADD# Configurat ionVersion =2.2#Desti nationAddr ess=591759 59807#IMEI =358016040 28935#MSIS DN=5917595 9807#Termi nalId=3580 1604028935 #
was the name of the file
If it is a line in the file, then the command should be
perl -lne '/MDMD3/ && /#IMEI=(\w{8})/ && ++$c{$1}; END{print scalar keys %c}' TrafficErrorEvents20130924 .0695
/var/opt/smarttreee/dpa/lo
was the name of the file
If it is a line in the file, then the command should be
perl -lne '/MDMD3/ && /#IMEI=(\w{8})/ && ++$c{$1}; END{print scalar keys %c}' TrafficErrorEvents20130924
ASKER
Tks!
> perl -lne '/MDMD3/ && /#IMEI=(\w{8})/ && ++$c{$1}; END{print scalar keys %c}' TrafficErrorEvents20130924 .0695
606
But this analizes only one file... I need something that would analize the last 7 days files
(note that I can have more than one file per day - as I highlighted on last post)
For example:
TrafficErrorEvents20130923 .0689
TrafficErrorEvents20130923 .0690
TrafficErrorEvents20130923 .0691
TrafficErrorEvents20130924 .0692
TrafficErrorEvents20130924 .0693
TrafficErrorEvents20130924 .0694
TrafficErrorEvents20130924 .0695
> perl -lne '/MDMD3/ && /#IMEI=(\w{8})/ && ++$c{$1}; END{print scalar keys %c}' TrafficErrorEvents20130924
606
But this analizes only one file... I need something that would analize the last 7 days files
(note that I can have more than one file per day - as I highlighted on last post)
For example:
TrafficErrorEvents20130923
TrafficErrorEvents20130923
TrafficErrorEvents20130923
TrafficErrorEvents20130924
TrafficErrorEvents20130924
TrafficErrorEvents20130924
TrafficErrorEvents20130924
perl -lne '/MDMD3/ && /#IMEI=(\w{8})/ && ++$c{$1}; END{print scalar keys %c}' TrafficErrorEvents2013092[ 34]*
or
perl -lne 'BEGIN{@ARGV=grep-f && 7 > -M,<*> unless @ARGV}/MDMD3/ && /#IMEI=(\w{8})/ && ++$c{$1}; END{print scalar keys %c}'
or
perl -lne 'BEGIN{@ARGV=grep-f && 7 > -M,<*> unless @ARGV}/MDMD3/ && /#IMEI=(\w{8})/ && ++$c{$1}; END{print scalar keys %c}'
ASKER
Im getting different results for them...
> perl -lne 'BEGIN{@ARGV=grep-f && 7 > -M,<*> unless @ARGV}/MDMD3/ && /#IMEI=(\w{8})/ && ++$c{$1}; END{print scalar keys %c}'
31462
> perl -lne '/MDMD3/ && /#IMEI=(\w{8})/ && ++$c{$1}; END{print scalar keys %c}' TrafficErrorEvents2013092[ 34]*
5673
Is there a way to check which files each command is looking into? Like pirnt the lines instead of the number of lines? - Just to check if the count is getting the right files..
The first one I have to be in the files directory right? (thats ok)
Tks,
Joao
> perl -lne 'BEGIN{@ARGV=grep-f && 7 > -M,<*> unless @ARGV}/MDMD3/ && /#IMEI=(\w{8})/ && ++$c{$1}; END{print scalar keys %c}'
31462
> perl -lne '/MDMD3/ && /#IMEI=(\w{8})/ && ++$c{$1}; END{print scalar keys %c}' TrafficErrorEvents2013092[
5673
Is there a way to check which files each command is looking into? Like pirnt the lines instead of the number of lines? - Just to check if the count is getting the right files..
The first one I have to be in the files directory right? (thats ok)
Tks,
Joao
perl -lne 'BEGIN{@ARGV=grep-f && 7 > -M,<*> unless @ARGV}/MDMD3/ && /#IMEI=(\w{8})/ && !$c{$1}++ && print'
ASKER
Tks! I will test it.
ASKER
Tks.. it is working perfectly!
I will use this one:
> perl -lne 'BEGIN{@ARGV=grep-f && 7 > -M,<*> unless @ARGV}/MDMD3/ && /#IMEI=(\w{8})/ && ++$c{$1}; END{print scalar keys %c}'
===
I have one last question: Is there a way for it not to take the files that has the timestamp of the current day?
For example, lets say I have files like this:
TrafficErrorEvents20130917 .0689
.
.
TrafficErrorEvents20130921 .0689
TrafficErrorEvents20130922 .0690
TrafficErrorEvents20130923 .0691
TrafficErrorEvents20130923 .0692
TrafficErrorEvents20130924 .0693
TrafficErrorEvents20130924 .0694
Using your command Im reading the files from the day 09/24 to the 09/17 (considering today as 09/24)
So, Is there a way for it to read the files from 09/23 to 09/17 ? NOT reading the files from 09/24 ?
Tks,
Joao Telles
I will use this one:
> perl -lne 'BEGIN{@ARGV=grep-f && 7 > -M,<*> unless @ARGV}/MDMD3/ && /#IMEI=(\w{8})/ && ++$c{$1}; END{print scalar keys %c}'
===
I have one last question: Is there a way for it not to take the files that has the timestamp of the current day?
For example, lets say I have files like this:
TrafficErrorEvents20130917
.
.
TrafficErrorEvents20130921
TrafficErrorEvents20130922
TrafficErrorEvents20130923
TrafficErrorEvents20130923
TrafficErrorEvents20130924
TrafficErrorEvents20130924
Using your command Im reading the files from the day 09/24 to the 09/17 (considering today as 09/24)
So, Is there a way for it to read the files from 09/23 to 09/17 ? NOT reading the files from 09/24 ?
Tks,
Joao Telles
Does it have to be by the timestamp in file name, or can it be by the modification time like the 7 day cutoff?
Does it have to be the beginning of the current day, or can it be 24 hours ago?
Would you also want to modify the 7 day cutoff to be based on the name of the file rather than the modification time, and should 7 days mean something like 144 hours before the beginning of the current day (or 143 or 146 if a daylight saving switchover occurred) instead of 168 hours ago like it is now?
Does it have to be the beginning of the current day, or can it be 24 hours ago?
Would you also want to modify the 7 day cutoff to be based on the name of the file rather than the modification time, and should 7 days mean something like 144 hours before the beginning of the current day (or 143 or 146 if a daylight saving switchover occurred) instead of 168 hours ago like it is now?
ASKER
It has to be from the begginning of the current day until 7 days ago... so not 24hrs, because 24hrs might eliminate a file from yesterday and this cant happen.
If you do it by the hour, I would have to run it in a specific time of the day to get the 7 days as described above.. otherwise it would eliminate a file from yesterday...
So I think it has to be by the timestamp in the file.
If you do it by the hour, I would have to run it in a specific time of the day to get the 7 days as described above.. otherwise it would eliminate a file from yesterday...
So I think it has to be by the timestamp in the file.
By "timestamp" do you mean in the the name of the file, or the modification (or creation) time in the file status information?
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
ASKER
Its fine like this... but Im getting the lines as output:
20130918000021#20130918000 021#639453 5421#MDMD3 #mdm2a#1#6 394535422# 3#Configur ationName= SP_ADD#Con figuration Version=2. 2#Destinat ionAddress =591768074 41#IMEI=35 6622001452 07
#MSISDN=59176807441#Termin alId=35662 200145207#
Can you make it to output the number of lines?
Tks,
Joao
20130918000021#20130918000
#MSISDN=59176807441#Termin
Can you make it to output the number of lines?
Tks,
Joao
SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
sed -ne 's/.*MDMD.*IMEI=\([0-9]*\) .*/\1/p' $(find /var/opt/smarttreee/dpa/lo g/event/ou tput/Traff icErrorEve nts* -type f -mtime -7) | uniq | wc -l
the above assumes duplicates follow one-another. if not you'd have to stick a sort before the uniq. awk would be more efficient.
the above assumes duplicates follow one-another. if not you'd have to stick a sort before the uniq. awk would be more efficient.
ASKER
Tks.