Shell - syntax to sort lines

Hi,

I have the following situation that Im not able to build a command line in Shell for it... Hope its not too complicated to explain...

Using this command I get these lines from a file that is generated daily - from 7 days ago :

find /var/opt/smarttreee/dpa/log/event/output/TrafficErrorEvents* -type f -mtime -7 -print0 | xargs -0 grep MDMD3

For this I get lines like this as output:

/var/opt/smarttreee/dpa/log/event/output/TrafficErrorEvents20130917.0340:20130917054940#20130917054940#6388920287#MDMD3#mdm1a#1#6388920288#3#ConfigurationName=SP_ADD#ConfigurationVersion=2.2#DestinationAddress=59175959807#IMEI=35801604028935#MSISDN=59175959807#TerminalId=35801604028935#
/var/opt/smarttrust/dpa/log/event/output/TrafficErrorEvents20130917.0340:20130917054940#20130917054940#6388920282#MDMD3#mdm1a#1#6388920283#3#ConfigurationName=SP_ADD#ConfigurationVersion=2.2#DestinationAddress=59178927867#IMEI=35210177000649#MSISDN=59178927867#TerminalId=35210177000649#

====

Some of these lines have this particular part duplicated:

IMEI=35210177

Not necessarily this number.. for example I can have something like this:

 /var/opt/smarttreee/dpa/log/event/output/TrafficErrorEvents20130917.0340:20130917054936#20130917054936#6388925052#MDMD3#mdm2a#1#6388925053#3#ConfigurationName=SP_ADD#ConfigurationVersion=2.2#DestinationAddress=59178470041#IMEI=35828141810883#MSISDN=59178470041#TerminalId=35828141810883#
/var/opt/smarttreee/dpa/log/event/output/TrafficErrorEvents20130917.0340:20130917054940#20130917054940#6388920276#MDMD3#mdm1a#1#6388920277#3#ConfigurationName=SP_ADD#ConfigurationVersion=2.2#DestinationAddress=59178961090#IMEI=35828141490515#MSISDN=59178961090#TerminalId=35763676490515#

NOTE the IMEI=35828141 duplicated.

=====

So what I need to get is a the count of lines (wc -l)  that doesnt have the IMEI part mentioned above duplicated.

For example, on the four lines I posted above, I would get a count of 3 since two of then have the IMEI=XXXXXXXX duplicated.

Is this possible?

Tks,
Joao
joaotellesAsked:
Who is Participating?
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

ozoCommented:
perl -le ' /#IMEI=(\w{8})/ && ++$c{$1} for </var/opt/smarttreee/dpa/log/event/output/TrafficErrorEvents*MDMD3*>;print scalar keys %c'
0
ozoCommented:
How did you get
/var/opt/smarttrust/dpa/log/event/output/TrafficErrorEvents20130917.0340:20130917054940#20130917054940#6388920282#MDMD3#mdm1a#1#6388920283#3#ConfigurationName=SP_ADD#ConfigurationVersion=2.2#DestinationAddress=59178927867#IMEI=35210177000649#MSISDN=59178927867#TerminalId=35210177000649#
as output from
find /var/opt/smarttreee/dpa/log/event/output/TrafficErrorEvents*
?
0
ozoCommented:
perl -le '-f && 7 > -M && /#IMEI=(\w{8})/ && ++$c{$1} for </var/opt/smarttreee/dpa/log/event/output/TrafficErrorEvents*MDMD3*>;print scalar keys %c'
0
Problems using Powershell and Active Directory?

Managing Active Directory does not always have to be complicated.  If you are spending more time trying instead of doing, then it's time to look at something else. For nearly 20 years, AD admins around the world have used one tool for day-to-day AD management: Hyena. Discover why

joaotellesAuthor Commented:
Im sorry .. mixed up the outputs... the right one the the:

/var/opt/smarttreee/dpa/log/event/output/TrafficErrorEvents20130917.0340:20130917054940#20130917054940#6388920282#MDMD3#mdm1a#1#6388920283#3#ConfigurationName=SP_ADD#ConfigurationVersion=2.2#DestinationAddress=59178927867#IMEI=35210177000649#MSISDN=59178927867#TerminalId=35210177000649#
0
joaotellesAuthor Commented:
Sorry the newbie question but do I have to include anything to put this command in a script?

perl -le '-f && 7 > -M && /#IMEI=(\w{8})/ && ++$c{$1} for </var/opt/smarttreee/dpa/log/event/output/TrafficErrorEvents*MDMD3*>;print scalar keys %c'

Something like this?

#!/bin/perl

Or this enough

#!/bin/sh
0
ozoCommented:
You should be able to include whatever you had included when you put your
find /var/opt/smarttreee/dpa/log/event/output/TrafficErrorEvents* -type f -mtime -7 -print0 | xargs -0 grep MDMD3 | wc -l
command in a script
0
joaotellesAuthor Commented:
It didnt work..

> perl -le '-f && 7 > -M && /#IMEI=(\w{8})/ && ++$c{$1} for </var/opt/smarttreee/dpa/log/event/output/TrafficErrorEvents*MDMD3*>;print scalar keys %c'
0

> pwd
/var/opt/smarttreee/dpa/log/event/output

And it has lines with MDMD3

> > grep MDMD3 TrafficErrorEvents20130924.0695 | more
20130924084946#20130924084946#6408537165#MDMD3#mdm2a#1#6408537166#3#ConfigurationName=SP_ADD#ConfigurationVersion=2.2#DestinationAddress=59176159059#IMEI=38401303344524
#MSISDN=59176159059#TerminalId=38401303344524#
20130924084951#20130924084951#6408537183#MDMD3#mdm2a#1#6408537184#3#ConfigurationName=SP_ADD#ConfigurationVersion=2.2#DestinationAddress=59177254495#IMEI=35598682364489
#MSISDN=59177254495#TerminalId=35598682364489#

Not if this is a problem but I have more than one file per day... for example: - I need that the last 7 seven days to be analized.. (so this could mean more than 7 files)

TrafficErrorEvents20130923.0689
TrafficErrorEvents20130923.0690
TrafficErrorEvents20130923.0691
TrafficErrorEvents20130924.0692
TrafficErrorEvents20130924.0693
TrafficErrorEvents20130924.0694
TrafficErrorEvents20130924.0695

Tks,
Joao
0
ozoCommented:
Sorry, I thought
/var/opt/smarttreee/dpa/log/event/output/TrafficErrorEvents20130917.0340:20130917054940#20130917054940#6388920287#MDMD3#mdm1a#1#6388920288#3#ConfigurationName=SP_ADD#ConfigurationVersion=2.2#DestinationAddress=59175959807#IMEI=35801604028935#MSISDN=59175959807#TerminalId=35801604028935#
was the name of the file
If it is a line in the file, then the command should be
perl -lne '/MDMD3/ && /#IMEI=(\w{8})/ && ++$c{$1}; END{print scalar keys %c}' TrafficErrorEvents20130924.0695
0
joaotellesAuthor Commented:
Tks!

> perl -lne '/MDMD3/ && /#IMEI=(\w{8})/ && ++$c{$1}; END{print scalar keys %c}' TrafficErrorEvents20130924.0695
606

But this analizes only one file... I need something that would analize the last 7 days files

(note that I can have more than one file per day - as I highlighted on last post)

For example:

TrafficErrorEvents20130923.0689
TrafficErrorEvents20130923.0690
TrafficErrorEvents20130923.0691
TrafficErrorEvents20130924.0692
TrafficErrorEvents20130924.0693
TrafficErrorEvents20130924.0694
TrafficErrorEvents20130924.0695
0
ozoCommented:
perl -lne '/MDMD3/ && /#IMEI=(\w{8})/ && ++$c{$1}; END{print scalar keys %c}'  TrafficErrorEvents2013092[34]*
or
perl -lne 'BEGIN{@ARGV=grep-f && 7 > -M,<*> unless @ARGV}/MDMD3/ && /#IMEI=(\w{8})/ && ++$c{$1}; END{print scalar keys %c}'
0
joaotellesAuthor Commented:
Im getting different results for them...

> perl -lne 'BEGIN{@ARGV=grep-f && 7 > -M,<*> unless @ARGV}/MDMD3/ && /#IMEI=(\w{8})/ && ++$c{$1}; END{print scalar keys %c}'
31462

> perl -lne '/MDMD3/ && /#IMEI=(\w{8})/ && ++$c{$1}; END{print scalar keys %c}'  TrafficErrorEvents2013092[34]*
5673

Is there a way to check which files each command is looking into? Like pirnt the lines instead of the number of lines? - Just to check if the count is getting the right files..

The first one I have to be in the files directory right? (thats ok)

Tks,
Joao
0
ozoCommented:
perl -lne 'BEGIN{@ARGV=grep-f && 7 > -M,<*> unless @ARGV}/MDMD3/ && /#IMEI=(\w{8})/ && !$c{$1}++ && print'
0
joaotellesAuthor Commented:
Tks! I will test it.
0
joaotellesAuthor Commented:
Tks.. it is working perfectly!

I will use this one:
> perl -lne 'BEGIN{@ARGV=grep-f && 7 > -M,<*> unless @ARGV}/MDMD3/ && /#IMEI=(\w{8})/ && ++$c{$1}; END{print scalar keys %c}'

===

I have one last question: Is there a way for it not to take the files that has the timestamp of the current day?

For example, lets say I have files like this:

TrafficErrorEvents20130917.0689
.
.
TrafficErrorEvents20130921.0689
TrafficErrorEvents20130922.0690
TrafficErrorEvents20130923.0691
TrafficErrorEvents20130923.0692
TrafficErrorEvents20130924.0693
TrafficErrorEvents20130924.0694

Using your command Im reading the files from the day 09/24 to the 09/17  (considering today as 09/24)

So, Is there a way for it to read the files from 09/23 to 09/17 ? NOT reading the files from 09/24 ?

Tks,
Joao Telles
0
ozoCommented:
Does it have to be by the timestamp in file name, or can it be by the modification time like the 7 day cutoff?
Does it have to be the beginning of the current day, or can it be 24 hours ago?
Would you also want to modify the 7 day cutoff to be based on the name of the file rather than the modification time, and should 7 days mean something like 144 hours before the beginning of the current day (or 143 or 146 if a daylight saving switchover occurred) instead of 168 hours ago like it is now?
0
joaotellesAuthor Commented:
It has to be from the begginning of the current day until 7 days ago... so not 24hrs, because 24hrs might eliminate a file from yesterday and this cant happen.

If you do it by the hour, I would have to run it in a specific time of the day to get the 7 days as described above.. otherwise it would eliminate a file from yesterday...

So I think it has to be by the timestamp in the file.
0
ozoCommented:
By "timestamp" do you mean in the the name of the file, or the modification (or creation) time in the file status information?
0
ozoCommented:
Using the date in the name of the file:
perl -MPOSIX -lne 'BEGIN{
$day=strftime"%Y%m%d",localtime time;
$week=strftime"%Y%m%d",localtime time-60*60*((localtime)[2]+12+24*6);
@ARGV=grep{/(\d{8})/&& $week <= $1 && $1 < $day}<TrafficErrorEvents*>;
}
/MDMD3/ && /#IMEI=(\w{8})/ && !$c{$1}++ && print;
END{print scalar keys %c}'

Or would it make more sense to use the date at the beginning of the lines in the file?
0

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
joaotellesAuthor Commented:
Its fine like this... but Im getting the lines as output:

20130918000021#20130918000021#6394535421#MDMD3#mdm2a#1#6394535422#3#ConfigurationName=SP_ADD#ConfigurationVersion=2.2#DestinationAddress=59176807441#IMEI=35662200145207
#MSISDN=59176807441#TerminalId=35662200145207#

Can you make it to output the number of lines?

Tks,
Joao
0
ozoCommented:
just remove the  && print
0
skullnobrainsCommented:
sed -ne 's/.*MDMD.*IMEI=\([0-9]*\).*/\1/p' $(find /var/opt/smarttreee/dpa/log/event/output/TrafficErrorEvents* -type f -mtime -7) | uniq | wc -l

the above assumes duplicates follow one-another. if not you'd have to stick a sort before the uniq. awk would be more efficient.
0
joaotellesAuthor Commented:
Tks.
0
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
Shell Scripting

From novice to tech pro — start learning today.