Link to home
Start Free TrialLog in
Avatar of bt707
bt707Flag for United States of America

asked on

AWK and Sort

I have a awk command I need to modify, I need to:

1) find all lines that have refuesed in them,usally in column 18 but not always.
2) find all lines that have the same value in column 7
3) print to screen one copy of each different value it finds in column 7 and the number of times it reoccured in the file,
   and print out column 6, 7, and 18.
   


Command i need to modify:

awk ' /refused/ {print $6, $7, $18}' log_file | sort | uniq -c | sort -nr


Example of what want it to look like:



22-Nov-2004 01:35:54.44 tcp_in              Q 5 jcfaxhgibblyu@yyhmail.com rfc822;find@com find@com /mail/06/queue/tcp_in/002/ZR0I7Z00AF6W3UDF.00 <FYCQIMPNYZQGGRYOZKLQHMEM@wonet.com>  TCP active open: Failed connect()    Error: Connection refused


22-Nov-2004 01:35:54.44 tcp_in              Q 5 jcfaxhgibblyu@yyhmail.com rfc822;newmail.com newmail.com /mail/06/queue/tcp_in/002/ZR0I7Z00AF6W3UDF.00 <FYCQIMPNYZQGGRYOZKLQHMEM@wonet.com>  TCP active open: Failed connect()    Error: Connection refused


22-Nov-2004 01:35:54.44 tcp_in              Q 5 mymail.com rfc822;find@com find@com /mail/06/queue/tcp_in/002/IUYGHTREAF6W3UDF.00 <OIUYGMPNYZQGGRYOIUYHKJ@wonet.com>  TCP active open: Failed connect()    Error: Connection refused


22-Nov-2004 01:35:54.44 tcp_in              Q 5 jcfaxhgibblyu@yyhmail.com rfc822;byrd@villahermosa.com byrd@villahermosa.com /mail/06/queue/tcp_in/002/ZR0I7Z00AF6W3UDF.00 <FYCQIMPNYZQGGRYOZKLQHMEM@wonet.com>  TCP active open: Failed connect()    Error: Connection refused




2 jcfaxhgibblyu@yyhmail.com rfc822;find@com find@com
1 jcfaxhgibblyu@yyhmail.com rfc822;newmail.com
1 jcfaxhgibblyu@yyhmail.com rfc822;byrd@villahermosa.com


Thanks,
ASKER CERTIFIED SOLUTION
Avatar of tfewster
tfewster
Flag of United Kingdom of Great Britain and Northern Ireland image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Avatar of bt707

ASKER

that what i was looking at but can't seem to get it working.

what I need to do is to look at the 7th feild and find all lines that have the 7th field duplicated then print out how many times that this same value appeared
in the 7th field.


Thanks,
Now I'm not sure what the problem is; My amendment to your command does the following:
- Sorts the output using the second field, i.e. rfc822;address@domain to bring them all together (This IS the seventh field of the log_file).
- Does a "uniq" on the second (and subsequent) fields. If two ADJACENT lines contain the same second (and subsequent) fields, only one line is printed plus a total.  Note that as the first field is ignored in the "uniq", some of the "from" addresses (field 6 in the log, i.e. field 1 in the extract) are lost

In your example output, you seem to be extracting fields 6, 7 & 8 (not 18) as the email address is repeated (or is it a domain? I'm not sure). But the principle still stands...I tested it on your input and got the sort/summary you posted in your example.  Or am I missing something?
Avatar of bt707

ASKER

No i'm just not explaining it very good i guess.

What you put did work great for what I was looking for, I'm now trying to modify the results from that, I was printing out the other lines, but they are not really neccessary.

I'll close this one and open up new question for the next part of what I am trying to do,

This part is working great, your command was just right, i was trying to make it to complicated and missed it.


Thanks again,