bt707
asked on
AWK and Sort
I have a awk command I need to modify, I need to:
1) find all lines that have refuesed in them,usally in column 18 but not always.
2) find all lines that have the same value in column 7
3) print to screen one copy of each different value it finds in column 7 and the number of times it reoccured in the file,
and print out column 6, 7, and 18.
Command i need to modify:
awk ' /refused/ {print $6, $7, $18}' log_file | sort | uniq -c | sort -nr
Example of what want it to look like:
22-Nov-2004 01:35:54.44 tcp_in Q 5 jcfaxhgibblyu@yyhmail.com rfc822;find@com find@com /mail/06/queue/tcp_in/002/ ZR0I7Z00AF 6W3UDF.00 <FYCQIMPNYZQGGRYOZKLQHMEM@ wonet.com> TCP active open: Failed connect() Error: Connection refused
22-Nov-2004 01:35:54.44 tcp_in Q 5 jcfaxhgibblyu@yyhmail.com rfc822;newmail.com newmail.com /mail/06/queue/tcp_in/002/ ZR0I7Z00AF 6W3UDF.00 <FYCQIMPNYZQGGRYOZKLQHMEM@ wonet.com> TCP active open: Failed connect() Error: Connection refused
22-Nov-2004 01:35:54.44 tcp_in Q 5 mymail.com rfc822;find@com find@com /mail/06/queue/tcp_in/002/ IUYGHTREAF 6W3UDF.00 <OIUYGMPNYZQGGRYOIUYHKJ@wo net.com> TCP active open: Failed connect() Error: Connection refused
22-Nov-2004 01:35:54.44 tcp_in Q 5 jcfaxhgibblyu@yyhmail.com rfc822;byrd@villahermosa.c om byrd@villahermosa.com /mail/06/queue/tcp_in/002/ ZR0I7Z00AF 6W3UDF.00 <FYCQIMPNYZQGGRYOZKLQHMEM@ wonet.com> TCP active open: Failed connect() Error: Connection refused
2 jcfaxhgibblyu@yyhmail.com rfc822;find@com find@com
1 jcfaxhgibblyu@yyhmail.com rfc822;newmail.com
1 jcfaxhgibblyu@yyhmail.com rfc822;byrd@villahermosa.c om
Thanks,
1) find all lines that have refuesed in them,usally in column 18 but not always.
2) find all lines that have the same value in column 7
3) print to screen one copy of each different value it finds in column 7 and the number of times it reoccured in the file,
and print out column 6, 7, and 18.
Command i need to modify:
awk ' /refused/ {print $6, $7, $18}' log_file | sort | uniq -c | sort -nr
Example of what want it to look like:
22-Nov-2004 01:35:54.44 tcp_in Q 5 jcfaxhgibblyu@yyhmail.com rfc822;find@com find@com /mail/06/queue/tcp_in/002/
22-Nov-2004 01:35:54.44 tcp_in Q 5 jcfaxhgibblyu@yyhmail.com rfc822;newmail.com newmail.com /mail/06/queue/tcp_in/002/
22-Nov-2004 01:35:54.44 tcp_in Q 5 mymail.com rfc822;find@com find@com /mail/06/queue/tcp_in/002/
22-Nov-2004 01:35:54.44 tcp_in Q 5 jcfaxhgibblyu@yyhmail.com rfc822;byrd@villahermosa.c
2 jcfaxhgibblyu@yyhmail.com rfc822;find@com find@com
1 jcfaxhgibblyu@yyhmail.com rfc822;newmail.com
1 jcfaxhgibblyu@yyhmail.com rfc822;byrd@villahermosa.c
Thanks,
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Now I'm not sure what the problem is; My amendment to your command does the following:
- Sorts the output using the second field, i.e. rfc822;address@domain to bring them all together (This IS the seventh field of the log_file).
- Does a "uniq" on the second (and subsequent) fields. If two ADJACENT lines contain the same second (and subsequent) fields, only one line is printed plus a total. Note that as the first field is ignored in the "uniq", some of the "from" addresses (field 6 in the log, i.e. field 1 in the extract) are lost
In your example output, you seem to be extracting fields 6, 7 & 8 (not 18) as the email address is repeated (or is it a domain? I'm not sure). But the principle still stands...I tested it on your input and got the sort/summary you posted in your example. Or am I missing something?
- Sorts the output using the second field, i.e. rfc822;address@domain to bring them all together (This IS the seventh field of the log_file).
- Does a "uniq" on the second (and subsequent) fields. If two ADJACENT lines contain the same second (and subsequent) fields, only one line is printed plus a total. Note that as the first field is ignored in the "uniq", some of the "from" addresses (field 6 in the log, i.e. field 1 in the extract) are lost
In your example output, you seem to be extracting fields 6, 7 & 8 (not 18) as the email address is repeated (or is it a domain? I'm not sure). But the principle still stands...I tested it on your input and got the sort/summary you posted in your example. Or am I missing something?
ASKER
No i'm just not explaining it very good i guess.
What you put did work great for what I was looking for, I'm now trying to modify the results from that, I was printing out the other lines, but they are not really neccessary.
I'll close this one and open up new question for the next part of what I am trying to do,
This part is working great, your command was just right, i was trying to make it to complicated and missed it.
Thanks again,
What you put did work great for what I was looking for, I'm now trying to modify the results from that, I was printing out the other lines, but they are not really neccessary.
I'll close this one and open up new question for the next part of what I am trying to do,
This part is working great, your command was just right, i was trying to make it to complicated and missed it.
Thanks again,
ASKER
what I need to do is to look at the 7th feild and find all lines that have the 7th field duplicated then print out how many times that this same value appeared
in the 7th field.
Thanks,