Go Premium for a chance to win a PS4. Enter to Win

x
?
Solved

Bash sort / cut / uniq script to analyze domains

Posted on 2008-09-29
5
Medium Priority
?
1,324 Views
Last Modified: 2013-12-26
I have a list of domains that I'd like to trim down a little bit. I only want the domains listed and not the subdomains of each domain. For example, the list below:

ff.search.yahoo.com
files.opensuse.org
files.widgetbox.com
yahoo.com
forums.dpreview.com
forums.opensuse.org
images.dpreview.com
geo.yahoo.com
go.microsoft.com
updates.vista.microsoft.com

I want to reduce to:

yahoo.com.com
opensuse.org
widgetbox.com
dpreview.com
microsoft.com

It would just be the cat's pajamas if I cut use the Cut command, specify the '.' as field separator and return the RIGHT two fields instead of numbering from the left. But I don't think that's possible, at least from what I've read and searched so far.

To say it another way, II only want to deal with two fields (as separated by '.'), not 4, 6, or three. And field quantity is a variable, depending on how many subdomains need to be deleted. I can't use AWK, but SEDis available to my systeme, if Ihave to use something other than Cut or Sort.
0
Comment
Question by:thinkwelldesigns
  • 2
  • 2
5 Comments
 
LVL 85

Accepted Solution

by:
ozo earned 2000 total points
ID: 22602258
sed 's/.*\.\(.*\.\)/\1/' | sort -u
0
 
LVL 48

Expert Comment

by:Tintin
ID: 22602373
Why the restriction on awk?

awk -F. '{printf"%s.%s\n", $(NF-1),$NF}' file  |sort -u

otherwise I was going to suggest the same sed option as ozo.
0
 
LVL 85

Expert Comment

by:ozo
ID: 22602407
awk -F. '{d[$(NF-1),".",$NF]++}END{for( u in d) print u}'
0
 
LVL 9

Author Comment

by:thinkwelldesigns
ID: 22605731
Thanks, ozo. The sed command works perfectly. I'm not quite sure exactly HOW it works, and if you'd give me an explanation, I'd be grateful. But it does work exactly so points to you!

Thanks a million.

BTW, this script will be used on a number of gateway machines that don't have awk installed, but do have sed. I dowanna hafta install awk on every machine to make the script work.

But is awk "better" than sed?
0
 
LVL 48

Expert Comment

by:Tintin
ID: 22609027
awk isn't better than sed, it's just another tool that is better at some tasks than sed.  In this instance, there's not much difference between using sed or awk.  I'm surprised you don't have awk on your gateway servers, as it is a very standard utility.

Anyway, breaking the sed statement down:

.*  zero or more matches of any character
\.  a literal dot
\(  start of pattern capture
.*\.  zero of more matches of any character terminated by a literal dot
\)  end of pattern capture
\1  print the captured pattern
0

Featured Post

Independent Software Vendors: We Want Your Opinion

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

In this post we will learn how to make Android Gesture Tutorial and give different functionality whenever a user Touch or Scroll android screen.
What do responsible coders do? They don't take detrimental shortcuts. They do take reasonable security precautions, create important automation, implement sufficient logging, fix things they break, and care about users.
In this fifth video of the Xpdf series, we discuss and demonstrate the PDFdetach utility, which is able to list and, more importantly, extract attachments that are embedded in PDF files. It does this via a command line interface, making it suitable …
Screencast - Getting to Know the Pipeline

971 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question