Solved

Bash sort / cut / uniq script to analyze domains

Posted on 2008-09-29
5
1,284 Views
Last Modified: 2013-12-26
I have a list of domains that I'd like to trim down a little bit. I only want the domains listed and not the subdomains of each domain. For example, the list below:

ff.search.yahoo.com
files.opensuse.org
files.widgetbox.com
yahoo.com
forums.dpreview.com
forums.opensuse.org
images.dpreview.com
geo.yahoo.com
go.microsoft.com
updates.vista.microsoft.com

I want to reduce to:

yahoo.com.com
opensuse.org
widgetbox.com
dpreview.com
microsoft.com

It would just be the cat's pajamas if I cut use the Cut command, specify the '.' as field separator and return the RIGHT two fields instead of numbering from the left. But I don't think that's possible, at least from what I've read and searched so far.

To say it another way, II only want to deal with two fields (as separated by '.'), not 4, 6, or three. And field quantity is a variable, depending on how many subdomains need to be deleted. I can't use AWK, but SEDis available to my systeme, if Ihave to use something other than Cut or Sort.
0
Comment
Question by:thinkwelldesigns
  • 2
  • 2
5 Comments
 
LVL 84

Accepted Solution

by:
ozo earned 500 total points
ID: 22602258
sed 's/.*\.\(.*\.\)/\1/' | sort -u
0
 
LVL 48

Expert Comment

by:Tintin
ID: 22602373
Why the restriction on awk?

awk -F. '{printf"%s.%s\n", $(NF-1),$NF}' file  |sort -u

otherwise I was going to suggest the same sed option as ozo.
0
 
LVL 84

Expert Comment

by:ozo
ID: 22602407
awk -F. '{d[$(NF-1),".",$NF]++}END{for( u in d) print u}'
0
 
LVL 9

Author Comment

by:thinkwelldesigns
ID: 22605731
Thanks, ozo. The sed command works perfectly. I'm not quite sure exactly HOW it works, and if you'd give me an explanation, I'd be grateful. But it does work exactly so points to you!

Thanks a million.

BTW, this script will be used on a number of gateway machines that don't have awk installed, but do have sed. I dowanna hafta install awk on every machine to make the script work.

But is awk "better" than sed?
0
 
LVL 48

Expert Comment

by:Tintin
ID: 22609027
awk isn't better than sed, it's just another tool that is better at some tasks than sed.  In this instance, there's not much difference between using sed or awk.  I'm surprised you don't have awk on your gateway servers, as it is a very standard utility.

Anyway, breaking the sed statement down:

.*  zero or more matches of any character
\.  a literal dot
\(  start of pattern capture
.*\.  zero of more matches of any character terminated by a literal dot
\)  end of pattern capture
\1  print the captured pattern
0

Featured Post

Does Powershell have you tied up in knots?

Managing Active Directory does not always have to be complicated.  If you are spending more time trying instead of doing, then it's time to look at something else. For nearly 20 years, AD admins around the world have used one tool for day-to-day AD management: Hyena. Discover why

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Does the idea of dealing with bits scare or confuse you? Does it seem like a waste of time in an age where we all have terabytes of storage? If so, you're missing out on one of the core tools in every professional programmer's toolbox. Learn how to …
Since upgrading to Office 2013 or higher installing the Smart Indenter addin will fail. This article will explain how to install it so it will work regardless of the Office version installed.
In this fourth video of the Xpdf series, we discuss and demonstrate the PDFinfo utility, which retrieves the contents of a PDF's Info Dictionary, as well as some other information, including the page count. We show how to isolate the page count in a…
In this fifth video of the Xpdf series, we discuss and demonstrate the PDFdetach utility, which is able to list and, more importantly, extract attachments that are embedded in PDF files. It does this via a command line interface, making it suitable …

777 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question