Solved

Bash sort / cut / uniq script to analyze domains

Posted on 2008-09-29
5
1,292 Views
Last Modified: 2013-12-26
I have a list of domains that I'd like to trim down a little bit. I only want the domains listed and not the subdomains of each domain. For example, the list below:

ff.search.yahoo.com
files.opensuse.org
files.widgetbox.com
yahoo.com
forums.dpreview.com
forums.opensuse.org
images.dpreview.com
geo.yahoo.com
go.microsoft.com
updates.vista.microsoft.com

I want to reduce to:

yahoo.com.com
opensuse.org
widgetbox.com
dpreview.com
microsoft.com

It would just be the cat's pajamas if I cut use the Cut command, specify the '.' as field separator and return the RIGHT two fields instead of numbering from the left. But I don't think that's possible, at least from what I've read and searched so far.

To say it another way, II only want to deal with two fields (as separated by '.'), not 4, 6, or three. And field quantity is a variable, depending on how many subdomains need to be deleted. I can't use AWK, but SEDis available to my systeme, if Ihave to use something other than Cut or Sort.
0
Comment
Question by:thinkwelldesigns
  • 2
  • 2
5 Comments
 
LVL 84

Accepted Solution

by:
ozo earned 500 total points
ID: 22602258
sed 's/.*\.\(.*\.\)/\1/' | sort -u
0
 
LVL 48

Expert Comment

by:Tintin
ID: 22602373
Why the restriction on awk?

awk -F. '{printf"%s.%s\n", $(NF-1),$NF}' file  |sort -u

otherwise I was going to suggest the same sed option as ozo.
0
 
LVL 84

Expert Comment

by:ozo
ID: 22602407
awk -F. '{d[$(NF-1),".",$NF]++}END{for( u in d) print u}'
0
 
LVL 9

Author Comment

by:thinkwelldesigns
ID: 22605731
Thanks, ozo. The sed command works perfectly. I'm not quite sure exactly HOW it works, and if you'd give me an explanation, I'd be grateful. But it does work exactly so points to you!

Thanks a million.

BTW, this script will be used on a number of gateway machines that don't have awk installed, but do have sed. I dowanna hafta install awk on every machine to make the script work.

But is awk "better" than sed?
0
 
LVL 48

Expert Comment

by:Tintin
ID: 22609027
awk isn't better than sed, it's just another tool that is better at some tasks than sed.  In this instance, there's not much difference between using sed or awk.  I'm surprised you don't have awk on your gateway servers, as it is a very standard utility.

Anyway, breaking the sed statement down:

.*  zero or more matches of any character
\.  a literal dot
\(  start of pattern capture
.*\.  zero of more matches of any character terminated by a literal dot
\)  end of pattern capture
\1  print the captured pattern
0

Featured Post

Announcing the Most Valuable Experts of 2016

MVEs are more concerned with the satisfaction of those they help than with the considerable points they can earn. They are the types of people you feel privileged to call colleagues. Join us in honoring this amazing group of Experts.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

Does the idea of dealing with bits scare or confuse you? Does it seem like a waste of time in an age where we all have terabytes of storage? If so, you're missing out on one of the core tools in every professional programmer's toolbox. Learn how to …
Computer science students often experience many of the same frustrations when going through their engineering courses. This article presents seven tips I found useful when completing a bachelors and masters degree in computing which I believe may he…
Learn several ways to interact with files and get file information from the bash shell. ls lists the contents of a directory: Using the -a flag displays hidden files: Using the -l flag formats the output in a long list: The file command gives us mor…
In this fourth video of the Xpdf series, we discuss and demonstrate the PDFinfo utility, which retrieves the contents of a PDF's Info Dictionary, as well as some other information, including the page count. We show how to isolate the page count in a…

830 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question