Solved

extract ip addresses from any file.

Posted on 2008-10-23
8
2,603 Views
Last Modified: 2012-05-05
I need to extract ip addresses from an unformatted text file.  The addresses do not have brackets or any other delimiters around them.  I need to output this to another file.
0
Comment
Question by:jeffsmall
8 Comments
 
LVL 1

Expert Comment

by:LosBear
ID: 22787424
is the file tab delimited at least? can you provide a sample of the file you want to parse?

0
 
LVL 1

Expert Comment

by:WANM
ID: 22787698
assuming they at least have a space at each side of the address, and one address somewhere on each line:

cat file | sed -r 's/(.*?) ([0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}) (.*)/\2/'

If you could provide and example of the file it would make this easier....
0
 

Author Comment

by:jeffsmall
ID: 22788244
Sorry, I want this to work on just about any text file and there may or may not be spaces around the address.  The addresses might be interspersed and there could be more than one on a line.

Here is some sample text

; generated by /sbin/dhclient-script
nameserver 192.168.1.147
nameserver 192.168.1.1
ip address=192.168.1.148
0
 

Author Comment

by:jeffsmall
ID: 22788450
search aus.us.siteprotect.com
nameserver 216.139.253.2
nameserver 216.139.253.3

the regex provided above matches the first line also.  I just need to pull the ip addresses and output them to a temp file.

>cat /etc/resolv.conf | sed -r 's/(.*?) ([0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}) (.*)/\2/'
search aus.us.siteprotect.com
nameserver 216.139.253.2
nameserver 216.139.253.3
0
DevOps Toolchain Recommendations

Read this Gartner Research Note and discover how your IT organization can automate and optimize DevOps processes using a toolchain architecture.

 
LVL 5

Accepted Solution

by:
zmo earned 125 total points
ID: 22788998
well to remove those lines you can do :

% cat /etc/resolv.conf | grep -E '[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}'
nameserver 216.139.253.2
nameserver 216.139.253.3

but if you join both you'll still get :
% cat /etc/resolv.conf | grep -E '[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}' | sed -r 's/(.*?) ([0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}) (.*)/\2/'
nameserver 216.139.253.2
nameserver 216.139.253.3

and that's because of the needed ' ' between the regexp and the rest of the line.

A fix would be :
% cat /etc/resolv.conf | sed -r 's/^.* ([0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}).*$/\1/' | grep -E '[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}'
216.139.253.2
216.139.253.3

but you just can't match several IPs on the same line, because of how sed works internally (that's fun I said the same thing this morning on another topic about regexps). Sed works with finite stack automatas, which means it can't produce n times the same pattern in a s///. You have to do it programatically.

A solution would be to get all the first IPs of every lines containing IPs. Then use sed to remove all those IPs and pipe it to sed again so it gets all the first IPs of every lines, and so on until there are no more IPs.
0
 
LVL 5

Expert Comment

by:zmo
ID: 22789028
% cat /etc/resolv.conf | sed -r 's/^[^0-9]*([0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}).*$/\1/' | grep -E '[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}'
216.139.253.2
216.139.253.3

is what you want (ie no need for leading white space)
0
 

Author Closing Comment

by:jeffsmall
ID: 31509269
Thank you!  That does what I need very nicely. I appreciate your help and the knowledge I have gained from your work.
0
 
LVL 5

Expert Comment

by:zmo
ID: 22789190
to be more picky, and correct all previous mistakes, you could use :

% cat /etc/resolv.conf | sed -r 's/^.*?[^0-9](25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)[^0-9].*$/\1.\2.\3.\4/' | grep -E "(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)"
216.139.253.2
216.139.253.3

the differences are :
1/ '' ^.*? '' : matches any character any time at the beginning of the line
2/ '' [^0-9] '' : match any character that is not a number one time
3/ '' (25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?) '' : matches an IP address that is valid (ie 265.24.256.3 is not valid)
4/ '' [^0-9] '' same as 2
5/ '' .*?$ '' : same as 1/ for the end

\1.\2.\3.\4 reconstructs the address from the parts in parenthesis in pattern 3/
0

Featured Post

Is Your Active Directory as Secure as You Think?

More than 75% of all records are compromised because of the loss or theft of a privileged credential. Experts have been exploring Active Directory infrastructure to identify key threats and establish best practices for keeping data safe. Attend this month’s webinar to learn more.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

This tutorial will discuss fancy secure registration forms, with AJAX technology support. In this article I assume you already know HTML and some JS. I will write the code using WhizBase Server Pages, so you need to know some basics in WBSP (you mig…
It is a general practice to get rid of old user profiles on a computer  in a LAN environment. As I have been working with a company in a LAN environment where users move from one place to some other place at times. This will make many user profil…
Learn how to match and substitute tagged data using PHP regular expressions. Demonstrated on Windows 7, but also applies to other operating systems. Demonstrated technique applies to PHP (all versions) and Firefox, but very similar techniques will w…
In this fifth video of the Xpdf series, we discuss and demonstrate the PDFdetach utility, which is able to list and, more importantly, extract attachments that are embedded in PDF files. It does this via a command line interface, making it suitable …

867 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

21 Experts available now in Live!

Get 1:1 Help Now