Solved

extract ip addresses from any file.

Posted on 2008-10-23
8
2,610 Views
Last Modified: 2012-05-05
I need to extract ip addresses from an unformatted text file.  The addresses do not have brackets or any other delimiters around them.  I need to output this to another file.
0
Comment
Question by:jeffsmall
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
8 Comments
 
LVL 1

Expert Comment

by:LosBear
ID: 22787424
is the file tab delimited at least? can you provide a sample of the file you want to parse?

0
 
LVL 1

Expert Comment

by:WANM
ID: 22787698
assuming they at least have a space at each side of the address, and one address somewhere on each line:

cat file | sed -r 's/(.*?) ([0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}) (.*)/\2/'

If you could provide and example of the file it would make this easier....
0
 

Author Comment

by:jeffsmall
ID: 22788244
Sorry, I want this to work on just about any text file and there may or may not be spaces around the address.  The addresses might be interspersed and there could be more than one on a line.

Here is some sample text

; generated by /sbin/dhclient-script
nameserver 192.168.1.147
nameserver 192.168.1.1
ip address=192.168.1.148
0
Get 15 Days FREE Full-Featured Trial

Benefit from a mission critical IT monitoring with Monitis Premium or get it FREE for your entry level monitoring needs.
-Over 200,000 users
-More than 300,000 websites monitored
-Used in 197 countries
-Recommended by 98% of users

 

Author Comment

by:jeffsmall
ID: 22788450
search aus.us.siteprotect.com
nameserver 216.139.253.2
nameserver 216.139.253.3

the regex provided above matches the first line also.  I just need to pull the ip addresses and output them to a temp file.

>cat /etc/resolv.conf | sed -r 's/(.*?) ([0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}) (.*)/\2/'
search aus.us.siteprotect.com
nameserver 216.139.253.2
nameserver 216.139.253.3
0
 
LVL 5

Accepted Solution

by:
zmo earned 125 total points
ID: 22788998
well to remove those lines you can do :

% cat /etc/resolv.conf | grep -E '[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}'
nameserver 216.139.253.2
nameserver 216.139.253.3

but if you join both you'll still get :
% cat /etc/resolv.conf | grep -E '[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}' | sed -r 's/(.*?) ([0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}) (.*)/\2/'
nameserver 216.139.253.2
nameserver 216.139.253.3

and that's because of the needed ' ' between the regexp and the rest of the line.

A fix would be :
% cat /etc/resolv.conf | sed -r 's/^.* ([0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}).*$/\1/' | grep -E '[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}'
216.139.253.2
216.139.253.3

but you just can't match several IPs on the same line, because of how sed works internally (that's fun I said the same thing this morning on another topic about regexps). Sed works with finite stack automatas, which means it can't produce n times the same pattern in a s///. You have to do it programatically.

A solution would be to get all the first IPs of every lines containing IPs. Then use sed to remove all those IPs and pipe it to sed again so it gets all the first IPs of every lines, and so on until there are no more IPs.
0
 
LVL 5

Expert Comment

by:zmo
ID: 22789028
% cat /etc/resolv.conf | sed -r 's/^[^0-9]*([0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}).*$/\1/' | grep -E '[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}'
216.139.253.2
216.139.253.3

is what you want (ie no need for leading white space)
0
 

Author Closing Comment

by:jeffsmall
ID: 31509269
Thank you!  That does what I need very nicely. I appreciate your help and the knowledge I have gained from your work.
0
 
LVL 5

Expert Comment

by:zmo
ID: 22789190
to be more picky, and correct all previous mistakes, you could use :

% cat /etc/resolv.conf | sed -r 's/^.*?[^0-9](25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)[^0-9].*$/\1.\2.\3.\4/' | grep -E "(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)"
216.139.253.2
216.139.253.3

the differences are :
1/ '' ^.*? '' : matches any character any time at the beginning of the line
2/ '' [^0-9] '' : match any character that is not a number one time
3/ '' (25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?) '' : matches an IP address that is valid (ie 265.24.256.3 is not valid)
4/ '' [^0-9] '' same as 2
5/ '' .*?$ '' : same as 1/ for the end

\1.\2.\3.\4 reconstructs the address from the parts in parenthesis in pattern 3/
0

Featured Post

Optimize your web performance

What's in the eBook?
- Full list of reasons for poor performance
- Ultimate measures to speed things up
- Primary web monitoring types
- KPIs you should be monitoring in order to increase your ROI

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

The purpose of this article is to fix the unknown display problem in Linux Mint operating system. After installing the OS if you see Display monitor is not recognized then we can install "MESA" utilities to fix this problem or we can install additio…
Originally, this post was published on Monitis Blog, you can check it here . In business circles, we sometimes hear that today is the “age of the customer.” And so it is. Thanks to the enormous advances over the past few years in consumer techno…
The viewer will learn how to count occurrences of each item in an array.
In a recent question (https://www.experts-exchange.com/questions/29004105/Run-AutoHotkey-script-directly-from-Notepad.html) here at Experts Exchange, a member asked how to run an AutoHotkey script (.AHK) directly from Notepad++ (aka NPP). This video…

623 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question