Solved

Python: Parsing TXT for IP Address

Posted on 2014-11-03
9
140 Views
Last Modified: 2016-06-22
Hello All,

I am using the below code to download a text file to a directory specified, then open the text file and search for all ip addresses contained in that text file. Below you will find my current code and current output. The issue is that I am not certain how to accomplish my final three and remaining tasks with this code and need some guidance. I have made it this far on my own and I do not want anyone to do it for me just point me in the right direction.

Questions:

1. How can I modify the code below in a way to find all addresses ending in .255 and remove them? This would leave only addresses with .0 remaining. I suppose I could modify the regex pattern to find all ip addresses with .0 at the end. Not sure of the regex pattern to accomplish that.
2. I would like to append a /24 to each of the ip addresses below.
3. Formatting; Each IP address should have its own row such as:
    1.1.1.0/24
    2.2.2.0/24
    3.3.3.0/24

''' MY CURRENT CODE '''

from urllib.request import urlretrieve
import re

files_www = 'http://feeds.dshield.org/block.txt'
files_dlloc = 'c:\dshield.txt'
urlretrieve(files_www, files_dlloc)

f = open(files_dlloc, 'r')
raw_text = str(f.readlines())
f.close()
ip_address = r'(?:[\d]{1,3})\.(?:[\d]{1,3})\.(?:[\d]{1,3})\.(?:[\d]{1,3})'
foundip = re.findall( ip_address, raw_text )

print (foundip)

Open in new window


''' MY CURRENT OUTPUT '''

C:\>dshield.py
['184.105.139.0', '184.105.139.255', '61.221.83.0', '61.221.83.255', '212.83.148
.0', '212.83.148.255', '123.8.126.0', '123.8.126.255', '122.225.109.0', '122.225
.109.255', '218.77.79.0', '218.77.79.255', '74.82.47.0', '74.82.47.255', '162.21
2.181.0', '162.212.181.255', '176.31.101.0', '176.31.101.255', '192.3.205.0', '1
92.3.205.255', '60.173.11.0', '60.173.11.255', '104.194.25.0', '104.194.25.255',
 '222.43.119.0', '222.43.119.255', '115.28.239.0', '115.28.239.255', '197.243.16
.0', '197.243.16.255', '190.143.107.0', '190.143.107.255', '184.105.247.0', '184
.105.247.255', '93.178.3.0', '93.178.3.255', '61.174.51.0', '61.174.51.255', '71
.6.216.0', '71.6.216.255']

Open in new window

0
Comment
Question by:BERITM
  • 4
  • 4
9 Comments
 
LVL 35

Expert Comment

by:Terry Woods
ID: 40420994
Not sure about the code, but I should be able to help with the regular expression. The following expressions will match IP addresses ending with 255 and 0 respectively:
r'(?:[\d]{1,3})\.(?:[\d]{1,3})\.(?:[\d]{1,3})\.255'

Open in new window

r'(?:[\d]{1,3})\.(?:[\d]{1,3})\.(?:[\d]{1,3})\.0'

Open in new window

0
 

Author Comment

by:BERITM
ID: 40420997
I updated the regex:

ip_address = r'(?:[\d]{1,3})\.(?:[\d]{1,3})\.(?:[\d]{1,3})\.0'

Open in new window


Here is my new output:

C:\>dshield.py
['184.105.139.0', '122.225.97.0', '222.186.34.0', '74.82.47.0', '176.31.101.0',
'162.212.181.0', '218.77.79.0', '184.105.247.0', '173.201.39.0', '71.6.216.0', '
192.3.205.0', '176.31.251.0', '5.196.77.0', '92.222.181.0', '93.174.93.0', '124.
232.142.0', '62.210.90.0', '85.25.242.0', '89.248.162.0', '204.42.253.0']

Open in new window


When I change my print to:

print (foundip) + "/24"

Open in new window


I get the following error:

C:\>dshield.py
['184.105.139.0', '122.225.97.0', '222.186.34.0', '74.82.47.0', '176.31.101.0',
'162.212.181.0', '218.77.79.0', '184.105.247.0', '173.201.39.0', '71.6.216.0', '
192.3.205.0', '176.31.251.0', '5.196.77.0', '92.222.181.0', '93.174.93.0', '124.
232.142.0', '62.210.90.0', '85.25.242.0', '89.248.162.0', '204.42.253.0']
Traceback (most recent call last):
  File "C:\dshield.py", line 19, in <module>
    print (foundip) + "/24"
TypeError: unsupported operand type(s) for +: 'NoneType' and 'str'

Open in new window

0
 
LVL 35

Expert Comment

by:Terry Woods
ID: 40421000
To add the /24 on the end, perhaps something like this?
ip_address = r'(?:[\d]{1,3})\.(?:[\d]{1,3})\.(?:[\d]{1,3})\.0'
foundip = re.findall( ip_address, raw_text )
for ip in foundip:
  print ip + "/24"

Open in new window

0
Free Tool: Path Explorer

An intuitive utility to help find the CSS path to UI elements on a webpage. These paths are used frequently in a variety of front-end development and QA automation tasks.

One of a set of tools we're offering as a way of saying thank you for being a part of the community.

 
LVL 35

Expert Comment

by:Terry Woods
ID: 40421002
Not sure if that's the correct way to concat strings in python, sorry. If my latest suggestion doesn't work, then perhaps try a comma instead of a plus:
  print ip, "/24"

Edit: looks like + is the correct one to use, so ignore this...
0
 

Author Comment

by:BERITM
ID: 40423082
Okay, So my problem is this. The output is a single list of ip addresses: so if ip append the string "/24" to the end it looks like below. I think the next phase is to learn how to break each of these ip addresses out of the list. Once that is done then I can print any string on the end or in from of them i want. Any idea how to break each of these addresses out of the list?

['117.27.158.0', '62.210.83.0', '195.154.252.0', '61.174.51.0', '218.77.79.0',
162.212.181.0', '176.31.101.0', '192.126.120.0', '219.129.237.0', '60.173.10.0
 '222.186.51.0', '104.194.15.0', '171.111.153.0', '1.182.119.0', '197.243.16.0
 '80.82.64.0', '78.93.247.0', '222.186.15.0', '115.238.247.0', '61.160.247.0']/24
0
 

Author Comment

by:BERITM
ID: 40423084
Okay, so I just changed my print line to this:

print (foundip[0])
print (foundip[1])
print (foundip[2])

and my output is

C:\>C:\dshield.py
117.27.158.0
62.210.83.0
195.154.252.0

I know for sure now that the list is good to go so I need to learn to print the items in an arrary or something.
0
 

Accepted Solution

by:
BERITM earned 0 total points
ID: 40423215
Got IT! I needed to iterate the [list]

Here is my final code for this matter

''' -- Imports -- '''
import urllib.request
import re

''' --- URL To Read -- '''
dshield_url = "http://feeds.dshield.org/block.txt"

''' -- Open URL, Read IT, String IT, Find IP, & Sort -- '''
dshield_raw_text = urllib.request.urlopen(dshield_url)
dshield_string = str(dshield_raw_text.read())
ipv4_pattern = r'\d{1,3}\.\d{1,3}\.\d{1,3}\.0'
iplist = re.findall( ipv4_pattern, dshield_string )
iplist.sort()

''' -- Create Final RSC, Populate Dangerous Class C -- '''
dshield_rsc = "C:\DShield.rsc"
dshield_clean_text = open(dshield_rsc, "w")
dshield_clean_text.write("/ip firewall address-list\n")
for ip in iplist:
    dshield_clean_text.write("add list=DShield address=" + (ip) + "/24 comment=DShield\n")
dshield_clean_text.close()

Open in new window


Here is the output in an RSC file:

/ip firewall address-list
add list=DShield address=111.73.46.0/24 comment=DShield
add list=DShield address=114.43.16.0/24 comment=DShield
add list=DShield address=117.21.173.0/24 comment=DShield
add list=DShield address=119.5.155.0/24 comment=DShield
add list=DShield address=122.141.234.0/24 comment=DShield
add list=DShield address=122.225.109.0/24 comment=DShield
add list=DShield address=122.225.97.0/24 comment=DShield
add list=DShield address=158.255.1.0/24 comment=DShield
add list=DShield address=162.212.181.0/24 comment=DShield
add list=DShield address=176.31.101.0/24 comment=DShield
add list=DShield address=194.73.113.0/24 comment=DShield
add list=DShield address=210.14.152.0/24 comment=DShield
add list=DShield address=218.77.79.0/24 comment=DShield
add list=DShield address=222.186.21.0/24 comment=DShield
add list=DShield address=222.186.56.0/24 comment=DShield
add list=DShield address=222.187.32.0/24 comment=DShield
add list=DShield address=60.173.11.0/24 comment=DShield
add list=DShield address=61.174.51.0/24 comment=DShield
add list=DShield address=74.118.193.0/24 comment=DShield
add list=DShield address=93.174.93.0/24 comment=DShield

Open in new window


Thank You,
0
 
LVL 35

Expert Comment

by:Terry Woods
ID: 40434506
Sorry I didn't get back to this earlier; glad you got it solved! Could you please accept an answer to the question?

You can accept your own solution as the answer, but you can assign some points to my comments if they were helpful...
0

Featured Post

Free Tool: IP Lookup

Get more info about an IP address or domain name, such as organization, abuse contacts and geolocation.

One of a set of tools we are providing to everyone as a way of saying thank you for being a part of the community.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

Title # Comments Views Activity
numbers ascending pyramid 101 213
Apps blocked by Java 9 88
Developing database that gets updated from Excel. Looking for best approach 5 42
python - user input - set 4 28
This article will show, step by step, how to integrate R code into a R Sweave document
When we want to run, execute or repeat a statement multiple times, a loop is necessary. This article covers the two types of loops in Python: the while loop and the for loop.
The goal of the video will be to teach the user the difference and consequence of passing data by value vs passing data by reference in C++. An example of passing data by value as well as an example of passing data by reference will be be given. Bot…
The viewer will learn additional member functions of the vector class. Specifically, the capacity and swap member functions will be introduced.

828 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question