Solved

Python: Parsing TXT for IP Address

Posted on 2014-11-03
9
88 Views
Last Modified: 2016-06-22
Hello All,

I am using the below code to download a text file to a directory specified, then open the text file and search for all ip addresses contained in that text file. Below you will find my current code and current output. The issue is that I am not certain how to accomplish my final three and remaining tasks with this code and need some guidance. I have made it this far on my own and I do not want anyone to do it for me just point me in the right direction.

Questions:

1. How can I modify the code below in a way to find all addresses ending in .255 and remove them? This would leave only addresses with .0 remaining. I suppose I could modify the regex pattern to find all ip addresses with .0 at the end. Not sure of the regex pattern to accomplish that.
2. I would like to append a /24 to each of the ip addresses below.
3. Formatting; Each IP address should have its own row such as:
    1.1.1.0/24
    2.2.2.0/24
    3.3.3.0/24

''' MY CURRENT CODE '''

from urllib.request import urlretrieve
import re

files_www = 'http://feeds.dshield.org/block.txt'
files_dlloc = 'c:\dshield.txt'
urlretrieve(files_www, files_dlloc)

f = open(files_dlloc, 'r')
raw_text = str(f.readlines())
f.close()
ip_address = r'(?:[\d]{1,3})\.(?:[\d]{1,3})\.(?:[\d]{1,3})\.(?:[\d]{1,3})'
foundip = re.findall( ip_address, raw_text )

print (foundip)

Open in new window


''' MY CURRENT OUTPUT '''

C:\>dshield.py
['184.105.139.0', '184.105.139.255', '61.221.83.0', '61.221.83.255', '212.83.148
.0', '212.83.148.255', '123.8.126.0', '123.8.126.255', '122.225.109.0', '122.225
.109.255', '218.77.79.0', '218.77.79.255', '74.82.47.0', '74.82.47.255', '162.21
2.181.0', '162.212.181.255', '176.31.101.0', '176.31.101.255', '192.3.205.0', '1
92.3.205.255', '60.173.11.0', '60.173.11.255', '104.194.25.0', '104.194.25.255',
 '222.43.119.0', '222.43.119.255', '115.28.239.0', '115.28.239.255', '197.243.16
.0', '197.243.16.255', '190.143.107.0', '190.143.107.255', '184.105.247.0', '184
.105.247.255', '93.178.3.0', '93.178.3.255', '61.174.51.0', '61.174.51.255', '71
.6.216.0', '71.6.216.255']

Open in new window

0
Comment
Question by:BERITM
  • 4
  • 4
9 Comments
 
LVL 35

Expert Comment

by:Terry Woods
ID: 40420994
Not sure about the code, but I should be able to help with the regular expression. The following expressions will match IP addresses ending with 255 and 0 respectively:
r'(?:[\d]{1,3})\.(?:[\d]{1,3})\.(?:[\d]{1,3})\.255'

Open in new window

r'(?:[\d]{1,3})\.(?:[\d]{1,3})\.(?:[\d]{1,3})\.0'

Open in new window

0
 

Author Comment

by:BERITM
ID: 40420997
I updated the regex:

ip_address = r'(?:[\d]{1,3})\.(?:[\d]{1,3})\.(?:[\d]{1,3})\.0'

Open in new window


Here is my new output:

C:\>dshield.py
['184.105.139.0', '122.225.97.0', '222.186.34.0', '74.82.47.0', '176.31.101.0',
'162.212.181.0', '218.77.79.0', '184.105.247.0', '173.201.39.0', '71.6.216.0', '
192.3.205.0', '176.31.251.0', '5.196.77.0', '92.222.181.0', '93.174.93.0', '124.
232.142.0', '62.210.90.0', '85.25.242.0', '89.248.162.0', '204.42.253.0']

Open in new window


When I change my print to:

print (foundip) + "/24"

Open in new window


I get the following error:

C:\>dshield.py
['184.105.139.0', '122.225.97.0', '222.186.34.0', '74.82.47.0', '176.31.101.0',
'162.212.181.0', '218.77.79.0', '184.105.247.0', '173.201.39.0', '71.6.216.0', '
192.3.205.0', '176.31.251.0', '5.196.77.0', '92.222.181.0', '93.174.93.0', '124.
232.142.0', '62.210.90.0', '85.25.242.0', '89.248.162.0', '204.42.253.0']
Traceback (most recent call last):
  File "C:\dshield.py", line 19, in <module>
    print (foundip) + "/24"
TypeError: unsupported operand type(s) for +: 'NoneType' and 'str'

Open in new window

0
 
LVL 35

Expert Comment

by:Terry Woods
ID: 40421000
To add the /24 on the end, perhaps something like this?
ip_address = r'(?:[\d]{1,3})\.(?:[\d]{1,3})\.(?:[\d]{1,3})\.0'
foundip = re.findall( ip_address, raw_text )
for ip in foundip:
  print ip + "/24"

Open in new window

0
 
LVL 35

Expert Comment

by:Terry Woods
ID: 40421002
Not sure if that's the correct way to concat strings in python, sorry. If my latest suggestion doesn't work, then perhaps try a comma instead of a plus:
  print ip, "/24"

Edit: looks like + is the correct one to use, so ignore this...
0
Threat Intelligence Starter Resources

Integrating threat intelligence can be challenging, and not all companies are ready. These resources can help you build awareness and prepare for defense.

 

Author Comment

by:BERITM
ID: 40423082
Okay, So my problem is this. The output is a single list of ip addresses: so if ip append the string "/24" to the end it looks like below. I think the next phase is to learn how to break each of these ip addresses out of the list. Once that is done then I can print any string on the end or in from of them i want. Any idea how to break each of these addresses out of the list?

['117.27.158.0', '62.210.83.0', '195.154.252.0', '61.174.51.0', '218.77.79.0',
162.212.181.0', '176.31.101.0', '192.126.120.0', '219.129.237.0', '60.173.10.0
 '222.186.51.0', '104.194.15.0', '171.111.153.0', '1.182.119.0', '197.243.16.0
 '80.82.64.0', '78.93.247.0', '222.186.15.0', '115.238.247.0', '61.160.247.0']/24
0
 

Author Comment

by:BERITM
ID: 40423084
Okay, so I just changed my print line to this:

print (foundip[0])
print (foundip[1])
print (foundip[2])

and my output is

C:\>C:\dshield.py
117.27.158.0
62.210.83.0
195.154.252.0

I know for sure now that the list is good to go so I need to learn to print the items in an arrary or something.
0
 

Accepted Solution

by:
BERITM earned 0 total points
ID: 40423215
Got IT! I needed to iterate the [list]

Here is my final code for this matter

''' -- Imports -- '''
import urllib.request
import re

''' --- URL To Read -- '''
dshield_url = "http://feeds.dshield.org/block.txt"

''' -- Open URL, Read IT, String IT, Find IP, & Sort -- '''
dshield_raw_text = urllib.request.urlopen(dshield_url)
dshield_string = str(dshield_raw_text.read())
ipv4_pattern = r'\d{1,3}\.\d{1,3}\.\d{1,3}\.0'
iplist = re.findall( ipv4_pattern, dshield_string )
iplist.sort()

''' -- Create Final RSC, Populate Dangerous Class C -- '''
dshield_rsc = "C:\DShield.rsc"
dshield_clean_text = open(dshield_rsc, "w")
dshield_clean_text.write("/ip firewall address-list\n")
for ip in iplist:
    dshield_clean_text.write("add list=DShield address=" + (ip) + "/24 comment=DShield\n")
dshield_clean_text.close()

Open in new window


Here is the output in an RSC file:

/ip firewall address-list
add list=DShield address=111.73.46.0/24 comment=DShield
add list=DShield address=114.43.16.0/24 comment=DShield
add list=DShield address=117.21.173.0/24 comment=DShield
add list=DShield address=119.5.155.0/24 comment=DShield
add list=DShield address=122.141.234.0/24 comment=DShield
add list=DShield address=122.225.109.0/24 comment=DShield
add list=DShield address=122.225.97.0/24 comment=DShield
add list=DShield address=158.255.1.0/24 comment=DShield
add list=DShield address=162.212.181.0/24 comment=DShield
add list=DShield address=176.31.101.0/24 comment=DShield
add list=DShield address=194.73.113.0/24 comment=DShield
add list=DShield address=210.14.152.0/24 comment=DShield
add list=DShield address=218.77.79.0/24 comment=DShield
add list=DShield address=222.186.21.0/24 comment=DShield
add list=DShield address=222.186.56.0/24 comment=DShield
add list=DShield address=222.187.32.0/24 comment=DShield
add list=DShield address=60.173.11.0/24 comment=DShield
add list=DShield address=61.174.51.0/24 comment=DShield
add list=DShield address=74.118.193.0/24 comment=DShield
add list=DShield address=93.174.93.0/24 comment=DShield

Open in new window


Thank You,
0
 
LVL 35

Expert Comment

by:Terry Woods
ID: 40434506
Sorry I didn't get back to this earlier; glad you got it solved! Could you please accept an answer to the question?

You can accept your own solution as the answer, but you can assign some points to my comments if they were helpful...
0

Featured Post

How to run any project with ease

Manage projects of all sizes how you want. Great for personal to-do lists, project milestones, team priorities and launch plans.
- Combine task lists, docs, spreadsheets, and chat in one
- View and edit from mobile/offline
- Cut down on emails

Join & Write a Comment

Go is an acronym of golang, is a programming language developed Google in 2007. Go is a new language that is mostly in the C family, with significant input from Pascal/Modula/Oberon family. Hence Go arisen as low-level language with fast compilation…
Displaying an arrayList in a listView using the default adapter is rarely the best solution. To get full control of your display data, and to be able to refresh it after editing, requires the use of a custom adapter.
The goal of the tutorial is to teach the user how to use functions in C++. The video will cover how to define functions, how to call functions and how to create functions prototypes. Microsoft Visual C++ 2010 Express will be used as a text editor an…
In this fourth video of the Xpdf series, we discuss and demonstrate the PDFinfo utility, which retrieves the contents of a PDF's Info Dictionary, as well as some other information, including the page count. We show how to isolate the page count in a…

758 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

20 Experts available now in Live!

Get 1:1 Help Now