Solved

Go over a ltf file with python

Posted on 2011-02-20
4
373 Views
Last Modified: 2012-05-11
Dear experts,
I am a newbie in the python world. I would like to build a small script that goes over a specific file (UTF-8) and do the following:
For each line that start with XXXX copy the line, remove the XXXX and put the output on a different file.
So eventually I will have a new file with all the lines from the original file that had XXX (without the XXX).
Can you please provide me some example that similar to what I would like to build?
Best regards,
Boaz.
0
Comment
Question by:WAS_Infra
  • 2
  • 2
4 Comments
 
LVL 6

Expert Comment

by:Bxoz
ID: 34936232
# -*- coding: iso-8859-1 -*-
import re
obFile = open('fileToRead.txt','r')
obFileW = open('fileToWrite.txt','w')

lignes = obFile.readlines()
reg1=re.compile('^XXX')
for i in lignes:
    if reg1.findall(i):
        obFileW.write(i)
obFile.close()
obFileW.close()

Open in new window

0
 
LVL 6

Accepted Solution

by:
Bxoz earned 500 total points
ID: 34936264
Same code but with XXX removed

# -*- coding: iso-8859-1 -*-
import re
obFile = open('fileToRead.txt','r')
obFileW = open('fileToWrite.txt','w')

lignes = obFile.readlines()
reg1=re.compile('^XXX')
for i in lignes:
    if reg1.findall(i):
        obFileW.write(i.replace('XXX',''))
obFile.close()
obFileW.close()

Open in new window

0
 
LVL 28

Expert Comment

by:pepr
ID: 34940704
The solution with regular expressions is an overkill if you really do not need them.  For the case when the line starts with known prefix, use the .startswith() method of the built-in string.  Also, there is no need to read the lines first to a list.  It is better to process the file on-the-fly (the data.txt file attached):

fin = open('data.txt')
fout = open('out.txt', 'w')

for line in fin:
    if line.startswith('XXXX'):
        fout.write(line)
    
fin.close()
fout.close()

Open in new window

data.txt
0
 
LVL 28

Expert Comment

by:pepr
ID: 34940768
For the UTF-8, it is a separate story. It depends also on whether you use Python 2.x or Python 3.  If using Python 2.x it depends on whether you want to theat the strings as sequences of bytes or as unicode strings.  Use the codecs module http://docs.python.org/library/codecs.html for the later.  The difference with opening the files seems only a minor one...

import codecs

fin = codecs.open('data.txt', encoding='utf-8')
fout = codecs.open('out.txt', 'w', encoding='utf-8')

for line in fin:
    if line.startswith('XXXX'):
        fout.write(line)
    
fin.close()
fout.close()

Open in new window


However, the line variable now contains unicode strings.  You can even convert the encoding for the output.

I do recommend to read the "Dive into Python 3" by Mark Pilgrim, Chapter 4. Strings -- http://diveintopython3.org/strings.html
0

Featured Post

What Should I Do With This Threat Intelligence?

Are you wondering if you actually need threat intelligence? The answer is yes. We explain the basics for creating useful threat intelligence.

Join & Write a Comment

This is an article about Leadership and accepting and adapting to new challenges. It focuses mostly on upgrading to Windows 10.
When we want to run, execute or repeat a statement multiple times, a loop is necessary. This article covers the two types of loops in Python: the while loop and the for loop.
Learn the basics of lists in Python. Lists, as their name suggests, are a means for ordering and storing values. : Lists are declared using brackets; for example: t = [1, 2, 3]: Lists may contain a mix of data types; for example: t = ['string', 1, T…
With the advent of Windows 10, Microsoft is pushing a Get Windows 10 icon into the notification area (system tray) of qualifying computers. There are many reasons for wanting to remove this icon. This two-part Experts Exchange video Micro Tutorial s…

708 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

10 Experts available now in Live!

Get 1:1 Help Now