Still celebrating National IT Professionals Day with 3 months of free Premium Membership. Use Code ITDAY17

x
?
Solved

Go over a ltf file with python

Posted on 2011-02-20
4
Medium Priority
?
395 Views
Last Modified: 2012-05-11
Dear experts,
I am a newbie in the python world. I would like to build a small script that goes over a specific file (UTF-8) and do the following:
For each line that start with XXXX copy the line, remove the XXXX and put the output on a different file.
So eventually I will have a new file with all the lines from the original file that had XXX (without the XXX).
Can you please provide me some example that similar to what I would like to build?
Best regards,
Boaz.
0
Comment
Question by:WAS_Infra
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 2
  • 2
4 Comments
 
LVL 6

Expert Comment

by:Bxoz
ID: 34936232
# -*- coding: iso-8859-1 -*-
import re
obFile = open('fileToRead.txt','r')
obFileW = open('fileToWrite.txt','w')

lignes = obFile.readlines()
reg1=re.compile('^XXX')
for i in lignes:
    if reg1.findall(i):
        obFileW.write(i)
obFile.close()
obFileW.close()

Open in new window

0
 
LVL 6

Accepted Solution

by:
Bxoz earned 2000 total points
ID: 34936264
Same code but with XXX removed

# -*- coding: iso-8859-1 -*-
import re
obFile = open('fileToRead.txt','r')
obFileW = open('fileToWrite.txt','w')

lignes = obFile.readlines()
reg1=re.compile('^XXX')
for i in lignes:
    if reg1.findall(i):
        obFileW.write(i.replace('XXX',''))
obFile.close()
obFileW.close()

Open in new window

0
 
LVL 29

Expert Comment

by:pepr
ID: 34940704
The solution with regular expressions is an overkill if you really do not need them.  For the case when the line starts with known prefix, use the .startswith() method of the built-in string.  Also, there is no need to read the lines first to a list.  It is better to process the file on-the-fly (the data.txt file attached):

fin = open('data.txt')
fout = open('out.txt', 'w')

for line in fin:
    if line.startswith('XXXX'):
        fout.write(line)
    
fin.close()
fout.close()

Open in new window

data.txt
0
 
LVL 29

Expert Comment

by:pepr
ID: 34940768
For the UTF-8, it is a separate story. It depends also on whether you use Python 2.x or Python 3.  If using Python 2.x it depends on whether you want to theat the strings as sequences of bytes or as unicode strings.  Use the codecs module http://docs.python.org/library/codecs.html for the later.  The difference with opening the files seems only a minor one...

import codecs

fin = codecs.open('data.txt', encoding='utf-8')
fout = codecs.open('out.txt', 'w', encoding='utf-8')

for line in fin:
    if line.startswith('XXXX'):
        fout.write(line)
    
fin.close()
fout.close()

Open in new window


However, the line variable now contains unicode strings.  You can even convert the encoding for the output.

I do recommend to read the "Dive into Python 3" by Mark Pilgrim, Chapter 4. Strings -- http://diveintopython3.org/strings.html
0

Featured Post

ATEN's HDBaseT Presentation at InfoComm 2017

Hear ATEN Product Manager YT Liang review HDBaseT technology, highlighting ATEN’s latest solutions as they relate to real-world applications during her presentation at the HDBaseT booth at InfoComm 2017.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

What do responsible coders do? They don't take detrimental shortcuts. They do take reasonable security precautions, create important automation, implement sufficient logging, fix things they break, and care about users.
We live in a world of interfaces like the one in the title picture. VBA also allows to use interfaces which offers a lot of possibilities. This article describes how to use interfaces in VBA and how to work around their bugs.
With the advent of Windows 10, Microsoft is pushing a Get Windows 10 icon into the notification area (system tray) of qualifying computers. There are many reasons for wanting to remove this icon. This two-part Experts Exchange video Micro Tutorial s…
This is used to tweak the memory usage for your computer, it is used for servers more so than workstations but just be careful editing registry settings as it may cause irreversible results. I hold no responsibility for anything you do to the regist…

721 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question