Solved

Python - how to delete an item from a nested list, file input

Posted on 2008-06-20
13
1,091 Views
Last Modified: 2011-10-03
The python readlines() function gives me a list, output[], from file and I want to delete any line from output[] that is short the required number of list elements (14).  I can print the index values of the lines concerned, but I can't delete the lines themselves - the objects are either unsubscriptable or don't support item deletion. Help!

output = src.readlines()
 

index = 0

for index, lines in enumerate(output):

   lines = lines.split(',', 13)

   if (len(lines) < 14):

       print index  # prints correct index value 

       del index     # or output2[index] etc, fails

   else:

       index +=1

Open in new window

0
Comment
Question by:sara_bellum
  • 4
  • 4
  • 3
  • +1
13 Comments
 
LVL 9

Expert Comment

by:ghostdog74
ID: 21836357
assign to another array. Here's a list comprehension
a = [ l for n,l in enumerate(open("file")) if len(l.split(",")) == 14 ]

print a   

Open in new window

0
 

Author Comment

by:sara_bellum
ID: 21836670
I have added your code to my script but still can't delete the lines that are missing list elements.  I tried several options but my attempts may confuse you, so I simply added some comments to demonstrate what I am trying to do.
file = open('badger_start.dat')

list_all = file.readlines()
 

# check to see if I'm picking up the right errors

list_errors = [ l for n, l in enumerate(file) if len(l.split(",")) < 14 ]

print list_errors # this is hard to read but I know how to fix that
 

for index, line in enumerate(list_all):

#   if the index value matches a line with an error 

#   delete the line

   else:

       index += 1 #the index should only increment for valid lines
 

# check your result

print list_all # hard to read but I know how to fix that

file.close()

Open in new window

0
 
LVL 9

Expert Comment

by:ghostdog74
ID: 21836790
can you post a sample of your badger_start.dat file, and then describe what you actually want to see as output. Its much easier this way
0
 
LVL 15

Expert Comment

by:efn
ID: 21836925
It's not going to work well to delete from a list while you are in the middle of iterating through it.

It is possible to fix your example to work with the technique ghostdog74 suggested.  The idea is that instead of trying to delete the elements you don't want from the list, you construct a new list with only the elements you do want.  In your code, you have all the input lines in the list_all list.  In this case, you don't care about the indexes, so there is no need to use the enumerate function.  You can just write an expression to select the lines you want from the list_all list:

wanted = [ line for line in list_all if len(line.split(",")) >= 14 ]

If you want to construct a list of the error lines as in your example, you can just change the condition tested.

Another way to do it is to iterate over a copy of the list, removing elements from the original list.  list_all[:] makes a copy of the list and you can delete from the original list by value.  This is longer, but perhaps easier to read.

for line in list_all[:] :
    if len(line.split(",")) < 14:
        list_all.remove(line)
0
 
LVL 15

Expert Comment

by:mish33
ID: 21837742
A) As was said do NOT modify list you iterate on
B) enumerate does +=1 for you
C) be careful with variable names
output = src.readlines()

valid = []

for index, line in enumerate(output):

   fields = line.split(',')

   if len(fields) < 14:

       print index  # prints correct index value 

   else:

       valid.append(line)

# use valid lines

Open in new window

0
 
LVL 15

Expert Comment

by:efn
ID: 21838051
> B) enumerate does +=1 for you

Nitpick:  actually, "for ... in ..." does the +=1 for you.  But mish33 did show another way that will work.
0
Highfive + Dolby Voice = No More Audio Complaints!

Poor audio quality is one of the top reasons people don’t use video conferencing. Get the crispest, clearest audio powered by Dolby Voice in every meeting. Highfive and Dolby Voice deliver the best video conferencing and audio experience for every meeting and every room.

 
LVL 15

Expert Comment

by:mish33
ID: 21838404
efn: I was referring to index += 1 in the OP code
0
 
LVL 15

Expert Comment

by:efn
ID: 21838542
So was I.  The idea of the "index += 1" statement was apparently to iterate through the list;  I think we agree that this statement is unnecessary.  The "for ... in ..." statement is what iterates through the various lists in all of the code on this page.  The calls to enumerate are really not needed at all, except for debugging displays of index values.  The enumerate function just returns a list that just sits there, not adding anything to anything.  But this is not really an important point.
0
 
LVL 15

Expert Comment

by:mish33
ID: 21838615
enumerate is needed to print indexes of non-complaining lines,
but having both enumerate and manual index counting is um... unnecessary
0
 

Author Comment

by:sara_bellum
ID: 21838933
Thanks very much!  efn and mish33's solutions do work for me, but I'm trying to start at an index value of 4 to exclude the header information from the sample data (the header has a different format).  I tried inserting index = 4 before the for loop(s) but that fails. I had assumed, apparently incorrectly, that I could start a for loop at any index value in the list.   Let me know if there's a simple answer to this, or if I need to start a new question, thanks.
0
 
LVL 15

Accepted Solution

by:
efn earned 500 total points
ID: 21838999
If you use the second approach I suggested, where you make a copy of the list, you can make a copy of everything in the list starting from index 4 if you use list_all[4:] instead of list_all[:].  In this approach, you are removing list elements by value, so it won't matter that the indexes in the list being checked and the list being changed are not the same.

There are, of course, other solutions.
0
 
LVL 15

Expert Comment

by:mish33
ID: 21839166
That approach will keep index (line number) of printed lines right:
output = src.readlines()

valid = []

for index, line in enumerate(output):

   if index < 4: continue  # skip first 4 lines

   fields = line.split(',')

   if len(fields) < 14:

       print index  # prints correct index value 

   else:

       valid.append(line)

# use valid lines

Open in new window

0
 

Author Closing Comment

by:sara_bellum
ID: 31469370
For reasons which pass my understanding, mish33's solution didn't work the 2d etc time I tried it...so I gave all the points to efn - thanks very much!
0

Featured Post

How to run any project with ease

Manage projects of all sizes how you want. Great for personal to-do lists, project milestones, team priorities and launch plans.
- Combine task lists, docs, spreadsheets, and chat in one
- View and edit from mobile/offline
- Cut down on emails

Join & Write a Comment

Setting up Secure Ubuntu server on VMware 1.      Insert the Ubuntu Server distribution CD or attach the ISO of the CD which is in the “Datastore”. Note that it is important to install the x64 edition on servers, not the X86 editions. 2.      Power on th…
Dictionaries contain key:value pairs. Which means a collection of tuples with an attribute name and an assigned value to it. The semicolon present in between each key and values and attribute with values are delimited with a comma.  In python we can…
Learn several ways to interact with files and get file information from the bash shell. ls lists the contents of a directory: Using the -a flag displays hidden files: Using the -l flag formats the output in a long list: The file command gives us mor…
This demo shows you how to set up the containerized NetScaler CPX with NetScaler Management and Analytics System in a non-routable Mesos/Marathon environment for use with Micro-Services applications.

747 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

12 Experts available now in Live!

Get 1:1 Help Now