bsumariwalla
asked on
Python - Deduplicating and Capturing Duplicates
Hello,
I'm using Python 2.7 and I have a lists of lists with lists. I'm trying to deduplicate the list of lists and also return every duplication. How can I do this? I seem to have code that deduplicates okay, but I can figure out the second part. Consider the following list
[["apple", "red"]
["apple", "red"]
["apple", "red"]
["apple", "green"]
["banana", "yellow"]]
I'm trying to return
Uniques:
[["apple", "red"]
["apple", "green"]
["banana", "yellow"]]
Duplicates:
[["apple", "red"]
["apple", "red"]]
I'm using Python 2.7 and I have a lists of lists with lists. I'm trying to deduplicate the list of lists and also return every duplication. How can I do this? I seem to have code that deduplicates okay, but I can figure out the second part. Consider the following list
[["apple", "red"]
["apple", "red"]
["apple", "red"]
["apple", "green"]
["banana", "yellow"]]
I'm trying to return
Uniques:
[["apple", "red"]
["apple", "green"]
["banana", "yellow"]]
Duplicates:
[["apple", "red"]
["apple", "red"]]
def csv_deduplicate2(csv_list):
sorted_by_date = sorted(csv_list, key=lambda row: row[2], reverse=True)
unique_csv_list = []
duplicate_csv_list = []
for sorted_row in sorted_by_date:
if sorted_row not in unique_csv_list:
unique_csv_list.append(sorted_row) # Produces a unique list
for sorted_row in sorted_by_date:
if sorted_row in unique_csv_list:
duplicate_csv_list.append(sorted_row) # Produces a list of everything, not just duplicates.
return unique_csv_list, duplicate_csv_list
SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
I don't see the need for sorting the list unless the list is very large. Then you would consider using the bisect searching library, which requires a sorted list.
@bsumariwalla
Have you tried the posted solutions? It is time for you to close this question.
Have you tried the posted solutions? It is time for you to close this question.
@bsumariwalla
Where do you stand with this question?
Where do you stand with this question?
Recommend points split between these two comments:
https://www.experts-exchange.com/questions/29015112/Python-Deduplicating-and-Capturing-Duplicates.html?anchorAnswerId=42086977#a42086977
https://www.experts-exchange.com/questions/29015112/Python-Deduplicating-and-Capturing-Duplicates.html?anchorAnswerId=42093279#a42093279
https://www.experts-exchange.com/questions/29015112/Python-Deduplicating-and-Capturing-Duplicates.html?anchorAnswerId=42086977#a42086977
https://www.experts-exchange.com/questions/29015112/Python-Deduplicating-and-Capturing-Duplicates.html?anchorAnswerId=42093279#a42093279
Split points between two correct solutions.