Solved

Read two csv files to two dictionaries and compare

Posted on 2009-07-04
8
725 Views
Last Modified: 2012-08-13
Hi Experts,
I have following three files.  
------------------------------------file01.txt---------------------------
00650260048_c,,,The Pink Sheet
05040390072,,,The Tan Sheet
020108d4,,,Health News Daily
02260160016_b,,,The Rose Sheet
00630360023,relatedDocs,00630220023,The Pink Sheet
--------------------------------------------------------------------------

------------------------------------file02.txt----------------------------
000105d2,,,Health News Daily
00650260048_c,,,The Pink Sheet
000105d5,,,Health News Daily
05040390072,,,The Tan Sheet
000106d1,,,Health News Daily
000106d3,,,Health News Daily
000106d4,,,Health News Daily
000106d6,,,Health News Daily
--------------------------------------------------------------------------

-------------------------both.txt----------------------------------------
00650260048_c
05040390072
020108d4
02260160016_b
00630360023
000105d2
00650260048_c
000105d5
05040390072
000106d1
000106d3
000106d4
000106d6
--------------------------------------------------------------------------
1.) Load file01.txt into a dictionary, with the key being id (the first value before the comma) and the value being the entire line of the file.  
2.) Load file02.txt into a dictionary, with the key being the id (the first value before the comma) and the value being the entire line of the file.
3.) For all of the ids in both.txt, determine which ids(keys) have different values between file01.txt and file02.txt.

Could you kindly help me to write this python script? Final script might enough. I will go though it and will prompt questions if I will get.  :)
BR Dushan.
0
Comment
Question by:Dushan De Silva
  • 5
  • 3
8 Comments
 
LVL 9

Expert Comment

by:ghostdog74
ID: 24778923
since you know what you want to do , why don't you start writing your code? ie, you should put in some effort. read the docs on how to use dictionaries and other stuffs. Write code and if you are stuck, then come here to ask.
Example 4 here is small snippet to get you started, the rest, try to have a go on your own
0
 
LVL 17

Author Comment

by:Dushan De Silva
ID: 24778953
I tired following two codes, but I couldn't get values to a dictionary.
1.) code1.py with two dictionaries
2.) code2.pywith Sets
#------------code1.py-----------------------------
#usage : python code1.py file01.txt file02.txt
import sys
h={}
for line in open(sys.argv[1]):
    line=line.strip().split()
    print line
    h[line[0]]=line[1]
for line in open(sys.argv[2]):
    line=line.strip()
    l=line.split()
    print line,h[l[0]]
----------------------------------------------------
 
#------------code2.py-------------------------------
#usage : python code1.py file01.txt file02.txt
#! /usr/bin/env python 
import sys
import sets
from sets import Set
#Open the list1 and read it into the set1
f=open(sys.argv[1], 'r')
set1 = Set(f.readlines())
f.close() 
 
dic1 = dict([(k, v) for v, k in enumerate(set1)])
print dic1
 
 
#Open the list2 and read it into the set2
f=open(sys.argv[2], 'r')
set2 = Set(f.readlines())
f.close() 
 
dic2 = dict([(l, w) for w, l in enumerate(set2)])
print dic2
 
#Find Delta
diff1 = set1 - set2
diff2 = set2 - set1
 
#set1-=set2 
 
#Dump delta
 
f=open(sys.argv[1] + '_NOTIN_'+ sys.argv[2] + '_.txt', 'w')
f.writelines(diff1)
f.close()
 
f=open(sys.argv[2] + '_NOTIN_'+ sys.argv[1] + '_.txt', 'w')
f.writelines(diff2)
f.close()
----------------------------------------------------

Open in new window

0
 
LVL 17

Author Comment

by:Dushan De Silva
ID: 24778959
and I'm not sure how to get these csv values to dictionary using following script with csv module.
import csv
filename = "file"
reader = csv.reader(open(filename),delimiter=',')
writer = csv.writer( open("newfile.csv","wb") )
for row in reader: 
    print row #stdout
    writer.writerow(row) #write to newfile.csv

Open in new window

0
Courses: Start Training Online With Pros, Today

Brush up on the basics or master the advanced techniques required to earn essential industry certifications, with Courses. Enroll in a course and start learning today. Training topics range from Android App Dev to the Xen Virtualization Platform.

 
LVL 9

Assisted Solution

by:ghostdog74
ghostdog74 earned 500 total points
ID: 24778963
if you have both files's delimiters as comma, then split on comma , eg  line.split(",")
if you want to use 1st column and the key and the whole line as value

h[line[0]] = ','.join(line)
0
 
LVL 17

Author Comment

by:Dushan De Silva
ID: 24778988
Thanks! I already tried it changing code1.py
0
 
LVL 9

Accepted Solution

by:
ghostdog74 earned 500 total points
ID: 24779166
>> and I'm not sure how to get these csv values to dictionary using following script with >> csv module

i am  not sure why you have so much difficulty.



import csv
filename = "file"
reader = csv.reader(open(filename),delimiter=',')
h={}
for row in reader:
    print row[0]
    print ','.join(row[1:])
    h[row[0]]=','.join(row[1:])  
print h

Open in new window

0
 
LVL 17

Author Comment

by:Dushan De Silva
ID: 24780963

import csv
import sys
 
f1 = open(sys.argv[1], 'rt')
f2 = open(sys.argv[2], 'rt')
f3 = open(sys.argv[3], 'rt')
#f1 = csv.reader(open(sys.argv[1]),delimiter=',')
h={}
try:
    reader1 = csv.reader(f1, delimiter=',')
    reader2 = csv.reader(f2, delimiter=',')
    reader3 = csv.reader(f3, delimiter=',')
    for row1 in reader1:
#       print row1[0]
#       print ','.join(row1[1:])
#       h[row1[0]]=','.join(row1[1:])
#       print h
            print row1[1:]
#       if row1[0]!= "":
            for row2 in reader2:
#               if row2[0]!= "":
                    if row2[0] in row1[0]:
#                       if row2[1:] in row1[1:]:
                           print row2[0]
#           print row2[1:]
 
finally:
    f1.close()
    f2.close()
    f3.close()

Open in new window

0
 
LVL 17

Author Comment

by:Dushan De Silva
ID: 24811240
Thanks for your help! I cameup with following solution. :)
http://www.experts-exchange.com/Programming/Languages/Scripting/Python/Q_24545046.html

BR Dushan
0

Featured Post

Gigs: Get Your Project Delivered by an Expert

Select from freelancers specializing in everything from database administration to programming, who have proven themselves as experts in their field. Hire the best, collaborate easily, pay securely and get projects done right.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

Here I am using Python IDLE(GUI) to write a simple program and save it, so that we can just execute it in future. Because when we write any program and exit from Python then program that we have written will be lost. So for not losing our program we…
Article by: Swadhin
Introduction of Lists in Python: There are six built-in types of sequences. Lists and tuples are the most common one. In this article we will see how to use Lists in python and how we can utilize it while doing our own program. In general we can al…
Learn the basics of while and for loops in Python.  while loops are used for testing while, or until, a condition is met: The structure of a while loop is as follows:     while <condition>:         do something         repeate: The break statement m…
Learn how to match and substitute tagged data using PHP regular expressions. Demonstrated on Windows 7, but also applies to other operating systems. Demonstrated technique applies to PHP (all versions) and Firefox, but very similar techniques will w…

786 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question