Read two csv files to two dictionaries and compare

Hi Experts,
I have following three files.  
------------------------------------file01.txt---------------------------
00650260048_c,,,The Pink Sheet
05040390072,,,The Tan Sheet
020108d4,,,Health News Daily
02260160016_b,,,The Rose Sheet
00630360023,relatedDocs,00630220023,The Pink Sheet
--------------------------------------------------------------------------

------------------------------------file02.txt----------------------------
000105d2,,,Health News Daily
00650260048_c,,,The Pink Sheet
000105d5,,,Health News Daily
05040390072,,,The Tan Sheet
000106d1,,,Health News Daily
000106d3,,,Health News Daily
000106d4,,,Health News Daily
000106d6,,,Health News Daily
--------------------------------------------------------------------------

-------------------------both.txt----------------------------------------
00650260048_c
05040390072
020108d4
02260160016_b
00630360023
000105d2
00650260048_c
000105d5
05040390072
000106d1
000106d3
000106d4
000106d6
--------------------------------------------------------------------------
1.) Load file01.txt into a dictionary, with the key being id (the first value before the comma) and the value being the entire line of the file.  
2.) Load file02.txt into a dictionary, with the key being the id (the first value before the comma) and the value being the entire line of the file.
3.) For all of the ids in both.txt, determine which ids(keys) have different values between file01.txt and file02.txt.

Could you kindly help me to write this python script? Final script might enough. I will go though it and will prompt questions if I will get.  :)
BR Dushan.
LVL 17
Dushan De SilvaTechnology ArchitectAsked:
Who is Participating?
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

ghostdog74Commented:
since you know what you want to do , why don't you start writing your code? ie, you should put in some effort. read the docs on how to use dictionaries and other stuffs. Write code and if you are stuck, then come here to ask.
Example 4 here is small snippet to get you started, the rest, try to have a go on your own
0
Dushan De SilvaTechnology ArchitectAuthor Commented:
I tired following two codes, but I couldn't get values to a dictionary.
1.) code1.py with two dictionaries
2.) code2.pywith Sets
#------------code1.py-----------------------------
#usage : python code1.py file01.txt file02.txt
import sys
h={}
for line in open(sys.argv[1]):
    line=line.strip().split()
    print line
    h[line[0]]=line[1]
for line in open(sys.argv[2]):
    line=line.strip()
    l=line.split()
    print line,h[l[0]]
----------------------------------------------------
 
#------------code2.py-------------------------------
#usage : python code1.py file01.txt file02.txt
#! /usr/bin/env python 
import sys
import sets
from sets import Set
#Open the list1 and read it into the set1
f=open(sys.argv[1], 'r')
set1 = Set(f.readlines())
f.close() 
 
dic1 = dict([(k, v) for v, k in enumerate(set1)])
print dic1
 
 
#Open the list2 and read it into the set2
f=open(sys.argv[2], 'r')
set2 = Set(f.readlines())
f.close() 
 
dic2 = dict([(l, w) for w, l in enumerate(set2)])
print dic2
 
#Find Delta
diff1 = set1 - set2
diff2 = set2 - set1
 
#set1-=set2 
 
#Dump delta
 
f=open(sys.argv[1] + '_NOTIN_'+ sys.argv[2] + '_.txt', 'w')
f.writelines(diff1)
f.close()
 
f=open(sys.argv[2] + '_NOTIN_'+ sys.argv[1] + '_.txt', 'w')
f.writelines(diff2)
f.close()
----------------------------------------------------

Open in new window

0
Dushan De SilvaTechnology ArchitectAuthor Commented:
and I'm not sure how to get these csv values to dictionary using following script with csv module.
import csv
filename = "file"
reader = csv.reader(open(filename),delimiter=',')
writer = csv.writer( open("newfile.csv","wb") )
for row in reader: 
    print row #stdout
    writer.writerow(row) #write to newfile.csv

Open in new window

0
Rowby Goren Makes an Impact on Screen and Online

Learn about longtime user Rowby Goren and his great contributions to the site. We explore his method for posing questions that are likely to yield a solution, and take a look at how his career transformed from a Hollywood writer to a website entrepreneur.

ghostdog74Commented:
if you have both files's delimiters as comma, then split on comma , eg  line.split(",")
if you want to use 1st column and the key and the whole line as value

h[line[0]] = ','.join(line)
0
Dushan De SilvaTechnology ArchitectAuthor Commented:
Thanks! I already tried it changing code1.py
0
ghostdog74Commented:
>> and I'm not sure how to get these csv values to dictionary using following script with >> csv module

i am  not sure why you have so much difficulty.



import csv
filename = "file"
reader = csv.reader(open(filename),delimiter=',')
h={}
for row in reader:
    print row[0]
    print ','.join(row[1:])
    h[row[0]]=','.join(row[1:])  
print h

Open in new window

0

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
Dushan De SilvaTechnology ArchitectAuthor Commented:

import csv
import sys
 
f1 = open(sys.argv[1], 'rt')
f2 = open(sys.argv[2], 'rt')
f3 = open(sys.argv[3], 'rt')
#f1 = csv.reader(open(sys.argv[1]),delimiter=',')
h={}
try:
    reader1 = csv.reader(f1, delimiter=',')
    reader2 = csv.reader(f2, delimiter=',')
    reader3 = csv.reader(f3, delimiter=',')
    for row1 in reader1:
#       print row1[0]
#       print ','.join(row1[1:])
#       h[row1[0]]=','.join(row1[1:])
#       print h
            print row1[1:]
#       if row1[0]!= "":
            for row2 in reader2:
#               if row2[0]!= "":
                    if row2[0] in row1[0]:
#                       if row2[1:] in row1[1:]:
                           print row2[0]
#           print row2[1:]
 
finally:
    f1.close()
    f2.close()
    f3.close()

Open in new window

0
Dushan De SilvaTechnology ArchitectAuthor Commented:
Thanks for your help! I cameup with following solution. :)
http://www.experts-exchange.com/Programming/Languages/Scripting/Python/Q_24545046.html

BR Dushan
0
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
Python

From novice to tech pro — start learning today.