Go Premium for a chance to win a PS4. Enter to Win

x
  • Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 754
  • Last Modified:

Read two csv files to two dictionaries and compare

Hi Experts,
I have following three files.  
------------------------------------file01.txt---------------------------
00650260048_c,,,The Pink Sheet
05040390072,,,The Tan Sheet
020108d4,,,Health News Daily
02260160016_b,,,The Rose Sheet
00630360023,relatedDocs,00630220023,The Pink Sheet
--------------------------------------------------------------------------

------------------------------------file02.txt----------------------------
000105d2,,,Health News Daily
00650260048_c,,,The Pink Sheet
000105d5,,,Health News Daily
05040390072,,,The Tan Sheet
000106d1,,,Health News Daily
000106d3,,,Health News Daily
000106d4,,,Health News Daily
000106d6,,,Health News Daily
--------------------------------------------------------------------------

-------------------------both.txt----------------------------------------
00650260048_c
05040390072
020108d4
02260160016_b
00630360023
000105d2
00650260048_c
000105d5
05040390072
000106d1
000106d3
000106d4
000106d6
--------------------------------------------------------------------------
1.) Load file01.txt into a dictionary, with the key being id (the first value before the comma) and the value being the entire line of the file.  
2.) Load file02.txt into a dictionary, with the key being the id (the first value before the comma) and the value being the entire line of the file.
3.) For all of the ids in both.txt, determine which ids(keys) have different values between file01.txt and file02.txt.

Could you kindly help me to write this python script? Final script might enough. I will go though it and will prompt questions if I will get.  :)
BR Dushan.
0
Dushan De Silva
Asked:
Dushan De Silva
  • 5
  • 3
2 Solutions
 
ghostdog74Commented:
since you know what you want to do , why don't you start writing your code? ie, you should put in some effort. read the docs on how to use dictionaries and other stuffs. Write code and if you are stuck, then come here to ask.
Example 4 here is small snippet to get you started, the rest, try to have a go on your own
0
 
Dushan De SilvaAuthor Commented:
I tired following two codes, but I couldn't get values to a dictionary.
1.) code1.py with two dictionaries
2.) code2.pywith Sets
#------------code1.py-----------------------------
#usage : python code1.py file01.txt file02.txt
import sys
h={}
for line in open(sys.argv[1]):
    line=line.strip().split()
    print line
    h[line[0]]=line[1]
for line in open(sys.argv[2]):
    line=line.strip()
    l=line.split()
    print line,h[l[0]]
----------------------------------------------------
 
#------------code2.py-------------------------------
#usage : python code1.py file01.txt file02.txt
#! /usr/bin/env python 
import sys
import sets
from sets import Set
#Open the list1 and read it into the set1
f=open(sys.argv[1], 'r')
set1 = Set(f.readlines())
f.close() 
 
dic1 = dict([(k, v) for v, k in enumerate(set1)])
print dic1
 
 
#Open the list2 and read it into the set2
f=open(sys.argv[2], 'r')
set2 = Set(f.readlines())
f.close() 
 
dic2 = dict([(l, w) for w, l in enumerate(set2)])
print dic2
 
#Find Delta
diff1 = set1 - set2
diff2 = set2 - set1
 
#set1-=set2 
 
#Dump delta
 
f=open(sys.argv[1] + '_NOTIN_'+ sys.argv[2] + '_.txt', 'w')
f.writelines(diff1)
f.close()
 
f=open(sys.argv[2] + '_NOTIN_'+ sys.argv[1] + '_.txt', 'w')
f.writelines(diff2)
f.close()
----------------------------------------------------

Open in new window

0
 
Dushan De SilvaAuthor Commented:
and I'm not sure how to get these csv values to dictionary using following script with csv module.
import csv
filename = "file"
reader = csv.reader(open(filename),delimiter=',')
writer = csv.writer( open("newfile.csv","wb") )
for row in reader: 
    print row #stdout
    writer.writerow(row) #write to newfile.csv

Open in new window

0
Industry Leaders: We Want Your Opinion!

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

 
ghostdog74Commented:
if you have both files's delimiters as comma, then split on comma , eg  line.split(",")
if you want to use 1st column and the key and the whole line as value

h[line[0]] = ','.join(line)
0
 
Dushan De SilvaAuthor Commented:
Thanks! I already tried it changing code1.py
0
 
ghostdog74Commented:
>> and I'm not sure how to get these csv values to dictionary using following script with >> csv module

i am  not sure why you have so much difficulty.



import csv
filename = "file"
reader = csv.reader(open(filename),delimiter=',')
h={}
for row in reader:
    print row[0]
    print ','.join(row[1:])
    h[row[0]]=','.join(row[1:])  
print h

Open in new window

0
 
Dushan De SilvaAuthor Commented:

import csv
import sys
 
f1 = open(sys.argv[1], 'rt')
f2 = open(sys.argv[2], 'rt')
f3 = open(sys.argv[3], 'rt')
#f1 = csv.reader(open(sys.argv[1]),delimiter=',')
h={}
try:
    reader1 = csv.reader(f1, delimiter=',')
    reader2 = csv.reader(f2, delimiter=',')
    reader3 = csv.reader(f3, delimiter=',')
    for row1 in reader1:
#       print row1[0]
#       print ','.join(row1[1:])
#       h[row1[0]]=','.join(row1[1:])
#       print h
            print row1[1:]
#       if row1[0]!= "":
            for row2 in reader2:
#               if row2[0]!= "":
                    if row2[0] in row1[0]:
#                       if row2[1:] in row1[1:]:
                           print row2[0]
#           print row2[1:]
 
finally:
    f1.close()
    f2.close()
    f3.close()

Open in new window

0
 
Dushan De SilvaAuthor Commented:
Thanks for your help! I cameup with following solution. :)
http://www.experts-exchange.com/Programming/Languages/Scripting/Python/Q_24545046.html

BR Dushan
0

Featured Post

Technology Partners: We Want Your Opinion!

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

  • 5
  • 3
Tackle projects and never again get stuck behind a technical roadblock.
Join Now