Solved

Read two csv files to two dictionaries and compare

Posted on 2009-07-04
8
721 Views
Last Modified: 2012-08-13
Hi Experts,
I have following three files.  
------------------------------------file01.txt---------------------------
00650260048_c,,,The Pink Sheet
05040390072,,,The Tan Sheet
020108d4,,,Health News Daily
02260160016_b,,,The Rose Sheet
00630360023,relatedDocs,00630220023,The Pink Sheet
--------------------------------------------------------------------------

------------------------------------file02.txt----------------------------
000105d2,,,Health News Daily
00650260048_c,,,The Pink Sheet
000105d5,,,Health News Daily
05040390072,,,The Tan Sheet
000106d1,,,Health News Daily
000106d3,,,Health News Daily
000106d4,,,Health News Daily
000106d6,,,Health News Daily
--------------------------------------------------------------------------

-------------------------both.txt----------------------------------------
00650260048_c
05040390072
020108d4
02260160016_b
00630360023
000105d2
00650260048_c
000105d5
05040390072
000106d1
000106d3
000106d4
000106d6
--------------------------------------------------------------------------
1.) Load file01.txt into a dictionary, with the key being id (the first value before the comma) and the value being the entire line of the file.  
2.) Load file02.txt into a dictionary, with the key being the id (the first value before the comma) and the value being the entire line of the file.
3.) For all of the ids in both.txt, determine which ids(keys) have different values between file01.txt and file02.txt.

Could you kindly help me to write this python script? Final script might enough. I will go though it and will prompt questions if I will get.  :)
BR Dushan.
0
Comment
Question by:Dushan911
  • 5
  • 3
8 Comments
 
LVL 9

Expert Comment

by:ghostdog74
ID: 24778923
since you know what you want to do , why don't you start writing your code? ie, you should put in some effort. read the docs on how to use dictionaries and other stuffs. Write code and if you are stuck, then come here to ask.
Example 4 here is small snippet to get you started, the rest, try to have a go on your own
0
 
LVL 17

Author Comment

by:Dushan911
ID: 24778953
I tired following two codes, but I couldn't get values to a dictionary.
1.) code1.py with two dictionaries
2.) code2.pywith Sets
#------------code1.py-----------------------------

#usage : python code1.py file01.txt file02.txt

import sys

h={}

for line in open(sys.argv[1]):

    line=line.strip().split()

    print line

    h[line[0]]=line[1]

for line in open(sys.argv[2]):

    line=line.strip()

    l=line.split()

    print line,h[l[0]]

----------------------------------------------------
 

#------------code2.py-------------------------------

#usage : python code1.py file01.txt file02.txt

#! /usr/bin/env python 

import sys

import sets

from sets import Set

#Open the list1 and read it into the set1

f=open(sys.argv[1], 'r')

set1 = Set(f.readlines())

f.close() 
 

dic1 = dict([(k, v) for v, k in enumerate(set1)])

print dic1
 
 

#Open the list2 and read it into the set2

f=open(sys.argv[2], 'r')

set2 = Set(f.readlines())

f.close() 
 

dic2 = dict([(l, w) for w, l in enumerate(set2)])

print dic2
 

#Find Delta

diff1 = set1 - set2

diff2 = set2 - set1
 

#set1-=set2 
 

#Dump delta
 

f=open(sys.argv[1] + '_NOTIN_'+ sys.argv[2] + '_.txt', 'w')

f.writelines(diff1)

f.close()
 

f=open(sys.argv[2] + '_NOTIN_'+ sys.argv[1] + '_.txt', 'w')

f.writelines(diff2)

f.close()

----------------------------------------------------

Open in new window

0
 
LVL 17

Author Comment

by:Dushan911
ID: 24778959
and I'm not sure how to get these csv values to dictionary using following script with csv module.
import csv

filename = "file"

reader = csv.reader(open(filename),delimiter=',')

writer = csv.writer( open("newfile.csv","wb") )

for row in reader: 

    print row #stdout

    writer.writerow(row) #write to newfile.csv

Open in new window

0
 
LVL 9

Assisted Solution

by:ghostdog74
ghostdog74 earned 500 total points
ID: 24778963
if you have both files's delimiters as comma, then split on comma , eg  line.split(",")
if you want to use 1st column and the key and the whole line as value

h[line[0]] = ','.join(line)
0
How your wiki can always stay up-to-date

Quip doubles as a “living” wiki and a project management tool that evolves with your organization. As you finish projects in Quip, the work remains, easily accessible to all team members, new and old.
- Increase transparency
- Onboard new hires faster
- Access from mobile/offline

 
LVL 17

Author Comment

by:Dushan911
ID: 24778988
Thanks! I already tried it changing code1.py
0
 
LVL 9

Accepted Solution

by:
ghostdog74 earned 500 total points
ID: 24779166
>> and I'm not sure how to get these csv values to dictionary using following script with >> csv module

i am  not sure why you have so much difficulty.



import csv

filename = "file"

reader = csv.reader(open(filename),delimiter=',')

h={}

for row in reader:

    print row[0]

    print ','.join(row[1:])

    h[row[0]]=','.join(row[1:])  

print h

Open in new window

0
 
LVL 17

Author Comment

by:Dushan911
ID: 24780963

import csv

import sys
 

f1 = open(sys.argv[1], 'rt')

f2 = open(sys.argv[2], 'rt')

f3 = open(sys.argv[3], 'rt')

#f1 = csv.reader(open(sys.argv[1]),delimiter=',')

h={}

try:

    reader1 = csv.reader(f1, delimiter=',')

    reader2 = csv.reader(f2, delimiter=',')

    reader3 = csv.reader(f3, delimiter=',')

    for row1 in reader1:

#       print row1[0]

#       print ','.join(row1[1:])

#       h[row1[0]]=','.join(row1[1:])

#       print h

            print row1[1:]

#       if row1[0]!= "":

            for row2 in reader2:

#               if row2[0]!= "":

                    if row2[0] in row1[0]:

#                       if row2[1:] in row1[1:]:

                           print row2[0]

#           print row2[1:]
 

finally:

    f1.close()

    f2.close()

    f3.close()

Open in new window

0
 
LVL 17

Author Comment

by:Dushan911
ID: 24811240
Thanks for your help! I cameup with following solution. :)
http://www.experts-exchange.com/Programming/Languages/Scripting/Python/Q_24545046.html

BR Dushan
0

Featured Post

How to improve team productivity

Quip adds documents, spreadsheets, and tasklists to your Slack experience
- Elevate ideas to Quip docs
- Share Quip docs in Slack
- Get notified of changes to your docs
- Available on iOS/Android/Desktop/Web
- Online/Offline

Join & Write a Comment

This article will show the steps for installing Python on Ubuntu Operating System. I have created a virtual machine with Ubuntu Operating system 8.10 and this installing process also works with upgraded version of Ubuntu OS. For installing Py…
Iteration: Iteration is repetition of a process. A student who goes to school repeats the process of going to school everyday until graduation. We go to grocery store at least once or twice a month to buy products. We repeat this process every mont…
Learn the basics of while and for loops in Python.  while loops are used for testing while, or until, a condition is met: The structure of a while loop is as follows:     while <condition>:         do something         repeate: The break statement m…
Learn how to match and substitute tagged data using PHP regular expressions. Demonstrated on Windows 7, but also applies to other operating systems. Demonstrated technique applies to PHP (all versions) and Firefox, but very similar techniques will w…

708 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

11 Experts available now in Live!

Get 1:1 Help Now