Hi,
I got 2 files i.e File1 and File2, as shown below
File1
A | APPLE
B | ORANGE
File2
A | 10
B | 15
D | 20
A | 10
I need following output
Output 1
A | APPLE | 10
B | ORANGE | 15
But I am getting this below output.
A | APPLE | 10
B | ORANGE | 15
A | APPLE | 10
How can I remove the duplicate rows from the output and direct only the duplicate output to a new file.
My code is as follow
Import pandas as pd
df1 = pd.read_csv('file1.txt', sep='|')
df2 = pd.read_csv('file2.txt', sep='|')
Merge12 = pd.merge(df1, df2, how='left', on='A')
Merge12.to_csv('output.txt')
You may remove duplicate rows either from the df1 and df2 first and then merge them or remove duplicate rows from the resultant dataframe.
Open in new window