Expiring Today—Celebrate National IT Professionals Day with 3 months of free Premium Membership. Use Code ITDAY17

x
?
Solved

Cygwin / Bash - Join command not working as expected

Posted on 2010-09-10
4
Medium Priority
?
897 Views
Last Modified: 2013-11-15
I have two CSV files I need to combine.
Both of them share a key column (ID), but each file has unique columns otherwise.
Not all of the IDs in the first file have matches in the second file.  

I would like the output to include ALL rows from the first file whether they match or not.  The rows that did have a match need the additional columns appended.

This seems to be a fairly common question floating around on the internet. The most common solution I found was to use 'join'.  However, it isn't behaving as I was expecting.



$ ls
test1.csv  test2.csv

$ cat test1.csv
22,Charles,Stross,388
33,William,Shakespeare,389
44,Michael,Jackson,385
55,Tom,Green,103
58,Edgar,Poe,22
66,John,Doe,938
99,Jane,Doe,2388

$ cat test2.csv
22,Stories,10/27/1983
33,Plays,12/31/1980
44,Music,7/23/1937

$ join -a 1 -t , test1.csv test2.csv
,Stories,10/27/198388
,Plays,12/31/1980peare,389
,Music,7/23/1937on,385
55,Tom,Green,103
58,Edgar,Poe,22
66,John,Doe,938
99,Jane,Doe,2388


The output seems to be dropping the "key" value from any rows that match, and I have no idea why the text is overlapping (for example, the Date on output Line 3 is overwriting the name).

Any suggestions?  
0
Comment
Question by:Proots
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 2
  • 2
4 Comments
 
LVL 84

Expert Comment

by:ozo
ID: 33651437
join -a 1 -t , test1.csv test2.csv
22,Charles,Stross,388,Stories,10/27/1983
33,William,Shakespeare,389,Plays,12/31/1980
44,Michael,Jackson,385,Music,7/23/1937
55,Tom,Green,103
58,Edgar,Poe,22
66,John,Doe,938
99,Jane,Doe,2388

is it possible you have an extra \r character at the end of the lines in test1.csv?
what do you see with
join -a 1 -t , test1.csv test2.csv | cat -v
0
 
LVL 2

Author Comment

by:Proots
ID: 33652073
$ join -a 1 -t , test1.csv test2.csv
,Stories,10/27/198388
,Plays,12/31/1980peare,389
,Music,7/23/1937on,385
55,Tom,Green,103
58,Edgar,Poe,22
66,John,Doe,938
99,Jane,Doe,2388


$ join -a 1 -t , test1.csv test2.csv | cat -v
22,Charles,Stross,388^M,Stories,10/27/1983^M
33,William,Shakespeare,389^M,Plays,12/31/1980^M
44,Michael,Jackson,385^M,Music,7/23/1937
55,Tom,Green,103^M
58,Edgar,Poe,22^M
66,John,Doe,938^M
99,Jane,Doe,2388
0
 
LVL 84

Accepted Solution

by:
ozo earned 200 total points
ID: 33652098
So that's why the text is overlapping.
you may be able to remove it with
tr -d '\r'
0
 
LVL 2

Author Closing Comment

by:Proots
ID: 33653046
Ozo, you are a saint!
It works now; here is the entirety of what I had to do:

cat test1.csv | tr -d '\r' > testa.csv
cat test2.csv | tr -d '\r' > testb.csv
join -a 1 -t , testa.csv testb.csv
0

Featured Post

Free Tool: ZipGrep

ZipGrep is a utility that can list and search zip (.war, .ear, .jar, etc) archives for text patterns, without the need to extract the archive's contents.

One of a set of tools we're offering as a way to say thank you for being a part of the community.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

If you're a modern-day technology professional, you may be wondering if certifications are really necessary. They are. Here's why.
Are you an Exchange administrator employed with an organization? And, have you encountered a corrupt Exchange database due to which you are not able to open its EDB file. This article will explain all the steps to repair corrupt Exchange database.
The viewer will learn how to successfully download and install the SARDU utility on Windows 7, without downloading adware.
This video Micro Tutorial shows how to password-protect PDF files with free software. Many software products can do this, such as Adobe Acrobat (but not Adobe Reader), Nuance PaperPort, and Nuance Power PDF, but they are not free products. This vide…

719 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question