Link to home
Start Free TrialLog in
Avatar of Member_2_5990268
Member_2_5990268

asked on

Cygwin / Bash - Join command not working as expected

I have two CSV files I need to combine.
Both of them share a key column (ID), but each file has unique columns otherwise.
Not all of the IDs in the first file have matches in the second file.  

I would like the output to include ALL rows from the first file whether they match or not.  The rows that did have a match need the additional columns appended.

This seems to be a fairly common question floating around on the internet. The most common solution I found was to use 'join'.  However, it isn't behaving as I was expecting.



$ ls
test1.csv  test2.csv

$ cat test1.csv
22,Charles,Stross,388
33,William,Shakespeare,389
44,Michael,Jackson,385
55,Tom,Green,103
58,Edgar,Poe,22
66,John,Doe,938
99,Jane,Doe,2388

$ cat test2.csv
22,Stories,10/27/1983
33,Plays,12/31/1980
44,Music,7/23/1937

$ join -a 1 -t , test1.csv test2.csv
,Stories,10/27/198388
,Plays,12/31/1980peare,389
,Music,7/23/1937on,385
55,Tom,Green,103
58,Edgar,Poe,22
66,John,Doe,938
99,Jane,Doe,2388


The output seems to be dropping the "key" value from any rows that match, and I have no idea why the text is overlapping (for example, the Date on output Line 3 is overwriting the name).

Any suggestions?  
Avatar of ozo
ozo
Flag of United States of America image

join -a 1 -t , test1.csv test2.csv
22,Charles,Stross,388,Stories,10/27/1983
33,William,Shakespeare,389,Plays,12/31/1980
44,Michael,Jackson,385,Music,7/23/1937
55,Tom,Green,103
58,Edgar,Poe,22
66,John,Doe,938
99,Jane,Doe,2388

is it possible you have an extra \r character at the end of the lines in test1.csv?
what do you see with
join -a 1 -t , test1.csv test2.csv | cat -v
Avatar of Member_2_5990268
Member_2_5990268

ASKER

$ join -a 1 -t , test1.csv test2.csv
,Stories,10/27/198388
,Plays,12/31/1980peare,389
,Music,7/23/1937on,385
55,Tom,Green,103
58,Edgar,Poe,22
66,John,Doe,938
99,Jane,Doe,2388


$ join -a 1 -t , test1.csv test2.csv | cat -v
22,Charles,Stross,388^M,Stories,10/27/1983^M
33,William,Shakespeare,389^M,Plays,12/31/1980^M
44,Michael,Jackson,385^M,Music,7/23/1937
55,Tom,Green,103^M
58,Edgar,Poe,22^M
66,John,Doe,938^M
99,Jane,Doe,2388
ASKER CERTIFIED SOLUTION
Avatar of ozo
ozo
Flag of United States of America image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Ozo, you are a saint!
It works now; here is the entirety of what I had to do:

cat test1.csv | tr -d '\r' > testa.csv
cat test2.csv | tr -d '\r' > testb.csv
join -a 1 -t , testa.csv testb.csv