zhshqzyc
asked on
Merge files with Todictionary method
I have three files that have many rows and columns. The following is a simple example extract from the file.
File1
The first row first column item in each file is empty. I want to combine them together. The final file likes
Thanks for help.
File1
rs1107199
LY-P1_A01 G
LY-P1_A02 G
LY-P1_A03 G
File 2 rs10078294
LY-P1_A01 AG
LY-P1_A02 G
LY-P1_A03 AG
File 3 rs1117330
LY-P1_A01 C
LY-P1_A02 C
LY-P1_A03 C
Please notice the delimiter for fields is tab. Three file have the same row number.The first row first column item in each file is empty. I want to combine them together. The final file likes
rs1107199 rs10078294 rs1117330
LY-P1_A01 G AG C
LY-P1_A02 G G C
LY-P1_A03 G AG C
Thus I used the code
string[] lines1 = File.ReadAllLines(fname1);
string[] lines2 = File.ReadAllLines(fname2);
string[] lines3 = File.ReadAllLines(fname3);
Dictionary<string, string> data =
(from line1 in lines1
let fields1 = line1.Split('\t')
from line2 in lines2
let fields2 = line2.Split('\t')
from line3 in lines3
let fields3 = line3.Split('\t')
where (fields1[0] == fields2[0] && fields1 [0]==fields3 [0])
select new
{
Key = fields1[0],
Value = line1+'\t'+line2 +'\t'+line3
}).ToDictionary(p => p.Key, p => p.Value);
However I am lacking confidence for that because there is an empty fields in each file.Thanks for help.
However I am lacking confidence for that because there is an empty fields in each file.This is my take on what you would like to do. Please let me know if I misinterpreted the requirement = )
Dictionary<string, string> data = (from line1 in lines1
let fields1 = line1.Split('\t')
from line2 in lines2
let fields2 = line2.Split('\t')
from line3 in lines3
let fields3 = line3.Split('\t')
where (fields1[0] == fields2[0] && fields1[0] == fields3[0])
select new
{
Key = fields1[0],
Value = (fields1[0].Length > 0 ? line1.Replace(fields1[0], string.Empty) : fields1[1]) + '\t' +
(fields2[0].Length > 0 ? line2.Replace(fields2[0], string.Empty) : fields2[1]) + '\t' +
(fields3[0].Length > 0 ? line3.Replace(fields3[0], string.Empty) : fields3[1])
}).ToDictionary(p => p.Key, p => p.Value);
File.WriteAllLines("output.txt", data.Select(item => item.Key + '\t' + item.Value).ToArray());
ASKER
Okay. Actually I have many columns in files rather than two in the example.
How to modify the code?
How to modify the code?
Please provide some sample data with a couple other columns. Also, provide the expected output.
Thanks.
Thanks.
ASKER
I mean that there are many columns. The example just gave you two columns to demostrate.
col1 col2 col3 col4 col5 col6 col7 col8 col9
LY-P1_A01 AG G GT C G T G GA GA
LY-P1_A02 G GA GT T G A A
So you cann't enumerate all such as fields[1],fields[2] etc, because you don't know how many columns.
ASKER
I guess
var query = from line1 in lines1
let fields1 = line1.Split('\t')
from line2 in lines2
let fields2 = line2.Split('\t')
from line3 in lines3
let fields3 = line3.Split('\t')
where (fields1[0] == fields2[0] && fields1[0] == fields3[0])
select new
{
Content = line1 + '\t' + fields2.Skip (1) + '\t' + fields3.Skip (1)
};
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
ASKER
Thank you. Tomorrow I will propose a similar but harder question. Hopefullly I can meet you.
Open in new window