Todictionary method in C#

Dear all,
I have two files which have the same rows. The second column in each file is same. And also some other columns are same. I want to create a dictionary to combine the information in these two files. Please see the sample data.
File 1
1	bs1_533	   	0	533	C	G
1	bs1_41342	0	41342	A	T
1	bs1_41791	0	41791	A	G
1	bs1_44449	0	44449	C	T

Open in new window

File 2
   1         bs1_533    C    G         0.05      120
   1       bs1_41342    A    T       0.2417      120
   1       bs1_41791    A    G      0.04167      120
   1       bs1_44449    C    T      0.01667      120

Open in new window

For example I want to get a dictionary<string,string>
dict[bs1_533] = "bs1_533*1*533*C*G*0.05";

Open in new window

Thanks.
zhshqzycAsked:
Who is Participating?

[Product update] Infrastructure Analysis Tool is now available with Business Accounts.Learn More

x
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

zhshqzycAuthor Commented:
I want to use Todictionary method.
Mike TomlinsonHigh School Computer Science, Computer Applications, and Mathematics TeachersCommented:
The files appear to be fixed-width...can you provide a specification?

What happened to the "0" and the "120" value?  If you only want certain columns then which ones?

"The second column in each file is same."

But the File1 column appears to be left justified and the File2 column appears to be right justified.

If you want help then you need to provide DETAILED information about the layout of the files and exactly what you want to extract along with the order of those items in the output.
zhshqzycAuthor Commented:
The columns are seperated by tabs. Don't worry about justifed. Please ignore 0 and 120 columns. I just want to extract useful columns.
OWASP: Threats Fundamentals

Learn the top ten threats that are present in modern web-application development and how to protect your business from them.

Mike TomlinsonHigh School Computer Science, Computer Applications, and Mathematics TeachersCommented:
"Don't worry about justifed."

We have to know the format of the data so we can parse it and make matches...it's not magic.  =)

"I just want to extract useful columns."

Only YOU know what is "useful".  Therefore you have to tell us exactly which columns those are...
zhshqzycAuthor Commented:
The format is like
1\tbs1_533\t0\t533\tC\tG

Open in new window

And
1\tbs1_533\tC\tG\t0.05\t120

Open in new window

So for first data set, ignore column 2(0 based index).
For the second data set, ignore the last column. Then combine all the remaings items.
The key is the column1 in data set 1. The value is
column 0, column 1, column 3, column 4 and 5 in file 1 plus column 4 in file 2.
Mike TomlinsonHigh School Computer Science, Computer Applications, and Mathematics TeachersCommented:
*untested*

Try something like this:

using System;
using System.Collections.Generic;
using System.ComponentModel;
using System.Data;
using System.Drawing;
using System.Linq;
using System.Text;
using System.Windows.Forms;

namespace WindowsFormsApplication1
{
    public partial class Form1 : Form
    {
        public Form1()
        {
            InitializeComponent();
        }

        private void button1_Click(object sender, EventArgs e)
        {
            string FileName1 = @"c:\some path\file1.ext";
            string FileName2 = @"c:\some path\file2.ext";

            // these will hold the raw values (as a string array split on \t) from the files using column #1 (zero based) as the key
            Dictionary<string, string[]> data1 = new Dictionary<string,string[]>();
            Dictionary<string, string[]> data2 = new Dictionary<string,string[]>();

            string[] values;
            foreach (string line in System.IO.File.ReadAllLines(FileName1))
            {
                if (line.Trim().Length > 0)
                {
                    values = line.Split("\t".ToCharArray());
                    if (values.GetUpperBound(0) >= 5)
                    {
                        data1[values[1].Trim()] = values;
                    }
                }
            }
            foreach (string line in System.IO.File.ReadAllLines(FileName2))
            {
                if (line.Trim().Length > 0)
                {
                    values = line.Split("\t".ToCharArray());
                    if (values.GetUpperBound(0) >= 5)
                    {
                        data2[values[1].Trim()] = values;
                    }
                    else
                    {
                        // line had invalid number of values in it!
                    }
                }
            }

            string data;
            Dictionary<string, string> combined = new Dictionary<string, string>();
            foreach (KeyValuePair<string, string[]> kvp in data1)
            {
                if (data2.ContainsKey(kvp.Key))
                {
                    // The key is the column1 in data set 1. The value is
                    // column 0, column 1, column 3, column 4 and 5 in file 1 plus column 4 in file 2.
                    data = kvp.Value[0] + "*" + kvp.Value[1] + "*" + kvp.Value[3] + "*" + kvp.Value[4] + "*" + kvp.Value[5] + "*" + data2[kvp.Key][4];
                    combined[kvp.Key] = data;
                }
                else
                {
                    // key in file1 did not have a matching key entry in file2!
                }
            }
        }

    }
}

Open in new window

zhshqzycAuthor Commented:
Thanks. I used to use your method. I want to use ToDictionary method in LINQ. This is my question and purpose.
Mike TomlinsonHigh School Computer Science, Computer Applications, and Mathematics TeachersCommented:
Gotcha...this was definitely a learning experience for me.

I combined these two MSDN examples:
http://msdn.microsoft.com/en-us/library/bb882647.aspx
http://msdn.microsoft.com/en-us/library/bb549277.aspx

To come up with:
private void button1_Click(object sender, EventArgs e)
        {
            string FileName1 = @"C:\Users\Mike\Documents\Test\file1.txt";
            string FileName2 = @"C:\Users\Mike\Documents\Test\file2.txt";

            Dictionary<string, string> data =
            (from line1 in System.IO.File.ReadAllLines(FileName1)
             let fields1 = line1.Split('\t')
             from line2 in System.IO.File.ReadAllLines(FileName2)
             let fields2 = line2.Split('\t')
             where fields1[1] == fields2[1]
             select new
             {
                 Key = fields1[1],
                 Value = fields1[0] + "*" + fields1[1] + "*" + fields1[3]
                     + "*" + fields1[4] + "*" + fields1[5] + "*" + fields2[4]
             }).ToDictionary(p => p.Key, p => p.Value);

            foreach (KeyValuePair<string, string> kvp in data)
            {
                Console.WriteLine(kvp.Key + ", " + kvp.Value);
            }
        }

Open in new window

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
Mike TomlinsonHigh School Computer Science, Computer Applications, and Mathematics TeachersCommented:
Also used this Anonymous Types page:
http://msdn.microsoft.com/en-us/library/bb397696.aspx
zhshqzycAuthor Commented:
Thank you very much. One more question. If the second file has a header in the first line, I want to skip the it. What is the code?
 CHR             SNP   A1   A2          MAF  NCHROBS
   1         bs1_533    C    G         0.05      120
   1       bs1_41342    A    T       0.2417      120
   1       bs1_41791    A    G      0.04167      120
   1       bs1_44449    C    T      0.01667      120

Open in new window

Mike TomlinsonHigh School Computer Science, Computer Applications, and Mathematics TeachersCommented:
Use the Skip() method after you read the lines from the second file:
http://msdn.microsoft.com/en-us/library/bb357513.aspx

So change this:

    from line2 in System.IO.File.ReadAllLines(FileName2)

To:

    from line2 in (System.IO.File.ReadAllLines(FileName2)).Skip(1)
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
C#

From novice to tech pro — start learning today.