asked on

need help with algorithm comparing two datasets

heya,

having a hard time comparing two datasets and producing the difference between the two.

For example, let's say I have two dataSets called DS1 and DS2.

DS1 contains the following rows:
Buckeyes
Wolverines <---not in DS2
Hoosiers
Hawkeyes
Boilermakers

DS2 contains the following rows:
Buckeyes
Hoosiers
Hawkeyes
Boilermakers
WildCats <----not in DS1

how can I create a dataSet called dsResult that will contain the following rows:
Wolverines
Wildcats

thanks you very much

brdrok

ASKER

below is my first attemp but wasn't successful...

DataSet dsAccess = new DataSet();
DataSet dsSql = new DataSet();
DataSet dsResult = new DataSet();

DataTable tblResult = new DataTable("Result");
DataColumn colResult = new DataColumn("colResultNumber", System.Type.GetType("System.String"));
DataColumn colResultName = new DataColumn("colResultName", System.Type.GetType("System.String"));
tblResult.Columns.Add(colResult);
tblResult.Columns.Add(colResultName);
dsResult.Tables.Add(tblResult);

OleDbConnection accessConn = new OleDbConnection(@"Provider=Microsoft.Jet.OLEDB.4.0;Data Source=C:\Documents and Settings\toll\Desktop\SWOL_ProjDataExtract_2.mdb;Persist Security Info=False");
SqlConnection sqlConn = new SqlConnection(@"Server=xxx.xx.x.xxx;Database=SWOL2000;User ID=xxxxx;Password=xxxxx;Trusted_Connection=False");

OleDbDataAdapter daAccess = new OleDbDataAdapter("SELECT * FROM TblActiveSurveyProjsWithProfCtrAndNameFinal", accessConn);
SqlDataAdapter daSql = new SqlDataAdapter("SELECT * FROM Project", sqlConn);

daAccess.Fill(dsAccess, "Access");
daSql.Fill(dsSql, "SQL");

dgAccess.DataSource = dsAccess;
dgSql.DataSource = dsSql;

foreach(DataRow sqlDr in dsSql.Tables[0].Rows)
{
      foreach(DataRow accessDr in dsAccess.Tables[0].Rows)
      {
            if(sqlDr[2].ToString() == accessDr[0].ToString())
            {
                  DataRow newRow = dsResult.Tables[0].NewRow();
                  newRow[0] = accessDr[0].ToString();
                  newRow[1] = accessDr[2].ToString();
                  dsResult.Tables[0].Rows.Add(newRow);
            }
      }
}

dgResult.DataSource = dsResult;

ASKER CERTIFIED SOLUTION

Diego Pazos

membership

This solution is only available to members.

To access this solution, you must be a member of Experts Exchange.

Start Free Trial

Diego Pazos

Forgot to say: this code uses the first table in each dataset (by indexing them), and queries the field named 'Text'. Change this to suit your case.

brdrok

ASKER

thanks very much for your response....

I get an error that says:
Additional information: Cannot find column [team1] at the follwoing line:
DRA = DTAux.Select("team1='" + DR["team1"] + "'");

perhaps I am not building my dataColumsn, dataTable not correctly? here is what I got so far:

//this assume i have a dataset called ds1
//create a new data table
DataTable ds1Table = new DataTable("table1");
//create a data column and name it "team1"
DataColumn colTeam1 = new DataColumn("team1", System.Type.GetType("System.String"));
//add column to ds1Table table
ds1Table.Columns.Add(colTeam1);
//add table to my dataset ds1
ds1.Tables.Add(ds1Table);

//create a new row based on table[0] of ds1
DataRow dr = ds1.Tables[0].NewRow();
//assign some vallue
dr[0] = "Buckeyes";
//add the new row to the table
ds1.Tables[0].Rows.Add(dr);

//do that for other teams...etc.

brdrok

ASKER

nevermind...got it to run w/o errors...will keep you posted ont he outcome...

brdrok

ASKER

hopefully didn't jinx myself....but it seems to work GREAT.

thanks a bunch

btw...posted the same question in the asp.net section for 50 points at the following link...not sure how to assign those points to you as well w/o you adding a comment...

https://www.experts-exchange.com/questions/21422037/need-help-with-algorithm-comparing-two-datasets-link.html

NipNFriar_Tuck

One thing you may want to try is to convert the two datasets to XML strings then compare the strings. This works very well and depending on the size of the dataset may be faster. The one caviot that I ran into deals with an empty string and null.

gregoryyoung

just out of curiosity how many rows are we talking here in real life ?

also can their be duplicates i.e.

table 1
wolverines
wolverines
wolverines
wolverines
tanks
tanks

table 2
wolverines
flying fish

this should produce Tanks Flying Fish as differences ?

or should it produce
Wolverines,Wolverines,Wolverines,Tanks,Tanks,Flying Fish as differences ?!

if you are leaning towards the first and have significant data I would make a single pass through the table and build a hash table as opposed to doing n select operations which are all linear (should be SIGNIFICANTLY faster).

Greg

brdrok

ASKER

thanks for the input...feel terribly replying although i already assigned points....but to answer some of the questions...I really have not worked extensively with XML files and/or the System.XML namesapce bit.

in real life...we are probably talking about 600-700 records inside the sql server and ms access database. not terribly much as far as databases are concerned...

regarding duplicates...I don't think there are any duplicates...i won't have to worry about this issue...(caught a break there =)

gregoryyoung

ok so you will have 600-700 records inside of each dataset ?!

if you have 700 ....

thats 700^2 comparisons that will be performed using hte above algorithm O(n^2)

by moving to a hash table implementation you should be looking closer to O(N log N)

motorlife

So, can you please explain, how to work with the xml data sets into a hash table?