brdrok
asked on
need help with algorithm comparing two datasets
heya,
having a hard time comparing two datasets and producing the difference between the two.
For example, let's say I have two dataSets called DS1 and DS2.
DS1 contains the following rows:
Buckeyes
Wolverines <---not in DS2
Hoosiers
Hawkeyes
Boilermakers
DS2 contains the following rows:
Buckeyes
Hoosiers
Hawkeyes
Boilermakers
WildCats <----not in DS1
how can I create a dataSet called dsResult that will contain the following rows:
Wolverines
Wildcats
thanks you very much
having a hard time comparing two datasets and producing the difference between the two.
For example, let's say I have two dataSets called DS1 and DS2.
DS1 contains the following rows:
Buckeyes
Wolverines <---not in DS2
Hoosiers
Hawkeyes
Boilermakers
DS2 contains the following rows:
Buckeyes
Hoosiers
Hawkeyes
Boilermakers
WildCats <----not in DS1
how can I create a dataSet called dsResult that will contain the following rows:
Wolverines
Wildcats
thanks you very much
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Forgot to say: this code uses the first table in each dataset (by indexing them), and queries the field named 'Text'. Change this to suit your case.
ASKER
thanks very much for your response....
I get an error that says:
Additional information: Cannot find column [team1] at the follwoing line:
DRA = DTAux.Select("team1='" + DR["team1"] + "'");
perhaps I am not building my dataColumsn, dataTable not correctly? here is what I got so far:
//this assume i have a dataset called ds1
//create a new data table
DataTable ds1Table = new DataTable("table1");
//create a data column and name it "team1"
DataColumn colTeam1 = new DataColumn("team1", System.Type.GetType("Syste m.String") );
//add column to ds1Table table
ds1Table.Columns.Add(colTe am1);
//add table to my dataset ds1
ds1.Tables.Add(ds1Table);
//create a new row based on table[0] of ds1
DataRow dr = ds1.Tables[0].NewRow();
//assign some vallue
dr[0] = "Buckeyes";
//add the new row to the table
ds1.Tables[0].Rows.Add(dr) ;
//do that for other teams...etc.
I get an error that says:
Additional information: Cannot find column [team1] at the follwoing line:
DRA = DTAux.Select("team1='" + DR["team1"] + "'");
perhaps I am not building my dataColumsn, dataTable not correctly? here is what I got so far:
//this assume i have a dataset called ds1
//create a new data table
DataTable ds1Table = new DataTable("table1");
//create a data column and name it "team1"
DataColumn colTeam1 = new DataColumn("team1", System.Type.GetType("Syste
//add column to ds1Table table
ds1Table.Columns.Add(colTe
//add table to my dataset ds1
ds1.Tables.Add(ds1Table);
//create a new row based on table[0] of ds1
DataRow dr = ds1.Tables[0].NewRow();
//assign some vallue
dr[0] = "Buckeyes";
//add the new row to the table
ds1.Tables[0].Rows.Add(dr)
//do that for other teams...etc.
ASKER
nevermind...got it to run w/o errors...will keep you posted ont he outcome...
ASKER
hopefully didn't jinx myself....but it seems to work GREAT.
thanks a bunch
btw...posted the same question in the asp.net section for 50 points at the following link...not sure how to assign those points to you as well w/o you adding a comment...
https://www.experts-exchange.com/questions/21422037/need-help-with-algorithm-comparing-two-datasets-link.html
thanks a bunch
btw...posted the same question in the asp.net section for 50 points at the following link...not sure how to assign those points to you as well w/o you adding a comment...
https://www.experts-exchange.com/questions/21422037/need-help-with-algorithm-comparing-two-datasets-link.html
One thing you may want to try is to convert the two datasets to XML strings then compare the strings. This works very well and depending on the size of the dataset may be faster. The one caviot that I ran into deals with an empty string and null.
just out of curiosity how many rows are we talking here in real life ?
also can their be duplicates i.e.
table 1
wolverines
wolverines
wolverines
wolverines
tanks
tanks
table 2
wolverines
flying fish
this should produce Tanks Flying Fish as differences ?
or should it produce
Wolverines,Wolverines,Wolv erines,Tan ks,Tanks,F lying Fish as differences ?!
if you are leaning towards the first and have significant data I would make a single pass through the table and build a hash table as opposed to doing n select operations which are all linear (should be SIGNIFICANTLY faster).
Greg
also can their be duplicates i.e.
table 1
wolverines
wolverines
wolverines
wolverines
tanks
tanks
table 2
wolverines
flying fish
this should produce Tanks Flying Fish as differences ?
or should it produce
Wolverines,Wolverines,Wolv
if you are leaning towards the first and have significant data I would make a single pass through the table and build a hash table as opposed to doing n select operations which are all linear (should be SIGNIFICANTLY faster).
Greg
ASKER
thanks for the input...feel terribly replying although i already assigned points....but to answer some of the questions...I really have not worked extensively with XML files and/or the System.XML namesapce bit.
in real life...we are probably talking about 600-700 records inside the sql server and ms access database. not terribly much as far as databases are concerned...
regarding duplicates...I don't think there are any duplicates...i won't have to worry about this issue...(caught a break there =)
in real life...we are probably talking about 600-700 records inside the sql server and ms access database. not terribly much as far as databases are concerned...
regarding duplicates...I don't think there are any duplicates...i won't have to worry about this issue...(caught a break there =)
ok so you will have 600-700 records inside of each dataset ?!
if you have 700 ....
thats 700^2 comparisons that will be performed using hte above algorithm O(n^2)
by moving to a hash table implementation you should be looking closer to O(N log N)
if you have 700 ....
thats 700^2 comparisons that will be performed using hte above algorithm O(n^2)
by moving to a hash table implementation you should be looking closer to O(N log N)
So, can you please explain, how to work with the xml data sets into a hash table?
ASKER
DataSet dsAccess = new DataSet();
DataSet dsSql = new DataSet();
DataSet dsResult = new DataSet();
DataTable tblResult = new DataTable("Result");
DataColumn colResult = new DataColumn("colResultNumbe
DataColumn colResultName = new DataColumn("colResultName"
tblResult.Columns.Add(colR
tblResult.Columns.Add(colR
dsResult.Tables.Add(tblRes
OleDbConnection accessConn = new OleDbConnection(@"Provider
SqlConnection sqlConn = new SqlConnection(@"Server=xxx
OleDbDataAdapter daAccess = new OleDbDataAdapter("SELECT * FROM TblActiveSurveyProjsWithPr
SqlDataAdapter daSql = new SqlDataAdapter("SELECT * FROM Project", sqlConn);
daAccess.Fill(dsAccess, "Access");
daSql.Fill(dsSql, "SQL");
dgAccess.DataSource = dsAccess;
dgSql.DataSource = dsSql;
foreach(DataRow sqlDr in dsSql.Tables[0].Rows)
{
foreach(DataRow accessDr in dsAccess.Tables[0].Rows)
{
if(sqlDr[2].ToString() == accessDr[0].ToString())
{
DataRow newRow = dsResult.Tables[0].NewRow(
newRow[0] = accessDr[0].ToString();
newRow[1] = accessDr[2].ToString();
dsResult.Tables[0].Rows.Ad
}
}
}
dgResult.DataSource = dsResult;