SolvedPrivate

Unmatched Query in VB.Net using LINQ , Matching on Multiple Fields

Posted on 2014-04-21
17
60 Views
Last Modified: 2016-02-15
I am working with VB.Net in Visual Studio 2010, and I need to perform an unmatched query in LINQ, when two fields are included in the linking.  What makes it tricky, is I only want to return records where one of the field matches, and the other does not.

I think it will make more sense if I lay out an example.

So this is an example of my Main Data Table:
GroupID    SubGroupID
ABC           RRR
ABC           SSS
LMN           TTT
XYZ           BBB

Open in new window

And here is a sample of my Reference Table:

GroupID    SubGroupID
ABC           RRR
XYZ           BBB
XYZ           CCC

Open in new window

Basically, what I am looking for is new instances of records where the GroupID already exists in my Reference table, but the SubGroupID does not.

So, I would only expect to return one record, specifically:
GroupID    SubGroupID
ABC           SSS

Open in new window

because that is the only record where the GroupID already exists in my Reference Table, but the SubGroupID (of "SSS") does not.

Can someone help me come up with the LINQ code for this?  BTW, this is not an Entity Framework, so I need to qualify the field types in my LINQ query code (they are both "Of String").

Thanks!
0
Comment
Question by:JoeMiskey
  • 9
  • 8
17 Comments
 
LVL 63

Expert Comment

by:Fernando Soto
ID: 40012985
Hi JoeMiskey;

Let see if this gives you what you need.
var results = from dt in Context.DataTable
              from rt in Context.ReferenceTable
              where dt.GroupID == rt.GroupID && dt.SubGroupID != rt.SubGroupID
              select rt;

Open in new window

0
 

Author Comment

by:JoeMiskey
ID: 40013007
That is not quite working out for me.  Maybe I made a mistake in translating it to work for my structure.  Here is the code, exactly as I have written it:
        Dim results = From s In EnhancedSubGroupCheckTable.AsEnumerable
                        From r In EnhancedACTCompanyRules
                        Where s.Field(Of String)("GroupID") == r.Field(Of String)("GroupID") _
                        && s.Field(Of String)("SubGroupID") != r.Field(Of String)("SubGroupID")
                        Select s

Open in new window

It highlight the double equal sign ("==") and says "Expression Expected").

Do you see anything wrong?  
This code is for VB.Net, and not for C+, right?
0
 
LVL 63

Expert Comment

by:Fernando Soto
ID: 40013169
Hi JoeMiskey;

Sorry about that I should have taken a better look at the question headder where it did state VB. I also made an assumption that you were accessing a database and not a DataTable Object. So lets try it this way.

Dim results = From s In EnhancedSubGroupCheckTable.AsEnumerable _
              From r In EnhancedACTCompanyRules.AsEnumerable _
              Where s.Field(Of String)("GroupID") = r.Field(Of String)("GroupID") _
                    AndAlso s.Field(Of String)("SubGroupID") <> r.Field(Of String)("SubGroupID") _
              Select s

Open in new window

0
Master Your Team's Linux and Cloud Stack!

The average business loses $13.5M per year to ineffective training (per 1,000 employees). Keep ahead of the competition and combine in-person quality with online cost and flexibility by training with Linux Academy.

 

Author Comment

by:JoeMiskey
ID: 40013277
We are getting close, but it is not quite right!  I think the issue is that when it matches on the GroupID, it should be comparing the SubGroupID to ALL the SubGroupIDs be returned with that particular GroupID, instead of trying to match up each one, record-by-record (more of a "IN", "NOT IN" type relationship).

I think I can explain it better with an example.

If my Data Table has the following records:
GroupID    SubGroupID
ABC           0001
XYZ           0011
XYZ           C001

Open in new window

and the Reference Table has the following records
GroupID    SubGroupID
ABC           0001
XYZ           0001
XYZ           C001

Open in new window

It should only return one record:
GroupID    SubGroupID
XYZ           0011

Open in new window

but instead it is returning both records for the XYZ group:
GroupID    SubGroupID
XYZ           0011
XYZ           C001

Open in new window

Out of curiosity, to see if I could at least come up with the logic for this, I wanted to see if I could write the SQL code to do this in Access, and I was able to.  Here is what that code looks like:
SELECT s.GroupID, s.SubGroupID
FROM MainDataTable s,ReferenceTable r 
WHERE s.GroupID = r.GroupID
AND ((s.GroupID & s.SubGroupID)
NOT IN 
(SELECT GroupID & SubGroupID
FROM ReferenceTable));

Open in new window

So, the approach I took was to check to see if the following was true:
1.  Does the GroupID of the Data table match the GroupID of the Reference Table?
2.  Does the GroupID/SubGroupID combination from that record in the Data Table NOT exist in any of the GroupID/SubGroupID combinations in the Reference Table?

If both those conditions are met, return the record.

So it looks like it may require a nested query.  Any idea on how we might be able tweak the code you came up with to do that?

Thanks
0
 

Author Comment

by:JoeMiskey
ID: 40013446
I have been researching any playing around with it, and here is my feeble attempt at it which does not work:
        Dim results = From s In EnhancedSubGroupCheckTable.AsEnumerable _
                      From r In EnhancedACTCompanyRules.AsEnumerable _
                      Where s.Field(Of String)("GroupID") = r.Field(Of String)("GroupID") _
                      AndAlso Not _ 
                      (From x In EnhancedACTCompanyRules _
                       Select x.Field(Of String)("GroupID") + x.Field(Of String)("ExcellusSubGroupID")) _
                      Contains(s.Field(Of String)("GroupID") + s.Field(Of String)("SubGroupID")) _
                      Select s

Open in new window

So, in order to do the subquery with the "Not In" logic, I tried nesting the query within the "Contains" statement (and negated it with a "Not").  However, it does not work.  At the word "Contains", it says "End of Statement Expected".

Any ideas?
0
 
LVL 63

Expert Comment

by:Fernando Soto
ID: 40014049
Hi JoeMiskey;

See if this does what you need. Please add the two supporting classes at the end of the query they are use to compare two objects.

Dim r2 = (From rr In rt.AsEnumerable() _
          Select New TempData With _
          { _
              .GroupID = rr.Field(Of String)("GroupID"), _
              .SubGroupID = rr.Field(Of String)("SubGroupID") _
          }).ToList()

Dim resultS = From s In mt.AsEnumerable() _
              Let s2 = New TempData With _
                       { _
                       	.GroupID = s.Field(Of String)("GroupID"), _
                       	.SubGroupID = s.Field(Of String)("SubGroupID") _
                       } _
              Where Not r2.Contains(s2, New TempDataEqualityComparer) _
              Select s
              
'Supporting classes for the above code
' Used to compare two objects in the query
Public Class TempData

    Public GroupID As String
    Public SubGroupID As String

End Class

' Used to compare two classes of TempData
Public Class TempDataEqualityComparer
    Implements IEqualityComparer(Of TempData)

    Public Function Equals1(x As TempData, y As TempData) As Boolean Implements IEqualityComparer(Of TempData).Equals
        If x.GroupID = y.GroupID And x.SubGroupID = y.SubGroupID Then
            Return True
        Else
            Return False
        End If
    End Function

    Public Function GetHashCode1(obj As TempData) As Integer Implements IEqualityComparer(Of TempData).GetHashCode
        Dim hCode As Integer = obj.GroupID Xor obj.SubGroupID
        Return hCode.GetHashCode()
    End Function
End Class              

Open in new window

0
 

Author Comment

by:JoeMiskey
ID: 40014950
Fernando,

Still doesn't appear to be working properly.  It is picking up GroupIDs which do not exist in the Rules table.  So if you look back at the data sample I posted in post #1, the last code returns the "LMN" group, even though they are not in the Rules table anywhere.

To reiterate, I only want want to return records where both of these conditions are met:
1.  The GroupID from the Data table appears somewhere in the Rules table.
2.  The GroupID/SubGroupID combination in the Data table does NOT appear anywhere in the Rules table.

Also, with your code above, am I right in assuming that "rt" should be replaced with the rules table name, and "mt" should be replaced with the data table name?

Thanks
0
 
LVL 63

Expert Comment

by:Fernando Soto
ID: 40015216
To your statement, "Also, with your code above, am I right in assuming that "rt" should be replaced with the rules table name, and "mt" should be replaced with the data table name?", yes I was running some test and called it that way in the code, but you are correct.

Let me look at your last post and get back to you.
0
 

Author Comment

by:JoeMiskey
ID: 40015244
I have a thought on maybe how to do this, in two passes (queries):

Query 1
Do a matched query between the tables matching on the GroupID field only.

Query 2
Do a unmatched query from the result set in Query 1 to the Rules tables, joining on both the GroupID and SubGroupID fields.

I will play around with that also, and see if I can come up with code that does that.

Thanks


EDIT: Not quite as simple as I thought.  To avoid a many-to-many relationship in Query 1, I would need to weed the duplicates GroupIDs out of the Rules table (since we do not care about SubGroupIDs at that point).
0
 
LVL 63

Expert Comment

by:Fernando Soto
ID: 40015438
Does both table have the same number of columns? If so does each column in one table have the same column in the other table, Name and data type?
0
 

Author Comment

by:JoeMiskey
ID: 40015524
The "data table" being used here only contains the GroupID and SubGroupID fields, nothing else (the original data files being imported actually contain much more data, but for his purpose I am importing it into a Temp Table that has a Primary Key constraint on these two fields to make sure that there are no duplicates of the GroupID/SubGroupID combinations in the table).

The "rules" table has the GroupID and SubGroupID fields (under slightly different names, but of the same data type), and a handful of other fields which are not important to this part of the process I am working on.

Let me explain what I am try to do, in a "high level" overview.

We are processing client claims.  We get large claims file from insurance companies that contain many clients data on it, but we only have to process claims for clients (GroupIDs) that are found in our "rules" tables (we don't care about claims for GroupIDs not found in our "rules" table).

However, a common situation we have is that a client will add a new division (SubGroupID), and no one will have updated the "rules" table (different SubGroupIDs may have different processing rules).  For every client ("GroupID") we process, we must have rules for all of their different divisions ("SubGroupIDs").  So, if a new SubGroupID starts showing up on the Claims files, we need a new rule for that SubGroupID added to our "rules" table.

So, what I am trying to do here is simply look through the claims files for new SubGroupIDs that need rules added to our "rules" table.  So, prior to processing these claims, I am comparing all the different GroupID/SubGroupID combinations found on the Claims file to all the GroupID/SubGroupID records found in the "rules" table, and trying to extract just the records where the GroupID is found in the "rules" table, but that particular SubGroupID for that GroupID is not, and needs to be added.  I wish to export that list out to an Excel file, so our processors can review it and add the new rules that are needed prior to actually processing the claims file.

I don't know how much that explanation helps, but at least you can see what I am trying to accomplish here.

Thanks
0
 
LVL 63

Accepted Solution

by:
Fernando Soto earned 500 total points
ID: 40015576
Hi;

I just saw your last post as I was about to post this, I will read it now mean while take a look at this.

I think I may have it this time if I understand it by now.

Dim relTblList = (From r In EnhancedACTCompanyRules.AsEnumerable() _
                  Select New TempData With _
                  { _
                      .GroupID = r.Field(Of String)("GroupID"), _
                      .SubGroupID = r.Field(Of String)("SubGroupID") _
                  }).ToList()

Dim result1 = From mainTable In (From s In EnhancedSubGroupCheckTable.AsEnumerable() _
                                 Join r In EnhancedACTCompanyRules.AsEnumerable() On s.Field(Of String)("GroupID") Equals r.Field(Of String)("GroupID") _
                                 Select s).Distinct() _
              Let mainTRec = New TempData With {.GroupID = mainTable.Field(Of String)("GroupID"), .SubGroupID = mainTable.Field(Of String)("SubGroupID")} _
              Where Not relTblList.Contains(mainTRec, New TempDataEqualityComparer) _
              Select mainTable

              
Public Class TempData                                                                                                
                                                                                                       
    Public GroupID As String                                                                                         
    Public SubGroupID As String                                                                                      
                                                                                                       
End Class                                                                                                            
                                                                                                       
Public Class TempDataEqualityComparer                                                                                
    Implements IEqualityComparer(Of TempData)                                                                        
                                                                                                       
    Public Function Equals1(x As TempData, y As TempData) As Boolean Implements IEqualityComparer(Of TempData).Equals
        If x.GroupID = y.GroupID And x.SubGroupID = y.SubGroupID Then                                                
            Return True                                                                                              
        Else                                                                                                         
            Return False                                                                                             
        End If                                                                                                       
    End Function                                                                                                     
                                                                                                       
    Public Function GetHashCode1(obj As TempData) As Integer Implements IEqualityComparer(Of TempData).GetHashCode   
        Dim hCode As Integer = obj.GroupID Xor obj.SubGroupID                                                        
        Return hCode.GetHashCode()                                                                                   
    End Function                                                                                                     
End Class   

Open in new window

0
 

Author Comment

by:JoeMiskey
ID: 40015631
Fernando,

On my initial testing, it does appear to work!
I am trying to go through the code to understand what it is doing.  I am getting lost around this point...
                      Let mainTRec = New TempData With {.GroupID = mainTable.Field(Of String)("GroupID"), .SubGroupID = mainTable.Field(Of String)("SubGroupID")} _
                      Where Not relTblList.Contains(mainTRec, New TempDataEqualityComparer) _

Open in new window

Could explain to me how this is working?

Thanks!
0
 

Author Comment

by:JoeMiskey
ID: 40015877
Another question I have.

Regarding your code, is there a reason why I should have two additional separate classes, instead of including that all in the same class with the procedure that is running this?
And what does "GetHashCode1" do?

Thanks
0
 
LVL 63

Expert Comment

by:Fernando Soto
ID: 40016638
Hi Joe;

This line of code creates a local variable within the query. It takes the two values GroupID and SubGroupID and creates a concrete object of type TempData so that the values can be used in the where clause in the Contains method so that it can be compaired with the list of values in the relTblList to see which does not have a rule for it.
             
Let mainTRec = New TempData With {.GroupID = mainTable.Field(Of String)("GroupID"), .SubGroupID = mainTable.Field(Of String)("SubGroupID")} _

Open in new window


The Where clause checks to see if the current value in the main table has a rule in the rules table if it does not it sends the value to the Select clause which returns it as a value needing to be looked at.

Where Not relTblList.Contains(mainTRec, New TempDataEqualityComparer) _  

Open in new window

 

To your question, "Regarding your code, is there a reason why I should have two additional separate classes, instead of including that all in the same class with the procedure that is running this? And what does "GetHashCode1" do?", well because we are dealing with object we are creating in a query and are Anonymous type and therefor have no names until compile time and therefore can not pass the a IEqualityComparer of this type we need to do this before the query starts so that we do not need to use Anonymous type and we can create an IEqualityComparer to compare values. GetHashCode1 is the implementation of the IEqualityComparer GetHashCode. It may or may not be use but must be implemented to forfill the contract of IEqualityComparer.

The class TempDataEqualityComparer does not have to be a class of its own it can be implemented in the class TempData. The Where clause would then need to change as follows :

Where Not relTblList.Contains(mainTRec, New TempData) _

Open in new window


But that to me does not look logical and therefore I rather create A new class who's only function is to do the IEqualityComparer functionality and give it a name of what it does. This also leaves the TempData class simple and clean.
0
 

Author Comment

by:JoeMiskey
ID: 40017386
Thanks for the help and explanation!
0
 
LVL 63

Expert Comment

by:Fernando Soto
ID: 40017397
Not a problem Joe, glad I was able to help.
0

Featured Post

3 Use Cases for Connected Systems

Our Dev teams are like yours. They’re continually cranking out code for new features/bugs fixes, testing, deploying, testing some more, responding to production monitoring events and more. It’s complex. So, we thought you’d like to see what’s working for us.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

Title # Comments Views Activity
Capture logon name 13 73
What namespace do I need to import? 2 29
Creating a route in asp.net webforms 2 27
Import a txt file into a DataGridView and TextBox 20 38
This document covers how to connect to SQL Server and browse its contents.  It is meant for those new to Visual Studio and/or working with Microsoft SQL Server.  It is not a guide to building SQL Server database connections in your code.  This is mo…
A long time ago (May 2011), I have written an article showing you how to create a DLL using Visual Studio 2005 to be hosted in SQL Server 2005. That was valid at that time and it is still valid if you are still using these versions. You can still re…
Microsoft Active Directory, the widely used IT infrastructure, is known for its high risk of credential theft. The best way to test your Active Directory’s vulnerabilities to pass-the-ticket, pass-the-hash, privilege escalation, and malware attacks …
The Email Laundry PDF encryption service allows companies to send confidential encrypted  emails to anybody. The PDF document can also contain attachments that are embedded in the encrypted PDF. The password is randomly generated by The Email Laundr…

777 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question