Find matching set of records

rationalJay
rationalJay used Ask the Experts™
on
I have the following tables on SQL Server 2008R2

MessageTable

ContrlNo|    LineNo   |    Msg
-------|--------------|--------------
1        |        1        |    Tiger1 Text
1        |        2        |    Tiger1 Text
1        |        3        |    Tiger1 Text
1        |        4        |    Tiger1 Text
2        |        1        |    Tiger1 Text1
2        |        2        |    Tiger1 Text2
2        |        3        |    Tiger1 Text3
2        |        4        |    Tiger1 Text4
3        |        1        |    Horse 1
3        |        2        |    Horse 2
3        |        3        |    Horse 3
3        |        4        |    Horse 4

RuleTable

RuleNo|    MsgLineNo  | RuleStartingPos |    RuleMsg
-------|--------------|-----|---------
1        |        1        |  1 |  Tiger1 Text
2        |        1        |  1 |  Tiger1 Text
2        |        3        |  1 |  Tiger1 Text3

For each set of ControlNo records in the MESSAGETABLE I would like to apply the rule from the RULETABLE and list the RULENo if any mataches.

For the above example, I would like to have the following output,

ContrlNo | RuleNo
-----------------
1   |   1 <br>
2   |   2 <br>
3   |   NULL

Thanks,
Jay
Comment
Watch Question

Do more with

Expert Office
EXPERT OFFICE® is a registered trademark of EXPERTS EXCHANGE®
Vitor MontalvãoIT Engineer
Distinguished Expert 2017

Commented:
SELECT DISTINCT m.ContrlNo, r.RuleNo
FROM MessageTable m
LEFT OUTER JOIN RuleTable r ON m.ContrlNo = r.RuleNo

Open in new window

Scott PletcherSenior DBA
Most Valuable Expert 2018
Top Expert 2014

Commented:
I think this code is right, or at least very close, but I haven't tested it as I don't have directly useable test data available:


SELECT message_Contrlno.ContrlNo, rule_matched.RuleNo
FROM (
    SELECT DISTINCT ContrlNo
    FROM MessageTable
) AS message_Contrlno
CROSS APPLY (
    SELECT TOP (1) rt.RuleNo, mt.ContrlNo
    FROM dbo.RuleTable rt
    LEFT OUTER JOIN MessageTable mt ON
        rt.MsgLineNo = mt.[LineNo] AND
        rt.RuleMsg = mt.Msg
    WHERE mt.ContrlNo = message_Contrlno.ContrlNo
    GROUP BY rt.RuleNo
    HAVING COUNT(rt.MsgLineNo) = COUNT(mt.Msg)
    ORDER BY COUNT(rt.MsgLineNo) DESC
) AS rule_matched
ORDER BY ContrlNo

Author

Commented:
Thanks for you quick response.  I tried your query

SELECT distinct message_Contrlno.ContrlNo, rule_matched.RuleNo
FROM (
    SELECT ContrlNo
    FROM MessageTable
) AS message_Contrlno
CROSS APPLY (
    SELECT rt.RuleNo
    FROM dbo.RuleTable rt
    LEFT OUTER JOIN MessageTable mt ON
        rt.MsgLineNo = mt.[LineNo] AND
        rt.RuleMsg = SUBSTRING(mt.Msg,rt.RuleStartingPos,LEN(rt.RuleMsg))
    WHERE mt.ContrlNo = message_Contrlno.ContrlNo
    GROUP BY rt.RuleNo
    HAVING COUNT(rt.MsgLineNo) = COUNT(mt.Msg)
       
) AS rule_matched
ORDER BY ContrlNo

I have attached the the result  as an image. I do not get the expected result as shown below,

ContrlNo | RuleNo
-----------------
1   |   1 <br>
2   |   2 <br>
3   |   NULL

Thanks,
Jay
1.PNG
Ensure you’re charging the right price for your IT

Do you wonder if your IT business is truly profitable or if you should raise your prices? Learn how to calculate your overhead burden using our free interactive tool and use it to determine the right price for your IT services. Start calculating Now!

Scott PletcherSenior DBA
Most Valuable Expert 2018
Top Expert 2014

Commented:
No, you re-wrote my query.  My query was:

SELECT message_Contrlno.ContrlNo, rule_matched.RuleNo
FROM (
     SELECT DISTINCT ContrlNo
...

My query will not give duplicate results for ContrlNo, yours will.

Author

Commented:
That's right, Sorry I messed it up little bit. Your query yields following result,

ContrlNo|      RuleNo
-------------|------------------
1              |            2
2              |            2

But, that is not the intended result.

If you see the RuleTable, the Rule 2 overlaps rule 1. And the requirement is to get the most matched RuleNo for each Control number. the expected result is,

ContrlNo|      RuleNo
-------------|------------------
1              |            1
2              |            2

-Thanks...
Scott PletcherSenior DBA
Most Valuable Expert 2018
Top Expert 2014

Commented:
Ok, I'll accept that.

Again, without testable data, I won't be able to fix it.  That takes running the query to test.
Scott PletcherSenior DBA
Most Valuable Expert 2018
Top Expert 2014

Commented:
Sorry, that was off, please try code below.  I believe it will work, although we'll have to see about performance.  Sorry about the issue with INSERT statements, but it usually takes me longer to convert plain text into INSERT statements than it does to write the query!  [And I simply don't have time to do that.]


SELECT message_Contrlno.ContrlNo, rule_matched.RuleNo
FROM (
    SELECT DISTINCT ContrlNo
    FROM MessageTable
) AS message_Contrlno
LEFT OUTER JOIN (
    SELECT rt.RuleNo, mt.ContrlNo,
        ROW_NUMBER() OVER(PARTITION BY rt.RuleNo ORDER BY COUNT(rt.MsgLineNo), mt.ContrlNo DESC) AS row_num
    FROM dbo.RuleTable rt
    LEFT OUTER JOIN MessageTable mt ON
        rt.MsgLineNo = mt.[LineNo] AND
        rt.RuleMsg = mt.Msg
    GROUP BY rt.RuleNo, mt.ContrlNo
) AS rule_matched ON
    rule_matched.ContrlNo = message_Contrlno.ContrlNo AND
    rule_matched.row_num = 1
ORDER BY ContrlNo

Author

Commented:
Thanks....

Author

Commented:
Each control Number should have just one rule mapped to it. When I added one more rule your latest query failed....

http://sqlfiddle.com/#!3/bda20/1

Author

Commented:
And the following lines just compare MessageTable record against the last row of the RuleNo data set. The approach is right but It should compare all the rows in decending fashion.

 rt.MsgLineNo = mt.[LineNo] AND
 rt.RuleMsg = mt.Msg

Open in new window


Thanks,
Jay
Scott PletcherSenior DBA
Most Valuable Expert 2018
Top Expert 2014
Commented:
SELECT message_Contrlno.ContrlNo, rule_matched.RuleNo
FROM (
    SELECT DISTINCT ContrlNo
    FROM MessageTable
) AS message_Contrlno
LEFT OUTER JOIN (
    SELECT rt.RuleNo, mt.ContrlNo, --COUNT(rt.MsgLineNo) AS rules_matched,
        ROW_NUMBER() OVER(PARTITION BY mt.ContrlNo ORDER BY COUNT(rt.MsgLineNo) DESC, rt.RuleNo) AS row_num
    FROM dbo.RuleTable rt
    LEFT OUTER JOIN MessageTable mt ON
        rt.MsgLineNo = mt.[LineNo] AND
        rt.RuleMsg = mt.Msg
    GROUP BY rt.RuleNo, mt.ContrlNo
) AS rule_matched ON
    rule_matched.ContrlNo = message_Contrlno.ContrlNo AND
    rule_matched.row_num = 1
ORDER BY ContrlNo

Author

Commented:
@Scott, I believe you have not seen my last comment,

http://www.experts-exchange.com/Database/MS-SQL-Server/Q_28549689.html#a40421879
Scott PletcherSenior DBA
Most Valuable Expert 2018
Top Expert 2014

Commented:
That is just to join every row that needs considered.  The ROW_NUMBER() and other things take care of the other requirements, as I understand them.

Keep in  mind, you know more in depth how these rules work, I don't.  I can go only by what you post, including the examples you gave, which were extremely limited.

If you can give a sample data set and the correct/expected result, which my query isn't giving, then I can debug it.

Author

Commented:
First, Thanks for helping. I totally understand your difficulty.  I hope the following example explains it better,

http://sqlfiddle.com/#!3/20c5d/1


When we run the query we have, we will get from the example that ContrlNo 5 matches RuleNo 2. But actually it should not. it should match RuleNo 6. But other ContrlNos have perfect match.
Scott PletcherSenior DBA
Most Valuable Expert 2018
Top Expert 2014

Commented:
That link's not working.

But run the inner subquery stand-alone, and check the results that way, including a column for # of rules matched (which, btw, we could pass through to the final result if you wanted to):

    SELECT rt.RuleNo, mt.ContrlNo, COUNT(rt.MsgLineNo) AS rules_matched,
         ROW_NUMBER() OVER(PARTITION BY mt.ContrlNo ORDER BY COUNT(rt.MsgLineNo) DESC, rt.RuleNo) AS row_num
     FROM dbo.RuleTable rt
     LEFT OUTER JOIN MessageTable mt ON
         rt.MsgLineNo = mt.[LineNo] AND
         rt.RuleMsg = mt.Msg
     GROUP BY rt.RuleNo, mt.ContrlNo
Scott, Thank you so much for your patience and support. What we had was very close... The following  produces exactly what I wanted....

SELECT MT.ContrlNo, r.RuleNo, r.MatchPercent
FROM
  MessageTable MT
   LEFT JOIN
    (
      SELECT
        ContrlNo,
        RuleNo,
        MatchedRules / AvailableRules AS MatchPercent,
        ROW_NUMBER() OVER (PARTITION BY ContrlNo ORDER BY MatchedRules DESC, MatchedRules / AvailableRules DESC) AS rn
      FROM
        (
          SELECT
            ContrlNo,
            R.RuleNo,
            COUNT(*) as MatchedRules,
            (SELECT COUNT(*) FROM RuleTable WHERE RuleTable.RuleNo = R.RuleNo) + 0.0 AS AvailableRules
          FROM
              MessageTable M
               INNER JOIN
              RuleTable R ON
                  M.[LineNo] = R.MsgLineNo AND
                  SUBSTRING(M.Msg,R.RuleStartingPos,LEN(R.RuleMsg)) LIKE '%' + R.RuleMsg + '%'
          GROUP BY M.ContrlNo, R.RuleNo
        ) q
     ) r ON
    MT.ContrlNo = r.ContrlNo AND
    r.rn = 1
GROUP BY MT.ContrlNo, r.RuleNo, r.MatchPercent
Scott PletcherSenior DBA
Most Valuable Expert 2018
Top Expert 2014

Commented:
Oh, OK, not the most rules matched (by count) but the highest % rules matched -- that makes sense.

Author

Commented:
help from scottPletcher got me close to the solution. But the last post from me has the exact solution I wanted.

Do more with

Expert Office
Submit tech questions to Ask the Experts™ at any time to receive solutions, advice, and new ideas from leading industry professionals.

Start 7-Day Free Trial