Valid combination of records

Mike Littlewood
Mike Littlewood used Ask the Experts™
on
Hi,

I am sure there are some nice mathematical methods/equations/patterns out there that might be able to help me, but I thought I would ask the question here.

I have a table containing records which I would like to build another dataset from of valid combinations.

Below is a sample of the records and what I think the outcome of the results should be (columns can be null for all records). What I would like is a nice piece of SQL or a procedure to do this.
Originally I thought I could just cross join each of the columns against each other, but that produces far more combinations than are valid.
Records and column index
1   2   3   4   5   6
1A                  6A
1A                  6B
        3A  4A
        3B  4B
    2A          5A
    2A  3A

Valid combination records
1A  2A  3A  4A  5A  6A
1B  2A  3A  4A  5A  6B

Records and column index
Removed last record
1   2   3   4   5   6
1A                  6A
1A                  6B
        3A  4A
        3B  4B
    2A          5A

Valid combination records
1A  2A  3A  4A  5A  6A
1B  2A  3A  4A  5A  6B
1A  2A  3B  4B  5A  6A
1B  2A  3B  4B  5A  6B

Open in new window

Comment
Watch Question

Do more with

Expert Office
EXPERT OFFICE® is a registered trademark of EXPERTS EXCHANGE®

Commented:
What is the rationale. What exactly are you trying to do. Remove null value and replace it by the next non null value in the column or what

Author

Commented:
Yes, if there is a value in a column, then the valid combination record must contain a value from that column, otherwise it can be null. The valid record though must use the whole of as many records combined as it can.

Commented:
Looks like something that can be done by a looping through the recordset and generating a temporary table as  you go through

Something like:



For Each field in MainTable

Do Until EOF

If not isnull(field) then
Insert Into TempTable(FieldName) Values(recordset(field))
End If

recordset.movenext

loop

Next Field

The exact syntax will depend on the program that you are using to design the application but that should give you a general idea
Build an E-Commerce Site with Angular 5

Learn how to build an E-Commerce site with Angular 5, a JavaScript framework used by developers to build web, desktop, and mobile applications.

Author

Commented:
If only it was that simple :o)
The way you describe is pretty much the same as cross joining each column against each other without taking into account the individual records as a whole.
This is the method I did first time round and it didn't work.
I am an experienced coder as well as SQL, but this issue is a bit more savvy.
Hamed NasrRetired IT Professional

Commented:
Tell me the process to get the Valid combination records by inspecting the table records.
Why Removed last record, and not the 2 before last as well?
Most Valuable Expert 2012
Distinguished Expert 2018

Commented:
Your samples do not seem to match up with your requirements.

I do not see a '1B' in either example.  If the first set of expected output, I don't see where the '3B' and '4B' records are.

Please clarify.

Author

Commented:
Hi,

I only removed the last record to show you how that record affected the resulting valid combinations.
The example I have given above is very basic, but it is only there to highlight how a valid combination record is created.
If you want a more complex example, then I can provide one.

Author

Commented:
DOH! I don't know how I put that 1B in there sorry
Records and column index
1   2   3   4   5   6
1A                  6A
1A                  6B
        3A  4A
        3B  4B
    2A          5A
    2A  3A

Valid combination records
1A  2A  3A  4A  5A  6A
1A  2A  3A  4A  5A  6B

Records and column index
Removed last record
1   2   3   4   5   6
1A                  6A
1A                  6B
        3A  4A
        3B  4B
    2A          5A

Valid combination records
1A  2A  3A  4A  5A  6A
1A  2A  3A  4A  5A  6B
1A  2A  3B  4B  5A  6A
1A  2A  3B  4B  5A  6B

Open in new window

Most Valuable Expert 2011
Top Expert 2012

Commented:
there must be other rules you're not mentioning, or your examples still aren't correct

in your first example (including the revised first example)  you have this row....

   3B  4B

however you have no results including those values.  Why not?
Most Valuable Expert 2012
Distinguished Expert 2018

Commented:
>>If you want a more complex example, then I can provide one.

Just need to understand the requirements.  Maybe a better example will help?  The 'picture' you are posting isn't getting us there.

By chance, are you wanting ALL possible combinations of ALL non-null values?

Author

Commented:
No that is correct, this is the sort of thing to watch out for.

2A can go with 5A
2A can go with 3A
3A can go with 4A

Therefore
2A 3A 4A 5A

2A can only go with 3A therefore it cannot go with 3B

You MUST use a value from each column if a value exists.
You MUST use the whole record in a valid combination.

Author

Commented:
>> By chance, are you wanting ALL possible combinations of ALL non-null values?

All combinations of all non-null value records as long as the column has a non null value.
If a column, say 2, has no value in any record, it must return NULL for all other valid combinations.
Most Valuable Expert 2011
Top Expert 2012

Commented:
>>> You MUST use a value from each column if a value exists.
>>> You MUST use the whole record in a valid combination.


using those two rules means your first example doesn't work because...

1A                  6A
1A                  6B


if I use 1A, 6A then I violate the "whole record" rule for 1A,6B and vice versa

Author

Commented:
And hey presto, you solved the missing 1B from earlier. Man I'm tired today it seems.
Records and column index
1   2   3   4   5   6
1A                  6A
1B                  6B
        3A  4A
        3B  4B
    2A          5A
    2A  3A

Valid combination records
1A  2A  3A  4A  5A  6A
1B  2A  3A  4A  5A  6B

Records and column index
Removed last record
1   2   3   4   5   6
1A                  6A
1B                  6B
        3A  4A
        3B  4B
    2A          5A

Valid combination records
1A  2A  3A  4A  5A  6A
1B  2A  3A  4A  5A  6B
1A  2A  3B  4B  5A  6A
1B  2A  3B  4B  5A  6B

Open in new window

Hamed NasrRetired IT Professional

Commented:
I can't follow! Still confusing rules.

I expect something like this:
Example 1:
Look at record 1
    col1 is 1A
    col2 is null, next not null value in col2 is 2A
    col3 is null, next not null value in col3 is 3A
    and so on
    record1 valid combination is : 1A  2A  3A  4A  5A  6A
    and for record 2
    record2 valid combination is : 1B  2A  3A  4A  5A  6B

Exampe2, following above logic leaves us with:
1A  2A  3A  4A  5A  6A
1B  2A  3A  4A  5A  6B

What rules will allow to list the other 2 records?

Author

Commented:
I think you are assuming you can only use the next record to fill in the null values, this is not correct.
The idea is to AND ANY records together you can to produce a distinct list of valid combination records.

Records 1 and 2 cannot obviously exist together, and neither can records 3 and 4.
We can therefore cross join these to get the 4 valid results, each of which can join to record 5.

I'm really not sure how I can explain this anymore than I am.

Author

Commented:
Does this more real world example help rather than A's and B's?
1     2     3     4     5     6
DAVE				ENGLISH
JOHN				SPANISH
	    TV	  HORROR
	    BOOKS SCI-FI
      HOLIDAY		SPAIN


DAVE speaks ENGLISH, Goes on HOLIDAY to SPAIN, and likes HORROR TV
DAVE speaks ENGLISH, Goes on HOLIDAY to SPAIN, and likes SCI-FI books
JOHN speaks SPANISH, Goes on HOLIDAY to SPAIN, and likes HORROR TV
JOHN speaks SPANISH, Goes on HOLIDAY to SPAIN, and likes SCI-FI BOOKS
ie
DAVE/HOLIDAY/TV/HORROR/SPAIN/ENGLISH
DAVE/HOLIDAY/BOOKS/SCI-FI/SPAIN/ENGLISH
JOHN/HOLIDAY/TV/HORROR/SPAIN/SPANISH
JOHN/HOLIDAY/BOOKS/SCI-FI/SPAIN/SPANISH


Putting record 6 back in, you can only go on HOLIDAY and watch TV
Therefore Reading BOOKS/SCI-FI cannot go with HOLIDAY/SPAIN

1     2     3     4     5     6
DAVE			      ENGLISH
JOHN			      SPANISH 
            TV	  HORROR
	    BOOKS SCI-FI
      HOLIDAY		SPAIN
      HOLIDAY TV


DAVE speaks ENGLISH, Goes on HOLIDAY to SPAIN, and likes HORROR TV
JOHN speaks SPANISH, Goes on HOLIDAY to SPAIN, and likes HORROR TV
ie
DAVE/NULL/TV/HORROR/SPAIN/ENGLISH
JOHN/NULL/TV/HORROR/SPAIN/SPANISH


If on the other hand I now took out HOLIDAY from all records,
I now have no conditions for column 2, therefore I can ignore this column

1     2     3     4     5     6
DAVE			      ENGLISH
JOHN                          SPANISH
            TV	  HORROR
	    BOOKS SCI-FI
	                SPAIN
            TV


DAVE speaks ENGLISH, Goes to SPAIN, and likes HORROR TV
DAVE speaks ENGLISH, Goes to SPAIN, and likes SCI-FI books
JOHN speaks SPANISH, Goes to SPAIN, and likes HORROR TV
JOHN speaks SPANISH, Goes to SPAIN, and likes SCI-FI BOOKS
ie
DAVE/NULL/TV/HORROR/SPAIN/ENGLISH
DAVE/NULL/BOOKS/SCI-FI/SPAIN/ENGLISH
JOHN/NULL/TV/HORROR/SPAIN/SPANISH
JOHN/NULL/BOOKS/SCI-FI/SPAIN/SPANISH


If I made a very simple version with only 3 columns

1       2       3
DAVE						ENGLISH
JOHN						SPANISH
        TV      ENGLISH
        BOOKS   SPANISH
DAVE    TV
JOHN    BOOKS

The valid combinations would be
DAVE/TV/ENGLISH
JOHN/BOOKS/SPANISH


If though I altered records 3 and 4, the resulting data set would give me nothing

1       2       3
DAVE						ENGLISH
JOHN						SPANISH
        BOOKS   ENGLISH
        TV      SPANISH
DAVE    TV
JOHN    BOOKS

Dave can only watch tv
Dave can only speak english
But TV is only in Spanish

Therefore Dave cant watch TV in spanish.
And Visa Versa, John can't read books in English

Open in new window

Author

Commented:
I think I'm going to have to delete this question if I can't get anyone to understand the problem.

Author

Commented:
I've requested that this question be deleted for the following reason:

Lack of understanding the question
Hamed NasrRetired IT Professional

Commented:
mikelittlewood,
I don't have oracle installed on my machine. But I contributed to this thread because of your experience so I may help in the logic. This is only if we analyse the problem correctly.

Can you use the 3 column example and explain in the way I did how to get each combination.
Are the words ENGLISH and SPANISH leftovers from editing after copy and paste? Do they have any effect on selecting the combinations.

Example:
record1 col1 is DAVE col2 is null, col 3 is null
   we look for a record that is ..... and this is record no ....
Most Valuable Expert 2012
Distinguished Expert 2018

Commented:
Sorry you are deleting the question.

Thanks for the new data.  

The good news:  I think I now understand the requirement.
The Bad, I'm not sure how to pull this off.  It will likely require some detailed coding that is a little above my head.

Let me make sure I understand it so maybe other Experts will know how to do achieve it.

The way I understand it, you are looking to develop sort of an AI inference engine:
http://en.wikipedia.org/wiki/Expert_system#Inference_rules

If a pair exists in a row, that is a 'rule'.  Once you get all the rules, create the result set.
Most Valuable Expert 2011
Top Expert 2012

Commented:
how about this?


SELECT DISTINCT x1.*,
                x2.*,
                x3.*,
                x4.*,
                x5.*,
                x6.*
  FROM (SELECT DISTINCT col1
          FROM yourdata
         WHERE col1 IS NOT NULL) x1,
       (SELECT DISTINCT col2
          FROM yourdata
         WHERE col2 IS NOT NULL) x2,
       (SELECT DISTINCT col3
          FROM yourdata
         WHERE col3 IS NOT NULL) x3,
       (SELECT DISTINCT col4
          FROM yourdata
         WHERE col4 IS NOT NULL) x4,
       (SELECT DISTINCT col5
          FROM yourdata
         WHERE col5 IS NOT NULL) x5,
       (SELECT DISTINCT col6
          FROM yourdata
         WHERE col6 IS NOT NULL) x6,
       yourdata d2
 WHERE (d2.col1 IS NULL OR d2.col1 = x1.col1)
   AND (d2.col2 IS NULL OR d2.col2 = x2.col2)
   AND (d2.col3 IS NULL OR d2.col3 = x3.col3)
   AND (d2.col4 IS NULL OR d2.col4 = x4.col4)
   AND (d2.col5 IS NULL OR d2.col5 = x5.col5)
   AND (d2.col6 IS NULL OR d2.col6 = x6.col6)
   AND 1 =
           ALL (SELECT CASE
                           WHEN ((d.col1 IS NULL OR d.col1 = x1.col1)
                             AND (d.col2 IS NULL OR d.col2 = x2.col2)
                             AND (d.col3 IS NULL OR d.col3 = x3.col3)
                             AND (d.col4 IS NULL OR d.col4 = x4.col4)
                             AND (d.col5 IS NULL OR d.col5 = x5.col5)
                             AND (d.col6 IS NULL OR d.col6 = x6.col6))
                             OR  NOT (NVL(d.col1, x1.col1) = x1.col1
                                   OR  NVL(d.col2, '-') = x2.col2
                                   OR  NVL(d.col3, '-') = x3.col3
                                   OR  NVL(d.col4, '-') = x4.col4
                                   OR  NVL(d.col5, '-') = x5.col5
                                   OR  NVL(d.col6, '-') = x6.col6)
                           THEN
                               1
                           ELSE
                               0
                       END
                  FROM yourdata d
                 WHERE (d.col1 = x1.col1
                     OR  d.col2 = x2.col2
                     OR  d.col3 = x3.col3
                     OR  d.col4 = x4.col4
                     OR  d.col5 = x5.col5
                     OR  d.col6 = x6.col6))    
Most Valuable Expert 2011
Top Expert 2012

Commented:
I don't expect that to be very fast with large data samples, but it works for what you posted
Most Valuable Expert 2011
Top Expert 2012
Commented:
here's a slightly more compact version, there were some redundant conditions in my previous post


SELECT DISTINCT x1.col1,
                x2.col2,
                x3.col3,
                x4.col4,
                x5.col5,
                x6.col6
  FROM (SELECT DISTINCT col1
          FROM yourdata
         WHERE col1 IS NOT NULL) x1,
       (SELECT DISTINCT col2
          FROM yourdata
         WHERE col2 IS NOT NULL) x2,
       (SELECT DISTINCT col3
          FROM yourdata
         WHERE col3 IS NOT NULL) x3,
       (SELECT DISTINCT col4
          FROM yourdata
         WHERE col4 IS NOT NULL) x4,
       (SELECT DISTINCT col5
          FROM yourdata
         WHERE col5 IS NOT NULL) x5,
       (SELECT DISTINCT col6
          FROM yourdata
         WHERE col6 IS NOT NULL) x6,
       yourdata d2
 WHERE 1 =
           ALL (SELECT CASE
                           WHEN (d.col1 IS NULL OR d.col1 = x1.col1)
                             AND (d.col2 IS NULL OR d.col2 = x2.col2)
                             AND (d.col3 IS NULL OR d.col3 = x3.col3)
                             AND (d.col4 IS NULL OR d.col4 = x4.col4)
                             AND (d.col5 IS NULL OR d.col5 = x5.col5)
                             AND (d.col6 IS NULL OR d.col6 = x6.col6)
                           THEN
                               1                          
                       END
                  FROM yourdata d
                 WHERE (d.col1 = x1.col1
                     OR  d.col2 = x2.col2
                     OR  d.col3 = x3.col3
                     OR  d.col4 = x4.col4
                     OR  d.col5 = x5.col5
                     OR  d.col6 = x6.col6));
Most Valuable Expert 2011
Top Expert 2012

Commented:
I recommend accept of http:#36361130   

It answers the question and is reasonably efficient for something that inherently suffers from combinatoric explosion
Most Valuable Expert 2012
Distinguished Expert 2018

Commented:
I have no problem with sdstuber's suggestion.

Do more with

Expert Office
Submit tech questions to Ask the Experts™ at any time to receive solutions, advice, and new ideas from leading industry professionals.

Start 7-Day Free Trial