Valid combination of records

Mike Littlewood used Ask the Experts™
on
Hi,

I am sure there are some nice mathematical methods/equations/patterns out there that might be able to help me, but I thought I would ask the question here.

I have a table containing records which I would like to build another dataset from of valid combinations.

Below is a sample of the records and what I think the outcome of the results should be (columns can be null for all records). What I would like is a nice piece of SQL or a procedure to do this.
Originally I thought I could just cross join each of the columns against each other, but that produces far more combinations than are valid.
``````Records and column index
1   2   3   4   5   6
1A                  6A
1A                  6B
3A  4A
3B  4B
2A          5A
2A  3A

Valid combination records
1A  2A  3A  4A  5A  6A
1B  2A  3A  4A  5A  6B

Records and column index
Removed last record
1   2   3   4   5   6
1A                  6A
1A                  6B
3A  4A
3B  4B
2A          5A

Valid combination records
1A  2A  3A  4A  5A  6A
1B  2A  3A  4A  5A  6B
1A  2A  3B  4B  5A  6A
1B  2A  3B  4B  5A  6B
``````
Comment
Watch Question

Do more with

EXPERT OFFICE® is a registered trademark of EXPERTS EXCHANGE®

Commented:
What is the rationale. What exactly are you trying to do. Remove null value and replace it by the next non null value in the column or what
Engineer

Commented:
Yes, if there is a value in a column, then the valid combination record must contain a value from that column, otherwise it can be null. The valid record though must use the whole of as many records combined as it can.

Commented:
Looks like something that can be done by a looping through the recordset and generating a temporary table as  you go through

Something like:

For Each field in MainTable

Do Until EOF

If not isnull(field) then
Insert Into TempTable(FieldName) Values(recordset(field))
End If

recordset.movenext

loop

Next Field

The exact syntax will depend on the program that you are using to design the application but that should give you a general idea
Engineer

Commented:
If only it was that simple :o)
The way you describe is pretty much the same as cross joining each column against each other without taking into account the individual records as a whole.
This is the method I did first time round and it didn't work.
I am an experienced coder as well as SQL, but this issue is a bit more savvy.
Retired IT Professional

Commented:
Tell me the process to get the Valid combination records by inspecting the table records.
Why Removed last record, and not the 2 before last as well?
Most Valuable Expert 2012
Distinguished Expert 2018

Commented:

I do not see a '1B' in either example.  If the first set of expected output, I don't see where the '3B' and '4B' records are.

Engineer

Commented:
Hi,

I only removed the last record to show you how that record affected the resulting valid combinations.
The example I have given above is very basic, but it is only there to highlight how a valid combination record is created.
If you want a more complex example, then I can provide one.
Engineer

Commented:
DOH! I don't know how I put that 1B in there sorry
``````Records and column index
1   2   3   4   5   6
1A                  6A
1A                  6B
3A  4A
3B  4B
2A          5A
2A  3A

Valid combination records
1A  2A  3A  4A  5A  6A
1A  2A  3A  4A  5A  6B

Records and column index
Removed last record
1   2   3   4   5   6
1A                  6A
1A                  6B
3A  4A
3B  4B
2A          5A

Valid combination records
1A  2A  3A  4A  5A  6A
1A  2A  3A  4A  5A  6B
1A  2A  3B  4B  5A  6A
1A  2A  3B  4B  5A  6B
``````
Most Valuable Expert 2011
Top Expert 2012

Commented:
there must be other rules you're not mentioning, or your examples still aren't correct

in your first example (including the revised first example)  you have this row....

3B  4B

Most Valuable Expert 2012
Distinguished Expert 2018

Commented:
>>If you want a more complex example, then I can provide one.

Just need to understand the requirements.  Maybe a better example will help?  The 'picture' you are posting isn't getting us there.

By chance, are you wanting ALL possible combinations of ALL non-null values?
Engineer

Commented:
No that is correct, this is the sort of thing to watch out for.

2A can go with 5A
2A can go with 3A
3A can go with 4A

Therefore
2A 3A 4A 5A

2A can only go with 3A therefore it cannot go with 3B

You MUST use a value from each column if a value exists.
You MUST use the whole record in a valid combination.
Engineer

Commented:
>> By chance, are you wanting ALL possible combinations of ALL non-null values?

All combinations of all non-null value records as long as the column has a non null value.
If a column, say 2, has no value in any record, it must return NULL for all other valid combinations.
Most Valuable Expert 2011
Top Expert 2012

Commented:
>>> You MUST use a value from each column if a value exists.
>>> You MUST use the whole record in a valid combination.

using those two rules means your first example doesn't work because...

1A                  6A
1A                  6B

if I use 1A, 6A then I violate the "whole record" rule for 1A,6B and vice versa
Engineer

Commented:
And hey presto, you solved the missing 1B from earlier. Man I'm tired today it seems.
``````Records and column index
1   2   3   4   5   6
1A                  6A
1B                  6B
3A  4A
3B  4B
2A          5A
2A  3A

Valid combination records
1A  2A  3A  4A  5A  6A
1B  2A  3A  4A  5A  6B

Records and column index
Removed last record
1   2   3   4   5   6
1A                  6A
1B                  6B
3A  4A
3B  4B
2A          5A

Valid combination records
1A  2A  3A  4A  5A  6A
1B  2A  3A  4A  5A  6B
1A  2A  3B  4B  5A  6A
1B  2A  3B  4B  5A  6B
``````
Retired IT Professional

Commented:
I can't follow! Still confusing rules.

I expect something like this:
Example 1:
Look at record 1
col1 is 1A
col2 is null, next not null value in col2 is 2A
col3 is null, next not null value in col3 is 3A
and so on
record1 valid combination is : 1A  2A  3A  4A  5A  6A
and for record 2
record2 valid combination is : 1B  2A  3A  4A  5A  6B

Exampe2, following above logic leaves us with:
1A  2A  3A  4A  5A  6A
1B  2A  3A  4A  5A  6B

What rules will allow to list the other 2 records?
Engineer

Commented:
I think you are assuming you can only use the next record to fill in the null values, this is not correct.
The idea is to AND ANY records together you can to produce a distinct list of valid combination records.

Records 1 and 2 cannot obviously exist together, and neither can records 3 and 4.
We can therefore cross join these to get the 4 valid results, each of which can join to record 5.

I'm really not sure how I can explain this anymore than I am.
Engineer

Commented:
Does this more real world example help rather than A's and B's?
``````1     2     3     4     5     6
DAVE				ENGLISH
JOHN				SPANISH
TV	  HORROR
BOOKS SCI-FI
HOLIDAY		SPAIN

DAVE speaks ENGLISH, Goes on HOLIDAY to SPAIN, and likes HORROR TV
DAVE speaks ENGLISH, Goes on HOLIDAY to SPAIN, and likes SCI-FI books
JOHN speaks SPANISH, Goes on HOLIDAY to SPAIN, and likes HORROR TV
JOHN speaks SPANISH, Goes on HOLIDAY to SPAIN, and likes SCI-FI BOOKS
ie
DAVE/HOLIDAY/TV/HORROR/SPAIN/ENGLISH
DAVE/HOLIDAY/BOOKS/SCI-FI/SPAIN/ENGLISH
JOHN/HOLIDAY/TV/HORROR/SPAIN/SPANISH
JOHN/HOLIDAY/BOOKS/SCI-FI/SPAIN/SPANISH

Putting record 6 back in, you can only go on HOLIDAY and watch TV
Therefore Reading BOOKS/SCI-FI cannot go with HOLIDAY/SPAIN

1     2     3     4     5     6
DAVE			      ENGLISH
JOHN			      SPANISH
TV	  HORROR
BOOKS SCI-FI
HOLIDAY		SPAIN
HOLIDAY TV

DAVE speaks ENGLISH, Goes on HOLIDAY to SPAIN, and likes HORROR TV
JOHN speaks SPANISH, Goes on HOLIDAY to SPAIN, and likes HORROR TV
ie
DAVE/NULL/TV/HORROR/SPAIN/ENGLISH
JOHN/NULL/TV/HORROR/SPAIN/SPANISH

If on the other hand I now took out HOLIDAY from all records,
I now have no conditions for column 2, therefore I can ignore this column

1     2     3     4     5     6
DAVE			      ENGLISH
JOHN                          SPANISH
TV	  HORROR
BOOKS SCI-FI
SPAIN
TV

DAVE speaks ENGLISH, Goes to SPAIN, and likes HORROR TV
DAVE speaks ENGLISH, Goes to SPAIN, and likes SCI-FI books
JOHN speaks SPANISH, Goes to SPAIN, and likes HORROR TV
JOHN speaks SPANISH, Goes to SPAIN, and likes SCI-FI BOOKS
ie
DAVE/NULL/TV/HORROR/SPAIN/ENGLISH
DAVE/NULL/BOOKS/SCI-FI/SPAIN/ENGLISH
JOHN/NULL/TV/HORROR/SPAIN/SPANISH
JOHN/NULL/BOOKS/SCI-FI/SPAIN/SPANISH

If I made a very simple version with only 3 columns

1       2       3
DAVE						ENGLISH
JOHN						SPANISH
TV      ENGLISH
BOOKS   SPANISH
DAVE    TV
JOHN    BOOKS

The valid combinations would be
DAVE/TV/ENGLISH
JOHN/BOOKS/SPANISH

If though I altered records 3 and 4, the resulting data set would give me nothing

1       2       3
DAVE						ENGLISH
JOHN						SPANISH
BOOKS   ENGLISH
TV      SPANISH
DAVE    TV
JOHN    BOOKS

Dave can only watch tv
Dave can only speak english
But TV is only in Spanish

Therefore Dave cant watch TV in spanish.
And Visa Versa, John can't read books in English
``````
Engineer

Commented:
I think I'm going to have to delete this question if I can't get anyone to understand the problem.
Engineer

Commented:
I've requested that this question be deleted for the following reason:

Lack of understanding the question
Retired IT Professional

Commented:
mikelittlewood,
I don't have oracle installed on my machine. But I contributed to this thread because of your experience so I may help in the logic. This is only if we analyse the problem correctly.

Can you use the 3 column example and explain in the way I did how to get each combination.
Are the words ENGLISH and SPANISH leftovers from editing after copy and paste? Do they have any effect on selecting the combinations.

Example:
record1 col1 is DAVE col2 is null, col 3 is null
we look for a record that is ..... and this is record no ....
Most Valuable Expert 2012
Distinguished Expert 2018

Commented:
Sorry you are deleting the question.

Thanks for the new data.

The good news:  I think I now understand the requirement.
The Bad, I'm not sure how to pull this off.  It will likely require some detailed coding that is a little above my head.

Let me make sure I understand it so maybe other Experts will know how to do achieve it.

The way I understand it, you are looking to develop sort of an AI inference engine:
http://en.wikipedia.org/wiki/Expert_system#Inference_rules

If a pair exists in a row, that is a 'rule'.  Once you get all the rules, create the result set.
Most Valuable Expert 2011
Top Expert 2012

Commented:

SELECT DISTINCT x1.*,
x2.*,
x3.*,
x4.*,
x5.*,
x6.*
FROM (SELECT DISTINCT col1
FROM yourdata
WHERE col1 IS NOT NULL) x1,
(SELECT DISTINCT col2
FROM yourdata
WHERE col2 IS NOT NULL) x2,
(SELECT DISTINCT col3
FROM yourdata
WHERE col3 IS NOT NULL) x3,
(SELECT DISTINCT col4
FROM yourdata
WHERE col4 IS NOT NULL) x4,
(SELECT DISTINCT col5
FROM yourdata
WHERE col5 IS NOT NULL) x5,
(SELECT DISTINCT col6
FROM yourdata
WHERE col6 IS NOT NULL) x6,
yourdata d2
WHERE (d2.col1 IS NULL OR d2.col1 = x1.col1)
AND (d2.col2 IS NULL OR d2.col2 = x2.col2)
AND (d2.col3 IS NULL OR d2.col3 = x3.col3)
AND (d2.col4 IS NULL OR d2.col4 = x4.col4)
AND (d2.col5 IS NULL OR d2.col5 = x5.col5)
AND (d2.col6 IS NULL OR d2.col6 = x6.col6)
AND 1 =
ALL (SELECT CASE
WHEN ((d.col1 IS NULL OR d.col1 = x1.col1)
AND (d.col2 IS NULL OR d.col2 = x2.col2)
AND (d.col3 IS NULL OR d.col3 = x3.col3)
AND (d.col4 IS NULL OR d.col4 = x4.col4)
AND (d.col5 IS NULL OR d.col5 = x5.col5)
AND (d.col6 IS NULL OR d.col6 = x6.col6))
OR  NOT (NVL(d.col1, x1.col1) = x1.col1
OR  NVL(d.col2, '-') = x2.col2
OR  NVL(d.col3, '-') = x3.col3
OR  NVL(d.col4, '-') = x4.col4
OR  NVL(d.col5, '-') = x5.col5
OR  NVL(d.col6, '-') = x6.col6)
THEN
1
ELSE
0
END
FROM yourdata d
WHERE (d.col1 = x1.col1
OR  d.col2 = x2.col2
OR  d.col3 = x3.col3
OR  d.col4 = x4.col4
OR  d.col5 = x5.col5
OR  d.col6 = x6.col6))
Most Valuable Expert 2011
Top Expert 2012

Commented:
I don't expect that to be very fast with large data samples, but it works for what you posted
Most Valuable Expert 2011
Top Expert 2012
Commented:
here's a slightly more compact version, there were some redundant conditions in my previous post

SELECT DISTINCT x1.col1,
x2.col2,
x3.col3,
x4.col4,
x5.col5,
x6.col6
FROM (SELECT DISTINCT col1
FROM yourdata
WHERE col1 IS NOT NULL) x1,
(SELECT DISTINCT col2
FROM yourdata
WHERE col2 IS NOT NULL) x2,
(SELECT DISTINCT col3
FROM yourdata
WHERE col3 IS NOT NULL) x3,
(SELECT DISTINCT col4
FROM yourdata
WHERE col4 IS NOT NULL) x4,
(SELECT DISTINCT col5
FROM yourdata
WHERE col5 IS NOT NULL) x5,
(SELECT DISTINCT col6
FROM yourdata
WHERE col6 IS NOT NULL) x6,
yourdata d2
WHERE 1 =
ALL (SELECT CASE
WHEN (d.col1 IS NULL OR d.col1 = x1.col1)
AND (d.col2 IS NULL OR d.col2 = x2.col2)
AND (d.col3 IS NULL OR d.col3 = x3.col3)
AND (d.col4 IS NULL OR d.col4 = x4.col4)
AND (d.col5 IS NULL OR d.col5 = x5.col5)
AND (d.col6 IS NULL OR d.col6 = x6.col6)
THEN
1
END
FROM yourdata d
WHERE (d.col1 = x1.col1
OR  d.col2 = x2.col2
OR  d.col3 = x3.col3
OR  d.col4 = x4.col4
OR  d.col5 = x5.col5
OR  d.col6 = x6.col6));
Most Valuable Expert 2011
Top Expert 2012

Commented:
I recommend accept of http:#36361130

It answers the question and is reasonably efficient for something that inherently suffers from combinatoric explosion
Most Valuable Expert 2012
Distinguished Expert 2018

Commented:
I have no problem with sdstuber's suggestion.

Do more with