I'm trying to find a relatively efficient algorithm to solve the following problem:

I have a finite set S = {s_1, s_2, s_3,...}, where each s_i is itself a non-empty finite set. I need to find a subset of S -- call it T -- where the number of elements in the union over all elements in T is equal to the number of elements in T itself.

For example, if I have the set S = {{1,2,8}, {3,5}, {1,3,8}, {2,5,7}, {1,2,5,8}, {3,8}, {2,7}, {7}}, then the algorithm should return the set T = {{3,5}, {2,5,7}, {2,7}, {7}} because T has 4 elements and union({3,5}, {2,5,7}, {2,7}, {7}) also has 4 elements.

The only way I can think of solving this problem is examining each element of the power set of S to see if it matches the criteria, but that algorithm runs in exponential time. Can anybody think of a more efficient way to solve this problem, or point me in the right direction? Thanks.

I have a finite set S = {s_1, s_2, s_3,...}, where each s_i is itself a non-empty finite set. I need to find a subset of S -- call it T -- where the number of elements in the union over all elements in T is equal to the number of elements in T itself.

For example, if I have the set S = {{1,2,8}, {3,5}, {1,3,8}, {2,5,7}, {1,2,5,8}, {3,8}, {2,7}, {7}}, then the algorithm should return the set T = {{3,5}, {2,5,7}, {2,7}, {7}} because T has 4 elements and union({3,5}, {2,5,7}, {2,7}, {7}) also has 4 elements.

The only way I can think of solving this problem is examining each element of the power set of S to see if it matches the criteria, but that algorithm runs in exponential time. Can anybody think of a more efficient way to solve this problem, or point me in the right direction? Thanks.

Do more with

EXPERT OFFICE^{®} is a registered trademark of EXPERTS EXCHANGE^{®}

This is not a trivial problem

• Reject potential sets where any any card(s_i) > card(t)

• start with the lowest card(t), yielding {{7}} as solution (and then {{2,7},{7}} for example)

• arrange sets in decreasing cardinal order, prune as soon as the cumulative union has card(u) > card(t), and eliminate potential sets starting the same with equal or lower cardinality.

If you need all possible answers, I still feel there should be ways to optimise the search. If you can arrange all potential sets in a binary tree, there should be nodes where you can prune the search, thus eliminating entire branches. For example:

S = {{1,2,3,4},{1,2},{2,3}}

All sets containing s_1 can be eliminated without exploring them in more detail. I don't see the tree, however...

(°v°)

Store for each s_i:

card(s_i)

Range(s_i) == (Min_i, Max_i)

When considering (s_i U s_j) , if the ranges do not overlap, then immediately you know that

card( s_i U s_j ) = card( s_i ) + card( s_j ).

If the ranges do overlap, then to determine the intersection, only the overlapped range need be considered.

Some partial matrix can be pre-computed to also improve performance. Some candidates are:

card( s_i U s_j ) and possibly card( s_i Intersect s_j )

===

For your example...

Other solutions where T has 5 elements are:

{3,5}, {3,8}, {1,2,8}, {1,3,8}, {1,2,5,8}

{7}, {2,7}, {3,5}, {3,8}, {2,5,7}

A solution where T has 6 elements is:

{7}, {2,7}, {3,5}, {3,8}, {2,5,7}, {1,2,8},

===

Sometimes, there is only one solution. For example:

S = { {2,6} {1,7} {1,2,6} {2,6,7} }

Only solution is: T = S

## Premium Content

You need an Expert Office subscription to comment.Start Free Trial