troubleshooting Question

# Calculating variations, permutations and subsets using trees

Programming.NET ProgrammingC#Algorithms
Hi all,

Can anyone suggest an efficient method of sorting / arranging / printing the following...

I have a text file with approx 200,000 lines. On each line is a space-separated list of positive integer values, and the same number does not appear twice on any given line.
The integer values are sorted in ascending order from left to right, and each list has a variable length - some lines contain a few numbers, and the longest line has 73 unique integer values.

I can count the number of occurrences of each unique value within the file, however what i'm trying to do is a bit more complicated than that. I need to find out how many occurrences exist of each possible set and subset of values, and once that's been calculated to return all occurrences of more than N.

i.e. say the first line contains the following:

1 13 82 83 130 216 395 400 412

Possible subsets are 1, 1 and 13, 1 and 82, 1 and 83.... etc.. 1,13,82 1,13,83 1,13,130 etc
all the way up to a subset that contains every number in that line.

I have looked through numerous different algorithms and methods, but most of what i'm finding comes down to permutations, but i'm after values that have already been sorted, and i don't necessarily want every number in my searches.
For example, the set { 1, 83, 216 } occurs in the line that I gave an example of above, as does { 82, 83, 412 } but the integer values are not next to each other in the array.

I realise that I probably need to provide a fair bit more detail, but does this give anyone any ideas at this point?

TIA!
Join the community to see this answer!
###### Learn from the best

Network and collaborate with thousands of CTOs, CISOs, and IT Pros rooting for you and your success.

Andrew Hancock - VMware vExpert
See if this solution works for you by signing up for a 7 day free trial.