C++: find with iterator of map

I am writing a code for Apriori Algorithm and I want to find the maximal frequent itemsets then write it in a file

the code contains  
// elements
typedef std::string DataItem;

// set of elements
typedef std::set< DataItem > DataItemSet;

//Transactions
typedef DataItemSet DataTransaction;

// set of transactions
typedef std::vector< DataTransaction > DataTransactionSet;

// set of candidates
// key: a collection of items , value:count
typedef std::map< DataItemSet, int > DataItemSetCountTable;
typedef std::vector< DataItemSetCountTable > DataCandeItemTableList;

// Frequent item set
typedef std::set<DataItemSet> DataFreItSet;

Open in new window


I managed to write this part

// find the Maximal Frequent item sets
	cout << "\n\n Writing Frequent item sets in (MaximalFrequentItemsets.txt) file ...\n";
	outFile3 << "\n Maximal Frequent item sets With minimum Support " << minSup << " are: " << endl << endl;
	
	DataItemSetCountTable allCanTalbe1;
	for (size_t i = 0; i<allCanTableList.size(); ++i)
	{
		allCanTalbe1.insert(allCanTableList[i].begin(), allCanTableList[i].end());
	}

	for (DataItemSetCountTable::iterator CandIter = allCanTalbe1.begin(); CandIter != allCanTalbe1.end(); ++CandIter)
	{
		for (DataItemSetCountTable::const_iterator CandIter1 = allCanTalbe1.begin(); CandIter1 != allCanTalbe1.end(); ++CandIter1)
		{
			// just compare item set with the level next to it
			if (CandIter->first.size() == CandIter1->first.size() - 1)
			{
				
				bool isThere = includes(CandIter1->first.begin(), CandIter1->first.end(), CandIter->first.begin(), CandIter->first.end());
				if (isThere)
				{
						// check if its count(support) greater than the one found in the next level
					if (CandIter->second > CandIter1->second)
					{
						cout << CandIter1->first << "  " << CandIter->first << endl;

						outFile3 << CandIter1->first << "\t Sup: " << (float(CandIter->second) / transSet.size()) * 100 << endl;
					}
					
				}
			}
		}
	}

Open in new window

but instead of using :

bool isThere = includes(CandIter1->first.begin(), CandIter1->first.end(), CandIter->first.begin(), CandIter->first.end());
				if (isThere)

Open in new window

I would like to use find because I want to find the whole value that in CandIter (ex: i1i2 ) is in CandIter1(ex: i1i2i3) and not each data itemset  : (ex: i1 , i2 )
CandIter1->first.find(CandIter->first)

Open in new window

but it gives me an error.
is there other function or method that I can use or the whole code is a big error!!

help
AaeshahAsked:
Who is Participating?
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

Karrtik IyerSoftware ArchitectCommented:
I'm not sure if I understand your requirement, can you please write your algorithm in plain English like step 1, step 2,etc..and then we can help. I'm unable to understand Why would you iterate the same map twice in the same direction two times?  If you see both for loops below, both iterate from allCanTalbe1.begin to end? Only difference in code is one is const iterator and one is non const..
for (DataItemSetCountTable::iterator CandIter = allCanTalbe1.begin(); CandIter != allCanTalbe1.end(); ++CandIter)
      {
            for (DataItemSetCountTable::const_iterator CandIter1 = allCanTalbe1.begin(); CandIter1 != allCanTalbe1.end(); ++CandIter1)
            {
0
AaeshahAuthor Commented:
simply: how I can compare two values in a map using iterator but not with the same sequence of letter, not by separating each string  and compare them as a word because includes did the comparing but among each string in the word

for example if:
first word: sad          second word: sadness      other word: sand
I wand to find sad in sadness but not in sand which includes return true in this case because it has s,a,d in it but in the same order

I hope I am clear and const_iterator was by mistake the all the same and I two for because I wand to compare each value with the next to it
0
Karrtik IyerSoftware ArchitectCommented:
So if i understand correctly, the set, typedef std::set< DataItem > DataItemSet, contains all the strings, which as per your previous example are say,
sad,
sadness
Sand
Now you want to find if each of this string is part of other strings (substring) in the collection, and how many times each of string is substring of other items in your set.
So in your previous example the output would be:
sad --> 2 times
sadness --> 1 time
Sand --> 1 time.
Is this the expected output?
0
Ultimate Tool Kit for Technology Solution Provider

Broken down into practical pointers and step-by-step instructions, the IT Service Excellence Tool Kit delivers expert advice for technology solution providers. Get your free copy now.

Karrtik IyerSoftware ArchitectCommented:
If I understand what you are looking for, just use std::string's find method, it shall not return sad in sand which includes in your code you say is returning. Check out below reference:
std::string::find
0
evilrixSenior Software Engineer (Avast)Commented:
What if you have "ss" and a target of "ssss", would you expect that to find 1, 2 or 3 matches? If the latter you can't just use find; you'll need a smarter algorithm.

Can we forget code for the moment? Can you explain the use case in simple layman's terms and give a simple representative example that covers all use cases and known edge cases? The trouble with focusing on the existing code is it will skew our thinking. Let's make sure we're all on the same page in terms of requirements and then we can return to the code and see what needs doing to it.

Thanks.
0
AaeshahAuthor Commented:
Karrtik Iyer
the first thing come to my mind is using find but it gives me an errors:
CandIter1->first.find(CandIter->first);

Open in new window


 'std::_Tree<std::_Tset_traits<_Kty,_Pr,_Alloc,false>>::find' : 2 overloads have no legal conversion for 'this' pointer	

IntelliSense: no instance of overloaded function "std::set<_Kty, _Pr, _Alloc>::find [with _Kty=DataItem, _Pr=std::less<DataItem>, _Alloc=std::allocator<DataItem>]" matches the argument list
            argument types are: (const DataItemSet)
            object type is: const DataItemSet	

Open in new window


evilrix
I believe you're right .. the concept itself is easy to understand but it is more complicated when it comes to programming the simple concept that I have items in set(lines) like this:

T1: I1,I2,I5
T2: I2,I4
T3: I2,I3
T4: I1,I2,I4
T5: I1,I3
T6: I2,I3
T7: I1,I3
T8: I1,I2,I3,I5
T9: I1,I2,I3

I read this file and count how many I1 is there and how many I2 and so on
then I do the same but in two's how many I1I2 is together in each transaction how many I2I3

I did this part very well
but when I need to find the maximal frequent itemsets I get lost, the algorithm is:
1. compare each element in the first level I1, I2 , I3 ....  with the element in the next level I1I2 , I2I3 and see:
2. if the element in the first level has more frequency count it is closed
3. then check if there is an element not mention in the next level so it is maximal

I want to get those Maximal items sets

Book definitions :
closed : An itemset is closed if none of its immediate supersets has the same support (count) as the itemset
Maximal: it is a frequent itemset for which none of its immediate supersets are frequents
0
AaeshahAuthor Commented:
So as a result to the frequent itemsets I got:

Frequent item sets With minimum Support 20 (which means repeated more than once) are:

I1,       cout: 6       Sup: 66.6667
I2,       cout: 7       Sup: 77.7778
I3,       cout: 6       Sup: 66.6667
I4,       cout: 2       Sup: 22.2222
I5,       cout: 2       Sup: 22.2222
I1,I2,       cout: 4       Sup: 44.4444
I1,I3,       cout: 4       Sup: 44.4444
I1,I5,       cout: 2       Sup: 22.2222
I2,I3,       cout: 4       Sup: 44.4444
I2,I4,       cout: 2       Sup: 22.2222
I2,I5,       cout: 2       Sup: 22.2222
I1,I2,I3,       cout: 2       Sup: 22.2222
I1,I2,I5,       cout: 2       Sup: 22.2222

the closed and maximal should be:
Closed Frequent itemsets = {I1, I2, I3, I1I2, I1I3, I2I3, I2I4, I1I2I3, I1I2I5}
Maximal Frequent itemsets = {I2I4, I1I2I3, I1I2I5}
0
Karrtik IyerSoftware ArchitectCommented:
Hi Aaeshah, before we get to the problem of finding the itemsets, i see that in your code you are trying to print CandIter1->first directly, but CandIter1->first is a std::set, and you cannot directly do a ostream << operation of set, you have to print them individually using a for a loop and you can print each item in set since it is a string. So first of all the code cout<<CandIter1->first won't work.
Also you are doing includes on the complete set, while I suggested for using find on the string inside the set (each item of your set) in my earlier post.
I partially understood what your requirements are by reading your post above, but not completely. Coming to your original code, I assume the dataset (DataItemSet) contains set of strings. I would like to know how have you stored level wise strings in set which is in turn stored in a map? Like for example what I want to know is, your map contains DataItemSet v/s count,
With the above example, can you give example of at least two entries of this map containing the set?
In the below if you give it, it shall be useful.
1> map Item 1,
First - DataItemSet (what does this dataitemset contain, some sample values)
Second - Count (some integer value)
2> Map Item 2
First - DataItemSet (what does this dataitemset contain, some sample values, how is this different to item 1 above)
Second - Count (some integer value)

If you are unable to provide the data above, pardon me, without understanding your data, I would like you to look at below reference to see if either of these functions could be useful to you. I am sorry to provide such generic sample since I am not in a state to understand your data with the above code. Neither may be a direct solution, might just be some hints if I was able to even understand little what you are trying to do.
Search algorithm
Set::Find

Thanks,
Karrtik
0

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
Karrtik IyerSoftware ArchitectCommented:
For another of your comment, the find that you tried was find on the set, but while doing find on set, the argument (rhs) should be an element and not another set itself. You get compilation error, since in your code you are passing another set  (CanIter->First)
CandIter1->first.find(CandIter->first);// This code is not correct. the set find takes value you want to find, which means the string that you want to find, you cannot pass another set.
See example below.
std::set<int> myset;
  std::set<int>::iterator it;

  // set some initial values:
  for (int i=1; i<=5; i++) myset.insert(i*10);    // set: 10 20 30 40 50

  it=myset.find(20);
  myset.erase (it);
  myset.erase (myset.find(40));

  std::cout << "myset contains:";
  for (it=myset.begin(); it!=myset.end(); ++it)
    std::cout << ' ' << *it;
  std::cout << '\n';
0
Karrtik IyerSoftware ArchitectCommented:
Any updates Aaeshah?
0
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
C++

From novice to tech pro — start learning today.

Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.