Link to home
Start Free TrialLog in
Avatar of Unimatrix_001
Unimatrix_001Flag for United Kingdom of Great Britain and Northern Ireland

asked on

Need help speeding up a search function...

Hi,

See the attached code, I'm needing this to be as fast as possible - if you can see any improvements I could make I'd be very grateful.

Thanks,
Uni
unsigned int countNeedles(vector<unsigned char> &aHaystack, vector<unsigned char> &aNeedle){
	vector<unsigned char>::iterator resultPointer=aHaystack.begin();
	unsigned int occurences=0;
	while(true){
		resultPointer=search(resultPointer, aHaystack.end(), aNeedle.begin(), aNeedle.end());
		if(resultPointer==aHaystack.end()){
			break;
		}else{
			resultPointer+=aNeedle.size();
			occurences++;
		}
	}
	return occurences;
}

Open in new window

Avatar of evilrix
evilrix
Flag of United Kingdom of Great Britain and Northern Ireland image

This is just doing a count of the intersection right?

Try looking at set_intersection...

http://www.sgi.com/tech/stl/set_intersection.html
http://www.cplusplus.com/reference/algorithm/set_intersection.html

...you might find the STL algorithm for doing this is faster. Obviously you end up with a vector of the intersection, of which size() will give you the matching occurances.
// set_intersection example
#include <iostream>
#include <algorithm>
#include <vector>
using namespace std;
 
int main () {
  int first[] = {5,10,15,20,25};
  int second[] = {50,40,30,20,10};
  vector<int> v(10);                           // 0  0  0  0  0  0  0  0  0  0
  vector<int>::iterator it;
 
  sort (first,first+5);     //  5 10 15 20 25
  sort (second,second+5);   // 10 20 30 40 50
 
  it=set_intersection (first, first+5, second, second+5, v.begin());
                                               // 10 20 0  0  0  0  0  0  0  0
 
  cout << "intersection has " << int(it - v.begin()) << " elements.\n";
 
  return 0;
}

Open in new window

Avatar of Unimatrix_001

ASKER

Hi there,

>>This is just doing a count of the intersection right?
I'm sorry, I don't understand? It is counting the number of times the needle occurs.

The problem is with the above is that I can't use the sort function. See, my haystack is like:
A,B,C,A,C,A etc...
But the needle spans more than one element, such as:
C,A
In which case the count there would be 2. Clearly if I sorted the haystack, then I can't count the occurences.
>> I'm sorry, I don't understand? It is counting the number of times the needle occurs.
It's counting the intersection

>> But the needle spans more than one element
Not sure I follow this as both vectors are the same type and if search works then so should intersection. That being the case the intersection code I posted above (not mine but from one of the sites) should work.
Well, before set_intersection is called, the code runs stl::sort. I can't use sort on neither the needle or the haystack, doing so would mean that I wouldn't get the correct results.
>> I can't use sort on neither the needle or the haystack, doing so would mean that I wouldn't get the correct results.
So, are you saying search works because even though it does a char by char comparison it includes char pairs? If that's the case then I guess I can see why sort won't work and search will. You could resolve this by wrapping each char pair in a class/struct and overloading the operator < and operator = so that you could sort and compare the char pairs... if you see what I mean. It'd be a little bit of work but the result would likely be simpler and more efficient. Obviously, you'd need to change the vectors to store the struct/class instead.
>>So, are you saying search works because even though it does a char by char comparison it includes char pairs?
Correct. :) Although the needle is not restricted to just a pair, it may contain up to 16 chars.

>>You could resolve this by wrapping each char pair in a class/struct and overloading the operator < and operator = so that you could sort and compare the char pairs.
I'd rather not, because that would then mean wrapping the haystack into pairs; although I don't just search for pairs in the haystack, so it would mean I would have to rewrap the haystack depending on the size of the needle.
ASKER CERTIFIED SOLUTION
Avatar of evilrix
evilrix
Flag of United Kingdom of Great Britain and Northern Ireland image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Okay, thanks anyways - I'll leave it open for a few days before bumping it up...
>> Okay, thanks anyways - I'll leave it open for a few days before bumping it up...
Sure, no worries :)
In order to optimize an algorithm like this, we'll need to exploit special properties of the haystack and needle data.

What does the data represent ? Is the data somehow aligned ? Would it be possible to use unsigned ints instead of unsigned chars ? Comparing an int instead of chars would make this quite a bit faster.
Hi Infinity,

The data is nothing more than a string of values each one between 0 and 4 inclusive. No the data isn't aligned in any way shape or form. I could use unsigned ints, although that would force memory usage up by a factor of 4, something I'd prefer to avoid if at all possible. It may be a length of string question, but could you estimate the "quite a bit faster" approach?

Thanks,
Uni
What I meant by using unsigned ints is to compare more than one byte at a time (ie. 4 on a 32bit system). The speedup would potentially be a factor 4. But data alignment makes this more complicated, as well as data lengths that are not a multiple of sizeof(int) bytes.

Since you say the bytes can only have 0-4 as value, would it be possible to pack multiple values into one byte. An 8bit byte can contain 3 such values unambiguously (5^3 = 125 <= 256). This would again require certain alignment characteristics, which I assume are not there.
>> The data is nothing more than a string of values each one between 0 and 4 inclusive
In the example you have above they were strings of letters...

"The problem is with the above is that I can't use the sort function. See, my haystack is like:
A,B,C,A,C,A etc...
But the needle spans more than one element, such as:
C,A"

Is this not actually the case? If the data is really just a string of numbers them, as I8 eludes, there is room for further optimization.
Since you say the bytes can only have 0-4 as value, would it be possible to pack multiple values into one byte. An 8bit byte can contain 3 such values unambiguously (5^3 = 125 <= 256). This would again require certain alignment characteristics, which I assume are not there.

Yes, this is a possibility and something I have considered, but I'm unsure how to get around the issue of where the needle is embedded in a byte or crosses the byte boundary. I.e:

Byte: 1,3,0
Needle: 1,3

Or:

Byte: [0, 0, 1], [3, 0, 1]
Needle: 1, 3

Also, packing the bytes, I then have the issue of where a needle may be 7 values long, in which case this would span 3 bytes.

Have you any idea how I could approach these problems?

-----------------------------------------------------------------------------------

What I meant by using unsigned ints is to compare more than one byte at a time (ie. 4 on a 32bit system). The speedup would potentially be a factor 4. But data alignment makes this more complicated, as well as data lengths that are not a multiple of sizeof(int) bytes.

I'm sorry, could you explain that further, I'm not understanding you 100%.

-----------------------------------------------------------------------------------

Thanks,
Uni
>>In the example you have above they were strings of letters...
A-E, 0-4, take your pick, we still store each in a byte... From the users point of view it is A-E, but internally we use 0-4. I was expecting optimisations such as 'to use arrays', etc. to appear more than some enhancements that could be done using the data, so I didn't wish to add what I thought was unnecessary information to the question. :)

>>If the data is really just a string of numbers them, as I8 eludes, there is room for further optimization.
As above, you're free to call them what you wish; I guess the important part here is there are only 5 choices, each one uses a byte.

Thanks,
Uni
>> but I'm unsure how to get around the issue of where the needle is embedded in a byte or crosses the byte boundary.

That's exactly what I meant with alignment issues ;) If a needle can start at any given byte, then this makes speeding up things by packing bytes pretty much impossible.


>> I'm sorry, could you explain that further, I'm not understanding you 100%.

Right now, you're comparing one byte at a time. Assuming that you are working on a 32bit system : If your needle is 4 bytes long, then you potentially need to make 4 byte comparisons. You can however achieve the same result with 1 unsigned int comparison (comparing all 4 bytes at once).

There are a few issues with this :

1) if the needle is not a multiple of 4 bytes, you have a few trailing bytes that need to be handled separately, for example by masking the unused bytes.

2) you would perform int comparisons at byte boundaries (ie. not at word boundaries). That might be a problem on some systems.


Note that we're slowly moving into the extreme optimization area here ;) Do you really need such an optimized search ?


One more thought : do you have control over how the haystack is constructed ? At the moment the haystack is constructed, do you have any idea about what the needle will be ?
SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
>>That's exactly what I meant with alignment issues ;) If a needle can start at any given byte, then this makes speeding up things by packing bytes pretty much impossible.

Ah I understand what you mean now...

Yes, the 4 byte idea is okay, but unfortunately the first issue you raise will most likely be the highest occuring - probably to the point where all possible savings would be undone by the cost of masking.

It needn't be extreme, I was looking for perhaps a obvious observations.

No, I have no ideas of the haystack contents until it comes into the program. I know that the needle will be a length between 2 and 16.
>> I was looking for perhaps a obvious observations.

Nothing that I can see ... The way it's implemented looks pretty good to me. Do you agree evilrix ?
Hm, it doesn't seem I'm going to get better than what I have, perhaps I should close this question... Can you both put in your recommendations for what to do (delete, split points, paq & refund) etc...

Thanks,
Uni
Maybe just one minor thing : you can stop checking at (aHaystack.end() - aNeedle.size()) instead of just aHaystack.end() ...
>> perhaps I should close this question...

If you want, you can wait for a third opinion ;)
That is okay - I shall PAQ this question. Thank you both for your efforts. :)

Uni
>> Nothing that I can see ... The way it's implemented looks pretty good to me. Do you agree evilrix ?
I agree, I don't think there is much you can do given the restrictions imposed

>> Can you both put in your recommendations for what to do (delete, split points, paq & refund) etc...
I originally gave you the answer in the post {http:#21633376}, Infinity08 has then worked through it with you to see if he can see any other way forward and, I believe, has reached the same conclusion in this post here {http:#21634129}. Although I accept they may not be the answer you want they do answer your question.

-Rx.
Very well - points split 50/50.
Thank you. :)
>> has reached the same conclusion in this post here

Yep, sorry about that ;) I thought there might be some special properties about the data that could maybe be exploited, but that didn't seem the case :(
>> Yep, sorry about that
Ditto... it's never nice to have to admit defeat :(