Avatar of Unimatrix_001
Unimatrix_001Flag for United Kingdom of Great Britain and Northern Ireland asked on

Need help speeding up a search function...

Hi,

See the attached code, I'm needing this to be as fast as possible - if you can see any improvements I could make I'd be very grateful.

Thanks,
Uni
unsigned int countNeedles(vector<unsigned char> &aHaystack, vector<unsigned char> &aNeedle){
	vector<unsigned char>::iterator resultPointer=aHaystack.begin();
	unsigned int occurences=0;
	while(true){
		resultPointer=search(resultPointer, aHaystack.end(), aNeedle.begin(), aNeedle.end());
		if(resultPointer==aHaystack.end()){
			break;
		}else{
			resultPointer+=aNeedle.size();
			occurences++;
		}
	}
	return occurences;
}

Open in new window

C++

Avatar of undefined
Last Comment
evilrix

8/22/2022 - Mon
evilrix

This is just doing a count of the intersection right?

Try looking at set_intersection...

http://www.sgi.com/tech/stl/set_intersection.html
http://www.cplusplus.com/reference/algorithm/set_intersection.html

...you might find the STL algorithm for doing this is faster. Obviously you end up with a vector of the intersection, of which size() will give you the matching occurances.
// set_intersection example
#include <iostream>
#include <algorithm>
#include <vector>
using namespace std;
 
int main () {
  int first[] = {5,10,15,20,25};
  int second[] = {50,40,30,20,10};
  vector<int> v(10);                           // 0  0  0  0  0  0  0  0  0  0
  vector<int>::iterator it;
 
  sort (first,first+5);     //  5 10 15 20 25
  sort (second,second+5);   // 10 20 30 40 50
 
  it=set_intersection (first, first+5, second, second+5, v.begin());
                                               // 10 20 0  0  0  0  0  0  0  0
 
  cout << "intersection has " << int(it - v.begin()) << " elements.\n";
 
  return 0;
}

Open in new window

ASKER
Unimatrix_001

Hi there,

>>This is just doing a count of the intersection right?
I'm sorry, I don't understand? It is counting the number of times the needle occurs.

The problem is with the above is that I can't use the sort function. See, my haystack is like:
A,B,C,A,C,A etc...
But the needle spans more than one element, such as:
C,A
In which case the count there would be 2. Clearly if I sorted the haystack, then I can't count the occurences.
evilrix

>> I'm sorry, I don't understand? It is counting the number of times the needle occurs.
It's counting the intersection

>> But the needle spans more than one element
Not sure I follow this as both vectors are the same type and if search works then so should intersection. That being the case the intersection code I posted above (not mine but from one of the sites) should work.
Experts Exchange has (a) saved my job multiple times, (b) saved me hours, days, and even weeks of work, and often (c) makes me look like a superhero! This place is MAGIC!
Walt Forbes
ASKER
Unimatrix_001

Well, before set_intersection is called, the code runs stl::sort. I can't use sort on neither the needle or the haystack, doing so would mean that I wouldn't get the correct results.
evilrix

>> I can't use sort on neither the needle or the haystack, doing so would mean that I wouldn't get the correct results.
So, are you saying search works because even though it does a char by char comparison it includes char pairs? If that's the case then I guess I can see why sort won't work and search will. You could resolve this by wrapping each char pair in a class/struct and overloading the operator < and operator = so that you could sort and compare the char pairs... if you see what I mean. It'd be a little bit of work but the result would likely be simpler and more efficient. Obviously, you'd need to change the vectors to store the struct/class instead.
ASKER
Unimatrix_001

>>So, are you saying search works because even though it does a char by char comparison it includes char pairs?
Correct. :) Although the needle is not restricted to just a pair, it may contain up to 16 chars.

>>You could resolve this by wrapping each char pair in a class/struct and overloading the operator < and operator = so that you could sort and compare the char pairs.
I'd rather not, because that would then mean wrapping the haystack into pairs; although I don't just search for pairs in the haystack, so it would mean I would have to rewrap the haystack depending on the size of the needle.
Get an unlimited membership to EE for less than $4 a week.
Unlimited question asking, solutions, articles and more.
ASKER CERTIFIED SOLUTION
evilrix

Log in or sign up to see answer
Become an EE member today7-DAY FREE TRIAL
Members can start a 7-Day Free trial then enjoy unlimited access to the platform
Sign up - Free for 7 days
or
Learn why we charge membership fees
We get it - no one likes a content blocker. Take one extra minute and find out why we block content.
See how we're fighting big data
Not exactly the question you had in mind?
Sign up for an EE membership and get your own personalized solution. With an EE membership, you can ask unlimited troubleshooting, research, or opinion questions.
ask a question
ASKER
Unimatrix_001

Okay, thanks anyways - I'll leave it open for a few days before bumping it up...
evilrix

>> Okay, thanks anyways - I'll leave it open for a few days before bumping it up...
Sure, no worries :)
Infinity08

In order to optimize an algorithm like this, we'll need to exploit special properties of the haystack and needle data.

What does the data represent ? Is the data somehow aligned ? Would it be possible to use unsigned ints instead of unsigned chars ? Comparing an int instead of chars would make this quite a bit faster.
Experts Exchange is like having an extremely knowledgeable team sitting and waiting for your call. Couldn't do my job half as well as I do without it!
James Murphy
ASKER
Unimatrix_001

Hi Infinity,

The data is nothing more than a string of values each one between 0 and 4 inclusive. No the data isn't aligned in any way shape or form. I could use unsigned ints, although that would force memory usage up by a factor of 4, something I'd prefer to avoid if at all possible. It may be a length of string question, but could you estimate the "quite a bit faster" approach?

Thanks,
Uni
Infinity08

What I meant by using unsigned ints is to compare more than one byte at a time (ie. 4 on a 32bit system). The speedup would potentially be a factor 4. But data alignment makes this more complicated, as well as data lengths that are not a multiple of sizeof(int) bytes.

Since you say the bytes can only have 0-4 as value, would it be possible to pack multiple values into one byte. An 8bit byte can contain 3 such values unambiguously (5^3 = 125 <= 256). This would again require certain alignment characteristics, which I assume are not there.
evilrix

>> The data is nothing more than a string of values each one between 0 and 4 inclusive
In the example you have above they were strings of letters...

"The problem is with the above is that I can't use the sort function. See, my haystack is like:
A,B,C,A,C,A etc...
But the needle spans more than one element, such as:
C,A"

Is this not actually the case? If the data is really just a string of numbers them, as I8 eludes, there is room for further optimization.
Get an unlimited membership to EE for less than $4 a week.
Unlimited question asking, solutions, articles and more.
ASKER
Unimatrix_001

Since you say the bytes can only have 0-4 as value, would it be possible to pack multiple values into one byte. An 8bit byte can contain 3 such values unambiguously (5^3 = 125 <= 256). This would again require certain alignment characteristics, which I assume are not there.

Yes, this is a possibility and something I have considered, but I'm unsure how to get around the issue of where the needle is embedded in a byte or crosses the byte boundary. I.e:

Byte: 1,3,0
Needle: 1,3

Or:

Byte: [0, 0, 1], [3, 0, 1]
Needle: 1, 3

Also, packing the bytes, I then have the issue of where a needle may be 7 values long, in which case this would span 3 bytes.

Have you any idea how I could approach these problems?

-----------------------------------------------------------------------------------

What I meant by using unsigned ints is to compare more than one byte at a time (ie. 4 on a 32bit system). The speedup would potentially be a factor 4. But data alignment makes this more complicated, as well as data lengths that are not a multiple of sizeof(int) bytes.

I'm sorry, could you explain that further, I'm not understanding you 100%.

-----------------------------------------------------------------------------------

Thanks,
Uni
ASKER
Unimatrix_001

>>In the example you have above they were strings of letters...
A-E, 0-4, take your pick, we still store each in a byte... From the users point of view it is A-E, but internally we use 0-4. I was expecting optimisations such as 'to use arrays', etc. to appear more than some enhancements that could be done using the data, so I didn't wish to add what I thought was unnecessary information to the question. :)

>>If the data is really just a string of numbers them, as I8 eludes, there is room for further optimization.
As above, you're free to call them what you wish; I guess the important part here is there are only 5 choices, each one uses a byte.

Thanks,
Uni
Infinity08

>> but I'm unsure how to get around the issue of where the needle is embedded in a byte or crosses the byte boundary.

That's exactly what I meant with alignment issues ;) If a needle can start at any given byte, then this makes speeding up things by packing bytes pretty much impossible.


>> I'm sorry, could you explain that further, I'm not understanding you 100%.

Right now, you're comparing one byte at a time. Assuming that you are working on a 32bit system : If your needle is 4 bytes long, then you potentially need to make 4 byte comparisons. You can however achieve the same result with 1 unsigned int comparison (comparing all 4 bytes at once).

There are a few issues with this :

1) if the needle is not a multiple of 4 bytes, you have a few trailing bytes that need to be handled separately, for example by masking the unused bytes.

2) you would perform int comparisons at byte boundaries (ie. not at word boundaries). That might be a problem on some systems.


Note that we're slowly moving into the extreme optimization area here ;) Do you really need such an optimized search ?


One more thought : do you have control over how the haystack is constructed ? At the moment the haystack is constructed, do you have any idea about what the needle will be ?
All of life is about relationships, and EE has made a viirtual community a real community. It lifts everyone's boat
William Peck
SOLUTION
Log in to continue reading
Log In
Sign up - Free for 7 days
Get an unlimited membership to EE for less than $4 a week.
Unlimited question asking, solutions, articles and more.
ASKER
Unimatrix_001

>>That's exactly what I meant with alignment issues ;) If a needle can start at any given byte, then this makes speeding up things by packing bytes pretty much impossible.

Ah I understand what you mean now...

Yes, the 4 byte idea is okay, but unfortunately the first issue you raise will most likely be the highest occuring - probably to the point where all possible savings would be undone by the cost of masking.

It needn't be extreme, I was looking for perhaps a obvious observations.

No, I have no ideas of the haystack contents until it comes into the program. I know that the needle will be a length between 2 and 16.
Infinity08

>> I was looking for perhaps a obvious observations.

Nothing that I can see ... The way it's implemented looks pretty good to me. Do you agree evilrix ?
ASKER
Unimatrix_001

Hm, it doesn't seem I'm going to get better than what I have, perhaps I should close this question... Can you both put in your recommendations for what to do (delete, split points, paq & refund) etc...

Thanks,
Uni
Get an unlimited membership to EE for less than $4 a week.
Unlimited question asking, solutions, articles and more.
Infinity08

Maybe just one minor thing : you can stop checking at (aHaystack.end() - aNeedle.size()) instead of just aHaystack.end() ...
Infinity08

>> perhaps I should close this question...

If you want, you can wait for a third opinion ;)
ASKER
Unimatrix_001

That is okay - I shall PAQ this question. Thank you both for your efforts. :)

Uni
This is the best money I have ever spent. I cannot not tell you how many times these folks have saved my bacon. I learn so much from the contributors.
rwheeler23
evilrix

>> Nothing that I can see ... The way it's implemented looks pretty good to me. Do you agree evilrix ?
I agree, I don't think there is much you can do given the restrictions imposed

>> Can you both put in your recommendations for what to do (delete, split points, paq & refund) etc...
I originally gave you the answer in the post {http:#21633376}, Infinity08 has then worked through it with you to see if he can see any other way forward and, I believe, has reached the same conclusion in this post here {http:#21634129}. Although I accept they may not be the answer you want they do answer your question.

-Rx.
ASKER
Unimatrix_001

Very well - points split 50/50.
ASKER
Unimatrix_001

Thank you. :)
Get an unlimited membership to EE for less than $4 a week.
Unlimited question asking, solutions, articles and more.
Infinity08

>> has reached the same conclusion in this post here

Yep, sorry about that ;) I thought there might be some special properties about the data that could maybe be exploited, but that didn't seem the case :(
evilrix

>> Yep, sorry about that
Ditto... it's never nice to have to admit defeat :(