Link to home
Start Free TrialLog in
Avatar of enigmaedge
enigmaedge

asked on

Rabin-Karp string searching algorithm help

Hello everyone,

I was looking at the string searching procedure called RAbin-Karp.  I have a slight problem.  Most implementations have this line in their pseudo code h=d^(m-1) mod q, d being the size of the alphabet (in my case, AT LEAST for case sensitivity of 52 chars) and i need for it to accomodate at least strings of 8 chars (m).  For those familiar with the algorithm, the q is the large prime number chosen to minimize spurios hits.

My question is how does one properly calculates  "h=d^(m-1) mod q" without casing an overflow and without using some large number library (such as GMP).  Most implementations assume an alphabet of 10 chars just for sake of explanation.  But to make it useful, i need to use at least 52 chars.  Am i missing something essential?

Thank you for any input!
Avatar of grg99
grg99

52 chars is going to require 6 bits, 8 chars would make a very convenient 48 bits.   Many languages have 64-bit arithmetic.

ASKER CERTIFIED SOLUTION
Avatar of dhyanesh
dhyanesh

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
enigmaedge,

I'm not familliar with the algorithm but your question suggests m is relatively small, perhaps you could start with 'x = d' and loop 'm-1' times taking 'x = (x * d) mod q' each time. The overhead of adding generic huge maths handling may be far more than the cost of a simple loop.

Hmmm, something tells me this is too obvious to be right.


Avatar of ozo
/* Assuming q*q < LONG_MAX */
long powmod(d,m,q){
    int p=1;
    if( m >= 2 ){
        p = powmod(d,m/2,q);
        p *= p;
        p %= q;
    }
    if( m%2 ){
        p *= d;
        p %= q;
    }
     return p;
}
/* h=d^(m-1) mod q */
h = powmod(d%q,m-1,q);
I believe I answered the question of how to calculate d^(m-1) mod q