If your organization has a need to mass-create AD user accounts, watch this video to see how its done without the need for scripting or other unnecessary complexities.

int any (char *s1, char *s2)

{

int i, n ;

for (n=0; s1[n] != '\0' ; n++){

for (i=0 ; s2[i] != '\0' ; i++){

if (s1[n] == s2[i]){

return n;}

}

}

return -1;

}

That is a function that I wrote in C, any(s1,s2) returns the first location in s1 where any character from s2 occurs, or -1 if no match found.

Can any expert out there help me to improve the function

1) Make it more time-efficient

2) Make it more space-efficient

thanks

{

int i, n ;

for (n=0; s1[n] != '\0' ; n++){

for (i=0 ; s2[i] != '\0' ; i++){

if (s1[n] == s2[i]){

return n;}

}

}

return -1;

}

That is a function that I wrote in C, any(s1,s2) returns the first location in s1 where any character from s2 occurs, or -1 if no match found.

Can any expert out there help me to improve the function

1) Make it more time-efficient

2) Make it more space-efficient

thanks

If you compile this program wuch optimation "turned on" you'll get a function without the integers *i* and *n*. The compiler will generate code that uses pointers instead of indices. There is a net savings of adding 1 to a pointer instead of adding 1 to an integer and then adding the integer to a base address. The code will look something like this:

int any (char *s1, char *s2)

{

char *s1loc, *s2loc;

s1loc = s1;

s2loc = s2;

while (*s1loc){

s2loc = s2;

while (*s2loc){

if (*s1loc == *s2loc)

return s1loc - s1;

s2loc++;

}

s1loc++;

}

return -1;

}

Kent

int any (char *s1, char *s2)

{

char *_s1, *_s2;

_s1 = s1;

while (*_s1)

{

_s2 = s2;

while (*_s2)

{

if (*_s1 == *_s2)

{

return (_s1-s1);

}

_s2++;

}

_s1++;

}

return -1;

}

In that case you can use Stef's nice trick of pre-computing a table of first occurences for that string and use that inside the function.

Also you may know from your data that certain characters appear more frequently, it would help to put those near the start of the pattern string.

-or- the function could dynamically organize the pattern string, say every time there's a match, move that character with the first one on the list. That will tend in many cases to self-organize the pattern array to be more efficient.

-actually- in many cases it's best to call a library function if there is one that does this. Quite often library functions are optimized all to heck, using special instructins that most C compilers don't generate, like "scan for matching byte".

Going back to the original question, do you have an expectation of what the data will look like? If the first character in string 1 is likely to occur in string 2, then your solution (and the modifications provide by AlexFM and me) will usually give you the result faster than other methods. But, if the returned value is greater than zero (not the first character in string 1), the solution offered by Stefan73 is usually superior. (When n==1, the timings will be similar. Also, timings may vary a bit depending on string size.)

If you're going to use Stefan73's solution as a "general purpose" routine, you'll need to call memset() to initialize the array every call. This may slighly skew the timings when the returned value is small.

Kent

Yes, speed very much depends on the average string sizes.

That's quite similar to SQL join techniques (like nested loops, hash join)...

int firsts[256];

unsigned char c;

Why do you want the array to have 256 characters? Is it because there are 256 characters that computer use?

I am not sure whether I am right but is the algorithm doing something like : (if there are 256 unique character in computer),

then we mark off the character in that array, those that does not occur in s1 will stay '-1'. This is some sort of arranging, then we do the same for s2.

thanks for reading

normal chars are signed, they range from -128 to 127. As you can imagine, that's not a good array index. There are two ways to solve this problem:

1. Use an int as array index and always add 128.

2. Use an unsigned char (0 to 255) instead.

From both performance and code clarity, #2 is preferable.

And yes, chars are 8 bits, so there are 256 possible values.

The algorithm is simple, but here are a bit more comments:

int firsts[256]; /* Lookup table. An entry != -1 is the index of that character in s1. */

unsigned char c;

/* Clear table */

for(i=0;i<256;i++) firsts[i]=-1 ; /* -1 == not found */

/* Scan s1 for character occurences. Ignore repetitions of characters we already have in our lookup table. */

for(n=0;s1[n];n++){

c=(unsigned char)s1[n];

if(firsts[c] == -1) /* We don't have this character yet? */

firsts[c]=n; /* Yes -> store index */

}

/* For each s2 character, check if we have marked it as existing in our lookup table. If yes, return the first s1 character found in s2. */

for(i=0;s2[i];i++){

c=(unsigned char)s2[i];

if(firsts[c] != -1) /* Does this character exist in s1? */

return firsts[c]; /* Yes -> return it */

return -1; /* Empty intersection -> -1 */

Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

All Courses

From novice to tech pro — start learning today.

> 1) Make it more time-efficient

You could store all first occurences in a pointer array:

int firsts[256];

unsigned char c;

for(i=0;i<256;i++) firsts[i]=-1 ; /* -1 == not found */

for(n=0;s1[n];n++){

c=(unsigned char)s1[n];

if(firsts[c] == -1)

firsts[c]=n;

}

for(i=0;s2[i];i++){

c=(unsigned char)s2[i];

if(firsts[c] != -1)

return firsts[c];

return -1;

This has O(n), not O(n^2).

> 2) Make it more space-efficient

It's already space-efficient.

Cheers,

Stefan