Solved

# Implement strtok function

Posted on 2006-07-11
3,006 Views
How do I implement strtok function?

Thanks!
0
Question by:gromul
[X]
###### Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

• Help others & share knowledge
• Earn cash & points
• 8
• 4
• 4
• +3

LVL 16

Expert Comment

ID: 17085054
Hi gromul,

strtok is already implemented in standard libraries. If you could explain why you need to do this it would help, even if this is for homework we can help.

Paul
0

Author Comment

ID: 17085112
I'm just practicing for an interview. What I have problem with is that the function apparently uses some global variable, but I'm not sure how to replicate that. Would this global variable be defined in the function, or is there some external definition that the function uses?
0

LVL 11

Expert Comment

ID: 17085160
It uses a static, which is why it's not thread safe.

I seem to remember an improved strtok that uses a handle to track the state, but I don't think I've ever seen it in any standard libraries.
0

LVL 11

Assisted Solution

KurtVon earned 100 total points
ID: 17085297
Here we go, an accurate and concise description of how strtok works and the strtok_r function that is the thread-safe version that uses a handle: http://www.opengroup.org/onlinepubs/000095399/functions/strtok.html

I'd point you to some source, but that's not going to help you prepare for the interview.

0

Author Comment

ID: 17085454
I already wrote an algorithm and found some code, so if you have the source code, I'd like to see it and compare. I'm mystified by the first line in the function: if it's not a first call, where will pos be pointing to?

> char *strtok(char *str, char *delims)
> {
>     static char *pos   = (char *)0;
>     char        *start = (char *)0;
>     if (str)    /* Start a new string? */
>                 pos = str;
>     if (pos)
>     {
>                         /* Skip delimiters */
>                 while (*pos && strchr(delims, *pos))
>                     pos++;
>                 if (*pos)
>                 {
>                     start = pos;
>                             /* Skip non-delimiters */
>                     while (*pos && !strchr(delims, *pos))
>                                 pos++;
>                     if (*pos)
>                                 *pos++ = '\0';
>                 }
>     }
>     return start;
> }
0

LVL 45

Expert Comment

ID: 17085541
Hi gromul,

Note the line:    *pos++ = '\0';     near the end of the function.  pos (position) points to where you want to start the search the next time that strtok() is called.

The only thing that I'd do is replace 'pos++' with '++pos' as it's slightly faster on most implementations.

Good Luck!
Kent
0

Author Comment

ID: 17085553
I get that, but is the first line necessary?
0

LVL 45

Assisted Solution

Kent Olsen earned 100 total points
ID: 17085618
Hi gromul,
1> char *strtok(char *str, char *delims)
2> {
3>     static char *pos   = (char *)0;
4>     char        *start = (char *)0;
5>     if (str)    /* Start a new string? */
6>                 pos = str;

"First Line" is a bit ambiguous.  :)

The code is trying to ensure that strtok() doesn't wander off into the ozone if NULL is passes as the string the first time that strtok() is called.

The normal sequence is to pass the string address to strtok() then repeatedly pass NULL for the string until strtok() returns NULL.  The "extra code" simply means that if, for some reason, the code passes NULL instead of a string address, strtok() will gracefully return NULL instead of "undefined results".

Kent
0

Author Comment

ID: 17085720
Is "static char *pos   = (char *)0;" the same as "static char *pos   = NULL;"?
0

LVL 11

Expert Comment

ID: 17085725
I think gromul means the static declaration.

A static is pretty much a global variable, so when it has an initializer it is a global one assigned at program start, not when the function is called.  This just makes sure that the static pos variable starts with NULL (and using (char*)0 instead of NULL is a givaway that this is part of a standard library).
0

Author Comment

ID: 17085807
Kdo, wouldn't this change the wrong character?

>The only thing that I'd do is replace 'pos++' with '++pos' as it's slightly faster on most implementations.

0

Author Comment

ID: 17085847
What I don't get is, if I initialize pos to NULL, how is the static value saved across function calls?
0

LVL 45

Expert Comment

ID: 17085969
Hi KurtVon,

In this case, static means that the variable is NOT a stack variable, but is placed in the globals block.  However, the definition remains local to the function.  You could put "static char *pos;" in every one of your functions and they would all have their own variable called "pos" that kept it's value between function calls.

Kent
0

LVL 11

Expert Comment

ID: 17086336
Hey Kdo.

I knew about it being local as you can see from my response to the other static question.  I just didn't consider it relevent here.  I guess it is pretty important, though, so thanks for clarfiying.
0

LVL 16

Assisted Solution

PaulCaswell earned 100 total points
ID: 17086477
Hi gromul,

>>What I don't get is, if I initialize pos to NULL, how is the static value saved across function calls?
Because, confusingly, the initialisation of the variable only happens once at startup time. From then on:

static char *pos   = (char *)0;

will do nothing at all, especially, it won't change 'pos'.

Paul
0

Author Comment

ID: 17086560
So it's the same as if I write "static char *pos;"? It's just a declaration.

0

Author Comment

ID: 17086582
I found an interesting behavior if strtok (both library and my own) is called to print all the tokens of the same string twice: it just prints all the tokens once, then just the first one. It works correctly on two different strings. Do you know what could be causing this behavior?

s1 = strtok( inputStr, delims );
while( s1 != NULL )
{
cout << s1 << endl;
s1 = strtok( NULL, delims );
}

// Second try
s2 = strtok( inputStr, delims );
while( s2 != NULL )
{
cout << s2 << endl;
s2 = strtok( NULL, delims );
}
0

LVL 45

Expert Comment

ID: 17086808
Hi gromul,

That's because strtok() replaces the separator(s) with '\0' and returns the starting address (pos) of the string.

If you try to pass the same string again, the '\0' after the first parameter means that you're only passing the first token and not the entire string.

Kent
0

LVL 8

Assisted Solution

manish_regmi earned 100 total points
ID: 17087285
Also remember that almost all the userspace library implements the string functions in assembly to speed up things.

here is the generic implementation in glibc.

static char *olds;
/* Parse S into tokens separated by characters in DELIM.
If S is NULL, the last string strtok() was called with is
used.  For example:
char s[] = "-abc-=-def";
x = strtok(s, "-");            // x = "abc"
x = strtok(NULL, "-=");            // x = "def"
x = strtok(NULL, "=");            // x = NULL
// s = "abc\0-def\0"
*/
char *
strtok (s, delim)
char *s;
const char *delim;
{
char *token;

if (s == NULL)
s = olds;

s += strspn (s, delim);
if (*s == '\0')
{
olds = s;
return NULL;
}

/* Find the end of the token.  */
token = s;
s = strpbrk (token, delim);
if (s == NULL)
/* This token finishes the string.  */
olds = __rawmemchr (token, '\0');
else
{
/* Terminate the token and make OLDS point past it.  */
*s = '\0';
olds = s + 1;
}
}

regards
Manish Regmi
0

LVL 7

Accepted Solution

nafis_devlpr earned 100 total points
ID: 17096775
You can try the code below for strtok(), it takes and returns the same as the standard library strtok() does :

char* str_tok(char* src, char* deli)
{
static char* sr;

if(src)
sr=src;

char *temp=sr,*t;
int i=0;

for(; *temp; temp++)
{
for(i=0; deli[i]; i++)
{
if(deli[i] == *temp)
break;
}
if(deli[i])
{
*temp=0;
if(temp == sr)
sr++;
else
{
++temp;
break;
}
}
}

if(!(*sr))
return 0;
t=sr;
sr=temp;
return t;
}

Nafis
0

## Featured Post

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

### Suggested Solutions

Summary: This tutorial covers some basics of pointer, pointer arithmetic and function pointer. What is a pointer: A pointer is a variable which holds an address.Â This address might be address of another variable/address of devices/address of fuâ€¦
Examines three attack vectors, specifically, the different types of malware used in malicious attacks, web application attacks, and finally, network based attacks.  Concludes by examining the means of securing and protecting critical systems and infâ€¦
The goal of this video is to provide viewers with basic examples to understand and use pointers in the C programming language.
The goal of this video is to provide viewers with basic examples to understand and use switch statements in the C programming language.
###### Suggested Courses
Course of the Month8 days, 9 hours left to enroll