Solved

Implement strtok function

Posted on 2006-07-11
20
2,797 Views
Last Modified: 2008-01-09
How do I implement strtok function?

Thanks!
0
Comment
Question by:gromul
  • 8
  • 4
  • 4
  • +3
20 Comments
 
LVL 16

Expert Comment

by:PaulCaswell
ID: 17085054
Hi gromul,

strtok is already implemented in standard libraries. If you could explain why you need to do this it would help, even if this is for homework we can help.

Paul
0
 

Author Comment

by:gromul
ID: 17085112
I'm just practicing for an interview. What I have problem with is that the function apparently uses some global variable, but I'm not sure how to replicate that. Would this global variable be defined in the function, or is there some external definition that the function uses?
0
 
LVL 11

Expert Comment

by:KurtVon
ID: 17085160
It uses a static, which is why it's not thread safe.

I seem to remember an improved strtok that uses a handle to track the state, but I don't think I've ever seen it in any standard libraries.
0
 
LVL 11

Assisted Solution

by:KurtVon
KurtVon earned 100 total points
ID: 17085297
Here we go, an accurate and concise description of how strtok works and the strtok_r function that is the thread-safe version that uses a handle: http://www.opengroup.org/onlinepubs/000095399/functions/strtok.html

I'd point you to some source, but that's not going to help you prepare for the interview.

0
 

Author Comment

by:gromul
ID: 17085454
I already wrote an algorithm and found some code, so if you have the source code, I'd like to see it and compare. I'm mystified by the first line in the function: if it's not a first call, where will pos be pointing to?

> char *strtok(char *str, char *delims)
> {
>     static char *pos   = (char *)0;
>     char        *start = (char *)0;
>     if (str)    /* Start a new string? */
>                 pos = str;
>     if (pos)
>     {
>                         /* Skip delimiters */
>                 while (*pos && strchr(delims, *pos))
>                     pos++;
>                 if (*pos)
>                 {
>                     start = pos;
>                             /* Skip non-delimiters */
>                     while (*pos && !strchr(delims, *pos))
>                                 pos++;
>                     if (*pos)
>                                 *pos++ = '\0';
>                 }
>     }
>     return start;
> }
0
 
LVL 45

Expert Comment

by:Kdo
ID: 17085541
Hi gromul,

Note the line:    *pos++ = '\0';     near the end of the function.  pos (position) points to where you want to start the search the next time that strtok() is called.

The only thing that I'd do is replace 'pos++' with '++pos' as it's slightly faster on most implementations.



Good Luck!
Kent
0
 

Author Comment

by:gromul
ID: 17085553
I get that, but is the first line necessary?
0
 
LVL 45

Assisted Solution

by:Kdo
Kdo earned 100 total points
ID: 17085618
Hi gromul,
1> char *strtok(char *str, char *delims)
2> {
3>     static char *pos   = (char *)0;
4>     char        *start = (char *)0;
5>     if (str)    /* Start a new string? */
6>                 pos = str;

"First Line" is a bit ambiguous.  :)

The code is trying to ensure that strtok() doesn't wander off into the ozone if NULL is passes as the string the first time that strtok() is called.

The normal sequence is to pass the string address to strtok() then repeatedly pass NULL for the string until strtok() returns NULL.  The "extra code" simply means that if, for some reason, the code passes NULL instead of a string address, strtok() will gracefully return NULL instead of "undefined results".


Kent
0
 

Author Comment

by:gromul
ID: 17085720
Is "static char *pos   = (char *)0;" the same as "static char *pos   = NULL;"?
0
 
LVL 11

Expert Comment

by:KurtVon
ID: 17085725
I think gromul means the static declaration.

A static is pretty much a global variable, so when it has an initializer it is a global one assigned at program start, not when the function is called.  This just makes sure that the static pos variable starts with NULL (and using (char*)0 instead of NULL is a givaway that this is part of a standard library).
0
How to run any project with ease

Manage projects of all sizes how you want. Great for personal to-do lists, project milestones, team priorities and launch plans.
- Combine task lists, docs, spreadsheets, and chat in one
- View and edit from mobile/offline
- Cut down on emails

 

Author Comment

by:gromul
ID: 17085807
Kdo, wouldn't this change the wrong character?

>The only thing that I'd do is replace 'pos++' with '++pos' as it's slightly faster on most implementations.

0
 

Author Comment

by:gromul
ID: 17085847
What I don't get is, if I initialize pos to NULL, how is the static value saved across function calls?
0
 
LVL 45

Expert Comment

by:Kdo
ID: 17085969
Hi KurtVon,

In this case, static means that the variable is NOT a stack variable, but is placed in the globals block.  However, the definition remains local to the function.  You could put "static char *pos;" in every one of your functions and they would all have their own variable called "pos" that kept it's value between function calls.


Kent
0
 
LVL 11

Expert Comment

by:KurtVon
ID: 17086336
Hey Kdo.

I knew about it being local as you can see from my response to the other static question.  I just didn't consider it relevent here.  I guess it is pretty important, though, so thanks for clarfiying.
0
 
LVL 16

Assisted Solution

by:PaulCaswell
PaulCaswell earned 100 total points
ID: 17086477
Hi gromul,

>>What I don't get is, if I initialize pos to NULL, how is the static value saved across function calls?
Because, confusingly, the initialisation of the variable only happens once at startup time. From then on:

static char *pos   = (char *)0;

will do nothing at all, especially, it won't change 'pos'.

Paul
0
 

Author Comment

by:gromul
ID: 17086560
So it's the same as if I write "static char *pos;"? It's just a declaration.

0
 

Author Comment

by:gromul
ID: 17086582
I found an interesting behavior if strtok (both library and my own) is called to print all the tokens of the same string twice: it just prints all the tokens once, then just the first one. It works correctly on two different strings. Do you know what could be causing this behavior?

s1 = strtok( inputStr, delims );
      while( s1 != NULL )
      {
            cout << s1 << endl;
            s1 = strtok( NULL, delims );
      }

      // Second try
      s2 = strtok( inputStr, delims );
      while( s2 != NULL )
      {
            cout << s2 << endl;
            s2 = strtok( NULL, delims );
      }
0
 
LVL 45

Expert Comment

by:Kdo
ID: 17086808
Hi gromul,

That's because strtok() replaces the separator(s) with '\0' and returns the starting address (pos) of the string.

If you try to pass the same string again, the '\0' after the first parameter means that you're only passing the first token and not the entire string.


Kent
0
 
LVL 8

Assisted Solution

by:manish_regmi
manish_regmi earned 100 total points
ID: 17087285
Also remember that almost all the userspace library implements the string functions in assembly to speed up things.

here is the generic implementation in glibc.

static char *olds;
/* Parse S into tokens separated by characters in DELIM.
   If S is NULL, the last string strtok() was called with is
   used.  For example:
      char s[] = "-abc-=-def";
      x = strtok(s, "-");            // x = "abc"
      x = strtok(NULL, "-=");            // x = "def"
      x = strtok(NULL, "=");            // x = NULL
            // s = "abc\0-def\0"
*/
char *
strtok (s, delim)
     char *s;
     const char *delim;
{
  char *token;

  if (s == NULL)
    s = olds;

  /* Scan leading delimiters.  */
  s += strspn (s, delim);
  if (*s == '\0')
    {
      olds = s;
      return NULL;
    }

  /* Find the end of the token.  */
  token = s;
  s = strpbrk (token, delim);
  if (s == NULL)
    /* This token finishes the string.  */
    olds = __rawmemchr (token, '\0');
  else
    {
      /* Terminate the token and make OLDS point past it.  */
      *s = '\0';
      olds = s + 1;
    }
  return token;
}

regards
Manish Regmi
0
 
LVL 7

Accepted Solution

by:
nafis_devlpr earned 100 total points
ID: 17096775
You can try the code below for strtok(), it takes and returns the same as the standard library strtok() does :

      char* str_tok(char* src, char* deli)
            {
                  static char* sr;

                  if(src)
                        sr=src;

                  char *temp=sr,*t;
                  int i=0;

                  for(; *temp; temp++)
                        {
                              for(i=0; deli[i]; i++)
                                    {
                                          if(deli[i] == *temp)
                                                break;
                                    }
                              if(deli[i])
                                    {
                                          *temp=0;
                                          if(temp == sr)
                                                sr++;
                                          else
                                                {
                                                      ++temp;
                                                      break;
                                                }
                                    }
                        }
                  
                  if(!(*sr))
                        return 0;
                  t=sr;
                  sr=temp;
                  return t;
            }

Nafis
0

Featured Post

Better Security Awareness With Threat Intelligence

See how one of the leading financial services organizations uses Recorded Future as part of a holistic threat intelligence program to promote security awareness and proactively and efficiently identify threats.

Join & Write a Comment

Suggested Solutions

This tutorial is posted by Aaron Wojnowski, administrator at SDKExpert.net.  To view more iPhone tutorials, visit www.sdkexpert.net. This is a very simple tutorial on finding the user's current location easily. In this tutorial, you will learn ho…
Windows programmers of the C/C++ variety, how many of you realise that since Window 9x Microsoft has been lying to you about what constitutes Unicode (http://en.wikipedia.org/wiki/Unicode)? They will have you believe that Unicode requires you to use…
The goal of this video is to provide viewers with basic examples to understand how to use strings and some functions related to them in the C programming language.
The goal of this video is to provide viewers with basic examples to understand and use conditional statements in the C programming language.

747 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

11 Experts available now in Live!

Get 1:1 Help Now