?
Solved

String problem....Data is duplicated in array?...

Posted on 2003-03-05
8
Medium Priority
?
222 Views
Last Modified: 2010-04-15
Ok I have a function thats part of a Shell Program.  I check for semicolon, and then continue to store commands from the command line prompt.  When I hit the semicolon I store whats left in nextexpr.  But while debugging my code, I used the input "dir /l; ls -alg; gnome-terminal" and nextexpr should equal gnome-terminal after tokenizing ls -alg but instead it equals gnome-tergnome-terminal\n"...This is a very odd bug because I am working with indices in an array.  And when I do one *inbuf++ call it stores a whole bunch of unwanted characters.  It must be the method I'm doing it with?

Anyway here is the code function:

void tokenize(inbuf, args)
char *inbuf;
char **args;
{
     int breakout = 0;
     while(*inbuf != '\0' && *inbuf != ';' && !breakout)
     {
          /*Strip whitespace.  Use nulls, so that the previous argument
          is terminated automatically.*/
          while ((*inbuf == ' ') || (*inbuf == '\t') || (*inbuf == '\n'))
               *inbuf++ = '\0';
         
          /*Save Argument*/
          if(*inbuf != ';' && *inbuf != '\0')
               *args++ = inbuf;
         
          /*MAYBE CHECK FOR SEMICOLON IN *inbuf
         
          /*Skip over Argument*/
          while ((*inbuf != '\0') && (*inbuf != ' ') && (*inbuf != '\t') && (*inbuf != '\n') && (*inbuf != ';'))
               inbuf++;
         
          if(*inbuf == ';')
          {
               semi_used = 1;
               *inbuf++ = '\0';
               while ((*inbuf == ' ') || (*inbuf == '\t') || (*inbuf == '\n'))
                    *inbuf++ = '\0';
               /*Clear nextexpre*/
               int k = 0;
               while(nextexpr[k] != '\0')
                    nextexpr[k] = '\0';
               free(nextexpr);
               /*Copy Remainder of INBUF into nextexpr (index by index)*/
               int i = 0;
               while(*inbuf != '\0')
               {    
                    nextexpr[i] = *inbuf++;
                    i++;
               }
               breakout = 1;
          }
          else
               semi_used = 0;
         
     }
     
     *(args) = '\0';
}
0
Comment
Question by:themaxxx
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
8 Comments
 
LVL 6

Expert Comment

by:gj62
ID: 8071928
What is nextexpr?  

One problem is that you are freeing nextexpr, then making an assignment to it a few lines later...

It is generally a bad idea to free memory inside a function that does a bunch of other stuff, like this one is trying to do...
0
 
LVL 6

Expert Comment

by:gj62
ID: 8071963
This code is dangerous.  While you may have checks to insure that at least one null character is in nextexpr outside of this function, it is very poor style to clear it by doing:

          /*Clear nextexpre*/
              int k = 0;
              while(nextexpr[k] != '\0')
                   nextexpr[k] = '\0';

If you don't have a null, you will overflow the array.
0
 
LVL 6

Expert Comment

by:gj62
ID: 8071995
Lastly, you say

             while(*inbuf != '\0')
              {    
                   nextexpr[i] = *inbuf++;
                   i++;
              }

This will copy the remainder of inbuf into nextexpr - even including intervening semicolons.  Are you sure you don't want:

             while(*inbuf != '\0' && *inbuf != ';')
              {    
                   nextexpr[i] = *inbuf++;
                   i++;
              }
0
Industry Leaders: We Want Your Opinion!

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

 
LVL 86

Expert Comment

by:jkr
ID: 8072356
Have you thought of using 'strtok()'? It is designed for that very purpose
0
 
LVL 6

Expert Comment

by:gj62
ID: 8072520
jkr makes a good point - here's a quick example...

#include <string.h>
#include <stdio.h>

char string[] = "progname arg1 -n; arg2 -y -z; arg3";
char sep[]   = ";";
char *token;

void main( void )
{
   /* Establish string and get the first token: */
   token = strtok( string, seps );
   while( token != NULL )
   {
      /* While there are tokens in "string" */
      printf( " %s\n", token );
      /* HERE YOU CAN STORE THE CHAR* POINTER OR STRCPY() THE TOKENS AS YOU WISH */
      /* YOU CAN WRITE SIMPLE TRIM FUNCTION TO TRIM WHITESPACE TOO */
      /* Get next token: */
      token = strtok( NULL, seps );
   }
}

Note that strtok() modifies the input string by replacing any separator found (semicolons, in your case) with nulls.  If you want the original string kept intact you have to make a copy.  

You can also create a parser to parse off the switches (-n for example), if needed by changing the seps to "-" and working with each individual arguement string.
0
 
LVL 5

Accepted Solution

by:
Kocil earned 800 total points
ID: 8074843
I can see **args. Double * is confusing, no wonder you made some mistakes.
I assume you want like this,
input  : inbuf = "dir /l; ls -alg; gnome-terminal"
output : args[0] = "dir /l"
         args[1] = "ls -alg"
         args[2] = "gnome-terminal"
**args may contain OLD allocated memory, we have to free it before we malloc a new one.

Pardon me to make a 60% totally new code.
Strtok should be used, by I want to show you the funny syntax of the double **args;

void tokenize(inbuf, args)
char *inbuf;
char **args;
{
    int k;

    // Free the old args
    for (k=0; args[k]; k++) free(args[k]);

    while(*inbuf) {
       // skip white and clear out inbuf
       while((*inbuf == ' ') || (*inbuf == '\t')) {
         *inbuf=0; inbuff++;
       }

       if (*inbuf) {
          // count the length
          for (k=0; (inbuf[k] && (inbuf[k] != ';'); k++);

          // malloc a new one
          *args = malloc(k+1);

          // copy the args and clear out inbuf
          while (*inbuf && (*inbuf != ';')) {
             **args = *inbuf; // see how to assign args string
             *inbuf = '\0';
             (*args)++; // see how to increment args string
             inbuf++;
          }
          **args = '\0'; // terminated args string

          // done, next argument please
          args++;  // see how to increment args array to string
       }
    }
};

Note that if somebody make an empty expression like
ls;;cat
you will get
args[0]="ls"
args[1]=""   (empty)
args[2]="cat"

Cheers
0
 
LVL 5

Expert Comment

by:Kocil
ID: 8074984
My humble apologize, I didn't read your question carefully.
When you hit the semicolon you want to store what is left in nextexpr. But I can't see nextexprs in the tokenize's arguments.

So, which one is your case ?

Posibility 1 .. the nextexpr is a global variables like this.
=================
char *nextexpr;

void tokenize(char* inbuf, char **args)
{
   // you modify global nextexprs here
}

main()
{
   char inbuf[100]="ls;gnome";
   char *args;

   // you call it like this
   tokenize(inbuf, &args);

   /* here you will get
   inbuf = "ls"
   args = "ls" (dynamically allocated)*/
   nextexpr = "gnome" (dynamically allocated)
}
================

Or posibility 2
Your question/code is not consistent. The nextexpr is the **args in the tokenize function;
====================
void tokenize(char* inbuf, char **args)
{
   // you modify nextexprs as LOCAL **args here
}

main()
{
   char inbuf[100]="ls;gnome";
   char *nextexprs;

   // you call it like this
   tokenize(inbuf, &nextexprs);

   /* here you will get
   inbuf = "ls"
   nextexpr = "gnome" (dynamically allocated)
   */
}

Let us know more.

0
 

Author Comment

by:themaxxx
ID: 8075577
Thanks for all the input.  I saw this string parsing technique with the **args and *inbuf in the function definition in an example small shell program, and I found that it worked at first.  After implementing the semicolon idea, it fell apart.  I'm new to pointers to character strings and the like so all this got me pissed.  But nextexpr is declared outside tokenize function, sorry about not specifying that.  Thanks for the info on better methods of accomplishing this.  I have found this confusing enough, and I'm just gone deal directly with character arrays read off the command line.  I'll execute each command as I parse them, and return to the tokenize function so parse some more.  Seems easiest for me, although not very modular.  Thanks to Kocil, qj62, and jkr.  I'll award points to Kocil for a complete function.  I wish I could award all of you people for being A. smart enough to use strtok  B. and grasping this very confusing but useful mechanism of C.  I am picking up a book by the creaters of C, kernighan/ritchie and hopefully they can teach me...Thanks all

tony
0

Featured Post

Industry Leaders: We Want Your Opinion!

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

An Outlet in Cocoa is a persistent reference to a GUI control; it connects a property (a variable) to a control.  For example, it is common to create an Outlet for the text field GUI control and change the text that appears in this field via that Ou…
Windows programmers of the C/C++ variety, how many of you realise that since Window 9x Microsoft has been lying to you about what constitutes Unicode (http://en.wikipedia.org/wiki/Unicode)? They will have you believe that Unicode requires you to use…
The goal of this video is to provide viewers with basic examples to understand how to use strings and some functions related to them in the C programming language.
Video by: Grant
The goal of this video is to provide viewers with basic examples to understand and use while-loops in the C programming language.
Suggested Courses

777 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question