• C

String Manipulation

Hello, I'm trying to understand how to do some string manipulation with C. Since I've been programming in Java the way you string manipulate in C is confusing due to the lack of some string functions. Right now I trying to do the following in C to get the idea of string manipulation:

1. Being able to search for a character in a given string and count the number of occurences

2. Being able to tokenize a phone number and print out the area code and the phone number seperatly

To me doing this in C requiers some very large arrrays and error checking but there must be more of an effcient way to do this. Please help! All of you have taught me so much about C I'm pretty sure you can help me out with this as well.
DancingFighterGAsked:
Who is Participating?
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

Harisha M GEngineerCommented:
1) strchr(str, chr);
2) strtok(str,substr);
Harisha M GEngineerCommented:
But, I would suggest writing your own functions for this.

For counting:

int countchar(char *str, char chr)
{
      int i,count;
       for(i=0,count=0;str[i]!='\0';i++)
      {      
            if(str[i]==chr)
                  count++;
      }
      return count;
}

brettmjohnsonCommented:
> the way you string manipulate in C is confusing due to the lack of some string functions.

I beg to differ.  There is large variety of string functions defined in string.h


> But, I would suggest writing your own functions for this.

Poor advice for anything more than the most trivial of functions.   Using the standard library
functions allows most people to write code faster with fewer errors.  For instance, your
countchar() routine segfaults if NULL str is passed, and returns an incorrect count if chr is '\0'.
And to be consistent with the rest of the system library string functions, it's prototype should be:
  size_t countchar(const char *str, int ch)
especially the const char * part.


Your Guide to Achieving IT Business Success

The IT Service Excellence Tool Kit has best practices to keep your clients happy and business booming. Inside, you’ll find everything you need to increase client satisfaction and retention, become more competitive, and increase your overall success.

Harisha M GEngineerCommented:
>> Poor advice for anything more than the most trivial of functions.
Can you please give the code for getting the number of times a character is found in a string ?

I will talk later.
MysidiaCommented:
In C strings are arrays of characters, and you can manipulate them
as such.

The convention is that they end at the first \0.

You need to be careful not to go outside the bounds of allocated
memory, otherwise you can think of

xyz[5]  being like  Java's xyz.charAt(5)  
except writable since you may safely do xyz[5] = 'b';

(if sufficient space is allocated and you are not wiping out the \0
terminator for the string, anyway)


If say you wanted to count characters, the best way would be to
do something like..

#include <stdio.h>
#include <string.h>

int countch(char const* s, int ch)
{
     unsigned short count = 0;

     if (s == NULL || ch == '\0')
         abort();

     while (*s != '\0')
               if (*s++ == ch) count++;
     return count;
}
---

But seriously, C provides a large library of string functions, including
str*  functions like strstr, strchr, strncpy, strncat, strstr
 sprintf

mem* functions like memchr  memcpy  memcmp memmove


As for parsing a phone number, for simple applications where you can
assume it will be entered a certain way, I would probably go with
the sscanf function, since you can do things like

   int day, month, year;

   sscanf(enteredText, "%d/%d/%d", &year, &month, &day);

MysidiaCommented:
Here's a C library reference:
 http://www.acm.uiuc.edu/webmonkeys/book/c_guide/

Particular attention to the end:
 
     2.14 string.h
DancingFighterGAuthor Commented:
Question Mysidia, In the code given above I would just simpe ask the user for a character making using the variable ch for the scanf right? Just confirming because that's what I think you code imply's
balderCommented:
how about ?

#include <stddef.h>

size_t countch(char const* s, char ch) /* return and count of same type, ch as same type as we point to */
{
     if( !s ) return 0; /* no valid string, no occurences of ch */
     if( ch == NULL ) return 1; /* ch is string end, always one end of string in valid string, or is it outside the string and we should return 0? if regarded inside we should find it through the loop to check if it really is a valid string */

     size_t count = 0; /*

     while (*s != NULL )
     {
               if( *s++ == ch ) ++count;
               if( !s ) return 0; /* invalid string, no end of string. not sure wrap around is defined behaviour, should we find type of size_t and check for max value ? */
     }
     return count;
}

creating a function without any preconditions is impossible :)
balderCommented:
Paul,

absolutely no insults taken, but I must say I am a bit baffled.

There is one bug (as I can spot in the code), a /* on the loose, the line should read.

    size_t count = 0;

The code is otherwise fully commented with regards about design decisions that must be taken by the implementer/user, and it corrects severals flaws in the code posted by Mysida (return type, 2nd input argument, use of abort() in a function)

My comments in the source, and last comment in the post should make it clear what level this code is in.

Why you stepped in to comment this code, and not Mysida's code is what baffles me.

balder

balderCommented:
to elaborate,
 brettmjohnson made a very wise comment about writing new functions.

I was trying to show all the problems you stumble into when insiting on writing new functions instead of using what is in the C library.

(surely failed)

balder
PaulCaswellCommented:
balder,

>>I was trying to show all the problems you stumble into when insiting on writing new functions instead of using what is in the C library.
And an excellent job you did. The issues I noticed (because today I feel pedantic) consist of:

>size_t countch(char const* s, char ch) /* return and count of same type, ch as same type as we point to */
Generally, const char * is more correct. You are passing a pointer to unchangeable characters, not an unchangeable pointer to characters.

>     if( !s ) return 0; /* no valid string, no occurences of ch */
a. Good coding practice recommends if(s!=NULL)
b. Good coding practice recommends only one exit point from a function.

>     if( ch == NULL ) return 1; /* ch is string end, always one end of string in valid string, or is it outside the string and we should return 0? if regarded inside we should find it through the loop to check if it really is a valid string */
NULL is generally used as a pointer. The correct code here is:
if( ch == '\0' ) ...
but see point 'b' above and the following loop will deal with this situation anyway.

>     size_t count = 0; /*
Apart from the unclosed comment, it is also invalid to declare variables away from the start of the enclosing block.

>     while (*s != NULL )
Same as for 'if ( ch == NULL )'.

>               if( *s++ == ch ) ++count;
This makes this loop a 'for' loop and therefore should be coded as such.

>               if( !s ) return 0; /* invalid string, no end of string. not sure wrap around is defined behaviour, should we find type of size_t and check for max value ? */
See 'b' above but otherwise a good idea.

My apologies for finding a fault on almost every line. My work teaching programming has made me sometimes exceptionally critical and it seems that today is one of those days.

>>Why you stepped in to comment this code, and not Mysida's code is what baffles me.
This was because I was far too late to step in on Mysidia's code to have any chance of doing some good. :(

>>I was trying to show all the problems you stumble into when insiting on writing new functions instead of using what is in the C library. (surely failed)
Not at all! Your input has been valuable and your point is well made.

Paul

balderCommented:

flat on the floor :), I don't drink coffe, but maybe I should start :)

>>     if( !s ) return 0; /* no valid string, no occurences of ch */
>a. Good coding practice recommends if(s!=NULL)
yepp, a hasty change, the NULL was supposed to go to this line, not the ch check. But you switched the logic in there.
if( s == NULL ) would be correct

> it is also invalid to declare variables away from the start of the enclosing block.
I thought that was changed in C99 (to align with C++)?

You really think
for(;*s;++s )
{
   if( !s ) return 0;
   if( *s == ch ) ++count;
}

is more correct than
while( *s )
{
   if( *s++ == ch ) ++count;
   if( !s ) return 0;
}

or did I get you wrong?
PaulCaswellCommented:
>>yepp
Good point! My mistake.

For me, I'd do it as:

...
int count = 0;

if ( s != NULL )
{
 int i;
 for ( i = 0; s[i] != '\0'; i++ )
 {
  if (s[i] == ch ) count += 1;
 }
}

return count;
...

but, as we all know, there are so many ways to do it, mine may not be the best but it certainly would not be the worst. As an advert says in England right now, 'It does what it says on the tin'.

Some days I might even remove the 's != NULL' check.

Paul
Kent OlsenDBACommented:

Hi DancinFighterG,

Lot's of good advice here.  :)

One of the things to consider in writing your function is the "real task" and not just a single step.  If your goal is to count the number of '0's or the number of 'A's, etc. then the suggestions above are exactly what you want.

But perhaps the "real task" is to test for several items.  Perhaps you want to know the number of spaces and if the number of spaces is zero then count the number of periods, else count the number of 'A's.  For efficiency, you probably don't want to repeatedly scan the string.  Instead, consider building a small table that counts each character.

#define CHARACTERS 256

int CharCount[CHARACTERS];

CountChars (void *string)  /*  Pass the string as void* for compatibility */
{
  unsigned char *str;        /*  We need to examine the characters as unsigned values  */

  str = (unsigned char*)string;
  memset (CharCount, 0, sizeof (CharCount));

  while (*str)
    CharCount [*(str++)]++;  /*  Count the number of times each character occurs  */
}


Now you simply call CountChars() one time to build the table.  You can then test for any character (or combination of characters) with a simple test.

  if (CharCount['.'] > 0)
    printf ("  The string contains a period");

  if (CharCount['A'] + CharCount['a'] > 0)
    printf ("  The string contains an 'A' in lower or upper case");

  if (CharCount['A'] + CharCount['E'] + CharCount['I'] + CharCount['O'] + CharCount['U'] > 0)
    printf ("  The string contains an upper case vowel");


Good Luck,
Kent

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
DancingFighterGAuthor Commented:
Ok, this some good stuff from everyone: I'm going to try a multitude of string manipulation so I can get the hang of this. I don't know why Java came easier to me when it came to this but oh well. This is what I have right now but it keeps crashing and I don't know why:

#include <stdio.h>

int countch(char const* s, char ch);

int main(void)
{

      char const* s;
      unsigned short a;
      char ch;
      printf("\nEnter in a string:");
      scanf("%s", s);
      printf("\nWhat character are you looking for:");
      scanf("%s", ch);
      a = countch(s, ch);
      printf("\nThere are %d occurences of %s in the string", a, ch);
      
return (0);

}

int countch(char const* s, char ch)
{
     unsigned short count = 0;

     if (s == NULL || ch == '\0')
       {
        printf("\nInvalid");
       }

     while (*s != '\0')
       {
         if (*s++ == ch)
             {
                  count++;
             }
       }
     return count;
}
DancingFighterGAuthor Commented:
Hey Kdo, I was working on your way of dealing with a real situation and I tried to add some alterations like such:

#include <stdio.h>
#include <string.h>

#define CHARACTERS 256
CountChars (void *string);

int main (void)
{      
      int CharCount[CHARACTERS];
      unsigned char *str;
      int i = 0;
      printf("\nEnter in aa string");
      scanf("%s", str);
      printf("\n");
      // convert string to upper or lower case and count
      // print out the number of characters and occurences


      return 0;
}

CountChars (void *string)  
{
  unsigned char *str;        

  str = (unsigned char*) string;
  memset (CharCount, 0, sizeof (CharCount));

  while (*str)
    CharCount [*(str++)]++;  
}

Now the only thing that I don't understand in you functions is how are you allowing a user to input in a string and print out the occurences and characters without using and if statement. I want to convert the string entered in by the user to either lower case or upper case so it makes it easier to handle.
Kent OlsenDBACommented:
Hi DancingfighterG,

I didn't code the part about inputting a string.  I gave a solution, not how to get there.  :)

There are several options for getting the string.  Use fgets(), scanf(), or accept the string as a parameter on the program statement.


Kent
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
C

From novice to tech pro — start learning today.