Solved

problem while separating words from sentence

Posted on 2007-11-13
17
823 Views
Last Modified: 2008-02-01
Hello, to extract words from sentence i use this code.

#include <iostream.h>
#include <string.h>

using namespace std;

int main ()
{
   int i,k;
    string s, word;  
    cout << "Enter Sentence:\n";
    getline(cin, s);
    cout << "You entered:" << s << "\n";
     s += ',';  
    int npos = 0;
    int lpos = 0;
   
   
    while ((npos = (int)s.find_first_of(",", lpos)) != string::npos )
    {
         
          word = s.substr(lpos, npos - lpos);
                    cout << word << endl;
           lpos = npos + 1;
    }
 
    system("PAUSE");
      return 0;

    }


Input must be made like this (sentence): dog,cat,food,table,carrot,expert,exchange.
After last word dot.
And output should be like:
dog
cat
food
table
carrot
expert
exchange
-----------------------
But right now, i get:
dog
cat
foor
table
carrot
expert
exchange.

How to make, that there would be no dot "." after last word? (please write sample)

Also one more question, let say later i will need to check if word  has more than 5 letters, how could i do that? i know i can't do just "if (word>5)", because word is declared as string. Please help me, thanks!
0
Comment
Question by:moonskyland
  • 8
  • 3
  • 3
  • +2
17 Comments
 
LVL 7

Expert Comment

by:UrosVidojevic
ID: 20276310
Add this, before printing the word.

              if (word[word.length()-1] == '.')
                    word = word.substr(0, word.length()-1);
0
 
LVL 7

Accepted Solution

by:
UrosVidojevic earned 300 total points
ID: 20276363
Or even better if you are sure that '.' is the last character of the sentence.
Eliminate it at the beginning, immediately after you read the sentence by:

s = s.substr(0, s.length()-1);
0
 
LVL 40

Assisted Solution

by:evilrix
evilrix earned 200 total points
ID: 20276374
You split up a string using a comma. So your code is doing exactly the right thing, the fact that there is a . at the end of the last word is a red herring, it'll happen on any word. In fact any char that is not a comma will be selected as a word. As UrosVidojevic has eluded, you need to perform some post processing of the string to 'clean it up'.

If you really need the power of parsing consider investing time in implemented one of the following free regex engines.

boost regex
http://www.boost.org/libs/regex/doc/index.html

PCRE
http://www.pcre.org/

Greta
http://research.microsoft.com/projects/greta/

Regexes are pretty easy to learn -- and fun to :)
http://www.regular-expressions.info/

I hope this helps.

-Rx.
0
Free Tool: Path Explorer

An intuitive utility to help find the CSS path to UI elements on a webpage. These paths are used frequently in a variety of front-end development and QA automation tasks.

One of a set of tools we're offering as a way of saying thank you for being a part of the community.

 
LVL 40

Expert Comment

by:evilrix
ID: 20276405
>> Also one more question, let say later i will need to check if word  has more than 5 letters, how could i do that

if(word.size() > 5)
{
    // more than 5 chars
}
0
 
LVL 40

Expert Comment

by:evilrix
ID: 20276506
This goes some way towards fixing your problem...

#include <iostream>
#include <string>

int main ()
{
      std::string s;
      std::string word;  

      cout << "Enter Sentence:\n";
      getline(cin, s);
      cout << "You entered:" << s << "\n";

      s += ',';  
      
      std::string::size_type npos = 0;
      std::string::size_type lpos = 0;

      while ((npos = s.find_first_of(",", lpos)) != std::string::npos )
      {
            word = s.substr(lpos, npos - lpos);
            
            std::string::iterator itr = word.begin();

            while(itr != word.end())
            {
                  char c = *itr;

                  if((c >= 'a' && c <= 'z') || (c >= 'A' && c <= 'Z'))
                  {
                        ++itr;
                  }
                  else
                  {
                        itr = word.erase(itr);
                  }
            }

            std::cout << word << std::endl;
            lpos = npos + 1;
      }

      system("PAUSE");
      return 0;
}

NB. Code I post is for example only and is not guaranteed to be defect free!
0
 

Author Comment

by:moonskyland
ID: 20276514
Thanks for help! :) i`m still learning c++.
0
 
LVL 40

Expert Comment

by:evilrix
ID: 20276529
You are very welcome :)
0
 

Expert Comment

by:crazybrker
ID: 20276572
To your first question, you are currently taking anything from the start of s (your sentence) up till the first comma then cout that, the issue is that it considers the "." (period) as part of your new word. So try addind that to your find_first_of function i.e.
    while ((npos = (int)s.find_first_of(",.", lpos)) != string::npos )
now it will split the sentence on every occurance of a period or comma.

as for your 2nd question the function that you are looking for would be .length i.e.

cout << "The length of word is " << word.length() << " characters.\n";
or
if (word.length()<5)
do whatever...
0
 

Author Comment

by:moonskyland
ID: 20276854
crazybrker, thank you also for help. checked it works too, i thought it is not possible to write ",." (two elements), i mean, i thought it would work only if word1,.word2,.word3,.and so on.
Thanks for help again! :-)
0
 
LVL 40

Expert Comment

by:evilrix
ID: 20278274
Of course, the next problem you'll hit is other non-alpha characters will also form part of your word-set so you'll have to list them all in the find_first_of but then they'll be word separators when all you probably want to do is just filter them out. This is why I provided you with a code snippet that shows how to filter these out rather than split on them.
0
 
LVL 53

Expert Comment

by:Infinity08
ID: 20278315
btw, these headers :

#include <iostream.h>
#include <string.h>

are deprecated. Use these instead :

#include <iostream>
#include <string>
0
 
LVL 40

Expert Comment

by:evilrix
ID: 20278346
As per my example code :)
0
 
LVL 53

Expert Comment

by:Infinity08
ID: 20278428
I know ... Just wanted to say it explicitly.
0
 
LVL 40

Expert Comment

by:evilrix
ID: 20278474
You just wanted the final word :-p
0
 
LVL 53

Expert Comment

by:Infinity08
ID: 20278480
Mmmm ... did I ? ;)
0
 

Author Comment

by:moonskyland
ID: 20278846
final word is mine :D thanks to all for help
0
 
LVL 40

Expert Comment

by:evilrix
ID: 20278859
You are very welcome --- doh! :-s
0

Featured Post

Free Tool: Site Down Detector

Helpful to verify reports of your own downtime, or to double check a downed website you are trying to access.

One of a set of tools we are providing to everyone as a way of saying thank you for being a part of the community.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Unlike C#, C++ doesn't have native support for sealing classes (so they cannot be sub-classed). At the cost of a virtual base class pointer it is possible to implement a pseudo sealing mechanism The trick is to virtually inherit from a base class…
Templates For Beginners Or How To Encourage The Compiler To Work For You Introduction This tutorial is targeted at the reader who is, perhaps, familiar with the basics of C++ but would prefer a little slower introduction to the more ad…
The viewer will learn how to user default arguments when defining functions. This method of defining functions will be contrasted with the non-default-argument of defining functions.
The viewer will be introduced to the member functions push_back and pop_back of the vector class. The video will teach the difference between the two as well as how to use each one along with its functionality.

809 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question