Solved

problem while separating words from sentence

Posted on 2007-11-13
17
824 Views
Last Modified: 2008-02-01
Hello, to extract words from sentence i use this code.

#include <iostream.h>
#include <string.h>

using namespace std;

int main ()
{
   int i,k;
    string s, word;  
    cout << "Enter Sentence:\n";
    getline(cin, s);
    cout << "You entered:" << s << "\n";
     s += ',';  
    int npos = 0;
    int lpos = 0;
   
   
    while ((npos = (int)s.find_first_of(",", lpos)) != string::npos )
    {
         
          word = s.substr(lpos, npos - lpos);
                    cout << word << endl;
           lpos = npos + 1;
    }
 
    system("PAUSE");
      return 0;

    }


Input must be made like this (sentence): dog,cat,food,table,carrot,expert,exchange.
After last word dot.
And output should be like:
dog
cat
food
table
carrot
expert
exchange
-----------------------
But right now, i get:
dog
cat
foor
table
carrot
expert
exchange.

How to make, that there would be no dot "." after last word? (please write sample)

Also one more question, let say later i will need to check if word  has more than 5 letters, how could i do that? i know i can't do just "if (word>5)", because word is declared as string. Please help me, thanks!
0
Comment
Question by:moonskyland
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 8
  • 3
  • 3
  • +2
17 Comments
 
LVL 7

Expert Comment

by:UrosVidojevic
ID: 20276310
Add this, before printing the word.

              if (word[word.length()-1] == '.')
                    word = word.substr(0, word.length()-1);
0
 
LVL 7

Accepted Solution

by:
UrosVidojevic earned 300 total points
ID: 20276363
Or even better if you are sure that '.' is the last character of the sentence.
Eliminate it at the beginning, immediately after you read the sentence by:

s = s.substr(0, s.length()-1);
0
 
LVL 40

Assisted Solution

by:evilrix
evilrix earned 200 total points
ID: 20276374
You split up a string using a comma. So your code is doing exactly the right thing, the fact that there is a . at the end of the last word is a red herring, it'll happen on any word. In fact any char that is not a comma will be selected as a word. As UrosVidojevic has eluded, you need to perform some post processing of the string to 'clean it up'.

If you really need the power of parsing consider investing time in implemented one of the following free regex engines.

boost regex
http://www.boost.org/libs/regex/doc/index.html

PCRE
http://www.pcre.org/

Greta
http://research.microsoft.com/projects/greta/

Regexes are pretty easy to learn -- and fun to :)
http://www.regular-expressions.info/

I hope this helps.

-Rx.
0
Technology Partners: We Want Your Opinion!

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

 
LVL 40

Expert Comment

by:evilrix
ID: 20276405
>> Also one more question, let say later i will need to check if word  has more than 5 letters, how could i do that

if(word.size() > 5)
{
    // more than 5 chars
}
0
 
LVL 40

Expert Comment

by:evilrix
ID: 20276506
This goes some way towards fixing your problem...

#include <iostream>
#include <string>

int main ()
{
      std::string s;
      std::string word;  

      cout << "Enter Sentence:\n";
      getline(cin, s);
      cout << "You entered:" << s << "\n";

      s += ',';  
      
      std::string::size_type npos = 0;
      std::string::size_type lpos = 0;

      while ((npos = s.find_first_of(",", lpos)) != std::string::npos )
      {
            word = s.substr(lpos, npos - lpos);
            
            std::string::iterator itr = word.begin();

            while(itr != word.end())
            {
                  char c = *itr;

                  if((c >= 'a' && c <= 'z') || (c >= 'A' && c <= 'Z'))
                  {
                        ++itr;
                  }
                  else
                  {
                        itr = word.erase(itr);
                  }
            }

            std::cout << word << std::endl;
            lpos = npos + 1;
      }

      system("PAUSE");
      return 0;
}

NB. Code I post is for example only and is not guaranteed to be defect free!
0
 

Author Comment

by:moonskyland
ID: 20276514
Thanks for help! :) i`m still learning c++.
0
 
LVL 40

Expert Comment

by:evilrix
ID: 20276529
You are very welcome :)
0
 

Expert Comment

by:crazybrker
ID: 20276572
To your first question, you are currently taking anything from the start of s (your sentence) up till the first comma then cout that, the issue is that it considers the "." (period) as part of your new word. So try addind that to your find_first_of function i.e.
    while ((npos = (int)s.find_first_of(",.", lpos)) != string::npos )
now it will split the sentence on every occurance of a period or comma.

as for your 2nd question the function that you are looking for would be .length i.e.

cout << "The length of word is " << word.length() << " characters.\n";
or
if (word.length()<5)
do whatever...
0
 

Author Comment

by:moonskyland
ID: 20276854
crazybrker, thank you also for help. checked it works too, i thought it is not possible to write ",." (two elements), i mean, i thought it would work only if word1,.word2,.word3,.and so on.
Thanks for help again! :-)
0
 
LVL 40

Expert Comment

by:evilrix
ID: 20278274
Of course, the next problem you'll hit is other non-alpha characters will also form part of your word-set so you'll have to list them all in the find_first_of but then they'll be word separators when all you probably want to do is just filter them out. This is why I provided you with a code snippet that shows how to filter these out rather than split on them.
0
 
LVL 53

Expert Comment

by:Infinity08
ID: 20278315
btw, these headers :

#include <iostream.h>
#include <string.h>

are deprecated. Use these instead :

#include <iostream>
#include <string>
0
 
LVL 40

Expert Comment

by:evilrix
ID: 20278346
As per my example code :)
0
 
LVL 53

Expert Comment

by:Infinity08
ID: 20278428
I know ... Just wanted to say it explicitly.
0
 
LVL 40

Expert Comment

by:evilrix
ID: 20278474
You just wanted the final word :-p
0
 
LVL 53

Expert Comment

by:Infinity08
ID: 20278480
Mmmm ... did I ? ;)
0
 

Author Comment

by:moonskyland
ID: 20278846
final word is mine :D thanks to all for help
0
 
LVL 40

Expert Comment

by:evilrix
ID: 20278859
You are very welcome --- doh! :-s
0

Featured Post

Secure Your Active Directory - April 20, 2017

Active Directory plays a critical role in your company’s IT infrastructure and keeping it secure in today’s hacker-infested world is a must.
Microsoft published 300+ pages of guidance, but who has the time, money, and resources to implement? Register now to find an easier way.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

When writing generic code, using template meta-programming techniques, it is sometimes useful to know if a type is convertible to another type. A good example of when this might be is if you are writing diagnostic instrumentation for code to generat…
  Included as part of the C++ Standard Template Library (STL) is a collection of generic containers. Each of these containers serves a different purpose and has different pros and cons. It is often difficult to decide which container to use and …
The viewer will learn how to user default arguments when defining functions. This method of defining functions will be contrasted with the non-default-argument of defining functions.
The viewer will be introduced to the technique of using vectors in C++. The video will cover how to define a vector, store values in the vector and retrieve data from the values stored in the vector.

696 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question