Solved

problem while separating words from sentence

Posted on 2007-11-13
17
820 Views
Last Modified: 2008-02-01
Hello, to extract words from sentence i use this code.

#include <iostream.h>
#include <string.h>

using namespace std;

int main ()
{
   int i,k;
    string s, word;  
    cout << "Enter Sentence:\n";
    getline(cin, s);
    cout << "You entered:" << s << "\n";
     s += ',';  
    int npos = 0;
    int lpos = 0;
   
   
    while ((npos = (int)s.find_first_of(",", lpos)) != string::npos )
    {
         
          word = s.substr(lpos, npos - lpos);
                    cout << word << endl;
           lpos = npos + 1;
    }
 
    system("PAUSE");
      return 0;

    }


Input must be made like this (sentence): dog,cat,food,table,carrot,expert,exchange.
After last word dot.
And output should be like:
dog
cat
food
table
carrot
expert
exchange
-----------------------
But right now, i get:
dog
cat
foor
table
carrot
expert
exchange.

How to make, that there would be no dot "." after last word? (please write sample)

Also one more question, let say later i will need to check if word  has more than 5 letters, how could i do that? i know i can't do just "if (word>5)", because word is declared as string. Please help me, thanks!
0
Comment
Question by:moonskyland
  • 8
  • 3
  • 3
  • +2
17 Comments
 
LVL 7

Expert Comment

by:UrosVidojevic
ID: 20276310
Add this, before printing the word.

              if (word[word.length()-1] == '.')
                    word = word.substr(0, word.length()-1);
0
 
LVL 7

Accepted Solution

by:
UrosVidojevic earned 300 total points
ID: 20276363
Or even better if you are sure that '.' is the last character of the sentence.
Eliminate it at the beginning, immediately after you read the sentence by:

s = s.substr(0, s.length()-1);
0
 
LVL 40

Assisted Solution

by:evilrix
evilrix earned 200 total points
ID: 20276374
You split up a string using a comma. So your code is doing exactly the right thing, the fact that there is a . at the end of the last word is a red herring, it'll happen on any word. In fact any char that is not a comma will be selected as a word. As UrosVidojevic has eluded, you need to perform some post processing of the string to 'clean it up'.

If you really need the power of parsing consider investing time in implemented one of the following free regex engines.

boost regex
http://www.boost.org/libs/regex/doc/index.html

PCRE
http://www.pcre.org/

Greta
http://research.microsoft.com/projects/greta/

Regexes are pretty easy to learn -- and fun to :)
http://www.regular-expressions.info/

I hope this helps.

-Rx.
0
 
LVL 40

Expert Comment

by:evilrix
ID: 20276405
>> Also one more question, let say later i will need to check if word  has more than 5 letters, how could i do that

if(word.size() > 5)
{
    // more than 5 chars
}
0
 
LVL 40

Expert Comment

by:evilrix
ID: 20276506
This goes some way towards fixing your problem...

#include <iostream>
#include <string>

int main ()
{
      std::string s;
      std::string word;  

      cout << "Enter Sentence:\n";
      getline(cin, s);
      cout << "You entered:" << s << "\n";

      s += ',';  
      
      std::string::size_type npos = 0;
      std::string::size_type lpos = 0;

      while ((npos = s.find_first_of(",", lpos)) != std::string::npos )
      {
            word = s.substr(lpos, npos - lpos);
            
            std::string::iterator itr = word.begin();

            while(itr != word.end())
            {
                  char c = *itr;

                  if((c >= 'a' && c <= 'z') || (c >= 'A' && c <= 'Z'))
                  {
                        ++itr;
                  }
                  else
                  {
                        itr = word.erase(itr);
                  }
            }

            std::cout << word << std::endl;
            lpos = npos + 1;
      }

      system("PAUSE");
      return 0;
}

NB. Code I post is for example only and is not guaranteed to be defect free!
0
 

Author Comment

by:moonskyland
ID: 20276514
Thanks for help! :) i`m still learning c++.
0
 
LVL 40

Expert Comment

by:evilrix
ID: 20276529
You are very welcome :)
0
 

Expert Comment

by:crazybrker
ID: 20276572
To your first question, you are currently taking anything from the start of s (your sentence) up till the first comma then cout that, the issue is that it considers the "." (period) as part of your new word. So try addind that to your find_first_of function i.e.
    while ((npos = (int)s.find_first_of(",.", lpos)) != string::npos )
now it will split the sentence on every occurance of a period or comma.

as for your 2nd question the function that you are looking for would be .length i.e.

cout << "The length of word is " << word.length() << " characters.\n";
or
if (word.length()<5)
do whatever...
0
Find Ransomware Secrets With All-Source Analysis

Ransomware has become a major concern for organizations; its prevalence has grown due to past successes achieved by threat actors. While each ransomware variant is different, we’ve seen some common tactics and trends used among the authors of the malware.

 

Author Comment

by:moonskyland
ID: 20276854
crazybrker, thank you also for help. checked it works too, i thought it is not possible to write ",." (two elements), i mean, i thought it would work only if word1,.word2,.word3,.and so on.
Thanks for help again! :-)
0
 
LVL 40

Expert Comment

by:evilrix
ID: 20278274
Of course, the next problem you'll hit is other non-alpha characters will also form part of your word-set so you'll have to list them all in the find_first_of but then they'll be word separators when all you probably want to do is just filter them out. This is why I provided you with a code snippet that shows how to filter these out rather than split on them.
0
 
LVL 53

Expert Comment

by:Infinity08
ID: 20278315
btw, these headers :

#include <iostream.h>
#include <string.h>

are deprecated. Use these instead :

#include <iostream>
#include <string>
0
 
LVL 40

Expert Comment

by:evilrix
ID: 20278346
As per my example code :)
0
 
LVL 53

Expert Comment

by:Infinity08
ID: 20278428
I know ... Just wanted to say it explicitly.
0
 
LVL 40

Expert Comment

by:evilrix
ID: 20278474
You just wanted the final word :-p
0
 
LVL 53

Expert Comment

by:Infinity08
ID: 20278480
Mmmm ... did I ? ;)
0
 

Author Comment

by:moonskyland
ID: 20278846
final word is mine :D thanks to all for help
0
 
LVL 40

Expert Comment

by:evilrix
ID: 20278859
You are very welcome --- doh! :-s
0

Featured Post

Do You Know the 4 Main Threat Actor Types?

Do you know the main threat actor types? Most attackers fall into one of four categories, each with their own favored tactics, techniques, and procedures.

Join & Write a Comment

Often, when implementing a feature, you won't know how certain events should be handled at the point where they occur and you'd rather defer to the user of your function or class. For example, a XML parser will extract a tag from the source code, wh…
This article will show you some of the more useful Standard Template Library (STL) algorithms through the use of working examples.  You will learn about how these algorithms fit into the STL architecture, how they work with STL containers, and why t…
The viewer will learn how to use the return statement in functions in C++. The video will also teach the user how to pass data to a function and have the function return data back for further processing.
The viewer will be introduced to the technique of using vectors in C++. The video will cover how to define a vector, store values in the vector and retrieve data from the values stored in the vector.

762 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

20 Experts available now in Live!

Get 1:1 Help Now