Solved

Looking for sample program

Posted on 2007-04-06
13
246 Views
Last Modified: 2010-04-01
Hi all,
 I want you to find a program which can count words, and then make it also tell you the line numbers where the words occur.
 
 If you post the source code here( for future users too), please give me the references.  

PS: any person who need to edit the code in order to meet the requirements and have references will get 500 points.
0
Comment
Question by:valleytech
  • 6
  • 4
  • 2
  • +1
13 Comments
 
LVL 45

Expert Comment

by:Kdo
ID: 18865337

Hi Valleytech,

If you're on a unix system, this can be done very easily with a shell script.  That's a lot easier than writing a custom program.


Kent
0
 
LVL 16

Expert Comment

by:AlexNek
ID: 18865485
0
 
LVL 16

Expert Comment

by:AlexNek
ID: 18865494
0
 

Author Comment

by:valleytech
ID: 18868283
well, i want to it in c program.
 alexnek, those links you gave me only count word. However, I need to see for each particular word occurs at which those line numbers. Thanks.
please help.
0
 
LVL 3

Expert Comment

by:Darrylsh
ID: 18868405
create a map with the word as key and a vector of integers as the value.
map <string, vector<int> > wordcount;
now read each line.  
for each word in the line push back the line number into the vector whose key is the word.
wordcount[a_word].push_back(line_number);

in the end the size of the vector will be your count:  wordcount[word_in_list].size();
and the vector will be the index of all the lines it appears


0
 
LVL 3

Accepted Solution

by:
Darrylsh earned 500 total points
ID: 18868451
#include<iostream>
#include <vector>
#include <string>
#include <sstream>
#include <fstream>
#include <map>

using namespace std;

vector<string>  getTokens(string str);

int main()
{
      ifstream infile ("input.txt",ios::in);
      if (!infile.is_open())
      {
            cout << "Error Opening input File.\n";
            return 0;
      }      
      map <string, vector<int> > wordcount;
      int line_number = 1;
      string line;
      while( getline(infile, line,'\n'))
      {
            vector<string> words = getTokens(line);
            for (int i = 0;i < words.size();++i)
            {
                  wordcount[words[i]].push_back(line_number);
            }
            ++line_number;
      }
      map <string, vector<int> >::iterator it;
      for (it = wordcount.begin();it!= wordcount.end(); ++it)
      {
            cout << "word=" << it->first <<": count=" <<(it->second).size() << ": line numbers= ";
            for (int i = 0; i < (it->second).size();++i)
            {
                  cout << (it->second).at(i)<< " ";
            }
            cout << endl;
      }

}
vector<string>  getTokens(string str)
{
      string buf;  // Have a buffer string
      stringstream ss(str);  // Insert the string into a stream
      vector<string> tokens;  // Create vector to hold our words
      while (ss >> buf)
            tokens.push_back(buf);
      return tokens;
}
0
Is Your Active Directory as Secure as You Think?

More than 75% of all records are compromised because of the loss or theft of a privileged credential. Experts have been exploring Active Directory infrastructure to identify key threats and establish best practices for keeping data safe. Attend this month’s webinar to learn more.

 

Author Comment

by:valleytech
ID: 18868696
wow Darrylsh.
 You are so good. I just thought of binary tree with the node include the line number and word. Let me run yours. Thanks.
0
 
LVL 16

Expert Comment

by:AlexNek
ID: 18869166
> I need to see for each particular word occurs at which those line numbers.
It is a base, I think it is easy to modify it.
If you want it in C that would be better to use a tree. With node like this one
struct Node{
char Letter;
BOOL EndOfWord;
int LineNumber;
int WordCount;
Node* pNext;
Node* pChildren;
} Node;

It is not Memory optimized but easy to realize;
It is possibel to use sorted array too with items
stuct Item
{
char* pWord;
int LineNumber;
int WordCount;
}
0
 

Author Comment

by:valleytech
ID: 18870225
do you have the sample code for it? Thanks.
0
 
LVL 16

Expert Comment

by:AlexNek
ID: 18870350
No, I haven't but I can't see a big problem to write it.
I'll try to write a pseudo code for it.
0
 
LVL 16

Expert Comment

by:AlexNek
ID: 18871999
I'm sorry, I have no time to write something but I've found the error in my data construction.
We need to have an array for the LineNumber, one number for each word. For C it is possible to have a list
0
 

Author Comment

by:valleytech
ID: 18872479
it's ok alexnek.
0
 
LVL 16

Expert Comment

by:AlexNek
ID: 18873198
I did it! But I hate C now. ... The code could be better.

typedef struct LineNumNode {
      int m_nLineNumber;
      LineNumNode* m_pNext;
}LineNumNode;

typedef struct Node {
      char m_Letter;
      bool m_EndOfWord;
      LineNumNode* m_pLineNumRoot;
      int m_WordCount;
      Node* m_pNext;
      Node* m_pChildren;
} Node;

Node* FindSymbolNext(Node* pStart, char Symbol)
{
      Node* pCurrent = pStart;
      Node* Find = NULL;
      if (pStart!= NULL)
      {
        do
        {
           if (pCurrent->m_Letter == Symbol)
             {
                   Find = pCurrent;
                   break;
             }
         pCurrent = pCurrent->m_pNext;
        }
        while (pCurrent != NULL);
      }
      return Find;
}

Node* FindSymbolChild(Node* pStart, char Symbol)
{
      Node* pCurrent = pStart;
      Node* Find = NULL;
      if (pStart!= NULL)
      {
        do
        {
           if (pCurrent->m_Letter == Symbol)
             {
                   Find = pCurrent;
                   break;
             }
             pCurrent = pCurrent->m_pChildren;
        }
        while (pCurrent != NULL);
      }
      return Find;
}

void AddLineNumber(Node* pNode, int LineNumber)
{
      if (pNode != NULL)
      {
            LineNumNode* pNumNode = NULL;
            pNumNode = new LineNumNode();
            //Set all fields to 0
            memset(pNumNode,0, sizeof(LineNumNode));

            if (pNode->m_pLineNumRoot == NULL)
            {
                  pNode->m_pLineNumRoot = pNumNode;
            }
            else
            {
                  LineNumNode* pCurrNumNode = pNode->m_pLineNumRoot;
                  while (pCurrNumNode->m_pNext != NULL)
                  {
                        pCurrNumNode = pCurrNumNode->m_pNext;
                  }
                  pCurrNumNode->m_pNext = pNumNode;
            }
            pNumNode->m_nLineNumber = LineNumber;
      }
}

Node* GetLastWordNode(Node* pStart)
{
      Node* pCurrent = pStart;
      Node* Find = pCurrent;
      if (pStart!= NULL)
      {
            do
            {
                  Find = pCurrent;
                  pCurrent = pCurrent->m_pNext;
            }
            while (pCurrent != NULL);
      }
      return Find;
}

//Test Line
//
//this is simple text
//
//tree structure - next line is Next, symbols on the same line is children
//root
//!
//t->h->i->s
//!  !
//!  e->x->t
//i->s
//!
//s->a->m->p->l->e
//
void test03()
{
      Node* Root = new Node();
      Node* LastNode = Root;
      Node* FoundNode = NULL;
      Node* CurrentNode = NULL;
      //Set all fields to 0
      memset(Root,0, sizeof(Node));
      
      int LineNumber = 0;

      char* WordDelimiters = " ;-.,\t\n";
      char* Test = "  This is  sample Text is Trick";
    char Ch;
      bool NewWord = true;
      bool BeginOfLine = true;
      bool MiddleOfWord = false;

      //skip delimiters
      while (*Test!= '\0' && strchr(WordDelimiters, *Test))
      {
            Test++;
      }

      while (*Test != '\0')
      {
        Ch = *Test;
            //Word Delimiter found
            if (strchr(WordDelimiters, Ch) != NULL)
            {
                  NewWord = true;
                  LastNode->m_WordCount++;
                  AddLineNumber(LastNode,LineNumber);
                  LastNode = Root;
      
                  //skip delimiters
                  while (*Test!= '\0' && strchr(WordDelimiters, *Test))
                  {
                        Test++;
                  }
                  Ch = *Test;
                  FoundNode = FindSymbolNext(LastNode,Ch);
            }
            else
            {
                  FoundNode = FindSymbolChild(LastNode,Ch);
            }

            if (FoundNode != NULL)
            {
                  LastNode = FoundNode;
                  MiddleOfWord = true;
            }
            else
            {
                  Node* CurrentNode = new Node();
                  //Set all fields to 0
                  memset(CurrentNode,0, sizeof(Node));
                  CurrentNode->m_Letter = Ch;
                  if (NewWord || BeginOfLine)
                  {
                        if (MiddleOfWord)
                        {
                              LastNode = GetLastWordNode(LastNode->m_pChildren);
                        }
                        else
                        {
                              LastNode = GetLastWordNode(Root);
                        }
                        if (LastNode != NULL)
                        {
                              LastNode->m_pNext = CurrentNode;
                        }
                  }
                  else
                  {
                        LastNode->m_pChildren = CurrentNode;
                  }
                  NewWord = false;
                  MiddleOfWord = false;
                  LastNode = CurrentNode;
            }
            BeginOfLine = false;
            Test++;
      }
      //set values for the last word in line
      LastNode->m_WordCount++;
      AddLineNumber(LastNode,LineNumber);

      //print all words
      PrintAllWords(Root,NULL);
}
0

Featured Post

Is Your Active Directory as Secure as You Think?

More than 75% of all records are compromised because of the loss or theft of a privileged credential. Experts have been exploring Active Directory infrastructure to identify key threats and establish best practices for keeping data safe. Attend this month’s webinar to learn more.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

Preface I don't like visual development tools that are supposed to write a program for me. Even if it is Xcode and I can use Interface Builder. Yes, it is a perfect tool and has helped me a lot, mainly, in the beginning, when my programs were small…
This article shows you how to optimize memory allocations in C++ using placement new. Applicable especially to usecases dealing with creation of large number of objects. A brief on problem: Lets take example problem for simplicity: - I have a G…
The viewer will learn how to use the return statement in functions in C++. The video will also teach the user how to pass data to a function and have the function return data back for further processing.
The viewer will be introduced to the technique of using vectors in C++. The video will cover how to define a vector, store values in the vector and retrieve data from the values stored in the vector.

867 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

12 Experts available now in Live!

Get 1:1 Help Now