Solved

Parsing Text Using C

Posted on 2011-03-01
18
343 Views
Last Modified: 2012-05-11
I'm looking to create a, hopefully simple, program within C that looks in the folder C:/test and goes through all of the text files that are in that file and for each one parses the data from those text files into a word document.

I'm pretty rough on C programming, any ideas? Also, what's the best free compiler?

For now I just want to get the concept started, I'll ask additional questions down the line for how to properly parse the specifics for which I'm intereted.

Thanks!
0
Comment
Question by:PGRBryant
  • 6
  • 6
  • 2
  • +2
18 Comments
 
LVL 86

Assisted Solution

by:jkr
jkr earned 125 total points
Comment Utility
Well, listing text files in a directory is the easier part, that could be done like the following (see also http://msdn.microsoft.com/en-us/library/aa365200%28VS.85%29.aspx - "Listing the Files in a Directory"):
#include <windows.h>
#include <tchar.h>

void main () {

	WIN32_FIND_DATA fd;

	HANDLE hFind = FindFirstFile(_T("c:\\path\\*.txt"),&fd);

	while (hFind) {

		wprintf(_T("File: %s\n""),fd.cFileName);

		if (!FindNextFile(hFind,&fd)) break;
	}
}

Open in new window


However - how would you want these files to be added to a Word document?
0
 
LVL 4

Assisted Solution

by:parnasso
parnasso earned 125 total points
Comment Utility
Once you have listed the files within a directory, you can create a new doc file and add content paragraphs to it with COM Word Automation in C++.

There is a very nice example of COM Word automation in the following link http://1code.codeplex.com/releases/view/59632#DownloadId=201236

The sample creates a new doc file and adds some paragraphs to it.

Hope this suits your needs
0
 
LVL 32

Assisted Solution

by:sarabande
sarabande earned 250 total points
Comment Utility
visual studio 2010 express is a good and free C and C++ compiler.

you also could use cygwin or mingw compilers/ide which already have the dirent.h included. the dirent.h is a portable means to retrieve a list of text files from folder.

the word document can done via com and automation like parnasso told. but that probably is not the easiest way. you might think of putting the data to an intermediate database and getting data from there into word/excel. or writing a rtf file from your data which then more easily can be converted to word document.

Sara
0
 
LVL 11

Expert Comment

by:DeepuAbrahamK
Comment Utility
Convert text file into a word document try this:


char szOrigFilename[MAX_PATH];
char szNewFilename[MAX_PATH];

strcpy(szOrigFilename,"c:\\test\\text.txt")
strcpy(szNewFilename,"c:\\test\\text.doc")

BOOL bRename = MoveFile(szOrigFilename,szNewFilename);

Open in new window

0
 
LVL 1

Author Comment

by:PGRBryant
Comment Utility
Okay Sarabande, the point of this process is to parse out tidbits of information that are stored in the text file and then automatically convert them and display them as needed.

I've attached a really silly .txt file that has basically random information.

But let's say that I wanted to pull out the "1" the "#" and the "Tom".

And place those into a word document, how would I do that?

See, down the line I'm actually going to be parsing the data from a series of text files and pasting them into another program. But in order to get there I'm trying to start with something relatively simple.
Test.txt
0
 
LVL 1

Author Comment

by:PGRBryant
Comment Utility
I didn't mean to exclude the names of you other experts, my apologies.

@jkr: I'm not sure what you're doing here? You're looking through a folder for all the .txt files and then you're printing them to what?

Let's say the txt files are all in the directory c:\test and then filename of the word document is Parsing.doc in the same folder.

@Parnasso, for some reason my visual basic express didn't know what to do with the file that you supplied, it said "conversion failed" and wouldn't load your example... although that does look very interesting.

@DeepuAbrahamK, what's the #includes  and main type for that code? I'm assuming in your psuedocode that szOrigFilename is supposed to be the name of my .txt file, so let's say test.txt, and the szNewFilename is the name of my word document, so let's say parsing.doc? Yes?
0
 
LVL 1

Author Comment

by:PGRBryant
Comment Utility
Visual C++ express, not basic*... I really wish I could edit posts.
0
 
LVL 32

Expert Comment

by:sarabande
Comment Utility
to parse the text file you would define a struct

struct Test
{
    int title1;
    std::string title2;
    std::string title3;
};

then open text file

   std::ifstream  testfile("test.txt");

and read line by line like

  Test record;
   std::string s;
   // ignore title line
   std::getline(testfile, s);

   std::vector<Test> alltests;
   while (testfile >> record.title1 >> record.title2 >> record.title3 )
   {
        // remove here leading and trailing spaces from title2 and title3
       ...
       alltests.push_back(record);
   }

the word part is not so easy.

I would suggest you store those data into ms access table  or .csv and get the data from there into your word doc. but i am no expert in word or automation.

Sara
0
Highfive + Dolby Voice = No More Audio Complaints!

Poor audio quality is one of the top reasons people don’t use video conferencing. Get the crispest, clearest audio powered by Dolby Voice in every meeting. Highfive and Dolby Voice deliver the best video conferencing and audio experience for every meeting and every room.

 
LVL 1

Author Comment

by:PGRBryant
Comment Utility
Okay so let's back it up a bit, and perhaps make it simplier... I'll ask more questions down the line.

Let's start with one .txt file and create a new txt file just with the data that I want parsed into it?

From what I gather of your code it should look something like the following, assuming I'm writing a win32 console application w/ defaults on in Visual C++ Express 2010.


#include "stdafx.h"
#include "targetver.h"
#include <iostream>

int _tmain(int argc, char* argv[])
{
   struct Test //define parts of interest
   {
       int title1;
       std::string title2;
       std::string title3;
   };	
   
   std::ifstream testfile("test.txt"); //Open text file
    
   Test record;
   std::string s; // ignores title line
   std::getc(testfile, s);
    
   std::vector<Test> alltests;
   while (testfile >> record.title1 >> record.title2 >> record.title3 )
   {
    // remove here leading and trailing spaces from title2 and title3
    ... // what goes here?
    alltests.push_back(record);
    }

   return 0;
}

Open in new window

0
 
LVL 86

Expert Comment

by:jkr
Comment Utility
>>    // remove here leading and trailing spaces from title2 and title3
>>    ... // what goes here?

A call to a function like the following to do that for each string variable:

void trim_whitespace(std::string& x) {

  size_t nPos; 

  while (' ' == (*x.begin())) x.erase(x.begin()); 
  while (' ' == (*x.rbegin())) x.erase(x.rbegin());

}

Open in new window

0
 
LVL 32

Accepted Solution

by:
sarabande earned 250 total points
Comment Utility
the member strings title2 and title3 will have spaces because of

   testfile >> record.title1 >> record.title2 >> record.title3

which does not extract the spaces (blanks) of your lines.

you can remove them using the following helper

std::string & trim(const std::string & s)
{
    int n1 = (int)s.find_first_of(" \t");
    if (n1 == std::string::npos)
        return "";  // empty string
    int n2 = (int)s.find_last_of(" \t");
    if (n2 == std::string::npos)
        n2 = s.length() -1;
    return s.substr(n1, n2+1-n1);
}

Sara
0
 
LVL 4

Expert Comment

by:parnasso
Comment Utility
About my example, if it doesn-t make the conversion, please create a new solution with your Visual and add the cpp files to it. With all likelyhood this is an issue with the Visual Studio versions.
0
 
LVL 32

Expert Comment

by:sarabande
Comment Utility
parnasso, i also downloaded the zip file but it can't extract the files with my winzip (decompression failed) which is fairly new (version 14.5).

Sara
0
 
LVL 32

Expert Comment

by:sarabande
Comment Utility
if using my trim helper you would need to replace find_first_of by find_first_not_of and find_last_of by find_last_not_of.

Sara
0
 
LVL 1

Author Comment

by:PGRBryant
Comment Utility
I'm still working on this, I don't quite understand what you guys are doing, I'm too green I guess, give me a bit to figure it out... in the meantime I think I've asked something simpler:
http://www.experts-exchange.com/Programming/Languages/C/Q_26866379.html
0
 
LVL 32

Expert Comment

by:sarabande
Comment Utility
actually it was 4 questions:

  - search directory for text files
  - read text file
  - parse text lines
  - put results into word document

where each expert made (valid) comments to different parts.

Sara
0
 
LVL 1

Author Closing Comment

by:PGRBryant
Comment Utility
Concur with sarabande, points split accordingly.

This question was poorly worded and I got distracted with other projects and didn't come back till later to verify the experts quality advice.
0

Featured Post

Enabling OSINT in Activity Based Intelligence

Activity based intelligence (ABI) requires access to all available sources of data. Recorded Future allows analysts to observe structured data on the open, deep, and dark web.

Join & Write a Comment

Basic understanding on "OO- Object Orientation" is needed for designing a logical solution to solve a problem. Basic OOAD is a prerequisite for a coder to ensure that they follow the basic design of OO. This would help developers to understand the b…
Although it can be difficult to imagine, someday your child will have a career of his or her own. He or she will likely start a family, buy a home and start having their own children. So, while being a kid is still extremely important, it’s also …
The goal of this video is to provide viewers with basic examples to understand and use switch statements in the C programming language.
The viewer will learn how to clear a vector as well as how to detect empty vectors in C++.

771 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

11 Experts available now in Live!

Get 1:1 Help Now