Solved

need to parse in c++

Posted on 2004-10-07
5
233 Views
Last Modified: 2010-04-17
I need to parse a file in C++.  Just wondering if there are any good functions out there to help with.  I'm using visual studio 6.0.  Any help would be great.
0
Comment
Question by:vonsolo
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
5 Comments
 
LVL 1

Expert Comment

by:Feldspar
ID: 12255500
For general purpose text-parsing you can use regular expressions via the the regex library.  It is very versatile although there are certain types of cases that require extra coding to work with (for example, making sure an arbitrary number of parenthesis are matched, so it would be difficult to parse complex mathematical expressions).  If you are using linux unix then the regexp library is probably already on your machine, or you can get a windows build at http://sourceforge.net/project/showfiles.php?group_id=23617&package_id=57165
0
 
LVL 11

Expert Comment

by:pratap_r
ID: 12260648
Since you are using VS6.0 your best bet is to use the string crt functions .. load the file into a buffer and parse it off.. :-) how much processing are you expecting in the parsing? vs6.0 does not have support for regex as suggested by Feldspar for linux.. however vs7.0 or .Net supports them.

you could use the traditional fopen, fread, fseek, strtok, strchr, strstr etc for this!!

please clarify your exact requirements, probably i can come up with a sample code.. :)

enjoy
pratap
0
 

Author Comment

by:vonsolo
ID: 12260745
the file is and xml file and looks something like this:
<article mdate="2002-10-03" key="tr/dec/SRC1997-018">
<editor>Paul R. McJones</editor>
<title>The 1995 SQL Reunion: People, Project, and Politics, May 29, 1995.</title>
<journal>Digital System Research Center Report</journal>
<volume>SRC1997-018</volume>
<year>1997</year>
<ee>db/labs/dec/SRC1997-018.html</ee>
<ee>http://www.mcjones.org/System_R/SQL_Reunion_95/</ee>
<cdrom>decTR/src1997-018.pdf</cdrom>
</article>

<article mdate="2002-10-03" key="tr/gte/TR-0263-08-94-165">
<ee>db/labs/gte/TR-0263-08-94-165/html</ee>
<author>Frank Manola</author>
<title>An Evaluation of Object-Oriented DBMS Developments: 1994 Edition</title>
<journal>GTE Laboratories Incorporated</journal>
<volume>TR-0263-08-94-165</volume>
<month>August</month>
<year>1994</year>
<url>db/labs/gte/index.html#Tr</url>
<cdrom>GTE/src1997-018.pdf</cdrom>
</article>

not all the sections start with the article tag.  I need to extract was is in the title tag, for those that come from articles.

So my resutls from the top two entries will be:
The 1995 SQL Reunion: People, Project, and Politics, May 29, 1995.
An Evaluation of Object-Oriented DBMS Developments: 1994 Edition

I hope that helps
0
 
LVL 1

Expert Comment

by:suryaxchange
ID: 12265246
In C++, You can use fstream for copying the data to the buffer.Use STL string for parsing it has many functions which can make your work easy.fstream and string are located in namespace std.

Surya.
0
 
LVL 11

Accepted Solution

by:
pratap_r earned 50 total points
ID: 12265467
hmm this is xml parsing.. since you are using vs6.0 and c++ may be you should use the xmldom for this.. which is much more easier...

try this
http://www.devhood.com/tutorials/tutorial_details.aspx?tutorial_id=608

and the xpath for your problem will be //article/title

that should solve your problem. let me know if you need more help.. ill try to post a code snip

Enjoy
Pratap

0

Featured Post

Free Tool: ZipGrep

ZipGrep is a utility that can list and search zip (.war, .ear, .jar, etc) archives for text patterns, without the need to extract the archive's contents.

One of a set of tools we're offering as a way to say thank you for being a part of the community.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

This article is meant to give a basic understanding of how to use R Sweave as a way to merge LaTeX and R code seamlessly into one presentable document.
Computer science students often experience many of the same frustrations when going through their engineering courses. This article presents seven tips I found useful when completing a bachelors and masters degree in computing which I believe may he…
An introduction to basic programming syntax in Java by creating a simple program. Viewers can follow the tutorial as they create their first class in Java. Definitions and explanations about each element are given to help prepare viewers for future …
Explain concepts important to validation of email addresses with regular expressions. Applies to most languages/tools that uses regular expressions. Consider email address RFCs: Look at HTML5 form input element (with type=email) regex pattern: T…

752 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question