Solved

need to parse in c++

Posted on 2004-10-07
5
229 Views
Last Modified: 2010-04-17
I need to parse a file in C++.  Just wondering if there are any good functions out there to help with.  I'm using visual studio 6.0.  Any help would be great.
0
Comment
Question by:vonsolo
5 Comments
 
LVL 1

Expert Comment

by:Feldspar
ID: 12255500
For general purpose text-parsing you can use regular expressions via the the regex library.  It is very versatile although there are certain types of cases that require extra coding to work with (for example, making sure an arbitrary number of parenthesis are matched, so it would be difficult to parse complex mathematical expressions).  If you are using linux unix then the regexp library is probably already on your machine, or you can get a windows build at http://sourceforge.net/project/showfiles.php?group_id=23617&package_id=57165
0
 
LVL 11

Expert Comment

by:pratap_r
ID: 12260648
Since you are using VS6.0 your best bet is to use the string crt functions .. load the file into a buffer and parse it off.. :-) how much processing are you expecting in the parsing? vs6.0 does not have support for regex as suggested by Feldspar for linux.. however vs7.0 or .Net supports them.

you could use the traditional fopen, fread, fseek, strtok, strchr, strstr etc for this!!

please clarify your exact requirements, probably i can come up with a sample code.. :)

enjoy
pratap
0
 

Author Comment

by:vonsolo
ID: 12260745
the file is and xml file and looks something like this:
<article mdate="2002-10-03" key="tr/dec/SRC1997-018">
<editor>Paul R. McJones</editor>
<title>The 1995 SQL Reunion: People, Project, and Politics, May 29, 1995.</title>
<journal>Digital System Research Center Report</journal>
<volume>SRC1997-018</volume>
<year>1997</year>
<ee>db/labs/dec/SRC1997-018.html</ee>
<ee>http://www.mcjones.org/System_R/SQL_Reunion_95/</ee>
<cdrom>decTR/src1997-018.pdf</cdrom>
</article>

<article mdate="2002-10-03" key="tr/gte/TR-0263-08-94-165">
<ee>db/labs/gte/TR-0263-08-94-165/html</ee>
<author>Frank Manola</author>
<title>An Evaluation of Object-Oriented DBMS Developments: 1994 Edition</title>
<journal>GTE Laboratories Incorporated</journal>
<volume>TR-0263-08-94-165</volume>
<month>August</month>
<year>1994</year>
<url>db/labs/gte/index.html#Tr</url>
<cdrom>GTE/src1997-018.pdf</cdrom>
</article>

not all the sections start with the article tag.  I need to extract was is in the title tag, for those that come from articles.

So my resutls from the top two entries will be:
The 1995 SQL Reunion: People, Project, and Politics, May 29, 1995.
An Evaluation of Object-Oriented DBMS Developments: 1994 Edition

I hope that helps
0
 
LVL 1

Expert Comment

by:suryaxchange
ID: 12265246
In C++, You can use fstream for copying the data to the buffer.Use STL string for parsing it has many functions which can make your work easy.fstream and string are located in namespace std.

Surya.
0
 
LVL 11

Accepted Solution

by:
pratap_r earned 50 total points
ID: 12265467
hmm this is xml parsing.. since you are using vs6.0 and c++ may be you should use the xmldom for this.. which is much more easier...

try this
http://www.devhood.com/tutorials/tutorial_details.aspx?tutorial_id=608

and the xpath for your problem will be //article/title

that should solve your problem. let me know if you need more help.. ill try to post a code snip

Enjoy
Pratap

0

Featured Post

Free Trending Threat Insights Every Day

Enhance your security with threat intelligence from the web. Get trending threat insights on hackers, exploits, and suspicious IP addresses delivered to your inbox with our free Cyber Daily.

Join & Write a Comment

Suggested Solutions

Title # Comments Views Activity
factorial example challenge 10 62
wordappend challenge 8 86
wordmultiple challenge 12 92
Programatically extract date from website 8 20
Does the idea of dealing with bits scare or confuse you? Does it seem like a waste of time in an age where we all have terabytes of storage? If so, you're missing out on one of the core tools in every professional programmer's toolbox. Learn how to …
In this post we will learn how to connect and configure Android Device (Smartphone etc.) with Android Studio. After that we will run a simple Hello World Program.
Explain concepts important to validation of email addresses with regular expressions. Applies to most languages/tools that uses regular expressions. Consider email address RFCs: Look at HTML5 form input element (with type=email) regex pattern: T…
In this fourth video of the Xpdf series, we discuss and demonstrate the PDFinfo utility, which retrieves the contents of a PDF's Info Dictionary, as well as some other information, including the page count. We show how to isolate the page count in a…

747 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

13 Experts available now in Live!

Get 1:1 Help Now