Solved

need to parse in c++

Posted on 2004-10-07
5
230 Views
Last Modified: 2010-04-17
I need to parse a file in C++.  Just wondering if there are any good functions out there to help with.  I'm using visual studio 6.0.  Any help would be great.
0
Comment
Question by:vonsolo
5 Comments
 
LVL 1

Expert Comment

by:Feldspar
ID: 12255500
For general purpose text-parsing you can use regular expressions via the the regex library.  It is very versatile although there are certain types of cases that require extra coding to work with (for example, making sure an arbitrary number of parenthesis are matched, so it would be difficult to parse complex mathematical expressions).  If you are using linux unix then the regexp library is probably already on your machine, or you can get a windows build at http://sourceforge.net/project/showfiles.php?group_id=23617&package_id=57165
0
 
LVL 11

Expert Comment

by:pratap_r
ID: 12260648
Since you are using VS6.0 your best bet is to use the string crt functions .. load the file into a buffer and parse it off.. :-) how much processing are you expecting in the parsing? vs6.0 does not have support for regex as suggested by Feldspar for linux.. however vs7.0 or .Net supports them.

you could use the traditional fopen, fread, fseek, strtok, strchr, strstr etc for this!!

please clarify your exact requirements, probably i can come up with a sample code.. :)

enjoy
pratap
0
 

Author Comment

by:vonsolo
ID: 12260745
the file is and xml file and looks something like this:
<article mdate="2002-10-03" key="tr/dec/SRC1997-018">
<editor>Paul R. McJones</editor>
<title>The 1995 SQL Reunion: People, Project, and Politics, May 29, 1995.</title>
<journal>Digital System Research Center Report</journal>
<volume>SRC1997-018</volume>
<year>1997</year>
<ee>db/labs/dec/SRC1997-018.html</ee>
<ee>http://www.mcjones.org/System_R/SQL_Reunion_95/</ee>
<cdrom>decTR/src1997-018.pdf</cdrom>
</article>

<article mdate="2002-10-03" key="tr/gte/TR-0263-08-94-165">
<ee>db/labs/gte/TR-0263-08-94-165/html</ee>
<author>Frank Manola</author>
<title>An Evaluation of Object-Oriented DBMS Developments: 1994 Edition</title>
<journal>GTE Laboratories Incorporated</journal>
<volume>TR-0263-08-94-165</volume>
<month>August</month>
<year>1994</year>
<url>db/labs/gte/index.html#Tr</url>
<cdrom>GTE/src1997-018.pdf</cdrom>
</article>

not all the sections start with the article tag.  I need to extract was is in the title tag, for those that come from articles.

So my resutls from the top two entries will be:
The 1995 SQL Reunion: People, Project, and Politics, May 29, 1995.
An Evaluation of Object-Oriented DBMS Developments: 1994 Edition

I hope that helps
0
 
LVL 1

Expert Comment

by:suryaxchange
ID: 12265246
In C++, You can use fstream for copying the data to the buffer.Use STL string for parsing it has many functions which can make your work easy.fstream and string are located in namespace std.

Surya.
0
 
LVL 11

Accepted Solution

by:
pratap_r earned 50 total points
ID: 12265467
hmm this is xml parsing.. since you are using vs6.0 and c++ may be you should use the xmldom for this.. which is much more easier...

try this
http://www.devhood.com/tutorials/tutorial_details.aspx?tutorial_id=608

and the xpath for your problem will be //article/title

that should solve your problem. let me know if you need more help.. ill try to post a code snip

Enjoy
Pratap

0

Featured Post

Is Your Active Directory as Secure as You Think?

More than 75% of all records are compromised because of the loss or theft of a privileged credential. Experts have been exploring Active Directory infrastructure to identify key threats and establish best practices for keeping data safe. Attend this month’s webinar to learn more.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

Title # Comments Views Activity
Python Regex Problem 24 123
creating threads in delphi 1 82
how to update exe applicatio from internet ? 6 67
REReplaceNoCase help 1 14
Whatever be the reason, if you are working on web development side,  you will need day-today validation codes like email validation, date validation , IP address validation, phone validation on any of the edit page or say at the time of registration…
This is an explanation of a simple data model to help parse a JSON feed
In this fourth video of the Xpdf series, we discuss and demonstrate the PDFinfo utility, which retrieves the contents of a PDF's Info Dictionary, as well as some other information, including the page count. We show how to isolate the page count in a…
In this seventh video of the Xpdf series, we discuss and demonstrate the PDFfonts utility, which lists all the fonts used in a PDF file. It does this via a command line interface, making it suitable for use in programs, scripts, batch files — any pl…

911 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

23 Experts available now in Live!

Get 1:1 Help Now