Solved

need to parse in c++

Posted on 2004-10-07
5
231 Views
Last Modified: 2010-04-17
I need to parse a file in C++.  Just wondering if there are any good functions out there to help with.  I'm using visual studio 6.0.  Any help would be great.
0
Comment
Question by:vonsolo
5 Comments
 
LVL 1

Expert Comment

by:Feldspar
ID: 12255500
For general purpose text-parsing you can use regular expressions via the the regex library.  It is very versatile although there are certain types of cases that require extra coding to work with (for example, making sure an arbitrary number of parenthesis are matched, so it would be difficult to parse complex mathematical expressions).  If you are using linux unix then the regexp library is probably already on your machine, or you can get a windows build at http://sourceforge.net/project/showfiles.php?group_id=23617&package_id=57165
0
 
LVL 11

Expert Comment

by:pratap_r
ID: 12260648
Since you are using VS6.0 your best bet is to use the string crt functions .. load the file into a buffer and parse it off.. :-) how much processing are you expecting in the parsing? vs6.0 does not have support for regex as suggested by Feldspar for linux.. however vs7.0 or .Net supports them.

you could use the traditional fopen, fread, fseek, strtok, strchr, strstr etc for this!!

please clarify your exact requirements, probably i can come up with a sample code.. :)

enjoy
pratap
0
 

Author Comment

by:vonsolo
ID: 12260745
the file is and xml file and looks something like this:
<article mdate="2002-10-03" key="tr/dec/SRC1997-018">
<editor>Paul R. McJones</editor>
<title>The 1995 SQL Reunion: People, Project, and Politics, May 29, 1995.</title>
<journal>Digital System Research Center Report</journal>
<volume>SRC1997-018</volume>
<year>1997</year>
<ee>db/labs/dec/SRC1997-018.html</ee>
<ee>http://www.mcjones.org/System_R/SQL_Reunion_95/</ee>
<cdrom>decTR/src1997-018.pdf</cdrom>
</article>

<article mdate="2002-10-03" key="tr/gte/TR-0263-08-94-165">
<ee>db/labs/gte/TR-0263-08-94-165/html</ee>
<author>Frank Manola</author>
<title>An Evaluation of Object-Oriented DBMS Developments: 1994 Edition</title>
<journal>GTE Laboratories Incorporated</journal>
<volume>TR-0263-08-94-165</volume>
<month>August</month>
<year>1994</year>
<url>db/labs/gte/index.html#Tr</url>
<cdrom>GTE/src1997-018.pdf</cdrom>
</article>

not all the sections start with the article tag.  I need to extract was is in the title tag, for those that come from articles.

So my resutls from the top two entries will be:
The 1995 SQL Reunion: People, Project, and Politics, May 29, 1995.
An Evaluation of Object-Oriented DBMS Developments: 1994 Edition

I hope that helps
0
 
LVL 1

Expert Comment

by:suryaxchange
ID: 12265246
In C++, You can use fstream for copying the data to the buffer.Use STL string for parsing it has many functions which can make your work easy.fstream and string are located in namespace std.

Surya.
0
 
LVL 11

Accepted Solution

by:
pratap_r earned 50 total points
ID: 12265467
hmm this is xml parsing.. since you are using vs6.0 and c++ may be you should use the xmldom for this.. which is much more easier...

try this
http://www.devhood.com/tutorials/tutorial_details.aspx?tutorial_id=608

and the xpath for your problem will be //article/title

that should solve your problem. let me know if you need more help.. ill try to post a code snip

Enjoy
Pratap

0

Featured Post

PRTG Network Monitor: Intuitive Network Monitoring

Network Monitoring is essential to ensure that computer systems and network devices are running. Use PRTG to monitor LANs, servers, websites, applications and devices, bandwidth, virtual environments, remote systems, IoT, and many more. PRTG is easy to set up & use.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

Since upgrading to Office 2013 or higher installing the Smart Indenter addin will fail. This article will explain how to install it so it will work regardless of the Office version installed.
If you’re thinking to yourself “That description sounds a lot like two people doing the work that one could accomplish,” you’re not alone.
Explain concepts important to validation of email addresses with regular expressions. Applies to most languages/tools that uses regular expressions. Consider email address RFCs: Look at HTML5 form input element (with type=email) regex pattern: T…
With the power of JIRA, there's an unlimited number of ways you can customize it, use it and benefit from it. With that in mind, there's bound to be things that I wasn't able to cover in this course. With this summary we'll look at some places to go…

777 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question