Solved

Looking for HTML Parser for C/C++

Posted on 2003-12-11
7
7,539 Views
Last Modified: 2010-05-18
Who knows where can I get a good HTML-Parser for C/C++ for free?
0
Comment
Question by:chenwei
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
7 Comments
 
LVL 17

Expert Comment

by:dorward
ID: 9919416
0
 

Author Comment

by:chenwei
ID: 9919452
Thanks for the info. I forget to say what I am looking for is an HTML Parser for C/C++ for MS Visual Sutdio. That means it could be compiled with MS Visual Studio.

The dillo seems for GNU-C

0
 

Expert Comment

by:jdewerth
ID: 9934032
http://www.thefreecountry.com/sourcecode/cpp.shtml

try c++ class library

"html parser" c++ in a google search brings up all sort of parsers
0
Industry Leaders: We Want Your Opinion!

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

 
LVL 1

Expert Comment

by:bsimser
ID: 9972294
There's a GPL piece of software called HTMLDOC. It's made for converting HTML to PDF files (or RTF) but has a pretty good HTML parser in it and can handle all the basic tags. While it's not exactly made for what you're trying to do, once you suck in the HTML it's all strung up in a object tree that you can do whatever you want with.

You can find the source here:
http://www.easysw.com/htmldoc/software.php

An alternative is to first run your HTML through HTMLTidy to create XHMTL then use a regular XML parser (like Xerces) to parse out what you want.

You can find HTML Tidy here:
http://tidy.sourceforge.net/

and Xerces here:
http://xml.apache.org/xerces-c/

Everything is open source and can probably give you what you want.

-Bil
0
 

Author Comment

by:chenwei
ID: 9972783
Thanks to all sites. I've found out an HTML Parser, libxml2.

Please don't answer my question any more.
0
 
LVL 1

Accepted Solution

by:
Computer101 earned 0 total points
ID: 12515483
PAQed with points refunded (20)

Computer101
EE Admin
0

Featured Post

Independent Software Vendors: We Want Your Opinion

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

Most of the sites are being standardized with W3C Web Standards. W3C provides lot of web standard services to the web. They have the web specification, process and documentation for all the web standards. You can apply HTML, CSS and Accessibility st…
JavaScript has plenty of pieces of code people often just copy/paste from somewhere but never quite fully understand. Self-Executing functions are just one good example that I'll try to demystify here.
Explain concepts important to validation of email addresses with regular expressions. Applies to most languages/tools that uses regular expressions. Consider email address RFCs: Look at HTML5 form input element (with type=email) regex pattern: T…
The viewer will the learn the benefit of plain text editors and code an HTML5 based template for use in further tutorials.

740 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question