?
Solved

Looking for HTML Parser for C/C++

Posted on 2003-12-11
7
Medium Priority
?
7,543 Views
Last Modified: 2010-05-18
Who knows where can I get a good HTML-Parser for C/C++ for free?
0
Comment
Question by:chenwei
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
7 Comments
 
LVL 17

Expert Comment

by:dorward
ID: 9919416
0
 

Author Comment

by:chenwei
ID: 9919452
Thanks for the info. I forget to say what I am looking for is an HTML Parser for C/C++ for MS Visual Sutdio. That means it could be compiled with MS Visual Studio.

The dillo seems for GNU-C

0
 

Expert Comment

by:jdewerth
ID: 9934032
http://www.thefreecountry.com/sourcecode/cpp.shtml

try c++ class library

"html parser" c++ in a google search brings up all sort of parsers
0
Concerto Cloud for Software Providers & ISVs

Can Concerto Cloud Services help you focus on evolving your application offerings, while delivering the best cloud experience to your customers? From DevOps to revenue models and customer support, the answer is yes!

Learn how Concerto can help you.

 
LVL 1

Expert Comment

by:bsimser
ID: 9972294
There's a GPL piece of software called HTMLDOC. It's made for converting HTML to PDF files (or RTF) but has a pretty good HTML parser in it and can handle all the basic tags. While it's not exactly made for what you're trying to do, once you suck in the HTML it's all strung up in a object tree that you can do whatever you want with.

You can find the source here:
http://www.easysw.com/htmldoc/software.php

An alternative is to first run your HTML through HTMLTidy to create XHMTL then use a regular XML parser (like Xerces) to parse out what you want.

You can find HTML Tidy here:
http://tidy.sourceforge.net/

and Xerces here:
http://xml.apache.org/xerces-c/

Everything is open source and can probably give you what you want.

-Bil
0
 

Author Comment

by:chenwei
ID: 9972783
Thanks to all sites. I've found out an HTML Parser, libxml2.

Please don't answer my question any more.
0
 
LVL 1

Accepted Solution

by:
Computer101 earned 0 total points
ID: 12515483
PAQed with points refunded (20)

Computer101
EE Admin
0

Featured Post

[Webinar] Lessons on Recovering from Petya

Skyport is working hard to help customers recover from recent attacks, like the Petya worm. This work has brought to light some important lessons. New malware attacks like this can take down your entire environment. Learn from others mistakes on how to prevent Petya like worms.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Most of the sites are being standardized with W3C Web Standards. W3C provides lot of web standard services to the web. They have the web specification, process and documentation for all the web standards. You can apply HTML, CSS and Accessibility st…
Introduction Knockoutjs (Knockout) is a JavaScript framework (Model View ViewModel or MVVM framework).   The main ideology behind Knockout is to control from JavaScript how a page looks whilst creating an engaging user experience in the least …
The viewer will receive an overview of the basics of CSS showing inline styles. In the head tags set up your style tags: (CODE) Reference the nav tag and set your properties.: (CODE) Set the reference for the UL element and styles for it to ensu…
The viewer will learn the benefit of using external CSS files and the relationship between class and ID selectors. Create your external css file by saving it as style.css then set up your style tags: (CODE) Reference the nav tag and set your prop…

649 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question