Solved

Looking for HTML Parser for C/C++

Posted on 2003-12-11
7
7,530 Views
Last Modified: 2010-05-18
Who knows where can I get a good HTML-Parser for C/C++ for free?
0
Comment
Question by:chenwei
7 Comments
 
LVL 17

Expert Comment

by:dorward
Comment Utility
0
 

Author Comment

by:chenwei
Comment Utility
Thanks for the info. I forget to say what I am looking for is an HTML Parser for C/C++ for MS Visual Sutdio. That means it could be compiled with MS Visual Studio.

The dillo seems for GNU-C

0
 

Expert Comment

by:jdewerth
Comment Utility
http://www.thefreecountry.com/sourcecode/cpp.shtml

try c++ class library

"html parser" c++ in a google search brings up all sort of parsers
0
Highfive + Dolby Voice = No More Audio Complaints!

Poor audio quality is one of the top reasons people don’t use video conferencing. Get the crispest, clearest audio powered by Dolby Voice in every meeting. Highfive and Dolby Voice deliver the best video conferencing and audio experience for every meeting and every room.

 
LVL 1

Expert Comment

by:bsimser
Comment Utility
There's a GPL piece of software called HTMLDOC. It's made for converting HTML to PDF files (or RTF) but has a pretty good HTML parser in it and can handle all the basic tags. While it's not exactly made for what you're trying to do, once you suck in the HTML it's all strung up in a object tree that you can do whatever you want with.

You can find the source here:
http://www.easysw.com/htmldoc/software.php

An alternative is to first run your HTML through HTMLTidy to create XHMTL then use a regular XML parser (like Xerces) to parse out what you want.

You can find HTML Tidy here:
http://tidy.sourceforge.net/

and Xerces here:
http://xml.apache.org/xerces-c/

Everything is open source and can probably give you what you want.

-Bil
0
 

Author Comment

by:chenwei
Comment Utility
Thanks to all sites. I've found out an HTML Parser, libxml2.

Please don't answer my question any more.
0
 
LVL 1

Accepted Solution

by:
Computer101 earned 0 total points
Comment Utility
PAQed with points refunded (20)

Computer101
EE Admin
0

Featured Post

Threat Intelligence Starter Resources

Integrating threat intelligence can be challenging, and not all companies are ready. These resources can help you build awareness and prepare for defense.

Join & Write a Comment

Suggested Solutions

Most of the sites are being standardized with W3C Web Standards. W3C provides lot of web standard services to the web. They have the web specification, process and documentation for all the web standards. You can apply HTML, CSS and Accessibility st…
Browsers only know CSS so your awesome SASS code needs to be translated into normal CSS. Here I'll try to explain what you should aim for in order to take full advantage of SASS.
Viewers will learn about the different types of variables in Java and how to declare them. Decide the type of variable desired: Put the keyword corresponding to the type of variable in front of the variable name: Use the equal sign to assign a v…
The viewer will learn the basics of jQuery including how to code hide show and toggles. Reference your jQuery libraries: (CODE) Include your new external js/jQuery file: (CODE) Write your first lines of code to setup your site for jQuery…

771 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

9 Experts available now in Live!

Get 1:1 Help Now