Solved

How to extract <IMG> tags from HTML file?

Posted on 2003-11-23
4
605 Views
Last Modified: 2013-11-19
Hello everyone:

anyone can tell me how to extract <IMG> tags from HTML file by using c#.net?
maybe using XML's parse function, I am not sure. please help me! thanks!

brownsbay

0
Comment
Question by:brownsbay
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
4 Comments
 
LVL 6

Accepted Solution

by:
purpleblob earned 20 total points
ID: 9809332
If the HTML is well formed (i.e. start and end tags) then you could load the HTML into an XML DOM and find all the img elements, however this is probably not the case, so a very simple alternative is to use the string class methods such as IndexOf. Are you actually wishing the extract, i.e. remove the <img> tags or simply find all of them ? If wishing to remove then obviously you will need to find the start <img> and it's end </img> and Remove (extract) the element.

If wishing to extract the <img> tags then unfortunately the string class is not very efficient with operations such as Remove, so you might wish to build an ArrayList of the start/end indices of the tags in the string then copy out the bits you want to keep into a StringBuilder - it's a shame StringBuilder has a Remove method but not Find or IndexOf - ah well we can't have it all :-)
0
 
LVL 10

Assisted Solution

by:ptmcomp
ptmcomp earned 20 total points
ID: 9812004
You can use SGML: http://www.gotdotnet.com/Community/UserSamples/Details.aspx?SampleGuid=b90fddce-e60d-43f8-a5c4-c3bd760564bc

or Regex:

Matches matches = Regex.Matches(html, "<img.*?>");
foreach(Match match in matches)
{
     Console.WriteLine(Match.Value);
}
0

Featured Post

Creating Instructional Tutorials  

For Any Use & On Any Platform

Contextual Guidance at the moment of need helps your employees/users adopt software o& achieve even the most complex tasks instantly. Boost knowledge retention, software adoption & employee engagement with easy solution.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Preface This is the third article about the EE Collaborative Login Project. A Better Website Login System (http://www.experts-exchange.com/A_2902.html) introduces the Login System and shows how to implement a login page. The EE Collaborative Logi…
Performance in games development is paramount: every microsecond counts to be able to do everything in less than 33ms (aiming at 16ms). C# foreach statement is one of the worst performance killers, and here I explain why.
The viewer will learn the benefit of using external CSS files and the relationship between class and ID selectors. Create your external css file by saving it as style.css then set up your style tags: (CODE) Reference the nav tag and set your prop…
HTML5 has deprecated a few of the older ways of showing media as well as offering up a new way to create games and animations. Audio, video, and canvas are just a few of the adjustments made between XHTML and HTML5. As we learned in our last micr…
Suggested Courses

623 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question