Expiring Today—Celebrate National IT Professionals Day with 3 months of free Premium Membership. Use Code ITDAY17

x
?
Solved

Advanced C# string parse to pull specific string and substrings out of a larger string.

Posted on 2008-10-27
3
Medium Priority
?
1,999 Views
Last Modified: 2013-12-17
I have a C# string that represents HTML code.  I need to parse the string and find each instance of the img element.  Once the img element is taken out, I need to then extract the src attribute from the img element.  

Example if I have the C# string, I need to find <img src="http://www.someurl.com/image1.gif" /> and assign it to a string.  Then I need to parse the new string for the src http://www.someurl.com/image1.gif.  I am somewhat familar with C# string functions, but I am not sure how to begin working on this.

If you need more clarification please let me know.  Thank you for your help.
0
Comment
Question by:shanemay
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
3 Comments
 
LVL 9

Accepted Solution

by:
gdupadhyay earned 2000 total points
ID: 22816337
You can do it by regural expression easily.
Add

using System.IO;
using System.Text;
using System.Text.RegularExpressions;

Now in function you have to write following:
// File Path.
string filename = "C:\\test\\Test.txt";

string pattern = @"<img .* />";

FileStream file = new FileStream(filename, FileMode.Open, FileAccess.Read);
StreamReader sr = new StreamReader(file);
string strFileText;
strFileText = sr.ReadToEnd();
Regex re = new Regex(pattern1);
MatchCollection mc = re.Matches(strFileText);


int mIdx = 0;
string strTemp1;
string strTemp2;
foreach (Match m in mc)
{
string strTemp = m.ToString();
strTemp1 = strTemp.Replace("<img src=", "");
strTemp2 = strTemp1.Replace(" />", "");
}

The final string strTemp2 is "http://www.someurl.com/image1.gif".

I have tested this code and working fine.

Please let me know, if you have any question.

Good Luck

 
0
 

Author Closing Comment

by:shanemay
ID: 31510481
Thank you so much for the quick response.  I really appreciate the crystal clear code example.  This is exactly what I needed,  I did not think I would have this working today.  Again, thank you so much.  
0
 
LVL 8

Expert Comment

by:mkosbie
ID: 22816810
RegEx's are definitely the way to go, but the code provided is pretty cumbersome.  It won't match any img tag with more than a src attribute (eg <img src="img.jpg" id="img1">, and it does a lot of extra processing. You can extract everything you need in one pass with a function like this (this returns the sources in an arraylist):
    private ArrayList getImageSources(String HTML)
    {
        Regex re = new Regex("<img\\s[^>]*?src=[\"']([^\"']+)[\"'][^>]*>", RegexOptions.IgnoreCase);
        MatchCollection matches = re.Matches(HTML);
 
        ArrayList sources = new ArrayList();
        foreach (Match m in matches) {
            sources.Add(m.Groups[1]);
        }
 
        return sources;
    }

Open in new window

0

Featured Post

On Demand Webinar: Networking for the Cloud Era

Ready to improve network connectivity? Watch this webinar to learn how SD-WANs and a one-click instant connect tool can boost provisions, deployment, and management of your cloud connection.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Wouldn’t it be nice if you could test whether an element is contained in an array by using a Contains method just like the one available on List objects? Wouldn’t it be good if you could write code like this? (CODE) In .NET 3.5, this is possible…
Real-time is more about the business, not the technology. In day-to-day life, to make real-time decisions like buying or investing, business needs the latest information(e.g. Gold Rate/Stock Rate). Unlike traditional days, you need not wait for a fe…
This course is ideal for IT System Administrators working with VMware vSphere and its associated products in their company infrastructure. This course teaches you how to install and maintain this virtualization technology to store data, prevent vuln…
This tutorial will teach you the special effect of super speed similar to the fictional character Wally West aka "The Flash" After Shake : http://www.videocopilot.net/presets/after_shake/ All lightning effects with instructions : http://www.mediaf…

719 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question