Link to home
Start Free TrialLog in
Avatar of awilderbeast
awilderbeastFlag for United Kingdom of Great Britain and Northern Ireland

asked on

function to find and replace all image sources in a string, i have code but some bugs outstanding...

hi all, my code below and the bugs

BUG 1 happens when there is no images in the string
BUG 2 happens with the code in the bottom
#################### BUG 1 #################################
System.ArgumentOutOfRangeException: Index was out of range. Must be non-negative and less than the size of the collection. Parameter name: startIndex at System.String.IndexOf(String value, Int32 startIndex, Int32 count, StringComparison comparisonType) at System.String.IndexOf(String value, Int32 startIndex, StringComparison comparisonType) at Functions.ExtractImages(String src, String replacement, Int32 counter, Dictionary`2& imageSources) in e:\netfolder\App_Code\functions.cs:line 114 at Documents.SendEmail(Object sender, EventArgs e) in e:\netfolder\documents\documents.aspx.cs:line 333

line 114 is :toIndex = src.IndexOf("/>", fromIndex);
line 333 is the function being run
            htmlBody = Functions.ExtractImages(
                Editor1.XHTML, // The HTML source to have <img> tags replaced
                replacement, // What to replace the src with - the {0} part represents where the number should go - so you could use image{0}.jpg to get image1.jpg, image2.jpg, image3.jpg, etc
                firstNumber, // Start with cid:image1 and go up 1 for each <img> tag
                ref imgSources  // The list of strings that will contain the original sources
            );

##################  BUG 2 ###############################
System.ArgumentOutOfRangeException: Index was out of range. Must be non-negative and less than the size of the collection. Parameter name: startIndex at System.String.IndexOf(String value, Int32 startIndex, Int32 count, StringComparison comparisonType) at System.String.IndexOf(String value, Int32 startIndex, StringComparison comparisonType) at Functions.ExtractImages(String src, String replacement, Int32 counter, Dictionary`2& imageSources) in e:\netfolder\App_Code\functions.cs:line 114 at Documents.SendEmail(Object sender, EventArgs e) in e:\netfolder\documents\documents.aspx.cs:line 333

Open in new window

##################### FUNCTIONS.cs ########################################
    public static string ExtractImages(string src, string replacement, int counter,
            ref Dictionary<string, string> imageSources)
    {
        int toIndex = 0, fromIndex = 0;

        do
        {
            if (toIndex > src.Length) break;
            fromIndex = src.IndexOf("<img", toIndex);
            toIndex = src.IndexOf("/>", fromIndex);
            string part = src.Substring(fromIndex, toIndex - fromIndex + 2);
            string[] tokens = part.Split(new string[] { "\"" }, System.StringSplitOptions.RemoveEmptyEntries);
            imageSources.Add(tokens[1], string.Format(replacement, counter));
            src = src.Replace(tokens[1], string.Format(replacement, counter++));
        } while (true);
        return src;
    }
################## DOCUMENTS.cs ##################################
Dictionary<string, string> imgSources = new Dictionary<string, string>();
            // Start with cid:image1 and go up 1 for each <img> tag
            int firstNumber = 1;
            // What to replace the src with - the {0} part represents where the number should go - so you could use image{0}.jpg to get image1.jpg, image2.jpg, image3.jpg, etc
            string replacement = "cid:image{0}";
            // Run the SourceTextBox contents through the <img> tag replacer, and assign
            // the results to the DestinationTextBox
            htmlBody = Functions.ExtractImages(
                Editor1.XHTML, // The HTML source to have <img> tags replaced
                replacement, // What to replace the src with - the {0} part represents where the number should go - so you could use image{0}.jpg to get image1.jpg, image2.jpg, image3.jpg, etc
                firstNumber, // Start with cid:image1 and go up 1 for each <img> tag
                ref imgSources  // The list of strings that will contain the original sources
            );
            AlternateView htmlView = AlternateView.CreateAlternateViewFromString(htmlBody, null, "text/html");
            // Now, write the original images and what they were replaced with
            // to a temporary StringBuilder
            foreach (KeyValuePair<string, string> originalSource in imgSources)
            {
                string strImageUrl = System.Web.HttpContext.Current.Server.MapPath(originalSource.Key);
                LinkedResource image = new LinkedResource(strImageUrl);
                image.ContentId = originalSource.Value.ToString().Replace("cid:", "");
                htmlView.LinkedResources.Add(image);
                              
                Editor1.Text += "<br />Key - " + originalSource.Key;
                Editor1.Text += "<br />Value - " + originalSource.Value;
                Editor1.Text += "<br />ImageURL - " + strImageUrl;
                Editor1.Text += "<br />Imagename - " + originalSource.Value.ToString().Replace("cid:", "");

                firstNumber++;
            }

Open in new window

################ BUG 2 String ###########################

<title> News Letter</title>
<div style="background-position: center; background-repeat: no-repeat; font-family: Arial, Helvetica, sans-serif;
font-size: 12px; color: #333333; background-color: #220000; margin: 0px; padding: 0px;" width="100%">
<table width="800" border="0" cellspacing="0" cellpadding="0" bgcolor="#FFFFFF" align="center">
     <tbody>
         <tr>
             <td width="50">
             <div unselectable="ON" contenteditable="false">
             <img src="/documents/mail_images/outer_top_left.jpg" alt="bg" />
             </div>
             </td>
             <td width="700">
             <div unselectable="ON" contenteditable="false">
             &nbsp;</div>
             </td>
             <td width="50" align="right">
             <div unselectable="ON" contenteditable="false">
             <img src="/documents/mail_images/outer_top_right.jpg" alt="bg" />
             </div>
             </td>
         </tr>
         <tr>
             <td width="50">
             <div unselectable="ON" contenteditable="false">
             &nbsp;</div>
             </td>
             <td width="700">
             <table width="700" border="0" cellspacing="0" cellpadding="0">
                 <tbody>
                     <tr>
                         <td width="300">
                         <div unselectable="ON" contenteditable="false">
                         <img src="/documents/mail_images/header_cw.jpg" alt="Construction Works Logo" />
                         </div>
                         </td>
                         <td width="400" align="center">
                         <div unselectable="ON" contenteditable="false">
                         <p style="font-size: 10px;">
                         <a href="javascript:;">Click here for printer friendly version</a>
                         </p>
                         </div>
                         <h2 style="color: #990000;">
                         Type Title Here</h2>
                         </td>
                     </tr>
                 </tbody>
             </table>
             </td>
             <td width="50">
             <div unselectable="ON" contenteditable="false">
             &nbsp;</div>
             </td>
         </tr>
         <tr>
             <td width="50">
             <div unselectable="ON" contenteditable="false">
             <img src="/documents/mail_images/spacer.jpg" alt="Spacer" width="50" />
             </div>
             </td>
             <td width="700">
             <table width="700" border="0" cellspacing="0" cellpadding="0" bgcolor="#EEEEEE">
                 <tbody>
                     <tr>
                         <td width="20" valign="top">
                         <div unselectable="ON" contenteditable="false">
                         <img src="/documents/mail_images/inner_top_left.jpg" alt="bg" />
                         </div>
                         </td>
                         <td width="640" colspan="3" align="center">
                         <div unselectable="ON" contenteditable="false">
                         <img src="/documents/mail_images/shadow2.jpg" align="top" alt="background image 1" />
                         </div>
                         </td>
                         <td width="20" align="right" valign="top">
                         <div unselectable="ON" contenteditable="false">
                         <img src="/documents/mail_images/inner_top_right.jpg" alt="bg" />
                         </div>
                         </td>
                     </tr>
                     <tr>
                         <td width="20">
                         <div unselectable="ON" contenteditable="false">
                         &nbsp;</div>
                         </td>
                         <td width="180" bgcolor="#CCCCCC" align="center" valign="top">
                         <table width="100%" height="100%" border="0" cellspacing="0" cellpadding="0">
                             <tbody>
                                 <tr>
                                     <td valign="top" align="left">
                                     <div unselectable="ON" contenteditable="false">
                                     <img src="/documents/mail_images/images_top_left.jpg" alt="bg" />
                                     </div>
                                     </td>
                                     <td align="center">
                                     <div unselectable="ON" contenteditable="false">
                                     <img src="/documents/mail_images/shadow3.jpg" alt="bg" />
                                     </div>
                                     </td>
                                     <td valign="top" align="right">
                                     <div unselectable="ON" contenteditable="false">
                                     <img src="/documents/mail_images/images_top_right.jpg" alt="bg" />
                                     </div>
                                     </td>
                                 </tr>
                                 <tr>
                                     <td>
                                     <div unselectable="ON" contenteditable="false">&nbsp;</div>
                                     </td>
                                     <td align="center">
                                     <p>
                                     <span style="border: 1px solid #666666;">
                                     <img src="/documents/mail_images/image1.jpg" alt="bg" />
                                     </span>
                                     </p>
                                     <p>
                                     <span style="border: 1px solid #666666;">
                                     <img src="/documents/mail_images/image2.jpg" alt="" />
                                     </span>
                                     </p>
                                     <p>
                                     <span style="border: 1px solid #666666;">
                                     <img src="/documents/mail_images/image3.jpg" alt="" />
                                     </span>
                                     </p>
                                     </td>
                                     <td>
                                     <div unselectable="ON" contenteditable="false">&nbsp;</div>
                                     </td>
                                 </tr>
                             </tbody>
                         </table>
                         </td>
                         <td width="20">
                         <div unselectable="ON" contenteditable="false">
                         &nbsp;</div>
                         </td>
                         <td width="440">
                         Type Content Here
                         </td>
                         <td width="20">
                         <div unselectable="ON" contenteditable="false">
                         &nbsp;</div>
                         </td>
                     </tr>
                     <tr>
                     </tr>
                     <tr>
                         <td width="20">
                         <div unselectable="ON" contenteditable="false">
                         <img src="/documents/mail_images/inner_bot_left.jpg" alt="" />
                         </div>
                         </td>
                         <td width="660" colspan="3" align="center">
                         <div unselectable="ON" contenteditable="false">
                         &nbsp;</div>
                         </td>
                         <td width="20" align="right">
                         <div unselectable="ON" contenteditable="false">
                         <img src="/documents/mail_images/inner_bot_right.jpg" alt="" />
                         </div>
                         </td>
                     </tr>
                     <tr>
                         <td colspan="5" bgcolor="#FFFFFF" align="center">
                         <div unselectable="ON" contenteditable="false">
                         <img src="/documents/mail_images/shadow.jpg" align="top" alt="background image 2" />
                         </div>
                         </td>
                     </tr>
                     <tr>
                         <td colspan="5" bgcolor="#FFFFFF">
                         <div unselectable="ON" contenteditable="false">
                         <table width="700" border="0" cellspacing="0" cellpadding="0">
                             <tbody>
                                 <tr>
                                     <td width="340" align="right" valign="top">
                                     </td>
                                 </tr>
                             </tbody>
                         </table>
                         </div>
                         </td>
                     </tr>
                 </tbody>
             </table>
             </td>
             <td width="50">
             <div unselectable="ON" contenteditable="false">
             <img src="/documents/mail_images/spacer.jpg" alt="Spacer" width="50" />
             </div>
             </td>
         </tr>
         <tr>
             <td width="50" align="left" valign="bottom">
             <div unselectable="ON" contenteditable="false">
             <img src="/documents/mail_images/outer_bot_left.jpg" alt="" />
             </div>
             </td>
             <td width="700">
             <div unselectable="ON" contenteditable="false">
             &nbsp;</div>
             </td>
             <td width="50" align="right" valign="bottom">
             <div unselectable="ON" contenteditable="false">
             <img src="/documents/mail_images/outer_bot_right.jpg" alt="" />
             </div>
             </td>
         </tr>
     </tbody>
</table>
</div>

Open in new window

SOLUTION
Avatar of Meir Rivkin
Meir Rivkin
Flag of Israel image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Avatar of awilderbeast

ASKER

thanks thats sorted bug one

bug two has now changed to the below

which i actuall dont think is to do with the mentioned functions

new bug line 344 is
string strImageUrl = System.Web.HttpContext.Current.Server.MapPath(originalSource.Key);

and if you look at the ouputs that i return the key isnt cid:image4 it has s path
i dont understand how this new error is even occuring, perhaps you can shed some light?

Thanks
#################### NEW BUG CODE ########################
            foreach (KeyValuePair<string, string> originalSource in imgSources)
            {
                string strImageUrl = System.Web.HttpContext.Current.Server.MapPath(originalSource.Key);
                LinkedResource image = new LinkedResource(strImageUrl);
                image.ContentId = originalSource.Value.ToString().Replace("cid:", "");
                htmlView.LinkedResources.Add(image);
                              
                Editor1.Text += "<br />Key - " + originalSource.Key;
                Editor1.Text += "<br />Value - " + originalSource.Value;
                Editor1.Text += "<br />ImageURL - " + strImageUrl;
                Editor1.Text += "<br />Imagename - " + originalSource.Value.ToString().Replace("cid:", "");

                firstNumber++;
            }

Open in new window

System.Web.HttpException (0x80004005): 'cid:image4' is not a valid virtual path. at System.Web.Util.UrlPath.CheckValidVirtualPath(String path) at System.Web.Util.UrlPath.Combine(String appPath, String basepath, String relative) at System.Web.VirtualPath.Combine(VirtualPath relativePath) at System.Web.HttpRequest.MapPath(VirtualPath virtualPath, VirtualPath baseVirtualDir, Boolean allowCrossAppMapping) at System.Web.HttpServerUtility.MapPath(String path) at Documents.SendEmail(Object sender, EventArgs e) in e:\netfolder\documents\documents.aspx.cs:line 344

Open in new window

############################ OUTPUTS ##############################
Key - /documents/mail_images/outer_top_left.jpg
Value - cid:image1
ImageURL - E:\netfolder\documents\mail_images\outer_top_left.jpg
Imagename - image1
Key - /documents/mail_images/outer_top_right.jpg
Value - cid:image2
ImageURL - E:\netfolder\documents\mail_images\outer_top_right.jpg
Imagename - image2
Key - /documents/mail_images/header_cw.jpg
Value - cid:image3
ImageURL - E:\netfolder\documents\mail_images\header_cw.jpg
Imagename - image3
Key - /documents/mail_images/spacer.jpg
Value - cid:image4
ImageURL - E:\netfolder\documents\mail_images\spacer.jpg
Imagename - image4
Key - /documents/mail_images/inner_top_left.jpg
Value - cid:image5
ImageURL - E:\netfolder\documents\mail_images\inner_top_left.jpg
Imagename - image5
Key - /documents/mail_images/shadow2.jpg
Value - cid:image6
ImageURL - E:\netfolder\documents\mail_images\shadow2.jpg
Imagename - image6
Key - /documents/mail_images/inner_top_right.jpg
Value - cid:image7
ImageURL - E:\netfolder\documents\mail_images\inner_top_right.jpg
Imagename - image7
Key - /documents/mail_images/images_top_left.jpg
Value - cid:image8
ImageURL - E:\netfolder\documents\mail_images\images_top_left.jpg
Imagename - image8
Key - /documents/mail_images/shadow3.jpg
Value - cid:image9
ImageURL - E:\netfolder\documents\mail_images\shadow3.jpg
Imagename - image9
Key - /documents/mail_images/images_top_right.jpg
Value - cid:image10
ImageURL - E:\netfolder\documents\mail_images\images_top_right.jpg
Imagename - image10
Key - /documents/mail_images/image1.jpg
Value - cid:image11
ImageURL - E:\netfolder\documents\mail_images\image1.jpg
Imagename - image11
Key - /documents/mail_images/image2.jpg
Value - cid:image12
ImageURL - E:\netfolder\documents\mail_images\image2.jpg
Imagename - image12
Key - /documents/mail_images/image3.jpg
Value - cid:image13
ImageURL - E:\netfolder\documents\mail_images\image3.jpg
Imagename - image13
Key - /documents/mail_images/inner_bot_left.jpg
Value - cid:image14
ImageURL - E:\netfolder\documents\mail_images\inner_bot_left.jpg
Imagename - image14
Key - /documents/mail_images/inner_bot_right.jpg
Value - cid:image15
ImageURL - E:\netfolder\documents\mail_images\inner_bot_right.jpg
Imagename - image15
Key - /documents/mail_images/shadow.jpg
Value - cid:image16
ImageURL - E:\netfolder\documents\mail_images\shadow.jpg
Imagename - image16

Open in new window

SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
hey tgerbert the code you jsut posted, i never got it to work with a dictionary, i didnt know how to split the m.groups[2].value up
If put breakpoints in your code you can step through it a line at a time, and just hover the mouse over variables to see their values - m.Groups[2].Value just has "/original/path/to/picture.jpg" in it, you wouldn't want to split it up.

Also, it's important to consult documentation - you can go to http://msdn.microsoft.com/library and  lookup things like String.Format(), Regex.Replace(), delegate keyword, etc.

This is a little verbose, I hope it makes sense, I'm not really getting where you're stuck and where you're good so I just tried to cover all the bases.

DON'T COPY & PASTE THIS CODE - UNDERSTAND IT'S CONCEPTS AND RE-APPLY THEM (do copy/paste the regular expression though, they're such a pain to get right).

using System;
using System.Collections.Generic;
using System.Web;
using System.Text.RegularExpressions;

public class ImgTagReplacer
{
	// This regular expression looks for and matches an <img /> tag: <img\s+[^>]*src\s*=\s*('|"")(.*?)('|"")[^>]*/>
	// The parts in parentheses are rememberd as "Groups", and the first
	// group starts and 0 and is always the entire match - group 1 is
	// the part of the match inside the first set of parentheses, group 2
	// is the second set of parentheses, etc.  In our particular case
	// the second set of parentheses represents the path to the image, e.g. /images/picture.jpg,
	// so if the regular expression matches we can then get the original
	// path from the ".Groups[2].Value" of the match

	// This Regex object is declared at the class-level as a static, this way we only ever use
	// one Regex object (as opposed to creating a new Regex everytime the ReplaceImgTags() method
	// is called) - and this regex is marked with the "Compiled" option which improves performance
	private static Regex regex = new Regex(@"<img\s+[^>]*src\s*=\s*('|"")(.*?)('|"")[^>]*/>", RegexOptions.Compiled | RegexOptions.IgnoreCase);

	public static string ReplaceImgTags(string SourceHtml, string ReplacementFormat, int FirstNumber, List<string> OriginalImageSources)
	{
		// The "ReplacementFormat" is expected to look like "cid:image{0}",
		// where {0} will be replaced by a number, so you'd end up with "cid:image1", "cid:image2", etc.
		// If instead ReplacementFormat was "{0}cid:image" you'd end up with "1cid:image", "2cid:image", etc.
		// See how "{0}" is really just a place-holder for where the number goes? This is kinda how
		// String.Format() and related methods work (in fact, later on String.Format() will be used
		// to achieve the desired result).
		// This try/catch block just checks to make sure ReplacementFormat is non-null and contains
		// AT LEAST {0} (if it doesn't contain "{0}" somewhere, String.Format() will throw a fit later)
		try
		{
			String.Format(ReplacementFormat, FirstNumber);
		}
		catch (ArgumentNullException)
		{
			throw new ArgumentException(
				"ReplacementFormat should contain the string to serve as the replacement for the <img> source; it MUST contain at least \"{0}\", to indicate where the number occurs in the replacement.",
				"ReplacementFormat");
		}
		catch (FormatException)
		{
			throw new ArgumentException(
				"ReplacementFormat should contain the string to serve as the replacement for the <img> source; it MUST contain at least \"{0}\", to indicate where the number occurs in the replacement.",
				"ReplacementFormat");
		}

		// The FirstNumber parameter lets you start the numbering at whatever number you want
		// So you could START at cid:image17 if you needed to for some reason.
		// The little section of code starting on the "counter++" line will be called for each
		// match (i.e. for each <img /> tag), and since the first line of that code block
		// is counter++, we initially need counter to start off at 1 less than FirstNumber,
		// so that the first time the code section runs it gets incremented to the FirstNumber
		int counter = FirstNumber - 1;

		// The regex.Replace method as used here expects two parameters.
		// The first is the SourceHtml string we're searching for <img /> tags.
		// The second parameter is a method - but instead of actually writing
		// a separate method in this class and just giving that method's name as the parameter,
		// the CONTENTS of that method IS the second parameter - which is done with the "delegate"
		// keyword and is known as an anonymous method (since it is in fact a method without a name)

		// regex.Replace is going to search SourceHtml for every occurence of
		// a sequence of characters that matches the regular expression we defined
		// on line 20 - which means in this case it's going to search SourceHtml
		// for each <img ... /> tag.  Then it's going to replace that <img ... /> tag
		// with whatever is returned by the "embedded" method
		return regex.Replace(SourceHtml, delegate(Match m)
		{
			// Start of "embedded" method (aka Anonymous Method)
			// m is a parameter passed into our function by regex.Replace, it is a "Match" object that
			// contains details about the piece of SourceHtml that matched the expression (i.e. our <img ... /> tag)

			counter++; // Increment the counter (self explanatory)

			// OriginalImageSources is a List<string> (a list of strings) that may or may not have been passed
			// into our ReplaceImgTags method
			if (OriginalImageSources != null)
			{
				// If it is non-null add to the string list 
				OriginalImageSources.Add(m.Groups[2].Value);
			}
			return m.Value.Replace(m.Groups[2].Value, String.Format(ReplacementFormat, counter)); // end of embedded/anonymous method
		});
	}
}

Open in new window

ok im begining to understand it and im giong to read your comments a few more times (like 20) see if i can get it to sink in

but at the mo  that big string that ive inserted above, i just tried it again and got the below error
System.ArgumentException: An item with the same key has already been added. at System.Collections.Generic.Dictionary`2.Insert(TKey key, TValue value, Boolean add) at Functions.<>c__DisplayClass1.b__0(Match m) in e:\netfolder\App_Code\functions.cs:line 98 at System.Text.RegularExpressions.RegexReplacement.Replace(MatchEvaluator evaluator, Regex regex, String input, Int32 count, Int32 startat) at System.Text.RegularExpressions.Regex.Replace(String input, MatchEvaluator evaluator) at Functions.ReplaceImgTags(String SourceHtml, String ReplacementFormat, Int32 FirstNumber, Dictionary`2 imageSources) in e:\netfolder\App_Code\functions.cs:line 94 at Documents.SendEmail(Object sender, EventArgs e) in e:\netfolder\documents\documents.aspx.cs:line 333

Open in new window

SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
ASKER CERTIFIED SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
thanks alot!
begining to understand this now, ive saved the sample to study as i set up my test windows application

thanks again!

if youve got the time bert can you show me what you mean by updating the linq using a dictionary in this post
https://www.experts-exchange.com/questions/26858845/LINQ-function-no-errors-but-not-updating-or-inserting.html

ive googled linq and updating with dictionary and i get nothing

thanks