awilderbeast
asked on
function to find and replace all image sources in a string, i have code but some bugs outstanding...
hi all, my code below and the bugs
BUG 1 happens when there is no images in the string
BUG 2 happens with the code in the bottom
BUG 1 happens when there is no images in the string
BUG 2 happens with the code in the bottom
#################### BUG 1 #################################
System.ArgumentOutOfRangeException: Index was out of range. Must be non-negative and less than the size of the collection. Parameter name: startIndex at System.String.IndexOf(String value, Int32 startIndex, Int32 count, StringComparison comparisonType) at System.String.IndexOf(String value, Int32 startIndex, StringComparison comparisonType) at Functions.ExtractImages(String src, String replacement, Int32 counter, Dictionary`2& imageSources) in e:\netfolder\App_Code\functions.cs:line 114 at Documents.SendEmail(Object sender, EventArgs e) in e:\netfolder\documents\documents.aspx.cs:line 333
line 114 is :toIndex = src.IndexOf("/>", fromIndex);
line 333 is the function being run
htmlBody = Functions.ExtractImages(
Editor1.XHTML, // The HTML source to have <img> tags replaced
replacement, // What to replace the src with - the {0} part represents where the number should go - so you could use image{0}.jpg to get image1.jpg, image2.jpg, image3.jpg, etc
firstNumber, // Start with cid:image1 and go up 1 for each <img> tag
ref imgSources // The list of strings that will contain the original sources
);
################## BUG 2 ###############################
System.ArgumentOutOfRangeException: Index was out of range. Must be non-negative and less than the size of the collection. Parameter name: startIndex at System.String.IndexOf(String value, Int32 startIndex, Int32 count, StringComparison comparisonType) at System.String.IndexOf(String value, Int32 startIndex, StringComparison comparisonType) at Functions.ExtractImages(String src, String replacement, Int32 counter, Dictionary`2& imageSources) in e:\netfolder\App_Code\functions.cs:line 114 at Documents.SendEmail(Object sender, EventArgs e) in e:\netfolder\documents\documents.aspx.cs:line 333
##################### FUNCTIONS.cs ########################################
public static string ExtractImages(string src, string replacement, int counter,
ref Dictionary<string, string> imageSources)
{
int toIndex = 0, fromIndex = 0;
do
{
if (toIndex > src.Length) break;
fromIndex = src.IndexOf("<img", toIndex);
toIndex = src.IndexOf("/>", fromIndex);
string part = src.Substring(fromIndex, toIndex - fromIndex + 2);
string[] tokens = part.Split(new string[] { "\"" }, System.StringSplitOptions.RemoveEmptyEntries);
imageSources.Add(tokens[1], string.Format(replacement, counter));
src = src.Replace(tokens[1], string.Format(replacement, counter++));
} while (true);
return src;
}
################## DOCUMENTS.cs ##################################
Dictionary<string, string> imgSources = new Dictionary<string, string>();
// Start with cid:image1 and go up 1 for each <img> tag
int firstNumber = 1;
// What to replace the src with - the {0} part represents where the number should go - so you could use image{0}.jpg to get image1.jpg, image2.jpg, image3.jpg, etc
string replacement = "cid:image{0}";
// Run the SourceTextBox contents through the <img> tag replacer, and assign
// the results to the DestinationTextBox
htmlBody = Functions.ExtractImages(
Editor1.XHTML, // The HTML source to have <img> tags replaced
replacement, // What to replace the src with - the {0} part represents where the number should go - so you could use image{0}.jpg to get image1.jpg, image2.jpg, image3.jpg, etc
firstNumber, // Start with cid:image1 and go up 1 for each <img> tag
ref imgSources // The list of strings that will contain the original sources
);
AlternateView htmlView = AlternateView.CreateAlternateViewFromString(htmlBody, null, "text/html");
// Now, write the original images and what they were replaced with
// to a temporary StringBuilder
foreach (KeyValuePair<string, string> originalSource in imgSources)
{
string strImageUrl = System.Web.HttpContext.Current.Server.MapPath(originalSource.Key);
LinkedResource image = new LinkedResource(strImageUrl);
image.ContentId = originalSource.Value.ToString().Replace("cid:", "");
htmlView.LinkedResources.Add(image);
Editor1.Text += "<br />Key - " + originalSource.Key;
Editor1.Text += "<br />Value - " + originalSource.Value;
Editor1.Text += "<br />ImageURL - " + strImageUrl;
Editor1.Text += "<br />Imagename - " + originalSource.Value.ToString().Replace("cid:", "");
firstNumber++;
}
################ BUG 2 String ###########################
<title> News Letter</title>
<div style="background-position: center; background-repeat: no-repeat; font-family: Arial, Helvetica, sans-serif;
font-size: 12px; color: #333333; background-color: #220000; margin: 0px; padding: 0px;" width="100%">
<table width="800" border="0" cellspacing="0" cellpadding="0" bgcolor="#FFFFFF" align="center">
<tbody>
<tr>
<td width="50">
<div unselectable="ON" contenteditable="false">
<img src="/documents/mail_images/outer_top_left.jpg" alt="bg" />
</div>
</td>
<td width="700">
<div unselectable="ON" contenteditable="false">
</div>
</td>
<td width="50" align="right">
<div unselectable="ON" contenteditable="false">
<img src="/documents/mail_images/outer_top_right.jpg" alt="bg" />
</div>
</td>
</tr>
<tr>
<td width="50">
<div unselectable="ON" contenteditable="false">
</div>
</td>
<td width="700">
<table width="700" border="0" cellspacing="0" cellpadding="0">
<tbody>
<tr>
<td width="300">
<div unselectable="ON" contenteditable="false">
<img src="/documents/mail_images/header_cw.jpg" alt="Construction Works Logo" />
</div>
</td>
<td width="400" align="center">
<div unselectable="ON" contenteditable="false">
<p style="font-size: 10px;">
<a href="javascript:;">Click here for printer friendly version</a>
</p>
</div>
<h2 style="color: #990000;">
Type Title Here</h2>
</td>
</tr>
</tbody>
</table>
</td>
<td width="50">
<div unselectable="ON" contenteditable="false">
</div>
</td>
</tr>
<tr>
<td width="50">
<div unselectable="ON" contenteditable="false">
<img src="/documents/mail_images/spacer.jpg" alt="Spacer" width="50" />
</div>
</td>
<td width="700">
<table width="700" border="0" cellspacing="0" cellpadding="0" bgcolor="#EEEEEE">
<tbody>
<tr>
<td width="20" valign="top">
<div unselectable="ON" contenteditable="false">
<img src="/documents/mail_images/inner_top_left.jpg" alt="bg" />
</div>
</td>
<td width="640" colspan="3" align="center">
<div unselectable="ON" contenteditable="false">
<img src="/documents/mail_images/shadow2.jpg" align="top" alt="background image 1" />
</div>
</td>
<td width="20" align="right" valign="top">
<div unselectable="ON" contenteditable="false">
<img src="/documents/mail_images/inner_top_right.jpg" alt="bg" />
</div>
</td>
</tr>
<tr>
<td width="20">
<div unselectable="ON" contenteditable="false">
</div>
</td>
<td width="180" bgcolor="#CCCCCC" align="center" valign="top">
<table width="100%" height="100%" border="0" cellspacing="0" cellpadding="0">
<tbody>
<tr>
<td valign="top" align="left">
<div unselectable="ON" contenteditable="false">
<img src="/documents/mail_images/images_top_left.jpg" alt="bg" />
</div>
</td>
<td align="center">
<div unselectable="ON" contenteditable="false">
<img src="/documents/mail_images/shadow3.jpg" alt="bg" />
</div>
</td>
<td valign="top" align="right">
<div unselectable="ON" contenteditable="false">
<img src="/documents/mail_images/images_top_right.jpg" alt="bg" />
</div>
</td>
</tr>
<tr>
<td>
<div unselectable="ON" contenteditable="false"> </div>
</td>
<td align="center">
<p>
<span style="border: 1px solid #666666;">
<img src="/documents/mail_images/image1.jpg" alt="bg" />
</span>
</p>
<p>
<span style="border: 1px solid #666666;">
<img src="/documents/mail_images/image2.jpg" alt="" />
</span>
</p>
<p>
<span style="border: 1px solid #666666;">
<img src="/documents/mail_images/image3.jpg" alt="" />
</span>
</p>
</td>
<td>
<div unselectable="ON" contenteditable="false"> </div>
</td>
</tr>
</tbody>
</table>
</td>
<td width="20">
<div unselectable="ON" contenteditable="false">
</div>
</td>
<td width="440">
Type Content Here
</td>
<td width="20">
<div unselectable="ON" contenteditable="false">
</div>
</td>
</tr>
<tr>
</tr>
<tr>
<td width="20">
<div unselectable="ON" contenteditable="false">
<img src="/documents/mail_images/inner_bot_left.jpg" alt="" />
</div>
</td>
<td width="660" colspan="3" align="center">
<div unselectable="ON" contenteditable="false">
</div>
</td>
<td width="20" align="right">
<div unselectable="ON" contenteditable="false">
<img src="/documents/mail_images/inner_bot_right.jpg" alt="" />
</div>
</td>
</tr>
<tr>
<td colspan="5" bgcolor="#FFFFFF" align="center">
<div unselectable="ON" contenteditable="false">
<img src="/documents/mail_images/shadow.jpg" align="top" alt="background image 2" />
</div>
</td>
</tr>
<tr>
<td colspan="5" bgcolor="#FFFFFF">
<div unselectable="ON" contenteditable="false">
<table width="700" border="0" cellspacing="0" cellpadding="0">
<tbody>
<tr>
<td width="340" align="right" valign="top">
</td>
</tr>
</tbody>
</table>
</div>
</td>
</tr>
</tbody>
</table>
</td>
<td width="50">
<div unselectable="ON" contenteditable="false">
<img src="/documents/mail_images/spacer.jpg" alt="Spacer" width="50" />
</div>
</td>
</tr>
<tr>
<td width="50" align="left" valign="bottom">
<div unselectable="ON" contenteditable="false">
<img src="/documents/mail_images/outer_bot_left.jpg" alt="" />
</div>
</td>
<td width="700">
<div unselectable="ON" contenteditable="false">
</div>
</td>
<td width="50" align="right" valign="bottom">
<div unselectable="ON" contenteditable="false">
<img src="/documents/mail_images/outer_bot_right.jpg" alt="" />
</div>
</td>
</tr>
</tbody>
</table>
</div>
SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
ASKER
hey tgerbert the code you jsut posted, i never got it to work with a dictionary, i didnt know how to split the m.groups[2].value up
If put breakpoints in your code you can step through it a line at a time, and just hover the mouse over variables to see their values - m.Groups[2].Value just has "/original/path/to/picture .jpg" in it, you wouldn't want to split it up.
Also, it's important to consult documentation - you can go to http://msdn.microsoft.com/library and lookup things like String.Format(), Regex.Replace(), delegate keyword, etc.
This is a little verbose, I hope it makes sense, I'm not really getting where you're stuck and where you're good so I just tried to cover all the bases.
DON'T COPY & PASTE THIS CODE - UNDERSTAND IT'S CONCEPTS AND RE-APPLY THEM (do copy/paste the regular expression though, they're such a pain to get right).
Also, it's important to consult documentation - you can go to http://msdn.microsoft.com/library and lookup things like String.Format(), Regex.Replace(), delegate keyword, etc.
This is a little verbose, I hope it makes sense, I'm not really getting where you're stuck and where you're good so I just tried to cover all the bases.
DON'T COPY & PASTE THIS CODE - UNDERSTAND IT'S CONCEPTS AND RE-APPLY THEM (do copy/paste the regular expression though, they're such a pain to get right).
using System;
using System.Collections.Generic;
using System.Web;
using System.Text.RegularExpressions;
public class ImgTagReplacer
{
// This regular expression looks for and matches an <img /> tag: <img\s+[^>]*src\s*=\s*('|"")(.*?)('|"")[^>]*/>
// The parts in parentheses are rememberd as "Groups", and the first
// group starts and 0 and is always the entire match - group 1 is
// the part of the match inside the first set of parentheses, group 2
// is the second set of parentheses, etc. In our particular case
// the second set of parentheses represents the path to the image, e.g. /images/picture.jpg,
// so if the regular expression matches we can then get the original
// path from the ".Groups[2].Value" of the match
// This Regex object is declared at the class-level as a static, this way we only ever use
// one Regex object (as opposed to creating a new Regex everytime the ReplaceImgTags() method
// is called) - and this regex is marked with the "Compiled" option which improves performance
private static Regex regex = new Regex(@"<img\s+[^>]*src\s*=\s*('|"")(.*?)('|"")[^>]*/>", RegexOptions.Compiled | RegexOptions.IgnoreCase);
public static string ReplaceImgTags(string SourceHtml, string ReplacementFormat, int FirstNumber, List<string> OriginalImageSources)
{
// The "ReplacementFormat" is expected to look like "cid:image{0}",
// where {0} will be replaced by a number, so you'd end up with "cid:image1", "cid:image2", etc.
// If instead ReplacementFormat was "{0}cid:image" you'd end up with "1cid:image", "2cid:image", etc.
// See how "{0}" is really just a place-holder for where the number goes? This is kinda how
// String.Format() and related methods work (in fact, later on String.Format() will be used
// to achieve the desired result).
// This try/catch block just checks to make sure ReplacementFormat is non-null and contains
// AT LEAST {0} (if it doesn't contain "{0}" somewhere, String.Format() will throw a fit later)
try
{
String.Format(ReplacementFormat, FirstNumber);
}
catch (ArgumentNullException)
{
throw new ArgumentException(
"ReplacementFormat should contain the string to serve as the replacement for the <img> source; it MUST contain at least \"{0}\", to indicate where the number occurs in the replacement.",
"ReplacementFormat");
}
catch (FormatException)
{
throw new ArgumentException(
"ReplacementFormat should contain the string to serve as the replacement for the <img> source; it MUST contain at least \"{0}\", to indicate where the number occurs in the replacement.",
"ReplacementFormat");
}
// The FirstNumber parameter lets you start the numbering at whatever number you want
// So you could START at cid:image17 if you needed to for some reason.
// The little section of code starting on the "counter++" line will be called for each
// match (i.e. for each <img /> tag), and since the first line of that code block
// is counter++, we initially need counter to start off at 1 less than FirstNumber,
// so that the first time the code section runs it gets incremented to the FirstNumber
int counter = FirstNumber - 1;
// The regex.Replace method as used here expects two parameters.
// The first is the SourceHtml string we're searching for <img /> tags.
// The second parameter is a method - but instead of actually writing
// a separate method in this class and just giving that method's name as the parameter,
// the CONTENTS of that method IS the second parameter - which is done with the "delegate"
// keyword and is known as an anonymous method (since it is in fact a method without a name)
// regex.Replace is going to search SourceHtml for every occurence of
// a sequence of characters that matches the regular expression we defined
// on line 20 - which means in this case it's going to search SourceHtml
// for each <img ... /> tag. Then it's going to replace that <img ... /> tag
// with whatever is returned by the "embedded" method
return regex.Replace(SourceHtml, delegate(Match m)
{
// Start of "embedded" method (aka Anonymous Method)
// m is a parameter passed into our function by regex.Replace, it is a "Match" object that
// contains details about the piece of SourceHtml that matched the expression (i.e. our <img ... /> tag)
counter++; // Increment the counter (self explanatory)
// OriginalImageSources is a List<string> (a list of strings) that may or may not have been passed
// into our ReplaceImgTags method
if (OriginalImageSources != null)
{
// If it is non-null add to the string list
OriginalImageSources.Add(m.Groups[2].Value);
}
return m.Value.Replace(m.Groups[2].Value, String.Format(ReplacementFormat, counter)); // end of embedded/anonymous method
});
}
}
ASKER
ok im begining to understand it and im giong to read your comments a few more times (like 20) see if i can get it to sink in
but at the mo that big string that ive inserted above, i just tried it again and got the below error
but at the mo that big string that ive inserted above, i just tried it again and got the below error
System.ArgumentException: An item with the same key has already been added. at System.Collections.Generic.Dictionary`2.Insert(TKey key, TValue value, Boolean add) at Functions.<>c__DisplayClass1.b__0(Match m) in e:\netfolder\App_Code\functions.cs:line 98 at System.Text.RegularExpressions.RegexReplacement.Replace(MatchEvaluator evaluator, Regex regex, String input, Int32 count, Int32 startat) at System.Text.RegularExpressions.Regex.Replace(String input, MatchEvaluator evaluator) at Functions.ReplaceImgTags(String SourceHtml, String ReplacementFormat, Int32 FirstNumber, Dictionary`2 imageSources) in e:\netfolder\App_Code\functions.cs:line 94 at Documents.SendEmail(Object sender, EventArgs e) in e:\netfolder\documents\documents.aspx.cs:line 333
SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
ASKER
thanks alot!
begining to understand this now, ive saved the sample to study as i set up my test windows application
thanks again!
if youve got the time bert can you show me what you mean by updating the linq using a dictionary in this post
https://www.experts-exchange.com/questions/26858845/LINQ-function-no-errors-but-not-updating-or-inserting.html
ive googled linq and updating with dictionary and i get nothing
thanks
begining to understand this now, ive saved the sample to study as i set up my test windows application
thanks again!
if youve got the time bert can you show me what you mean by updating the linq using a dictionary in this post
https://www.experts-exchange.com/questions/26858845/LINQ-function-no-errors-but-not-updating-or-inserting.html
ive googled linq and updating with dictionary and i get nothing
thanks
ASKER
bug two has now changed to the below
which i actuall dont think is to do with the mentioned functions
new bug line 344 is
string strImageUrl = System.Web.HttpContext.Cur
and if you look at the ouputs that i return the key isnt cid:image4 it has s path
i dont understand how this new error is even occuring, perhaps you can shed some light?
Thanks
Open in new window
Open in new window
Open in new window