innerHTML of a span containing a IMG object incorrect in IE

Ok, here's the deal, I'm trying to make a user interface where a person can click on a "SPAN", then see a popup with a textbox containing the the innerHTML of that SPAN.

So, here's a fun one:
<SPAN onclick='alert(this.innerHTML)'><img id="mylogo" src="/images/logo/fader/small_allsmall.png" alt="LanguageCache!" style="border-width: 0px;"></img>

when you click on it with firefox, you get "<img id="mylogo" src="/images/logo/fader/small_allsmall.png" alt="LanguageCache!" style="border-width: 0px;">"

when you click on it with IE, you get "<IMG id=mylogo style="BORDER-TOP-WIDTH: 0px; BORDER-LEFT-WIDTH: 0px; BORDER-BOTTOM-WIDTH: 0px; BORDER-RIGHT-WIDTH: 0px" alt=LanguageCache! src="/images/logo/fader/small_allsmall.png">"

What the hell????  

So, since my program is going to save the original innerHTML, and the new HTML added by the user, then, on a new page load, cycle through the elements of the page looking for matches of that saved innerHTML and swap them out with the new HTML.

Works wonderfully..... oops....  If you edit in IE, the IE replacements show up.  If you edit in FireFox, the Firefox replacements show up.

So.... that's not going to work.

I'm really tired of screwing with this.  Anyone have any ideas?
Who is Participating?
b0lsc0ttIT ManagerCommented:
Ignoring the style and even element attributes would seem to be acceptable for what you described.  Then it won't matter if the browser views them differently since you won't be writing that.  A "pain" is still probably right  but at least not the big issue you saw.
If you need to provide an option for modifying the attributes in the tag, e.g. the style, then depend on the server script and how it reads the file.  That would be exact and could be modified reliably.
Let us know if you need any more help with this or have some question still with the question you asked here.  Good luck with that project.  Sounds interesting although I don't have the time now to explore the page you linked to.
Julian MatzJoint ChairpersonCommented:
How about you use external stylesheets instead of inline CSS?
DanielcmorrisAuthor Commented:
I'm writing an application that pastes into other people's websites.  Kind of an editing tool.  

So, the idea is that they can cut and paste a single JavaScript include file onto the page and my script will allow them to edit their page's contents.

So.... I have to be able to deal with sloppy code.
Ultimate Tool Kit for Technology Solution Provider

Broken down into practical pointers and step-by-step instructions, the IT Service Excellence Tool Kit delivers expert advice for technology solution providers. Get your free copy now.

I'm surprised you chose to do it that way, since innerHTML is only the HTML representation of the SPAN, you might just as well have used innerText and complained about that!

I'd have an array (JScript) of all my SPANS and display store the innerHTML contents there. When the user clicks the SPAN I'd display the data from the array, and when he changes it I'd store it back into the array as well as the SPAN.
DanielcmorrisAuthor Commented:
The only problem I'm having is that IE seems to change the innerHTML to match its style sheet formats.

So, if my user has <span id=sentence1 >Here is a picture of me <img src="pic.gif" style="border-width:0px"> in Vermont</span>

the results of "sentence1.innerHTML" will be different in IE and FireFox.

Calling alert(sentence1.innerHTML) will replace "border-width:0px" with "BORDER-TOP-WIDTH: 0px; BORDER-LEFT-WIDTH: 0px; BORDER-BOTTOM-WIDTH: 0px; BORDER-RIGHT-WIDTH: 0px"

I NEED to get the original HTML, not the alterations added by IE.

The HTML is NOT mine, I can't change the way these people program, I just need to be able to get that data exactly as it was coded by the original developer.

If I were to us innerText, it wouldn't give me the HTML..... which I NEED, otherwise I'd have used innerText.

Now, as far as an Array..... that's exactly what I have.

The thing is, if the user has 3 spans.  One says "Hello World", the other Says "<b>Hello</b> World" and the last says "Hello <img src='world.jpg' style='border-width:0px'>",

Just like you suggested, I already have a script which pulls up the innerHTML of the span and lets the user edit it.  Hitting "save" updates the innerHTML of that span as well as saving it to the Array - along with the original data.

When the page loads, it cycles through the SPAN elements and checks to see if there are any replacements to make.

So, if you clicked on "hello world" and changed it to "hola mundo", when the page opened up, it would have a 2 dimensional array with "hello world" in column 1 and "hola mundo" in column 2.  It would getElementByTagName("span"), cycle through those elements, and see if the innerHTML matched "hello world".  If we used innerText, then both "hello world" and "<b>hello</b> world" would match, but that would be wrong.  

Once it sees that there is a match for the innerHTML of the first Span, with the data held in the first column of the array, it simply says, "ok, swap out the contents with column2".  Now the page, when reloaded, has "hola mundo" displayed.

Simple enough.

The problem is when IE gets to the IMG.  When you click on that span and it has an IMG within it, IE changes the innerHTML to meet the style structures of IE.  So, it saves the original innerHTML into the array.  Then, when the page reloads, it compares the innerHTML of those spans with the first column of the array.  If it is the same, just like "hello world", it swaps it.  When it gets to the span with the IMG, it is looking for this altered MARGIN-TOP:0, etc...  that IE put in.  If you are reloading in IE, that's just fine, it works great, but if you try to use that same array that you built in IE when you load it up in FireFox..... the "innerHTML" doesn't contain those IE alterations to the style, so the match isn't made.

All I need to know is HOW do I get the original innerHTML  PRIOR to IE making any alterations.  Maybe at least a work-around that would tell me how to undo the IE changes?
I have understood your problem. I just see no reason for reading back the contents of SPAN tags.

I also see no reason why the property innerHTML should be "Indentity", that is what goes in must come out. My point with innerText was that you could put something in with innerHTML and not expect it to come out with innerText.

Thirdly as far as I can see the extra properties added are just the default properties taken at the time when the image was rendered. I don't believe that they'd really make any difference semantically, although I agree it looks a bit of a mess.

Fourthly why aren't you using an index (id or something) to identify what goes where, instead of comparing contents?

Lastly I thought that IMG was a block element and SPAN a line element, although I don't see why that would make any difference to the attributes being added. But you might just try a test with DIV.
DanielcmorrisAuthor Commented:
If I use an ID, then what do I do when the user doesn't have his objects ID'd.

<p>hello world</p>

So, the "identifier" is the text.  Period.  There are more reasons beyond identification that have to do with the actual business logic, but we need to work from there.

The contents of the Span tags must be exactly the same regardless of the browser, HTML code or whatever, it must be included - even commented out HTML.  I want everything between <span> and </span>, just as the programmer  typed it in.  That's just part of the business logic.

All I need is a function that will get the original contents.  No changes to the HTML (it isn't mine to begin with), no changes to the business logic (there's almost 100 pages and there are many more than just this feature).

I just need a way to rip out the original code... exactly as it was typed by the programmer.  Exactly.
b0lsc0ttIT ManagerCommented:
I have done some testing and I don't see any way to do this reliably.  The problem is actually bigger than you mentioned.  Also it isn't limited to IE.  The issue is browsers interpret the html and handle it as they are designed to.  When script looks at the DOM it is as the browser has seen it.  The difference in the browsers means there are often differences in what is seen in innerHTML or other DOM properties.
To be specific here what you notice happens in IE happens in the reverse in Firefox.  If you have ...
border-color: black; border-width: 5px; border-style: solid;
... then Firefox's innerHTML, cssText, or attributes will see ...
border: 5px solid black;
Each are the same but will be different from what is entered.  There is no way to reliably know how it was entered even in just one browser.  Since they each have the problem in different ways it is even harder to try to do this cross browser.
 Luckily the result will be the same for the page.  The style that was entered and the style shown by the browser are the same.  You could make script to guess at what was original and change it to what you think was entered but it would be nothing more than a guess since the style could've been entered exactly as the browser shows it.  The result is the same in either case and there is no way to know if it "modified" it.
Let me know if you have a question or need more info.
DanielcmorrisAuthor Commented:
Well that sure is frustrating.  Maybe I'll parse out the "style" attribute.  Right now, it works Server-side.... where every damn thing always works for me.  I really really really hate Javascript.  I miss the days of thin-client.

Well.  I'm going to see if I can come up with some regular expression to do the search.  I really hope there is some way I can identify the contents of each span/div/P/ etc... somehow.

I hate being brought to a dead stop because of a #*%&*@(@ browser issue.  It's just pathetic.
b0lsc0ttIT ManagerCommented:
Let us know how it works or if you have any more questions about this.
You mentioned this is to try to match the element so you can make the change.  Just a note or two about what you said on that.  The innerText method is IE only so would cause an even bigger issue if you were to rely on it.  Instead of trying to do a match to get the right element using its html or contents why not use its order in the DOM.  If there is an ID then use it.  However if there isn't then use document.getElementsByTagName to get the collection of those elements and match the index.  The modifications made by the user should not change this and it seems to be a very reliable way to get the right element.
Avoid the elements collection though because there are cases where that can be different from browser to browser.
Let me know if you have a question about any of this.
DanielcmorrisAuthor Commented:
Well, I've been using getElementsByTagName to get the text right now.  The program is actually a language caching script.  So, if you're getting your site translated by Systrans, and you want some custom work done, say the links in the menu, a few paragraphs about your company, some slogans or company terms that you want translated perfectly... things like that.  We keep it in a database.

When the user selects Spanish, it cycles through all the elements and calls a LanguageCache script to look for a spanish translation of the phrases of each element.  If there isn't a translation, it redirects the request for a spanish translation to Google or Systran.

This also includes objects, like Images or even full Divs.  So, if you want a custom menu for your products which has specific chinese names, you want all the image buttons to be changed to the chinese image buttons, you want to remove a few links to products which aren't available in chinese, all you need to do is open the site in the LanguageCache designer, click on that div... it opens up and displays all the HTML (exactly), then you can alter it as you want, check GLOBAL, and click save.  The database saves the entire contents of that DIV, and saves the new HTML that you've altered as well.

The next time someone loads it and chooses Chinese, the system hits that Div... checks for a match to the English contents of that DIV, finds those contents in the DB, gets the ID for the english, then goes to the translation table, looks for that ID and the ID for Chinese.  If it finds a record, it just swaps out the contents.

Since it is a "Global" change for the domain, every single page with that menu on it will be available in Chinese - custom.  

If, from the language dropdown, they were to select "Greek", and you didn't have a custom menu for Greek, then it would simply redirect the text contents of the objects within that DIV to google or Systran.

It seems like a real pain, but it is actually pretty easy.  All webmaster of the site  needs to do is paste in a single javascript include file.  The database is really straight-forward, the AJAX for google and Systran is designed by them, and the text content moving from and the client site is negligible.

However....  if you have a Global matching pair, and it is.... say, an image, that image may be placed anywhere on the page, so I can't really use element order.  The key really was getting the contents to match.  Now I thing I may need to have a set of regular expressions to filter out styles of individual objects.

what a pain.

If you'd like, you can check out.  It's still in extreme rough-draft, but I've actually got it working on 5 sites  (each installed in a single day)

Unfortunately, I generally do database systems that are Internal - inventory/call management/OLAP - all even designed for blind people!  (government regs)  So don't expect anything remotely professional!  I'm going to have to hire a real web developer as soon as I get some decent revenue from my existing clients.
DanielcmorrisAuthor Commented:
Thanks for all the help.  -dan
b0lsc0ttIT ManagerCommented:
Your welcome!  I'm glad I could help.  Thanks for the grade, the points and the fun question.
DanielcmorrisAuthor Commented:
Anytime.  Sometimes just having fresh eyes look at a problem can make things a lot easier.

or.... make me feel that I'm not a total idiot who can't figure out something simple.  If someone else can't come up with a quick solution I know I've got an excuse for tossing that laptop out the window!

Luckily, I've been sitting on a beach in Sicily working on this project, so it isn't nearly as stressful as my normal office.  :)

b0lsc0ttIT ManagerCommented:
I've been sitting on a beach in Sicily working on this project
How can you even use the word "pain" then?? :D  I've never been there but will let myself imagine it is very nice.  A beach (just about) anywhere beats an office any day so you are a lucky one.  Growing up and living pretty close to one makes it so I love them.  Enjoy it!  It couldn't really be that stressful on a beach. ;)
DanielcmorrisAuthor Commented:
Just for fun, I'll tell you my awful hack that actually works.  :)

So, I used a regex to strip out the extra style junk that IE puts in there.  However..... it turns out that IE also reorders the attributes, so the "STYLE" attribute is not by the front, while the other was at the end..... nice, so that didn't work out worth a damn.

What to do???

you're going to love this.

I split(",") the entire paragraph or whatever the innerHTML was into an array, then used sort() to sort the array, then I used join() to slap it back together.

I did the same to my copy and, tada, as long as I don't get a bunch of innerHTML that happens to have the exact combination of letters, numbers and symbols, I'm good to go.  

It is surprisingly fast!
Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

All Courses

From novice to tech pro — start learning today.