notepad loses formatting after saving changes

i do some translation work, and when i download the file - and do not change anything, it saves normally, formatted and all.
But when i translate some strings, all formatting get's lost when i save the file.
I have been forced to use another program -instead of the normal Windows notepad -  to have a workaround - but right now, i'd like to know WHY it happens, and if there is a cure, since it keeps coming back (i believe i even had the problem on my 32- bit OS)

i noticed that only on the file for that program - others do NOT have the problem
the file is in UTF-8 format
LVL 94
Who is Participating?

[Product update] Infrastructure Analysis Tool is now available with Business Accounts.Learn More

I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

David Johnson, CD, MVPOwnerCommented:
the file is in UTF-8 format

Did you forget to change the encoding?
nobusAuthor Commented:
what do you mean?  it is in UTF-8 and i did not change anything
what am i supposed to change ?
options are unicode - unicode big endian - ANSI
What ve3ofa means is that when you save the notepad file, check that you select UTF-8 in de dropdown box when using Save As.

The options I get in my notepad Save As are:
ANSI, unicode, unicdoe big endian and UTF-8.

If UTF-8 is missing from the notepad save as option list, please let us know. Perhaps it is due to the version of Windows you have.

Another very good notepad application which can do the same is Notepad++.
Protecting & Securing Your Critical Data

Considering 93 percent of companies file for bankruptcy within 12 months of a disaster that blocked access to their data for 10 days or more, planning for the worst is just smart business. Learn how Acronis Backup integrates security at every stage

nobusAuthor Commented:
if you care to read, i posted : it is in UTF-8 and i did not change anything - meaning it is save d as UTF-8
and i know nNotepad++, it's the one i used before -as said, i want to look WHY it happensonly on this file
I understand that. The question I asked was: What version of Windows are you using?

And do you get the option to save as UTF-8 in your notepad version?

I do understand that the file was previously UTF-8 and that it should still be in UT-8. Why it loses it's formatting usually only occurs when it changes from type. That is why we need to know if your notepad still saves as UTF-8.
nobusAuthor Commented:
i said i did not change it, SO IT WAS SAVED AS UTF-8
yes the option is there - i meant OTHER options were the 3 i posted
the windows i use is posted in my Question
I am sorry, I cannot help you further with this problem.

I have no idea why a file would not save the layout properly, especially for this one type of file.

I wish luck in finding the answer.
Are you sure it's actually "losing" the formatting, or is it really just not displaying the formatting that was previously shown in Notepad?

i.e. can you copy and paste it from Notepad to a web form (into a Post a Comment box here, for example), and the formatting magically returns?
nobusAuthor Commented:
here is how it looks AFTER saving the file  :
Hi Nobus, could you please update a before saving file.

Well, that's not really what I asked.
e.g. here's what a procedure to make a bootable USB for Dell XP-SP3 OEM looks like in Notepad after saving it, but below that is the same procedure, simply copied and pasted from Notepad to this comment box. Notice how the formatting that 'seems' missing in Notepad has magically reappeared.

How procedure below 'looks' in Notepad      /media/MBW00      nfs      rw,timeo=10,intr      0      0

[root@AX4P3000 ~]# fdisk /dev/sdb

Command (m for help): d
Selected partition 1

Command (m for help): n
Command action
   e   extended
   p   primary partition (1-4)
Partition number (1-4): 1
First cylinder (1-124, default 1):
Using default value 1
Last cylinder, +cylinders or +size{K,M,G} (1-124, default 124):
Using default value 124

Command (m for help): t
Selected partition 1
Hex code (type L to list codes): 6
Changed system type of partition 1 to 6 (FAT16)

Command (m for help): a
Partition number (1-4): 1

Command (m for help): w
The partition table has been altered!

    (Remove USB stick and re-insert)

[root@AX4P3000 ~]# df -h
Filesystem            Size  Used Avail Use% Mounted on
/dev/sda3              50G   41G  6.5G  87% /
tmpfs                 2.0G  384K  2.0G   1% /dev/shm
/dev/sda2             1.2G  132M 1020M  12% /boot
/dev/sda5             4.1G  3.9G   71M  99% /home
/dev/sda1              47G   26G   22G  55% /media/win7-666
/dev/sr1              5.5M  5.5M     0 100% /media/U3 System
/dev/sdb1             971M  4.0K  971M   1% /media/3810-1D5D

[root@AX4P3000 ~]# umount /dev/sdb1

[root@AX4P3000 ~]# mkdosfs -F 32 -n XPPro /dev/sdb1
mkdosfs 3.0.9 (31 Jan 2010)

[root@AX4P3000 ~]# dd if=/Temp/Dell/Dell-XPPro-wSP3.iso of=/dev/sdb bs=8M
85+1 records in
85+1 records out
718554816 bytes (719 MB) copied, 106.441 s, 6.8 MB/s

/usr/bin/livecd-iso-to-disk --format --reset-mbr /Temp/Dell/Dell-XPPro-wSP3.iso /dev/sdb

qemu -hda /dev/sdf -m 256 -vga std
Hmmm...  I'm thinking that sample NFS /etc/fstab 'mount' command doesn't actually belong to that bootable USB procedure...  it just got mixed in because the formatting appears to be missing in Notepad.
nobusAuthor Commented:
Darr247 i think your idea is good, - when i copy the (let's call it scrambled) scrambled txt to the comment box, the formatting reappears.
So good thing is - it's still there
Only point left : how to get it back in Notepad
Weirdly, if I paste it into Wordpad, the formatting is there.
Then, in Wordpad, if I use Save as on the file menu 'tab', and choose 'Other formats' then in Save as type select Unicode Text Document (with .txt extension), I get a warning that it's going to lose all formatting, which I OK through, then finish the Save... when I double-click it, it still opens in Notepad (because of the .txt association), but now the formatting is visible again.

There is no UTF-8 option in Wordpad...  but I *think* UTF-8 is a subset of Unicode, isn't it?
nobusAuthor Commented:
where is wordpad?  i have Office 2010
Orb/Start -> All Programs -> Accessories

The path to it in my win7 x64 is "%ProgramFiles%\Windows NT\Accessories\wordpad.exe"

But if you have Word 2010, that should work too.
Nobus any chance you can upload one of the files before you save it in notepad.
Then we can find out the original encoding thats in it.
nobusAuthor Commented:
i copied it into Wordpad - but it was also without formatting
i attach the original file
The whole thing reminds me of when I take a text file made in linux (which has only the 0x0A <LF> line terminator instead of 0x0D0A <CR><LF>) and open it in notepad.

You might try opening the subject files in a HEX editor and see what's actually there.
I just opened Notepad, hit Enter, did Save As, chose Unicode as the encoding and named it Enter-Unicode.txt, then closed Notepad.
Repeated that but choosing UTF-8 encoding and ANSI encoding, completely closing Notepad at the end of making each file.

Here's the 'dir' of the 3 files:

2012-03-17  04:56                 2 Enter-ANSI.txt
2012-03-17  04:55                 6 Enter-Unicode.txt
2012-03-17  04:56                 5 Enter-UTF-8.txt

Will open them in a HEX editor, next, because I *really* want to see how 1 Enter uses 6 bytes.
Quite a difference, though all 3 actually have 0x0D0A rather than just 0x0A.

1 Enter character; 3 encodings
David Johnson, CD, MVPOwnerCommented:
What you are seeing is the Byte Order Mark which can cause some problems

I did notice that the original file Notepad++ calls it a mackintosh UTF8 File

note: macintosh utf8 presumably new lines are just 0a not 0d0a
actually under further review it has the BOM EF BB BF and newlines are 0d 0d 0a (cr/cr/lf) and when saved with notepad the single newlines are stripped out. notepad++ keeps the formatting as per the original

a straight save in wordpad the BOM is stripped and same with the newlines as per notepad
nobusAuthor Commented:
and what is  a solution then ve3ofa ?  (where did you find that nickname?)
nobusAuthor Commented:
i want to say  -if i do not change anything to the file, i can save it without havving the problem; only when i replace the string after the  = sign and save it - formatting gets lost
David Johnson, CD, MVPOwnerCommented:
well for this application I'd use notepad++ and not wordpad/notepad or even word.. what was really surprising was the 2 x cr and then the lf in the original file. as for the handle it is/was my amateur radio callsign.  I'm an advanced amateur without a radio.
As people have pointed out, it is standard UTF-8 Encoding with a <CR><CR><LF> at the end of a line.
I have also been able to replicate your issue in notepad and fix it.
Start notepad and turn word wrap off. (NO File loaded, just notepad itself)
Close notepad.

Then try opening your text document in notepad and editing it, and then saving it.(No need for Save as even)
Word wrap turned on seems to be stripping the <CR><CR><LF> when you save it.
When it is turned off it does not.
Pointed me to word wrap doing something funny with <CR><CR><LF> and it seems, to do the word wrapping on screen notepad inserts <CR><CR><LF>, but strips <CR><CR><LF> when saving. Because your file has <CR><CR><LF> at the end of each line it gets stripped when saving with word wrap on.
If your file had the normal <CR><LF> at the end of lines it would not be an issue.

This can be tested simply by having word wrap on in notepad and opening the file.
Then turn word wrap off. All <CR><CR><LF>'s get removed and also the formatting of your file without you changing a thing other than turning off word wrap in notepad.
Just stick with word wrap turned off in notepad for those files and you will be fine. Formatting and <CR><CR><LF> will remain as it was.

nobusAuthor Commented:
Ok guys, this looks like it's going in the right direction.
to summarise, i have 2 solutions : use notepad ++, or use notepad with word wrap off, right?

last questions : do you see any way to get the "scrambled" file to dispaly properly again?

and if i want to change the notepad exe, which Hex editor do you recommend for that?
Wow@ the notepad bug. And it's not just in XP... I'm running Win7. I saw it happening and didn't even realize Notepad was doing it. The Notepad (v6.1.7600.16385, 192,536 bytes) in my Win7 x64 ( I don't know how to tell if it's actually a 64-bit application) does not have the string that blog article cites, so it apparently can't be fixed.

The method I described earlier -- paste into Wordpad, Save As... 'Other' ->Unicode, then open in Notepad (with Word Wrap turned off, ALWAYS from now on, I guess) and then back in Notepad using File->Save As... 'UTF-8'  seems like it should convert them back, but I don't have any files to test it on that were UTF-8 to begin with (I've always just used Unicode, not UTF-8).
last questions : do you see any way to get the "scrambled" file to display properly again?

No. once the <CR><CR><LF> are gone there is no way to put them back other than manually going through the file and hitting enter where it's needed.(And that will put in the standard <CR><LF>)

And you have a 3rd option, use a hex or advanced text editor to replace all occurences of <CR><CR>  with <CR>
That would bring the file back in line with standard text and then notepad with word wrap turned on will be fine, but your other 2 options are correct. Notepad with word wrap turned off, or another editor like notepad++.

And you are correct, my Win7 64 bit notepad does not have the hex string either, but that is not surprising as they would not just be re-compiling it for all these years.
It can be fixed, just a different hex string so notepad would need to be debugged again to find the correct point to make the change. I'd not really call it a bug, just the method Microsoft has chosen to do word wrap so it is easily removable when saving. Unfortunately the file is using <CR><CR><LF> for end of line.
nobusAuthor Commented:
So the Control characters are effectively removed?

>>  , use a hex or advanced text editor   <<   Which one do you recommend?
David Johnson, CD, MVPOwnerCommented:
for your purposes an advanced text editor a hex editor is just too clunky
nobusAuthor Commented:
ok -  But what should i use?  
notepad++ maybe?
As shown in the screen grab posted earlier, I use XVI32 in win7 x64 without any problems.

Note I adjust the window size so it shows 16 bytes wide... that way it makes the offset addresses shown on the left side on the standard boundaries.
nobusAuthor Commented:
hey Darr - tx for that post, - i will definitely try it

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
nobusAuthor Commented:
i was VERY pleased with the ideas and solutions, and help offered
i sure would like to hand out more points - since you helped me understand an OLD problem, that kept reappearing

tx to all !
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today

From novice to tech pro — start learning today.