We help IT Professionals succeed at work.

notepad loses formatting after saving changes

i do some translation work, and when i download the file - and do not change anything, it saves normally, formatted and all.
But when i translate some strings, all formatting get's lost when i save the file.
I have been forced to use another program -instead of the normal Windows notepad -  to have a workaround - but right now, i'd like to know WHY it happens, and if there is a cure, since it keeps coming back (i believe i even had the problem on my 32- bit OS)

i noticed that only on the file for that program - others do NOT have the problem
the file is in UTF-8 format
Comment
Watch Question

David Johnson, CDSimple Geek from the '70s
CERTIFIED EXPERT
Distinguished Expert 2019

Commented:
the file is in UTF-8 format

Did you forget to change the encoding?
CERTIFIED EXPERT
Distinguished Expert 2019

Author

Commented:
what do you mean?  it is in UTF-8 and i did not change anything
what am i supposed to change ?
options are unicode - unicode big endian - ANSI

Commented:
What ve3ofa means is that when you save the notepad file, check that you select UTF-8 in de dropdown box when using Save As.

The options I get in my notepad Save As are:
ANSI, unicode, unicdoe big endian and UTF-8.

If UTF-8 is missing from the notepad save as option list, please let us know. Perhaps it is due to the version of Windows you have.

Another very good notepad application which can do the same is Notepad++.
CERTIFIED EXPERT
Distinguished Expert 2019

Author

Commented:
if you care to read, i posted : it is in UTF-8 and i did not change anything - meaning it is save d as UTF-8
and i know nNotepad++, it's the one i used before -as said, i want to look WHY it happensonly on this file

Commented:
I understand that. The question I asked was: What version of Windows are you using?

And do you get the option to save as UTF-8 in your notepad version?

I do understand that the file was previously UTF-8 and that it should still be in UT-8. Why it loses it's formatting usually only occurs when it changes from type. That is why we need to know if your notepad still saves as UTF-8.
CERTIFIED EXPERT
Distinguished Expert 2019

Author

Commented:
i said i did not change it, SO IT WAS SAVED AS UTF-8
yes the option is there - i meant OTHER options were the 3 i posted
the windows i use is posted in my Question

Commented:
I am sorry, I cannot help you further with this problem.

I have no idea why a file would not save the layout properly, especially for this one type of file.

I wish luck in finding the answer.
CERTIFIED EXPERT
Commented:
Are you sure it's actually "losing" the formatting, or is it really just not displaying the formatting that was previously shown in Notepad?

i.e. can you copy and paste it from Notepad to a web form (into a Post a Comment box here, for example), and the formatting magically returns?
CERTIFIED EXPERT
Distinguished Expert 2019

Author

Commented:
here is how it looks AFTER saving the file  :
goed.txt
CERTIFIED EXPERT

Commented:
Hi Nobus, could you please update a before saving file.

Thanks
Tel
CERTIFIED EXPERT

Commented:
Well, that's not really what I asked.
e.g. here's what a procedure to make a bootable USB for Dell XP-SP3 OEM looks like in Notepad after saving it, but below that is the same procedure, simply copied and pasted from Notepad to this comment box. Notice how the formatting that 'seems' missing in Notepad has magically reappeared.

How procedure below 'looks' in Notepad

192.168.23.126:/nfs/Public      /media/MBW00      nfs      rw,timeo=10,intr      0      0


[root@AX4P3000 ~]# fdisk /dev/sdb

Command (m for help): d
Selected partition 1

Command (m for help): n
Command action
   e   extended
   p   primary partition (1-4)
p
Partition number (1-4): 1
First cylinder (1-124, default 1):
Using default value 1
Last cylinder, +cylinders or +size{K,M,G} (1-124, default 124):
Using default value 124

Command (m for help): t
Selected partition 1
Hex code (type L to list codes): 6
Changed system type of partition 1 to 6 (FAT16)

Command (m for help): a
Partition number (1-4): 1

Command (m for help): w
The partition table has been altered!

    (Remove USB stick and re-insert)

[root@AX4P3000 ~]# df -h
Filesystem            Size  Used Avail Use% Mounted on
/dev/sda3              50G   41G  6.5G  87% /
tmpfs                 2.0G  384K  2.0G   1% /dev/shm
/dev/sda2             1.2G  132M 1020M  12% /boot
/dev/sda5             4.1G  3.9G   71M  99% /home
/dev/sda1              47G   26G   22G  55% /media/win7-666
/dev/sr1              5.5M  5.5M     0 100% /media/U3 System
/dev/sdb1             971M  4.0K  971M   1% /media/3810-1D5D

[root@AX4P3000 ~]# umount /dev/sdb1

[root@AX4P3000 ~]# mkdosfs -F 32 -n XPPro /dev/sdb1
mkdosfs 3.0.9 (31 Jan 2010)

[root@AX4P3000 ~]# dd if=/Temp/Dell/Dell-XPPro-wSP3.iso of=/dev/sdb bs=8M
85+1 records in
85+1 records out
718554816 bytes (719 MB) copied, 106.441 s, 6.8 MB/s







/usr/bin/livecd-iso-to-disk --format --reset-mbr /Temp/Dell/Dell-XPPro-wSP3.iso /dev/sdb

qemu -hda /dev/sdf -m 256 -vga std
CERTIFIED EXPERT

Commented:
Hmmm...  I'm thinking that sample NFS /etc/fstab 'mount' command doesn't actually belong to that bootable USB procedure...  it just got mixed in because the formatting appears to be missing in Notepad.
CERTIFIED EXPERT
Distinguished Expert 2019

Author

Commented:
Darr247 i think your idea is good, - when i copy the (let's call it scrambled) scrambled txt to the comment box, the formatting reappears.
So good thing is - it's still there
Only point left : how to get it back in Notepad
CERTIFIED EXPERT

Commented:
Weirdly, if I paste it into Wordpad, the formatting is there.
Then, in Wordpad, if I use Save as on the file menu 'tab', and choose 'Other formats' then in Save as type select Unicode Text Document (with .txt extension), I get a warning that it's going to lose all formatting, which I OK through, then finish the Save... when I double-click it, it still opens in Notepad (because of the .txt association), but now the formatting is visible again.

There is no UTF-8 option in Wordpad...  but I *think* UTF-8 is a subset of Unicode, isn't it?
CERTIFIED EXPERT
Distinguished Expert 2019

Author

Commented:
where is wordpad?  i have Office 2010
CERTIFIED EXPERT

Commented:
Orb/Start -> All Programs -> Accessories

The path to it in my win7 x64 is "%ProgramFiles%\Windows NT\Accessories\wordpad.exe"

But if you have Word 2010, that should work too.
CERTIFIED EXPERT

Commented:
Nobus any chance you can upload one of the files before you save it in notepad.
Then we can find out the original encoding thats in it.
CERTIFIED EXPERT
Distinguished Expert 2019

Author

Commented:
i copied it into Wordpad - but it was also without formatting
i attach the original file
LangFile.txt
CERTIFIED EXPERT
Commented:
The whole thing reminds me of when I take a text file made in linux (which has only the 0x0A <LF> line terminator instead of 0x0D0A <CR><LF>) and open it in notepad.

You might try opening the subject files in a HEX editor and see what's actually there.
CERTIFIED EXPERT
Commented:
I just opened Notepad, hit Enter, did Save As, chose Unicode as the encoding and named it Enter-Unicode.txt, then closed Notepad.
Repeated that but choosing UTF-8 encoding and ANSI encoding, completely closing Notepad at the end of making each file.

Here's the 'dir' of the 3 files:

2012-03-17  04:56                 2 Enter-ANSI.txt
2012-03-17  04:55                 6 Enter-Unicode.txt
2012-03-17  04:56                 5 Enter-UTF-8.txt

Will open them in a HEX editor, next, because I *really* want to see how 1 Enter uses 6 bytes.
CERTIFIED EXPERT

Commented:
Quite a difference, though all 3 actually have 0x0D0A rather than just 0x0A.

1 Enter character; 3 encodings
David Johnson, CDSimple Geek from the '70s
CERTIFIED EXPERT
Distinguished Expert 2019
Commented:
What you are seeing is the Byte Order Mark which can cause some problems

I did notice that the original file Notepad++ calls it a mackintosh UTF8 File

note: macintosh utf8 presumably new lines are just 0a not 0d0a
actually under further review it has the BOM EF BB BF and newlines are 0d 0d 0a (cr/cr/lf) and when saved with notepad the single newlines are stripped out. notepad++ keeps the formatting as per the original

a straight save in wordpad the BOM is stripped and same with the newlines as per notepad
CERTIFIED EXPERT
Distinguished Expert 2019

Author

Commented:
and what is  a solution then ve3ofa ?  (where did you find that nickname?)
CERTIFIED EXPERT
Distinguished Expert 2019

Author

Commented:
i want to say  -if i do not change anything to the file, i can save it without havving the problem; only when i replace the string after the  = sign and save it - formatting gets lost
David Johnson, CDSimple Geek from the '70s
CERTIFIED EXPERT
Distinguished Expert 2019

Commented:
well for this application I'd use notepad++ and not wordpad/notepad or even word.. what was really surprising was the 2 x cr and then the lf in the original file. as for the handle it is/was my amateur radio callsign.  I'm an advanced amateur without a radio.
CERTIFIED EXPERT
Commented:
As people have pointed out, it is standard UTF-8 Encoding with a <CR><CR><LF> at the end of a line.
I have also been able to replicate your issue in notepad and fix it.
Start notepad and turn word wrap off. (NO File loaded, just notepad itself)
Close notepad.

Then try opening your text document in notepad and editing it, and then saving it.(No need for Save as even)
Word wrap turned on seems to be stripping the <CR><CR><LF> when you save it.
When it is turned off it does not.

http://bavih.blogspot.com/2008/07/notepad-bug.html
Pointed me to word wrap doing something funny with <CR><CR><LF> and it seems, to do the word wrapping on screen notepad inserts <CR><CR><LF>, but strips <CR><CR><LF> when saving. Because your file has <CR><CR><LF> at the end of each line it gets stripped when saving with word wrap on.
If your file had the normal <CR><LF> at the end of lines it would not be an issue.

This can be tested simply by having word wrap on in notepad and opening the file.
Then turn word wrap off. All <CR><CR><LF>'s get removed and also the formatting of your file without you changing a thing other than turning off word wrap in notepad.
Just stick with word wrap turned off in notepad for those files and you will be fine. Formatting and <CR><CR><LF> will remain as it was.

Cheers
Tel
CERTIFIED EXPERT
Distinguished Expert 2019

Author

Commented:
Ok guys, this looks like it's going in the right direction.
to summarise, i have 2 solutions : use notepad ++, or use notepad with word wrap off, right?

last questions : do you see any way to get the "scrambled" file to dispaly properly again?

and if i want to change the notepad exe, which Hex editor do you recommend for that?
CERTIFIED EXPERT

Commented:
Wow@ the notepad bug. And it's not just in XP... I'm running Win7. I saw it happening and didn't even realize Notepad was doing it. The Notepad (v6.1.7600.16385, 192,536 bytes) in my Win7 x64 ( I don't know how to tell if it's actually a 64-bit application) does not have the string that blog article cites, so it apparently can't be fixed.

The method I described earlier -- paste into Wordpad, Save As... 'Other' ->Unicode, then open in Notepad (with Word Wrap turned off, ALWAYS from now on, I guess) and then back in Notepad using File->Save As... 'UTF-8'  seems like it should convert them back, but I don't have any files to test it on that were UTF-8 to begin with (I've always just used Unicode, not UTF-8).
CERTIFIED EXPERT

Commented:
last questions : do you see any way to get the "scrambled" file to display properly again?

No. once the <CR><CR><LF> are gone there is no way to put them back other than manually going through the file and hitting enter where it's needed.(And that will put in the standard <CR><LF>)

And you have a 3rd option, use a hex or advanced text editor to replace all occurences of <CR><CR>  with <CR>
That would bring the file back in line with standard text and then notepad with word wrap turned on will be fine, but your other 2 options are correct. Notepad with word wrap turned off, or another editor like notepad++.

@Darr247
And you are correct, my Win7 64 bit notepad does not have the hex string either, but that is not surprising as they would not just be re-compiling it for all these years.
It can be fixed, just a different hex string so notepad would need to be debugged again to find the correct point to make the change. I'd not really call it a bug, just the method Microsoft has chosen to do word wrap so it is easily removable when saving. Unfortunately the file is using <CR><CR><LF> for end of line.
CERTIFIED EXPERT
Distinguished Expert 2019

Author

Commented:
So the Control characters are effectively removed?

>>  , use a hex or advanced text editor   <<   Which one do you recommend?
David Johnson, CDSimple Geek from the '70s
CERTIFIED EXPERT
Distinguished Expert 2019

Commented:
for your purposes an advanced text editor a hex editor is just too clunky
CERTIFIED EXPERT
Distinguished Expert 2019

Author

Commented:
ok -  But what should i use?  
notepad++ maybe?
CERTIFIED EXPERT

Commented:
As shown in the screen grab posted earlier, I use XVI32 in win7 x64 without any problems.

Note I adjust the window size so it shows 16 bytes wide... that way it makes the offset addresses shown on the left side on the standard boundaries.
CERTIFIED EXPERT
Distinguished Expert 2019
Commented:
hey Darr - tx for that post, - i will definitely try it
CERTIFIED EXPERT
Distinguished Expert 2019

Author

Commented:
i was VERY pleased with the ideas and solutions, and help offered
i sure would like to hand out more points - since you helped me understand an OLD problem, that kept reappearing

tx to all !