Solved

Vim: getting rid of non printable characters

Posted on 2007-11-19
5
1,411 Views
Last Modified: 2012-08-14
I have a number of text files, generated on windows machines that appear strangly in Vim due to a massive amount of non printable characters and I'm not sure how to remove them.  For example, in windows, a particular XML file looks like this in notepad.

<?xml version="1.0" encoding="Unicode" ?>
<SYSTEMINFO>
<SYSTEM>
      <OSNAME>Microsoft Windows Server 2003 Advanced Server</OSNAME>
      <OSVER>5.2.3790 1.0</OSVER>
      <OSLANGUAGE>1033</OSLANGUAGE>
</SYSTEM>


But in Vim, the same thing looks like this with npc's after every character:

ÿþ<?^@x^@m^@l^@ ^@v^@e^@r^@s^@i^@o^@n^@=^@"^@1^@.^@0^@"^@ ^@e^@n^@c^@o^@d^@i^@n^@g^@=^@"^@U^@n^@i^@c^@o^@d^@e^@"^@ ^@?^@>^@^M
<^@S^@Y^@S^@T^@E^@M^@I^@N^@F^@O^@>^@^M
^@<^@S^@Y^@S^@T^@E^@M^@>^@^M
^@      ^@<^@O^@S^@N^@A^@M^@E^@>^@M^@i^@c^@r^@o^@s^@o^@f^@t^@ ^@W^@i^@n^@d^@o^@w^@s^@ ^@S^@e^@r^@v^@e^@r^@ ^@2^@0^@0^@3^@ ^@A^@d^@v^@a^@n^@c^@e^@d^@ ^@S^@e^@r^@v^@e^@r^@<^@/^@O^@S^@N^@A^@M^@E^@>^@^M
^@      ^@<^@O^@S^@V^@E^@R^@>^@5^@.^@2^@.^@3^@7^@9^@0^@ ^@1^@.^@0^@<^@/^@O^@S^@V^@E^@R^@>^@^M
^@      ^@<^@O^@S^@L^@A^@N^@G^@U^@A^@G^@E^@>^@1^@0^@3^@3^@<^@/^@O^@S^@L^@A^@N^@G^@U^@A^@G^@E^@>^@^M
^@<^@/^@S^@Y^@S^@T^@E^@M^@>^@^M


Not all files, mostly just .xml files generated by a microsoft program or .reg registry files.
0
Comment
Question by:Marketing_Insists
  • 2
  • 2
5 Comments
 
LVL 84

Accepted Solution

by:
ozo earned 500 total points
Comment Utility
:%s/[^ -~]//g
0
 
LVL 3

Expert Comment

by:amirs80
Comment Utility
before opening the file run the command
#dos2unix filename
now check it
0
 
LVL 34

Expert Comment

by:Duncan Roe
Comment Utility
Unicode characters are 2 bytes wide. The original ASCII characters map into the 2nd byte leaving the first byte zero. Also microsoft xml files often seem to have some weird binary garbage at the front - you can remove that with no ill effect in my limited experience.
So ozo's advice is sound - remove all characters that aren't printable. I don't understand the :% at the front but the rest is a sed command that would do what you want.
dos2unix will convert CrLf pairs to Lf but the sed command will effectively do that for you anyway
0
 
LVL 34

Expert Comment

by:Duncan Roe
Comment Utility
One other thing - for the benefit of xml parsers, you need to change "Unicode"  in line 1 to "utf-8", so the parser will know characters are 1 byte wide.
0
 
LVL 84

Expert Comment

by:ozo
Comment Utility
:% tells vim to exceute a sed command on every line in the buffer
0

Featured Post

Highfive + Dolby Voice = No More Audio Complaints!

Poor audio quality is one of the top reasons people don’t use video conferencing. Get the crispest, clearest audio powered by Dolby Voice in every meeting. Highfive and Dolby Voice deliver the best video conferencing and audio experience for every meeting and every room.

Join & Write a Comment

Daily system administration tasks often require administrators to connect remote systems. But allowing these remote systems to accept passwords makes these systems vulnerable to the risk of brute-force password guessing attacks. Furthermore there ar…
Using 'screen' for session sharing, The Simple Edition Step 1: user starts session with command: screen Step 2: other user (logged in with same user account) connects with command: screen -x Done. Both users are connected to the same CLI sessio…
Learn how to navigate the file tree with the shell. Use pwd to print the current working directory: Use ls to list a directory's contents: Use cd to change to a new directory: Use wildcards instead of typing out long directory names: Use ../ to move…
Get a first impression of how PRTG looks and learn how it works.   This video is a short introduction to PRTG, as an initial overview or as a quick start for new PRTG users.

772 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

11 Experts available now in Live!

Get 1:1 Help Now