Link to home
Start Free TrialLog in
Avatar of curiouswebster
curiouswebsterFlag for United States of America

asked on

I need to strip out Ascii control characters

I view a text file in Notepad++ and it shows characters this way:

[ESC][DC2][DC4]

and I need to remove these characters.

Using this table I have found the HEX and DEC values for these characters.

http://www.cs.tut.fi/~jkorpela/chars/c0.html

But how do I remove them from a text string?

Thanks,
newbieweb


Avatar of HalfAsleep
HalfAsleep

Do you mean how to remove non-printable characters from a string in C#?

Notepad++ is a program written in C++ and is another issue.

For C# strings, maybe this will give you an idea.

http://social.msdn.microsoft.com/Forums/en-US/csharpgeneral/thread/d490653d-479c-4a40-90ff-76870309c801
Avatar of curiouswebster

ASKER

Notepad++ displays [ESC] to represent an Ascii character with a HEX value of 0x1B. And this code removes that character and replaces it with a blank, which is what I want it to do.


                char ch = (char)0x1B;
                while (newStr.Contains(ch.ToString()))
                {
                    newStr = newStr.Replace(ch.ToString(), "");
                }

BUT, it only replaces the first one. I assumed that putting it inside a loop would keep calling Replace() until there are none left.

OR, isn't there a function which replaces ALL instances in the string? I am a bit puzzled that it's not immediately evident.

Thanks,
newbieweb

and I am even more puzzled that this loop only executes once even thought the string has dozens of instances of that exact character.

??
ASKER CERTIFIED SOLUTION
Avatar of HalfAsleep
HalfAsleep

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Glad you asked.

I realized that more than just the first character was being removed, while I made up a file containing the answer to your question.

It contains the before and after, with the integer value above them.

All the 27's were being removed. But the little square is used to describe other non-printable characters.

Bravo!
chars2.txt