remove backwards question mark inside box character from ASCI file

I have a file that is picking up a character that, when viewed in excel is a backwards question mark inside a box.  In Unix, the character appears like a backwards question mark.  I want to remove the backwards question mark from the ASCII file.  Is there a sed command to do this?  I cannot get it do display in this example.  It appears below the code markers below.  NOT an upside down question mark.  The question mark is actually facing the opposite direction.

[
¿

Open in new window

eshapleyAsked:
Who is Participating?
 
Gerwin Jansen, EE MVEConnect With a Mentor Topic Advisor Commented:
Hmm, A0 is octal 240 - this is removing the special character just fine:
tr -d '\240' < test.txt

Open in new window

output:

DET0005   UNI61130022DD                                                                                                                            EA 0000000000000020000000000024.5                                                                                      4144874                         50000
DET0008   UNI61130022DD                                                                                                                            EA 0000000000000020000000000024.5                                                                                      4144874                         80000


@duncan_roe - sed is understanding hex, this works as well:
sed 's/\xa0//g' test.txt

Open in new window

output:

DET0005   UNI61130022DD                                                                                                                            EA 0000000000000020000000000024.5                                                                                      4144874                         50000
DET0008   UNI61130022DD                                                                                                                            EA 0000000000000020000000000024.5                                                                                      4144874                         80000
0
 
tfewsterCommented:
I had the same issue recently, creating a .csv file in Unix to be imported into Excel; I can't remember what character combo generated it (probably a "\0nn" being interpreted as a special character), but it should be obvious if you `vi` the source file and display special characters using ":set list"

Once you know the character combo that is generating the odd displayed character, you can remove it used sed

Hope that helps!
0
 
eshapleyAuthor Commented:
It is a backward question mark.  No other characters are present.  If I cut and paste it back into vi, the question mark changes back to normal.
0
Free Tool: SSL Checker

Scans your site and returns information about your SSL implementation and certificate. Helpful for debugging and validating your SSL configuration.

One of a set of tools we are providing to everyone as a way of saying thank you for being a part of the community.

 
Eric AKA NetminderCommented:
Can you upload the file (with anything sensitive removed)?
0
 
DavidPresidentCommented:
do an od or hd command to get a hex dump to see what the actual value is.   You just don't want to arbitrarily cut off the last char of a file, because depending on the file type this may be an end-of-file indicator.  (trim it off and your file is munged and it will break things)
0
 
PaulHewsCommented:
First I would double check that there are not actually three bytes.  This character is part of the UTF-8 byte order mark, which is inserted by some software when writing a UTF-8 file.  These characters are EF BB BF which appear at the beginning of the file.

If it is a BOM, it can safely be discarded using a number of techniques:

Using awk/sed to detect/remove the byte order mark (BOM)
http://muzso.hu/2011/11/08/using-awk-sed-to-detect-remove-the-byte-order-mark-bom
0
 
Gerwin Jansen, EE MVETopic Advisor Commented:
ASCII for upside down question mark is 168 or a8 hex, so removing with sed would be like:
sed -i 's/\xa8//g' <file_name>

Open in new window

This would remove all upside down characters from your file file_name

To try before changing your file, leave out the -i

A tr alternative:
tr -d '\250' < file_name > new_file_name

Open in new window

0
 
eshapleyAuthor Commented:
In ultraedit, it looks like a space.  in vi it is a backwards question mark.  For example, the character following UNI61130022DD in column 24.  In hex it is a0.
test.txt
0
 
eshapleyAuthor Commented:
The hex dump reports it in hex as a0.
Tried this:  sed 's/\\xa0/\\x20/g' test2.txt > testout.txt
Using KSH.  Writes the testout.txt, but leaves the backwards question mark a0 in the file.
0
 
Duncan RoeSoftware DeveloperCommented:
The character in your original post is octal 277. You can always tell what the character is if the file is displaying:
Select (highlight) the character
In a bash shell window, type Control-v, then paste the character (middle button)
bash will echo the interpretation of the character
sed does not itself understand octal or hex escapes. But you can get bash to do it
sed $'s/\xa0/\x20/g' test2.txt > testout.text

Open in new window

The trick is to use $' ... '
0
 
Duncan RoeSoftware DeveloperCommented:
Yes sed does understand hex escapes. I tested with octal, which it doesn't seem to understand :(
0
Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

All Courses

From novice to tech pro — start learning today.