Want to win a PS4? Go Premium and enter to win our High-Tech Treats giveaway. Enter to Win

x
?
Solved

OSX grep (or sed? awk?) to find/replace non-ASCII hex values

Posted on 2014-01-06
5
Medium Priority
?
2,195 Views
Last Modified: 2014-01-08
I need to use OSX shell script (bash is current shell, would rather not re-write for others, but could if needed) to find and replace specific non-ASCII hex values (specifically Unicode #65533) from text files.

This appears to work, but wonder if there is something more elegant.
grep `echo -e 's/\xEF\xBF\xBD//'` fileName.txt

Have not otherwise been able to get the hex understood by grep or found. Any ideas?
0
Comment
Question by:michaellanham
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 3
  • 2
5 Comments
 
LVL 84

Expert Comment

by:ozo
ID: 39761210
awk '/s\/\xEF\xBF\xBD\/\//' fileName.txt
perl -ne 'print if m{s/\xEF\xBF\xBD//}' fileName.txt
0
 

Author Comment

by:michaellanham
ID: 39763953
Oddly, I'm unable to diagnose why I can't get my OS X 10.9.1 to play nice. I've attached a copy of a test file with this character sequence in it. I tried both the suggested solutions, as well as returning to my own example. And I'll be darned that all three do not have any discernable affect on the source file. I can close it and reopen it in the hex editor and sure enough, bad Hex Symbols still there.
The grep --version output is: grep (BSD grep) 2.5.1-FreeBSD, and that might be useful.

Diagnosis assistance would be great!
Screen-Shot-2014-01-07-at-8.34.0.png
0
 

Author Comment

by:michaellanham
ID: 39763955
Well...weirdness..a minor modification to suggestion #2 seems to be working, but I'm not clear what the difference is...I concede I'm doing a replacement with 'foo' instead of deleting, but the 'm' in front of the first brace seemed to be interfering with proper execution.

perl -ane '{if(s/[\xEF\xBF\xBD]+/foo/) { print } }' foo.csv

but
perl -e s/[\xEF\xBF\xBD]+/foo/ foo.csv

does not work. Argh! Why not?
0
 
LVL 84

Accepted Solution

by:
ozo earned 2000 total points
ID: 39764012
In the screen shot, I see the  character sequence "\xEF\xBF\xBD", but I don't see the character sequence "s/\xEF\xBF\xBD//", which is what your grep command would have been searching for
If you just want to replace all instances of those characters in any sequence with "foo" then you can do
perl -i.bak -pe 's/[\xEF\xBF\xBD]+/foo/' foo.csv
0
 

Author Comment

by:michaellanham
ID: 39764974
Zoinks, you are of course correct I was searching with grep for more characters than existed--hence no match.

I also noticed that I had not used the -i (to edit <> in place, with backup) nor quotes around the perl segment. Grrr.....

When using grep, this worked...
grep -e `echo -e $'\xEF\xBF\xBD'` foo.csv

Notice I had to have bash interpret the Hex characters before passing to grep. found an example after much searching and mostly-blind modifications to see if they would work as expected. Other than painful discovery learning, any suggestions on how to ID the actual problem with grep? I've read multiple conflicting posts that the version on Mac does/does not handle unicode characters, and my exposure thus far goes with the 'does not' camp.
Thank you!
0

Featured Post

How to Use the Help Bell

Need to boost the visibility of your question for solutions? Use the Experts Exchange Help Bell to confirm priority levels and contact subject-matter experts for question attention.  Check out this how-to article for more information.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Although it can be difficult to imagine, someday your child will have a career of his or her own. He or she will likely start a family, buy a home and start having their own children. So, while being a kid is still extremely important, it’s also …
An introduction to the wonderful sport of Scam Baiting.  Learn how to help fight scammers by beating them at their own game. This great pass time helps the world, while providing an endless source of entertainment. Enjoy!
In this seventh video of the Xpdf series, we discuss and demonstrate the PDFfonts utility, which lists all the fonts used in a PDF file. It does this via a command line interface, making it suitable for use in programs, scripts, batch files — any pl…
Introduction to Processes

604 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question