Solved

OSX grep (or sed? awk?) to find/replace non-ASCII hex values

Posted on 2014-01-06
5
2,079 Views
Last Modified: 2014-01-08
I need to use OSX shell script (bash is current shell, would rather not re-write for others, but could if needed) to find and replace specific non-ASCII hex values (specifically Unicode #65533) from text files.

This appears to work, but wonder if there is something more elegant.
grep `echo -e 's/\xEF\xBF\xBD//'` fileName.txt

Have not otherwise been able to get the hex understood by grep or found. Any ideas?
0
Comment
Question by:michaellanham
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 3
  • 2
5 Comments
 
LVL 84

Expert Comment

by:ozo
ID: 39761210
awk '/s\/\xEF\xBF\xBD\/\//' fileName.txt
perl -ne 'print if m{s/\xEF\xBF\xBD//}' fileName.txt
0
 

Author Comment

by:michaellanham
ID: 39763953
Oddly, I'm unable to diagnose why I can't get my OS X 10.9.1 to play nice. I've attached a copy of a test file with this character sequence in it. I tried both the suggested solutions, as well as returning to my own example. And I'll be darned that all three do not have any discernable affect on the source file. I can close it and reopen it in the hex editor and sure enough, bad Hex Symbols still there.
The grep --version output is: grep (BSD grep) 2.5.1-FreeBSD, and that might be useful.

Diagnosis assistance would be great!
Screen-Shot-2014-01-07-at-8.34.0.png
0
 

Author Comment

by:michaellanham
ID: 39763955
Well...weirdness..a minor modification to suggestion #2 seems to be working, but I'm not clear what the difference is...I concede I'm doing a replacement with 'foo' instead of deleting, but the 'm' in front of the first brace seemed to be interfering with proper execution.

perl -ane '{if(s/[\xEF\xBF\xBD]+/foo/) { print } }' foo.csv

but
perl -e s/[\xEF\xBF\xBD]+/foo/ foo.csv

does not work. Argh! Why not?
0
 
LVL 84

Accepted Solution

by:
ozo earned 500 total points
ID: 39764012
In the screen shot, I see the  character sequence "\xEF\xBF\xBD", but I don't see the character sequence "s/\xEF\xBF\xBD//", which is what your grep command would have been searching for
If you just want to replace all instances of those characters in any sequence with "foo" then you can do
perl -i.bak -pe 's/[\xEF\xBF\xBD]+/foo/' foo.csv
0
 

Author Comment

by:michaellanham
ID: 39764974
Zoinks, you are of course correct I was searching with grep for more characters than existed--hence no match.

I also noticed that I had not used the -i (to edit <> in place, with backup) nor quotes around the perl segment. Grrr.....

When using grep, this worked...
grep -e `echo -e $'\xEF\xBF\xBD'` foo.csv

Notice I had to have bash interpret the Hex characters before passing to grep. found an example after much searching and mostly-blind modifications to see if they would work as expected. Other than painful discovery learning, any suggestions on how to ID the actual problem with grep? I've read multiple conflicting posts that the version on Mac does/does not handle unicode characters, and my exposure thus far goes with the 'does not' camp.
Thank you!
0

Featured Post

Transaction Monitoring Vs. Real User Monitoring

Synthetic Transaction Monitoring Vs. Real User Monitoring: When To Use Each Approach? In this article, we will discuss two major monitoring approaches: Synthetic Transaction and Real User Monitoring.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

In this article we have discussed about the OS X EI Capitan and how to fix Wi-Fi issue in OS X El Capitan. We have explained how to delete system level preferences and create a new Wi-Fi location to resolve Wi-Fi issue.
In this post we will learn how to connect and configure Android Device (Smartphone etc.) with Android Studio. After that we will run a simple Hello World Program.
Six Sigma Control Plans
Progress

728 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question