[Okta Webinar] Learn how to a build a cloud-first strategyRegister Now

x
  • Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 207
  • Last Modified:

Anyone have a good UNICODE filter class?

I'm having to convert an AFP (mainframe) file to a PDF document.  I can do all that just fine, but there are plenty of garbage characters sprinkled throughout the file that I want to get rid of.  I want a method that removes all invalid characters.  The only characters I want is all the alphanumerics, cr & lf,  and the normal keyboard characters.  I know how to do this painstakingly, but I bet one of you guys already code to do this.  The method should only ALLOW characters from an approved list.  Unless there's a better way, of course.  
0
jackjeckyl
Asked:
jackjeckyl
  • 4
  • 3
1 Solution
 
CEHJCommented:
Can you post (attach) the unclean file here?
0
 
jackjeckylAuthor Commented:
Forgot to add - the characters are so off the wall, I've been using UNICODE to replaceAll on them.  Some aren't even visible.  
0
 
jackjeckylAuthor Commented:
I can't post the file.  It won't always be the same file, it'll always be something different.  
0
What does it mean to be "Always On"?

Is your cloud always on? With an Always On cloud you won't have to worry about downtime for maintenance or software application code updates, ensuring that your bottom line isn't affected.

 
CEHJCommented:
The best thing to do would be to implement a FilterReader to clean out anything not ISO8859-1

http://www.technojeeves.com/joomla/index.php/free/48-iso8859-1
0
 
jackjeckylAuthor Commented:
I ended up just doing an approved list of characters and ignored UNICODE.  Thanks for your responses.
0
 
CEHJCommented:
:-)

I would just use the ISO8859-1 charset
0

Featured Post

Independent Software Vendors: We Want Your Opinion

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

  • 4
  • 3
Tackle projects and never again get stuck behind a technical roadblock.
Join Now