Solved

Translate Word Smart Quotes submitted in form data?

Posted on 2008-10-29
5
436 Views
Last Modified: 2013-12-25
In the event that someone copies and pastes the contents of a Word document into a web form, I would like to translate the special Word characters to their ASCII equivalent. I found the following statement online...

      $form_fields->{comments} =~ tr/\x91\x92\x93\x94\x96\x97/''""\-\-/;

But I cannot get this to work correctly. For example; when a word that contains a single quote is entered into the form I get the following as a result from the previous statement: client?'s   It translates the character correctly to the ASCII single quote but I'm not sure where the question mark is coming from.
0
Comment
Question by:yamabob217
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 3
5 Comments
 
LVL 39

Assisted Solution

by:Adam314
Adam314 earned 100 total points
ID: 22836769
When you display quotes on a browser, you should use """ (without the double-quotes), not the actual quote character.
0
 
LVL 51

Assisted Solution

by:ahoffmann
ahoffmann earned 100 total points
ID: 22848338
please post an example of your data
and an example what you expect
0
 
LVL 1

Expert Comment

by:unobserved
ID: 22887680
I would suggest that first you use HTML::Entities to convert all high-bit ascii charatacters to HTML codes.

####################################
use HTML::Entities;
my $clean_input = encode_entities( $input );
####################################

This will convert the MSWord Smart quotes to “ (Left Double Quote) and ” (Right Double Quote) respectively. They will now display properly in a browser.

If your goal was to remove them completely and replace them with regular quotes you could then:
#############################
$clean_input =~ s|“|"|gi;
$clean_input =~ s|”|"|gi;
#############################
0
 
LVL 1

Expert Comment

by:unobserved
ID: 22887709
Also, if you're only worried about quote characters, you also might want to consider ‘ (Left Single) and ’ (Right Single). But the list of entities that will get converted coming out of MSWord goes far beyond just those four. HTML::Entities really is a god send in this case.
0
 
LVL 1

Accepted Solution

by:
unobserved earned 300 total points
ID: 22887718
You can read the documentation for HTML::Entities here
http://search.cpan.org/~gaas/HTML-Parser-3.56/lib/HTML/Entities.pm
0

Featured Post

Linux Academy Android App Now Supports Chromecast

We have some fantastic news for our Android fans. We’re so excited to announce that the Linux Academy Android app is now available with Chromecast support. That’s right – simply download the latest update of the Linux Academy App and start casting your favorite course videos!

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

I hope you'll find this tutorial useful and interesting. So let's try to extend Tcl with a new package.  For anyone more deeply interested please check out the book "Practical Programming in Tcl and Tk". It's really one of the best written books abo…
Checking the Alert Log in AWS RDS Oracle can be a pain through their user interface.  I made a script to download the Alert Log, look for errors, and email me the trace files.  In this article I'll describe what I did and share my script.
Learn the basics of while and for loops in Python.  while loops are used for testing while, or until, a condition is met: The structure of a while loop is as follows:     while <condition>:         do something         repeate: The break statement m…
The viewer will learn how to look for a specific file type in a local or remote server directory using PHP.

630 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question