yamabob217
asked on
Translate Word Smart Quotes submitted in form data?
In the event that someone copies and pastes the contents of a Word document into a web form, I would like to translate the special Word characters to their ASCII equivalent. I found the following statement online...
$form_fields->{comments} =~ tr/\x91\x92\x93\x94\x96\x9 7/''""\-\- /;
But I cannot get this to work correctly. For example; when a word that contains a single quote is entered into the form I get the following as a result from the previous statement: client?'s It translates the character correctly to the ASCII single quote but I'm not sure where the question mark is coming from.
$form_fields->{comments} =~ tr/\x91\x92\x93\x94\x96\x9
But I cannot get this to work correctly. For example; when a word that contains a single quote is entered into the form I get the following as a result from the previous statement: client?'s It translates the character correctly to the ASCII single quote but I'm not sure where the question mark is coming from.
SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Also, if you're only worried about quote characters, you also might want to consider ‘ (Left Single) and ’ (Right Single). But the list of entities that will get converted coming out of MSWord goes far beyond just those four. HTML::Entities really is a god send in this case.
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
##########################
use HTML::Entities;
my $clean_input = encode_entities( $input );
##########################
This will convert the MSWord Smart quotes to “ (Left Double Quote) and ” (Right Double Quote) respectively. They will now display properly in a browser.
If your goal was to remove them completely and replace them with regular quotes you could then:
##########################
$clean_input =~ s|“|"|gi;
$clean_input =~ s|”|"|gi;
##########################