stakor
asked on
Perl Substitution Table
I need to translate chunks of html that have made it into my text file, into just text. I am thinking of creating a text file that has two columns in it. One, with the html "'", and the other with the text to be inserted "'". Is there a way to have perl go through a text file, and use a text file as a translation reference. I am not sure how many things I am going to need to change, which is why I was thinking of using a text file, as it could handle 2 - 1000 changes...
Any thoughts on how to do this, without the ability to install any modules on the system?
Any thoughts on how to do this, without the ability to install any modules on the system?
ASKER
This seems to be close, but I am seeing the mark up characters disappear, instead.
The translate.txt file:
The Source file:
The translate.txt file:
' '
" "
The Source file:
boss's
as "This Test".
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
ASKER
The output now looks like:
boss's
as "This Test".
The program looks like:
The Translate text looks like:
So, the 's looks good, but the " does not yet.
boss's
as "This Test".
The program looks like:
use strict;
use warnings;
open TXT, 'translate.txt' or die "could not open translate.txt: $!";
my %xlate;
while (<TXT>) {
chomp;
my ($old, $new) = split /\t+/;
$xlate{$old} = $new;
}
close TXT;
my $fil = shift or die "Usage: $0 file_with_bad_text\n";
open IN, $fil or die "could not open $fil: $!";
while (<IN>) {
foreach my $pat (keys %xlate) {
s{$pat}{$xlate{$pat}}g; # possibly \b$pat\b or \Q$pat - see what works for you
}
print;
}
close IN;
The Translate text looks like:
' '
&quot; "
So, the 's looks good, but the " does not yet.
see
perldoc -q "How do I efficiently match many regular expressions at once?"
perldoc -q "How do I efficiently match many regular expressions at once?"
ASKER
I have found:
http://perldoc.perl.org/5.10.1/perlfaq6.html#How-do-I-efficiently-match-many-regular-expressions-at-once%3f
But honestly am not that good at perl yet. I think I can accomplish what I need with a set of sed commands. So, I will see if I can get that to work out for this project.
http://perldoc.perl.org/5.10.1/perlfaq6.html#How-do-I-efficiently-match-many-regular-expressions-at-once%3f
But honestly am not that good at perl yet. I think I can accomplish what I need with a set of sed commands. So, I will see if I can get that to work out for this project.
You can certainly do it with a series of sed commands.
Did you ever get the code I provided working completely? If not, are you still interested in it? If so, I'll take a look sometime "soon" (I've been busy lately).
Did you ever get the code I provided working completely? If not, are you still interested in it? If so, I'll take a look sometime "soon" (I've been busy lately).
#!/usr/bin/perl
use strict;
use warnings;
@ARGV or die "Usage: $0 file_with_bad_text\n";
my $xlate;
{local @ARGV=qw(translate.txt);
my $s;
while( <> ){
$s.="s{\\Q$1\\E}{$2}g;" if /(\S+)\s+(\S+)/;
}
die if $!;
$xlate = eval "sub{$s}"
}
while( <> ){
&$xlate;
print;
}
use strict;
use warnings;
@ARGV or die "Usage: $0 file_with_bad_text\n";
my $xlate;
{local @ARGV=qw(translate.txt);
my $s;
while( <> ){
$s.="s{\\Q$1\\E}{$2}g;" if /(\S+)\s+(\S+)/;
}
die if $!;
$xlate = eval "sub{$s}"
}
while( <> ){
&$xlate;
print;
}
Open in new window