大きなドキュメントのハッシュからキーワードをできるだけ早く見つけて置き換える必要があります。私は以下の 2 つの方法にうんざりしています。
辞書ハッシュに存在するキーワードのみを置き換え、存在しないキーワードを保持して、辞書にないことを知りたいという考えです。
以下の両方の方法は、2回スキャンして、私が思うように見つけて置き換えます。ルックアヘッドまたはビハインドのような正規表現は、はるかに高速に最適化できると確信しています。
#!/usr/bin/perl
use strict;
use warnings;
use Benchmark qw(:all);
my %dictionary = (
pollack => "pollard",
polynya => "polyoma",
pomaces => "pomaded",
pomades => "pomatum",
practic => "praetor",
prairie => "praised",
praiser => "praises",
prajnas => "praline",
quakily => "quaking",
qualify => "quality",
quamash => "quangos",
quantal => "quanted",
quantic => "quantum",
);
my $content =qq{
Start this is the text that contains the words to replace. {quantal} A computer {pollack} is a general {pomaces} purpose device {practic} that
can be {quakily} programmed to carry out a set {quantic} of arithmetic or logical operations automatically {quamash}.
Since a {prajnas} sequence of operations can {praiser} be readily changed, the computer {pomades} can solve more than {prairie}
one kind of problem {qualify} {doesNotExist} end.
};
# just duplicate content many times
$content .= $content;
cmpthese(100000, {
replacer_1 => sub {my $text = replacer1($content)},
replacer_2 => sub {my $text = replacer2($content)},
});
print replacer1($content) , "\n--------------------------\n";
print replacer2($content) , "\n--------------------------\n";
exit;
sub replacer1 {
my ($content) = shift;
$content =~ s/\{(.+?)\}/exists $dictionary{$1} ? "[$dictionary{$1}]": "\{$1\}"/gex;
return $content;
}
sub replacer2 {
my ($content) = shift;
my @names = $content =~ /\{(.+?)\}/g;
foreach my $name (@names) {
if (exists $dictionary{$name}) {
$content =~ s/\{$name\}/\[$dictionary{$name}\]/;
}
}
return $content;
}
ベンチマーク結果は次のとおりです。
Rate replacer_2 replacer_1
replacer_2 5565/s -- -76%
replacer_1 23397/s 320% --