0

php: 指定された文字列内の単語のインスタンスを並べ替えてカウントする

この記事では、特定の文字列内の単語のインスタンスをカウントし、頻度で並べ替える方法を理解しました。ここで、さらに作業を行い、結果の単語を別の配列 ($keywords) に一致させてから、上位 5 単語のみを取得します。しかし、私はそれを行う方法がわかりません。質問を開いてください。ありがとう。

$txt = <<<EOT
The 2013 Monaco Grand Prix (formally known as the Grand Prix de Monaco 2013) was a Formula One motor race that took place on 26 May 2013 at the Circuit de Monaco, a street circuit that runs through the principality of Monaco. The race was won by Nico Rosberg for Mercedes AMG Petronas, repeating the feat of his father Keke Rosberg in the 1983 race. The race was the sixth round of the 2013 season, and marked the seventy-second time the Monaco Grand Prix has been held. Rosberg had started the race from pole.
Background
Mercedes protest
Just before the race, Red Bull and Ferrari filed an official protest against Mercedes, having learned on the night before the race of a three-day tyre test undertaken by Pirelli at the venue of the last grand prix using Mercedes' car driven by both Hamilton and Rosberg. They claimed this violated the rule against in-season testing and gave Mercedes a competitive advantage in both the Monaco race and the next race, which would both be using the tyre that was tested (with Pirelli having been criticised following some tyre failures earlier in the season, the tests had been conducted on an improved design planned to be introduced two races after Monaco). Mercedes stated the FIA had approved the test. Pirelli cited their contract with the FIA which allows limited testing, but Red Bull and Ferrari argued this must only be with a car at least two years old. It was the second test conducted by Pirelli in the season, the first having been between race 4 and 5, but using a 2011 Ferrari car.[4]
Tyres
Tyre supplier Pirelli brought its yellow-banded soft compound tyre as the harder "prime" tyre and the red-banded super-soft compound tyre as the softer "option" tyre, just as they did the previous two years. It was the second time in the season that the super-soft compound was used at a race weekend, as was the case with the soft tyre compound.
EOT;

$words = array_count_values(str_word_count($txt, 1));
arsort($words);
var_dump($words);

$keywords = array("Monaco","Prix","2013","season","Formula","race","motor","street","Ferrari","Mercedes","Hamilton","Rosberg","Tyre"); 
//var_dump($words) which should match in $keywords array, then get top 5 words.
4

1 に答える 1

1

$words は連想配列として既にあり、単語によってインデックスが付けられ、カウントが値として指定されているため、array_flip()を使用して $keywords 配列を単語によってインデックスが付けられた連想配列にします。次に、 array_intersect_key()を使用して、反転した $keywords 配列に一致するインデックス エントリを持つ $words からのエントリのみを返すことができます。

これにより、結果の $matchWords 配列が得られます。この配列は、依然として単語によってキー設定されていますが、$keywords に一致する元の $words 配列のエントリのみが含まれています。それでも頻度でソートされます。

次にarray_slice()を使用して、その配列から最初の 5 つのエントリを抽出します。

$matchWords = array_intersect_key(
    $words,
    array_flip($keywords)
);

$matchWords = array_slice($matchWords, 0, 5);
var_dump($matchWords);

与える

array(5) {
  'race' =>
  int(11)
  'Monaco' =>
  int(7)
  'Mercedes' =>
  int(5)
  'Rosberg' =>
  int(4)
  'season' =>
  int(4)
}

警告: 大文字と小文字の区別に問題がある可能性があります。"Race" !== "race" であるため、$words = array_count_values(str_word_count($txt, 1));行はこれらを 2 つの異なる単語として扱います。

于 2013-05-30T08:59:40.277 に答える