php - 正規表現グループの一致で計算しますか?

Question

正規表現グループの一致で計算することは可能ですか?

弦：

(00) Bananas
...
(02) Apples (red ones)
...
(05) Oranges
...
(11) Some Other Fruit
...

各行の先頭の数字の差が 3 以下の場合は、間の「...」を削除します。したがって、文字列は次のように返されます。

(00) Bananas
(02) Apples (red ones)
(05) Oranges
...
(11) Some Other Fruit

正規表現:

$match = '/(*ANYCRLF)\((\d+)\) (.+)$
\.{3}
\((\d+)\) (.+)/m';

ここでトリッキーな部分は、一致を取得して、次のような条件に追加する方法です

if($3-$1 >= 3) {
  //replace
}

テスト: http://codepad.viper-7.com/f6iI4m

ありがとう！

score 3 · Accepted Answer

でそれを行う方法は次のとおりですpreg_replace_callback()。

$callback = function ($match) {
    if ($match[3] <= $match[2] + 3) {
        return $match[1];
    } else {
        return $match[0];
    }
};

$newtxt = preg_replace_callback('/(^\((\d+)\).+$)\s+^\.{3}$(?=\s+^\((\d+)\))/m', $callback, $txt);

/(^\((\d+)\).+$)\s+^\.{3}$(?=\s+^\((\d+)\))/m

断片的なパターンは次のとおりです。

(^\((\d+)\).+$)      # subpattern 1, first line; subpattern 2, the number
\s+^\.{3}$           # newline(s) and second line ("...")
(?=\s+^\((\d+)\))    # lookahead that matches another numbered line 
                     # without consuming it; contains subpattern 3, next number

したがって、パターン全体の一致は最初の2行（つまり、番号付きの行と「...」行）です。

数字の差が3より大きい場合は、の元のテキストに置き換えます$match[0]（つまり、変更なし）。差が3以下の場合は、最初の行のみに置き換えます（にあります$match1]）。

score 0 · Accepted Answer

preg_replace_callbackを採用し、任意の php コードを使用して置換文字列を返すことができます。コールバックはキャプチャを受け取ります。ただし、出力するには、置換のために重複する一致を取得する必要があります。

(00) Bananasvsを比較(02) Apples->2-0=2 置換
(02) Applesvsを比較(05) Oranges->5-2=3 置換
...

ただし、入力の(02) Apples部分は以前の一致に使用されているため、2 回目は選択されません。

編集：

これは、先読みを使用した正規表現ベースのソリューションです。クレジットはWiseguyに送られます。

$s = "(00) Bananas
...
(02) Apples (red ones)
...
(05) Oranges
...
(11) Some Other Fruit
...";

$match = '/(*ANYCRLF)\((\d+)\) (.+)$
\.{3}
(?=\((\d+)\) (.+))/m';

// php5.3 anonymous function syntax
$s = preg_replace_callback($match, function($m){
    if ($m[3] - $m[1] <= 3) {
        print preg_replace("/[\r\n]+.../", '', $m[0]);
    } else {
        print $m[0];
    }
}, $s);
echo $s;

これは、「ドットを見つけてから前/次の行を確認する」というロジックに基づいた私の最初のテイクです。

$s = "(00) Bananas
...
(02) Apples (red ones)
...
(05) Oranges
...
(11) Some Other Fruit
...
(18) Some Other Fruit
...
(19) Some Other Fruit
...
";

$s = preg_replace("/[\r\n]{2}/", "\n", $s);

$num_pattern = '/^\((?<num>\d+)\)/';
$dots_removed = 0;

preg_match_all('/\.{3}/', $s, $m, PREG_OFFSET_CAPTURE);
foreach ($m[0] as $i => $dots) {
    $offset = $dots[1] - ($dots_removed * 4); // fix offset of changing input

    $prev_line_end = $offset - 2; // -2 since the offset is pointing to the first '.', prev char is "\n"
    $prev_line_start = $prev_line_end; // start the search for the prev line's start from its end
    while ($prev_line_start > 0 && $s[$prev_line_start] != "\n") {
        --$prev_line_start;
    }

    $next_line_start = $offset + strlen($dots[0]) + 1;
    $next_line_end = strpos($s, "\n", $next_line_start);
$next_line_end or $next_line_end = strlen($s);

    $prev_line = trim(substr($s, $prev_line_start, $prev_line_end - $prev_line_start));
    $next_line = trim(substr($s, $next_line_start, $next_line_end - $next_line_start));

    if (!$next_line) {
        break;
    }

    // get the numbers
    preg_match($num_pattern, $prev_line, $prev);
    preg_match($num_pattern, $next_line, $next);

    if (intval($next['num']) - intval($prev['num']) <= 3) {
        // delete the "..." line
        $s = substr_replace($s, '', $offset-1, strlen($dots[0]) + 1);
        ++$dots_removed;
    }
}

print $s;

php - 正規表現グループの一致で計算しますか?

2 に答える 2

編集：

Related

Reference