python - PYTHON PIL を使用して Captcha Image から背景のノイズのある行を削除する

Question

私は処理されたキャプチャ画像（拡大）を次のようにしています：

ご覧のとおり、"TEXT" のフォントサイズは、ノイジーラインの幅よりも少し大きくなっています。
したがって、この画像からノイズの多い線を削除するアルゴリズムまたはコードが必要です。

Python PIL ライブラリと後述のチョッピングアルゴリズムの助けを借りて、OCR で簡単に読み取れる出力画像を取得できませんでした。

私が試したPythonコードは次のとおりです。

import PIL.Image
import sys

# python chop.py [chop-factor] [in-file] [out-file]

chop = int(sys.argv[1])
image = PIL.Image.open(sys.argv[2]).convert('1')
width, height = image.size
data = image.load()

# Iterate through the rows.
for y in range(height):
    for x in range(width):

        # Make sure we're on a dark pixel.
        if data[x, y] > 128:
            continue

        # Keep a total of non-white contiguous pixels.
        total = 0

        # Check a sequence ranging from x to image.width.
        for c in range(x, width):

            # If the pixel is dark, add it to the total.
            if data[c, y] < 128:
                total += 1

            # If the pixel is light, stop the sequence.
            else:
                break

        # If the total is less than the chop, replace everything with white.
        if total <= chop:
            for c in range(total):
                data[x + c, y] = 255

        # Skip this sequence we just altered.
        x += total


# Iterate through the columns.
for x in range(width):
    for y in range(height):

        # Make sure we're on a dark pixel.
        if data[x, y] > 128:
            continue

        # Keep a total of non-white contiguous pixels.
        total = 0

        # Check a sequence ranging from y to image.height.
        for c in range(y, height):
            # If the pixel is dark, add it to the total.
            if data[x, c] < 128:
                total += 1

            # If the pixel is light, stop the sequence.
            else:
                break

        # If the total is less than the chop, replace everything with white.
        if total <= chop:
            for c in range(total):
                data[x, y + c] = 255

        # Skip this sequence we just altered.
        y += total

image.save(sys.argv[3])

したがって、基本的には、ノイズを取り除き、OCR (Tesseract または pytesser) で画像を読み取れるようにするためのより良いアルゴリズム/コードを知りたいと思います。

score 1 · Accepted Answer

ほとんどの線をすばやく取り除くには、2 つ以下の隣接する黒ピクセルを含むすべての黒ピクセルを白にします。これで迷線が修正されるはずです。次に、「ブロック」がたくさんある場合は、小さいブロックを削除できます。

これは、サンプル画像が拡大されており、線の幅が 1 ピクセルのみであることを前提としています。

score 0 · Accepted Answer

私は個人的に上記のように dilate と erode を使用していますが、それを幅と高さの基本的な統計と組み合わせて、外れ値を見つけて、必要に応じてそれらの線を削除しようとします。その後、カーネルの最小値を取得し、一時画像を元の画像として使用する前に、一時画像の中心ピクセルをその色に変える (古い画像を反復処理する) フィルターが機能するはずです。pillow/PIL では、最小ベースのタスクは img.filter(ImageFilter.MINFILTER) で達成されます。

それだけでは不十分な場合は、OpenCV の輪郭と最小境界回転ボックスを使用して文字を回転させて比較できる識別可能なセットを生成する必要があります (大量のフォントと余分なものがあるため、この時点で Tesseract または商用 OCR をお勧めします)。クラスタリングやクリーンアップなどの機能)。

python - PYTHON PIL を使用して Captcha Image から背景のノイズのある行を削除する

3 に答える 3

Related

Reference