python - Python でのメール BodyStructure の解析

Question

重複の可能性:
python の imaplib で括弧付きリストを解析中

Python の電子メールパッケージを使用すると、電子メールメッセージの全文を取得して部分に解析し、さまざまな部分を見ていくことができます。BODYSTRUCTUREしかし、 IMAP プロトコルによって返された応答を部分的に解析するライブラリはありemail.message.Messageますか?

編集：

PHP の同等のメソッドはimap_fetchbody()、構造の解析を自動的に処理するです。

EDIT2：

この質問は重複として誤ってクローズされました。ネストされた式を便利な形式 (辞書など) に解析BODYSTRUCTUREすることであり、生の文字列を解析して Python 型にすることではありません。いずれにせよ、私は独自のソリューションを展開することになりました。これは、将来同様の問題に遭遇した人のためのコードです。imapclientライブラリと一緒に使用することを目的としています

# ----- Parsing BODYSTRUCTURE into parts dictionaries ----- #

def tuple2dict(pairs):
    """get dict from (key, value, key, value, ...) tuple"""
    if not pairs:
        return None
    return dict([(k, tuple2dict(v) if isinstance(v, tuple) else v)
                 for k, v in zip(pairs[::2], pairs[1::2])])

def parse_singlepart(var, part_no):
    """convert non-multipart into dic"""
    # Basic fields for non-multipart (Required)
    part = dict(zip(['maintype', 'subtype', 'params', 'id', 'description', 'encoding', 'size'], var[:7]), part_no=part_no)
    part['params'] = tuple2dict(part['params'])

    # Type specific fields (Required for 'message' or 'text' type)
    index = 7
    if part['maintype'].lower() == 'message' and part['subtype'].lower() == 'rfc822':
        part.update(zip(['envelope', 'bodystructure', 'lines'], var[7:10]))
        index = 10
    elif part['maintype'].lower() == 'text':
        part['lines'] = var[7]
        index = 8

    # Extension fields for non-multipart (Optional)
    part.update(zip(['md5', 'disposition', 'language', 'location'], var[index:]))
    part['disposition'] = tuple2dict(part['disposition'])

    return part

def parse_multipart(var, part_no):
    """convert the multipart into dict"""
    part = { 'child_parts': [], 'part_no': part_no }

    # First parse the child parts
    index = 0
    if isinstance(var[0], list):
        part['child_parts'] = [parse_part(v, ('%s.%d' % (part_no, i+1)).replace('TEXT.', '')) for i, v in enumerate(var[0])]
        index = 1
    elif isinstance(var[0], tuple):
        while isinstance(var[index], tuple):
            part['child_parts'].append(parse_part(var[index], ('%s.%d' % (part_no, index+1)).replace('TEXT.', '')))
            index += 1

    # Then parse the required field subtype and optional extension fields
    part.update(zip(['subtype', 'params', 'disposition', 'language', 'location'], var[index:]))
    part['params'] = tuple2dict(part['params'])
    part['disposition'] = tuple2dict(part['disposition'])

    return part

def parse_part(var, part_no=None):
    """Parse IMAP email BODYSTRUCTURE into nested dictionary

    See http://tools.ietf.org/html/rfc3501#section-6.4.5 for structure of email messages
    See http://tools.ietf.org/html/rfc3501#section-7.4.2 for specification of BODYSTRUCTURE
    """
    if isinstance(var[0], (tuple, list)):
        return parse_multipart(var, part_no or 'TEXT')
    else:
        return parse_singlepart(var, part_no or '1')

python - Python でのメール BodyStructure の解析

0 に答える 0

Related

Reference