python - Pythonスクリプトにデータブロックを保存するPythonicの方法は何ですか?

Question

Perl では__DATA__、スクリプトでトークンを使用して、データブロックの開始をマークすることができます。DATA ファイルハンドルを使用してデータを読み取ることができます。スクリプトにデータブロックを格納する Pythonic の方法は何ですか?

score 11 · Accepted Answer

データにもよりますが、辞書リテラルと複数行の文字列はどちらも非常に良い方法です。

state_abbr = {
    'MA': 'Massachusetts',
    'MI': 'Michigan',
    'MS': 'Mississippi',
    'MN': 'Minnesota',
    'MO': 'Missouri',
    }

gettysburg = """
Four score and seven years ago,
our fathers brought forth on this continent
a new nation, 
conceived in liberty
and dedicated to the proposition
that all men are created equal.
"""

score 6 · Accepted Answer

StringIOモジュールを使用して、ソース内のファイルのようなオブジェクトを作成します。

from StringIO import StringIO

textdata = """\
Now is the winter of our discontent,
Made glorious summer by this sun of York.
"""

# in place of __DATA__ = open('richard3.txt')
__DATA__ = StringIO(textdata)
for d in __DATA__:
    print d

__DATA__.seek(0)
print __DATA__.readline()

プリント：

Now is the winter of our discontent,

Made glorious summer by this sun of York.

Now is the winter of our discontent,

（私__DATA__はあなたの元の質問に合わせるためにこれを呼んだ。実際には、これは良いPythonの命名スタイルではないだろう-のようなものdatafileがより適切だろう。）

score 1 · Accepted Answer

IMO データの種類に大きく依存します: テキストしかなく、内部に「」または「」が含まれていないことが確実な場合は、このバージョンのテキストを保存することができます。たとえば、「」または「」が存在する、または存在する可能性があることがわかっているテキストを保存したい場合はどうしますか? それからそれはお勧めです

何らかの方法でコード化されたデータを保存するか、
別ファイルに入れる

例: テキストは

Python ライブラリには多くの ''' と """ があります。

この場合、三重引用符を介してそれを行うのは難しいかもしれません。だからあなたはすることができます

__DATA__ = """There are many '''s and \"""s in Python libraries.""";
print __DATA__

ただし、テキストを編集または置換する際には注意が必要です。この場合、次のようにする方が便利かもしれません

$ python -c 'import sys; print sys.stdin.read().encode("base64")'
There are many '''s and """s in Python libraries.<press Ctrl-D twice>

それからあなたは得る

VGhlcmUgYXJlIG1hbnkgJycncyBhbmQgIiIicyBpbiBQeXRob24gbGlicmFyaWVzLg==

出力として。これを取得して、次のようにスクリプトに入れます

__DATA__ = 'VGhlcmUgYXJlIG1hbnkgJycncyBhbmQgIiIicyBpbiBQeXRob24gbGlicmFyaWVzLg=='.decode('base64')
print __DATA__

そして結果を見る。

score 0 · Accepted Answer

Perl の__DATA__変数に慣れていない Google は、テストによく使用されると言っています。コードのテストも検討していると仮定すると、doctest (http://docs.python.org/library/doctest.html) を検討することをお勧めします。たとえば、代わりに

import StringIO

__DATA__ = StringIO.StringIO("""lines
of data
from a file
""")

DATAをファイルオブジェクトにしたいと仮定すると、それは現在取得しているものであり、今後は他のほとんどのファイルオブジェクトと同じように使用できます。例えば：

if __name__=="__main__":
    # test myfunc with test data:
    lines = __DATA__.readlines()
    myfunc(lines)

しかし、DATAの唯一の用途がテストである場合は、doctest を作成するか、PyUnit / Nose でテストケースを作成する方がよいでしょう。

例えば：

import StringIO

def myfunc(lines):
    r"""Do something to each line

    Here's an example:

    >>> data = StringIO.StringIO("line 1\nline 2\n")
    >>> myfunc(data)
    ['1', '2']
    """
    return [line[-2] for line in lines]

if __name__ == "__main__":
    import doctest
    doctest.testmod()

これらのテストを次のように実行します。

$ python ~/doctest_example.py -v
Trying:
    data = StringIO.StringIO("line 1\nline 2\n")
Expecting nothing
ok
Trying:
    myfunc(data)
Expecting:
    ['1', '2']
ok
1 items had no tests:
    __main__
1 items passed all tests:
   2 tests in __main__.myfunc
2 tests in 2 items.
2 passed and 0 failed.
Test passed.

Doctest は、プレーンテキストファイルで Python テストを見つけて実行するなど、さまざまなことを行います。個人的には、私は大ファンではなく、より構造化されたテストアプローチを好みます ( import unittest) が、コードをテストするための明らかに Pythonic な方法です。

python - Pythonスクリプトにデータブロックを保存するPythonicの方法は何ですか?

4 に答える 4

Related

Reference