python - :ref:? による ReST ドキュメントからのテキストブロックの抽出

Question

reStructuredText のドキュメントがいくつかあります。そのスニペットをオンラインヘルプで使用したいと考えています。1つのアプローチは、参照によってマークアップの一部を「切り取る」ことのようです。

.. _my_boring_section:

Introductory prose
------------------

blah blah blah

.. _my_interesting_section:

About this dialog
-----------------

talk about stuff which is relevant in contextual help

python/docutils/sphinx を使用して _my_interesting_section マーカーのマークアップを抽出するにはどうすればよいですか?

score 2 · Accepted Answer

Docutils パーサーをサブクラス化してカスタマイズする以外に、これを行う方法がわかりません。reStructuredText の関連セクションだけが必要で、マークアップの一部を失うことを気にしない場合は、次を試して使用できます。あるいは、特定のセクションの処理されたマークアップ (つまり、HTML または LaTeX に変換された reStructuredText) を取得するのは非常に簡単です。処理された XML の一部を抽出する例については、この質問に対する私の回答を参照してください。これがあなたが望むものかどうか私に知らせてください。とにかく、ここに行く...

Docutils を使用すると、reStructuredText を非常に簡単に操作できます。まず、Docutils 関数を使用して、reStructuredText の Docutils ドキュメントツリー (doctree) 表現を公開できpublish_doctreeます。この doctree は簡単にトラバースでき、特定の文書要素、つまり特定の属性を持つセクションを検索できます。特定のセクション参照を検索する最も簡単な方法idsは、doctree 自体の属性を調べることです。doctree.idsドキュメントの適切な部分へのすべての参照のマッピングを含む単なる辞書です。

from docutils.core import publish_doctree

s = """.. _my_boring_section:

Introductory prose
------------------

blah blah blah

.. _my_interesting_section:

About this dialog
-----------------

talk about stuff which is relevant in contextual help
"""

# Parse the above string to a Docutils document tree:
doctree = publish_doctree(s)

# Get element in the document with the reference id `my-interesting-section`:
ids = 'my-interesting-section'

try:
    section = doctree.ids[ids]
except KeyError:
    # Do some exception handling here...
    raise KeyError('No section with ids {0}'.format(ids))

# Can also make sure that the element we found was in fact a section:
import docutils.nodes
isinstance(section, docutils.nodes.section) # Should be True

# Finally, get section text
section.astext()

# This will print:
# u'About this dialog\n\ntalk about stuff which is relevant in contextual help'

これで、マークアップが失われました。派手すぎる場合は、上記の結果の最初の行の下にいくつかのダッシュを挿入して、セクションの見出しに戻るのは簡単です. より複雑なインラインマークアップのために何をする必要があるかわかりません。ただし、上記が良い出発点になることを願っています。

注: 渡す属性をクエリdoctree.idsする場合ids、reStructuredText の定義とは少し異なります。先頭のアンダースコアが削除され、他のすべてのアンダースコアが-s に置き換えられています。これが、Docutils が参照を正規化する方法です。reStructuredText 参照を Docutils の内部表現に変換する関数を作成するのは非常に簡単です。それ以外の場合は、Docuitls を掘り下げれば、これを行うルーチンを見つけることができると確信しています。

python - :ref:? による ReST ドキュメントからのテキスト ブロックの抽出

1 に答える 1

Related

Reference

python - :ref:? による ReST ドキュメントからのテキストブロックの抽出