python - Pythonを使用して引用キーに基づいてbibtexファイルから特定のエントリを削除する

Question

Pythonを使用して引用キーに基づいてbibtexファイルから特定のエントリを削除するにはどうすればよいですか？基本的に、2つの引数（bibtexファイルへのパスと引用キー）を取り、ファイルからキーに対応するエントリを削除する関数が必要です。正規表現で遊んでみましたが、うまくいきませんでした。私もbibtexパーサーを少し探しましたが、それはやり過ぎのようです。以下のスケルトン関数では、決定的な部分はcontent_modified =です。

def deleteEntry(path, key):
  # get content of bibtex file
  f = open(path, 'r')
  content = f.read()
  f.close() 
  # delete entry from content string
  content_modified = 

  # rewrite file
  f = open(path, 'w')
  f.write(content_modified)
  f.close()

次に、bibtexファイルの例を示します（要約にスペースが含まれています）。

@article{dai2008thebigfishlittlepond,
    title = {The {Big-Fish-Little-Pond} Effect: What Do We Know and Where Do We Go from Here?},
    volume = {20},
    shorttitle = {The {Big-Fish-Little-Pond} Effect},
    url = {http://dx.doi.org/10.1007/s10648-008-9071-x},
    doi = {10.1007/s10648-008-9071-x},
    abstract = {The big-fish-little-pond effect {(BFLPE)} refers to the theoretical prediction that equally able students will have lower academic
self-concepts in higher-achieving or selective schools or programs than in lower-achieving or less selective schools or programs,
largely due to social comparison based on local norms. While negative consequences of being in a more competitive educational
setting are highlighted by the {BFLPE}, the exact nature of the {BFLPE} has not been closely scrutinized. This article provides
a critique of the {BFLPE} in terms of its conceptualization, methodology, and practical implications. Our main argument is that
of the {BFLPE.}},
    number = {3},
    journal = {Educational Psychology Review},
    author = {Dai, David Yun and Rinn, Anne N.},
    year = {2008},
    keywords = {education, composition by performance, education, peer effect, education, school context, education, social comparison/big-fish{\textendash}little-pond effect},
    pages = {283--317},
    file = {Dai_Rinn_2008_The Big-Fish-Little-Pond Effect.pdf:/Users/jpl2136/Documents/Literatur/Dai_Rinn_2008_The Big-Fish-Little-Pond Effect.pdf:application/pdf}
}

@book{coleman1966equality,
    title = {Equality of Educational Opportunity},
    shorttitle = {Equality of educational opportunity},
    publisher = {{U.S.} Dept. of Health, Education, and Welfare, Office of Education},
    author = {Coleman, James},
    year = {1966},
    keywords = {\_task\_obtain, education, school context, soz. Ungleichheit, education}
}

編集：これが私が思いついた解決策です。これは、bibtexエントリ全体の照合に基づくものではなく、代わりにすべての先頭@article{dai2008thebigfishlittlepond,を検索し、コンテキスト文字列をスライスして対応するエントリを削除します。

content_keys = [(m.group(1), m.start(0)) for m in re.finditer("@\w{1,20}\{([\w\d-]+),", content)]
idx = [k[0] for k in content_keys].index(key)
content_modified = content[0:content_keys[idx][1]] + content[content_keys[idx + 1][1]:]

score 1 · Accepted Answer

コメントでBeniCherniavsky-Paskinが述べたように、BibTexエントリは行の開始直後に（タブやスペースなしで）開始および終了するという事実に依存する必要があります。次に、これを行うことができます：

pattern = re.compile(r"^@\w+\{"+key+r",.*?^\}", re.S | re.M)
content_modified = re.sub(pattern, "", content)

2つの修飾子に注意してください。Sマッチラインを.壊します。文字列の先頭で一致しますM。^

この事実に頼ることができない場合、BibTex形式は単に通常の言語ではありません（{}正しい結果を得るためにネストをカウントする必要があるためです。正規表現フレーバーがあり、それでもこのタスクを可能にする可能性があります（再帰またはバランシングを使用） group）ですが、Pythonはこれらの機能をサポートしていないと思います。したがって、実際にはBibTexパーサーを使用する必要があります（これにより、コードがさらに不安定になると思います）。

python - Pythonを使用して引用キーに基づいてbibtexファイルから特定のエントリを削除する

1 に答える 1

Related

Reference