python - PyPDF2 は 2 ページ目から PDF を追加します

Question

「退屈なものを自動化する」本を使用してプログラミングする方法を学んでいますが、第13章で障害に遭遇しました。「複数のPDFをマージしますが、最初のページ以外のすべてからタイトルページを省略します」

この本では、彼らは PDF をループすることでそれを行っていますが、PyPDF2 モジュールを調べていると、「ページ」オプションがよりクリーンなソリューションであることがわかりました。ただし、これを機能させるのは困難です。

それがpythonicか何かであるかどうかはまだ見ないでください。私はまだクラスを学んでいません ;-) この本の後、クラス、オブジェクト、デコレータ、*args と **kwargs から始める予定です ;-)

スニペットのコードの最後の行で助けが必要です。

私のコード:

  for fn_PdfObjects in range(len(l_fn_PdfObjects)):
if fn_PdfObjects != 0:
     break
else:
  ## watermark the first sheet
  addWatermark(l_fn_PdfObjects[fn_PdfObjects])
  watermarkedPage = PyPDF2.PdfFileReader(open('watermarkedCover.pdf', 'rb'))
  #   the 'position = ' is the page in the destination PDF it will receive
  tempMergerFile.merge(position=fn_PdfObjects, fileobj=watermarkedPage)
  tempMergerFile.merge(position=fn_PdfObjects+1, fileobj=l_fn_PdfObjects[fn_PdfObjects],pages='0:')

モジュールを見ると、これが見つかります: src: https://pythonhosted.org/PyPDF2/PdfFileMerger.html

マージ (位置、fileobj、ブックマーク = なし、ページ = なし、import_bookmarks = True)

pages – ページ範囲または (start, stop[, step]) タプルで、ソースドキュメントから指定された範囲のページのみを出力ドキュメントにマージできます。

page_ranges についてもこれを見つけましたが、何を試しても動作しません: src: https://github.com/mstamy2/PyPDF2/blob/master/PyPDF2/pagerange.py

class PageRange(object):
"""
A slice-like representation of a range of page indices,
    i.e. page numbers, only starting at zero.
The syntax is like what you would put between brackets [ ].
The slice is one of the few Python types that can't be subclassed,
but this class converts to and from slices, and allows similar use.
  o  PageRange(str) parses a string representing a page range.
  o  PageRange(slice) directly "imports" a slice.
  o  to_slice() gives the equivalent slice.
  o  str() and repr() allow printing.
  o  indices(n) is like slice.indices(n).
"""

def __init__(self, arg):
    """
    Initialize with either a slice -- giving the equivalent page range,
    or a PageRange object -- making a copy,
    or a string like
        "int", "[int]:[int]" or "[int]:[int]:[int]",
        where the brackets indicate optional ints.
    {page_range_help}
    Note the difference between this notation and arguments to slice():
        slice(3) means the first three pages;
        PageRange("3") means the range of only the fourth page.
        However PageRange(slice(3)) means the first three pages.
    """

エラーは次のとおりです。 TypeError: "pages" must be a tuple of (start, stop[, step])

    Traceback (most recent call last):
File "combining_select_pages_from_many_pdfs.py", line 112, in <module>
main() 
File "combining_select_pages_from_many_pdfs.py", line 104, in main
newPdfFile = mergePdfFiles(l_PdfObjects)
File "combining_select_pages_from_many_pdfs.py", line 63, in mergePdfFiles
tempMergerFile.merge(position=fn_PdfObjects+1, fileobj=l_fn_PdfObjects[fn_PdfObjects],pages=[0])
File "/home/sybie/.local/lib/python3.5/site-packages/PyPDF2/merger.py", line 143, in merge
raise TypeError('"pages" must be a tuple of (start, stop[, step])')

これについて私が見つけることができるのは次のとおりです。

# Find the range of pages to merge.
    if pages == None:
        pages = (0, pdfr.getNumPages())
    elif isinstance(pages, PageRange):
        pages = pages.indices(pdfr.getNumPages())
    elif not isinstance(pages, tuple):
        raise TypeError('"pages" must be a tuple of (start, stop[, step])')

ソース: https://github.com/mstamy2/PyPDF2/blob/master/PyPDF2/merger.py#L137

すべての助けを前もってありがとう！

score 0 · Accepted Answer

これを行うことで問題を解決しました：

pages=(1,l_fn_PdfObjects[fn_PdfObjects].numPages)

実際、私はそれをタプルにしました。ページ範囲がどのように機能するかを誰かが教えてくれたら、私は感謝します!

python - PyPDF2 は 2 ページ目から PDF を追加します

2 に答える 2

Related

Reference