python - Python で openpyxl を使用して Excel スプレッドシートに行を挿入する

Question

openpyxl を使用してスプレッドシートに行を挿入するための最良の方法を探しています。

事実上、ヘッダー行があり、その後に (最大で) 数千行のデータが続くスプレッドシート (Excel 2007) があります。行を実際のデータの最初の行として挿入しようとしているので、ヘッダーの後に挿入します。私の理解では、append 関数はファイルの末尾にコンテンツを追加するのに適しています。

openpyxl と xlrd (および xlwt) の両方のドキュメントを読むと、コンテンツを手動でループして新しいシートに挿入する以外に、これを行う明確な方法が見つかりません (必要な行を挿入した後)。

これまでのところ Python での経験が限られているため、これが本当に最良の選択肢であるかどうか (最も Pythonic!) を理解しようとしています。具体的には、openpyxl で行を読み書きできますか、それともセルにアクセスする必要がありますか? さらに、同じファイル (名前) を (上書き) できますか?

score 10 · Accepted Answer

目的の結果を達成するために現在使用しているコードでこれに答えます。行を位置 1 に手動で挿入していますが、特定のニーズに合わせて調整するのは簡単です。これを簡単に調整して複数の行を挿入し、関連する位置から残りのデータを入力することもできます。

また、ダウンストリームの依存関係により、「Sheet1」からデータを手動で指定していることに注意してください。データは、元のワークシートの名前を「Sheet1.5」に変更しながら、ワークブックの先頭に挿入される新しいシートにコピーされます。 .

編集:また、(後で) format_code に変更を追加して、ここでの既定のコピー操作ですべての書式設定が削除される問題を修正しました: new_cell.style.number_format.format_code = 'mm/dd/yyyy'. これが設定可能であるというドキュメントは見つかりませんでした。試行錯誤の結果でした。

最後に、この例がオリジナルを上書きしていることを忘れないでください。これを回避するために、必要に応じて保存パスを変更できます。

    import openpyxl

    wb = openpyxl.load_workbook(file)
    old_sheet = wb.get_sheet_by_name('Sheet1')
    old_sheet.title = 'Sheet1.5'
    max_row = old_sheet.get_highest_row()
    max_col = old_sheet.get_highest_column()
    wb.create_sheet(0, 'Sheet1')

    new_sheet = wb.get_sheet_by_name('Sheet1')

    # Do the header.
    for col_num in range(0, max_col):
        new_sheet.cell(row=0, column=col_num).value = old_sheet.cell(row=0, column=col_num).value

    # The row to be inserted. We're manually populating each cell.
    new_sheet.cell(row=1, column=0).value = 'DUMMY'
    new_sheet.cell(row=1, column=1).value = 'DUMMY'

    # Now do the rest of it. Note the row offset.
    for row_num in range(1, max_row):
        for col_num in range (0, max_col):
            new_sheet.cell(row = (row_num + 1), column = col_num).value = old_sheet.cell(row = row_num, column = col_num).value

    wb.save(file)

score 5 · Accepted Answer

Openpyxl ワークシートは、行レベルまたは列レベルの操作に関しては機能が制限されています。行/列に関連する Worksheet の唯一のプロパティは、各行と列の "RowDimensions" オブジェクトと "ColumnDimensions" オブジェクトをそれぞれ格納するプロパティrow_dimensionsとです。これらの辞書は、やなどcolumn_dimensionsの関数でも使用されます。get_highest_row()get_highest_column()

それ以外はすべてセルレベルで動作し、 Cell オブジェクトはディクショナリで追跡されます_cells(およびそれらのスタイルはディクショナリで追跡されます_styles)。行または列レベルで何かを行っているように見えるほとんどの関数は、実際にはセルの範囲で動作しています (前述のなどappend())。

最も簡単なことは、あなたが提案したことです。新しいシートを作成し、ヘッダー行を追加し、新しいデータ行を追加し、古いデータ行を追加し、古いシートを削除してから、新しいシートの名前を古いものに変更します。この方法で発生する可能性のある問題は、特にコピーしない限り、行/列のディメンション属性とセルスタイルが失われることです。

または、行または列を挿入する独自の関数を作成することもできます。

列を削除する必要がある非常に単純なワークシートが多数ありました。明示的な例を求められたので、これを行うためにすぐにまとめた関数を提供します。

from openpyxl.cell import get_column_letter

def ws_delete_column(sheet, del_column):

    for row_num in range(1, sheet.get_highest_row()+1):
        for col_num in range(del_column, sheet.get_highest_column()+1):

            coordinate = '%s%s' % (get_column_letter(col_num),
                                   row_num)
            adj_coordinate = '%s%s' % (get_column_letter(col_num + 1),
                                       row_num)

            # Handle Styles.
            # This is important to do if you have any differing
            # 'types' of data being stored, as you may otherwise get
            # an output Worksheet that's got improperly formatted cells.
            # Or worse, an error gets thrown because you tried to copy
            # a string value into a cell that's styled as a date.

            if adj_coordinate in sheet._styles:
                sheet._styles[coordinate] = sheet._styles[adj_coordinate]
                sheet._styles.pop(adj_coordinate, None)
            else:
                sheet._styles.pop(coordinate, None)

            if adj_coordinate in sheet._cells:
                sheet._cells[coordinate] = sheet._cells[adj_coordinate]
                sheet._cells[coordinate].column = get_column_letter(col_num)
                sheet._cells[coordinate].row = row_num
                sheet._cells[coordinate].coordinate = coordinate

                sheet._cells.pop(adj_coordinate, None)
            else:
                sheet._cells.pop(coordinate, None)

        # sheet.garbage_collect()

作業中のワークシートと、削除したい列番号を渡すと、消えます。ご希望の内容ではないことは承知しておりますが、この情報がお役に立てば幸いです。

編集:誰かがこれに別の投票をしたことに気づき、更新する必要があると考えました。Openpyxl の座標系は、ここ数年の間にいくつかの変更を経験し、coordinateのアイテムに属性が導入されました_cell。これも編集する必要があります。そうしないと、行が (削除されるのではなく) 空白のままになり、Excel はファイルの問題に関するエラーをスローします。これは Openpyxl 2.2.3 で動作します (以降のバージョンではテストされていません)。

score 3 · Accepted Answer

Python で openpyxl を使用して Excel スプレッドシートに行を挿入するには

以下のコードはあなたを助けることができます:-

import openpyxl

file = "xyz.xlsx"
#loading XL sheet bassed on file name provided by user
book = openpyxl.load_workbook(file)
#opening sheet whose index no is 0
sheet = book.worksheets[0]

#insert_rows(idx, amount=1) Insert row or rows before row==idx, amount will be no of 
#rows you want to add and it's optional
sheet.insert_rows(13)

列を挿入するために、openpyxl にも同様の関数 ieinsert_cols(idx, amount=1) があります

score 1 · Accepted Answer

私はダラスのソリューションを採用し、結合されたセルのサポートを追加しました:

    def insert_rows(self, row_idx, cnt, above=False, copy_style=True, fill_formulae=True):
        skip_list = []
        try:
            idx = row_idx - 1 if above else row_idx
            for (new, old) in zip(range(self.max_row+cnt,idx+cnt,-1),range(self.max_row,idx,-1)):
                for c_idx in range(1,self.max_column):
                  col = self.cell(row=1, column=c_idx).column #get_column_letter(c_idx)
                  print("Copying %s%d to %s%d."%(col,old,col,new))
                  source = self["%s%d"%(col,old)]
                  target = self["%s%d"%(col,new)]
                  if source.coordinate in skip_list:
                      continue

                  if source.coordinate in self.merged_cells:
                      # This is a merged cell
                      for _range in self.merged_cell_ranges:
                          merged_cells_list = [x for x in cells_from_range(_range)][0]
                          if source.coordinate in merged_cells_list:
                              skip_list = merged_cells_list
                              self.unmerge_cells(_range)
                              new_range = re.sub(str(old),str(new),_range)
                              self.merge_cells(new_range)
                              break

                  if source.data_type == Cell.TYPE_FORMULA:
                    target.value = re.sub(
                      "(\$?[A-Z]{1,3})%d"%(old),
                      lambda m: m.group(1) + str(new),
                      source.value
                    )
                  else:
                    target.value = source.value
                  target.number_format = source.number_format
                  target.font   = source.font.copy()
                  target.alignment = source.alignment.copy()
                  target.border = source.border.copy()
                  target.fill   = source.fill.copy()
            idx = idx + 1
            for row in range(idx,idx+cnt):
                for c_idx in range(1,self.max_column):
                  col = self.cell(row=1, column=c_idx).column #get_column_letter(c_idx)
                  #print("Clearing value in cell %s%d"%(col,row))
                  cell = self["%s%d"%(col,row)]
                  cell.value = None
                  source = self["%s%d"%(col,row-1)]
                  if copy_style:
                    cell.number_format = source.number_format
                    cell.font      = source.font.copy()
                    cell.alignment = source.alignment.copy()
                    cell.border    = source.border.copy()
                    cell.fill      = source.fill.copy()
                  if fill_formulae and source.data_type == Cell.TYPE_FORMULA:
                    #print("Copying formula from cell %s%d to %s%d"%(col,row-1,col,row))
                    cell.value = re.sub(
                      "(\$?[A-Z]{1,3})%d"%(row - 1),
                      lambda m: m.group(1) + str(row),
                      source.value
                    )

score 0 · Accepted Answer

ニックのソリューションを編集したこのバージョンは、開始行、挿入する行数、およびファイル名を取り、必要な数の空白行を挿入します。

#! python 3

import openpyxl, sys

my_start = int(sys.argv[1])
my_rows = int(sys.argv[2])
str_wb = str(sys.argv[3])

wb = openpyxl.load_workbook(str_wb)
old_sheet = wb.get_sheet_by_name('Sheet')
mcol = old_sheet.max_column
mrow = old_sheet.max_row
old_sheet.title = 'Sheet1.5'
wb.create_sheet(index=0, title='Sheet')

new_sheet = wb.get_sheet_by_name('Sheet')

for row_num in range(1, my_start):
    for col_num in range(1, mcol + 1):
        new_sheet.cell(row = row_num, column = col_num).value = old_sheet.cell(row = row_num, column = col_num).value

for row_num in range(my_start + my_rows, mrow + my_rows):
    for col_num in range(1, mcol + 1):
        new_sheet.cell(row = (row_num + my_rows), column = col_num).value = old_sheet.cell(row = row_num, column = col_num).value

wb.save(str_wb)

score 0 · Accepted Answer

これは私のために働いた：

    openpyxl.worksheet.worksheet.Worksheet.insert_rows(wbs,idx=row,amount=2)

row==idx の前に 2 行挿入する

参照: http://openpyxl.readthedocs.io/en/stable/api/openpyxl.worksheet.worksheet.html

score -1 · Accepted Answer

残念ながら、ファイルを読み込んで、xlwt のようなライブラリを使用して新しい Excel ファイルを書き出すより良い方法は実際にはありません (新しい行が上部に挿入されます)。Excel は、読み取りと追加が可能なデータベースのようには機能しません。残念ながら、情報を読み込んでメモリ内で操作し、本質的に新しいファイルに書き出す必要があります。

python - Python で openpyxl を使用して Excel スプレッドシートに行を挿入する

12 に答える 12

Related

Reference