I'm trying to programmatically modify an excel file (xlsx). I can successfully decompress, modify the xml as needed, and re-compress. However, i'm getting a warning everytime i open excel, even though it does read the file. I believe the error is due to the compression method used. This is an example of the closest I can get:
Decompress
7z x original.xlsx -o./decomp_xlsx
..Do some stuff..
Compress
7z a -tzip new ./decomp_xlsx/*
Rename
mv ./new.zip ./new.xlsx
The Error i get is: Excel found unreadable content in 'new.xlsx'. Do you want to recover the contents of this workbook? If you trust the source of this workbook, click Yes.
From ECMA-376-2 Office Open Formats Part 2 (Packaging Conventions) The compression algorithm supported is DEFLATE, as described in the .ZIP specification. The package implementer shall not use any compression algorithm other than DEFLATE.
So, what switches do i need to use in 7z or other linux compatible program to get the job done without the warning? I've tried dropping the -tzip and using -m0=COPY, but excel can't even recover from that one.
So here's the result of the zip program and zipinfo. I'm guessing i'm not going to find a tool to do this, other than the one provided below, so i'm going to award that answer, and see if i can find someone to translate to python for testing. I"m not sure it handles the differences between the 4.5 / 3.0, then b- / tx or the defS / defF though.
$ zipinfo original.xlsx
Archive: original.xlsx
Zip file size: 228039 bytes, number of entries: 20
-rw---- 4.5 fat 1969 b- defS 80-Jan-01 00:00 [Content_Types].xml
-rw---- 4.5 fat 588 b- defS 80-Jan-01 00:00 _rels/.rels
-rw---- 4.5 fat 1408 b- defS 80-Jan-01 00:00 xl/_rels/workbook.xml.rels
-rw---- 4.5 fat 908 b- defS 80-Jan-01 00:00 xl/workbook.xml
-rw---- 4.5 fat 35772 b- defS 80-Jan-01 00:00 xl/worksheets/sheet4.xml
-rw---- 4.5 fat 322 b- defS 80-Jan-01 00:00 xl/worksheets/_rels/sheet4.xml.rels
-rw---- 4.5 fat 322 b- defS 80-Jan-01 00:00 xl/worksheets/_rels/sheet1.xml.rels
-rw---- 4.5 fat 230959 b- defS 80-Jan-01 00:00 xl/worksheets/sheet2.xml
-rw---- 4.5 fat 263127 b- defS 80-Jan-01 00:00 xl/worksheets/sheet3.xml
-rw---- 4.5 fat 295775 b- defS 80-Jan-01 00:00 xl/worksheets/sheet1.xml
-rw---- 4.5 fat 1947 b- defS 80-Jan-01 00:00 xl/sharedStrings.xml
-rw---- 4.5 fat 22698 b- defS 80-Jan-01 00:00 xl/styles.xml
-rw---- 4.5 fat 7079 b- defS 80-Jan-01 00:00 xl/theme/theme1.xml
-rw---- 4.5 fat 220 b- defS 80-Jan-01 00:00 xl/printerSettings/printerSettings2.bin
-rw---- 4.5 fat 464247 b- defS 80-Jan-01 00:00 xl/externalLinks/externalLink1.xml
-rw---- 4.5 fat 338 b- defS 80-Jan-01 00:00 xl/externalLinks/_rels/externalLink1.xml.rels
-rw---- 4.5 fat 220 b- defS 80-Jan-01 00:00 xl/printerSettings/printerSettings1.bin
-rw---- 4.5 fat 593 b- defS 80-Jan-01 00:00 docProps/core.xml
-rw---- 4.5 fat 62899 b- defS 80-Jan-01 00:00 xl/calcChain.xml
-rw---- 4.5 fat 1031 b- defS 80-Jan-01 00:00 docProps/app.xml
20 files, 1392422 bytes uncompressed, 223675 bytes compressed: 83.9%
$ zipinfo new.xlsx
Archive: new.xlsx
Zip file size: 233180 bytes, number of entries: 20
-rw-r--r-- 3.0 unx 1031 tx defF 80-Jan-01 00:00 docProps/app.xml
-rw-r--r-- 3.0 unx 593 tx defF 80-Jan-01 00:00 docProps/core.xml
-rw-r--r-- 3.0 unx 62899 tx defF 80-Jan-01 00:00 xl/calcChain.xml
-rw-r--r-- 3.0 unx 464247 tx defF 80-Jan-01 00:00 xl/externalLinks/externalLink1.xml
-rw-r--r-- 3.0 unx 338 tx defF 80-Jan-01 00:00 xl/externalLinks/_rels/externalLink1.xml.rels
-rw-r--r-- 3.0 unx 220 bx defF 80-Jan-01 00:00 xl/printerSettings/printerSettings1.bin
-rw-r--r-- 3.0 unx 220 bx defF 80-Jan-01 00:00 xl/printerSettings/printerSettings2.bin
-rw-r--r-- 3.0 unx 1947 tx defF 80-Jan-01 00:00 xl/sharedStrings.xml
-rw-r--r-- 3.0 unx 22698 tx defF 80-Jan-01 00:00 xl/styles.xml
-rw-r--r-- 3.0 unx 7079 tx defF 80-Jan-01 00:00 xl/theme/theme1.xml
-rw-r--r-- 3.0 unx 908 tx defF 80-Jan-01 00:00 xl/workbook.xml
-rw-r--r-- 3.0 unx 295775 tx defF 80-Jan-01 00:00 xl/worksheets/sheet1.xml
-rw-r--r-- 3.0 unx 230959 tx defF 80-Jan-01 00:00 xl/worksheets/sheet2.xml
-rw-r--r-- 3.0 unx 263127 tx defF 80-Jan-01 00:00 xl/worksheets/sheet3.xml
-rw-r--r-- 3.0 unx 35772 tx defF 80-Jan-01 00:00 xl/worksheets/sheet4.xml
-rw-r--r-- 3.0 unx 322 tx defF 80-Jan-01 00:00 xl/worksheets/_rels/sheet1.xml.rels
-rw-r--r-- 3.0 unx 322 tx defF 80-Jan-01 00:00 xl/worksheets/_rels/sheet4.xml.rels
-rw-r--r-- 3.0 unx 1408 tx defF 80-Jan-01 00:00 xl/_rels/workbook.xml.rels
-rw-r--r-- 3.0 unx 1969 tx defF 80-Jan-01 00:00 [Content_Types].xml
-rw-r--r-- 3.0 unx 588 tx defF 80-Jan-01 00:00 _rels/.rels
20 files, 1392422 bytes uncompressed, 229608 bytes compressed: 83.5%