最初の例で、私がイメージングしたようにメモリ消費が発生する理由を理解したいと思います。
s = StringIO()
s.write('abc'*10000000)
# Memory increases: OK
s.seek(0)
s.truncate()
# Memory decreases: OK
この2番目の例では、最後に同じものを使用していますが、切り捨てメソッドの後でメモリが減少していないようです。次のコードは、クラスのメソッドに含まれています。
from StringIO import StringIO
import requests
self.BUFFER_SIZE = 5 * 1024 * 2 ** 10 # 5 MB
self.MAX_MEMORY = 3 * 1024 * 2 ** 10 # 3 MB
r = requests.get(self.target, stream=True) # stream=True to not download the data at once
chunks = r.iter_content(chunk_size=self.BUFFER_SIZE)
buff = StringIO()
# Get the MAX_MEMORY first data
for chunk in chunks:
buff.write(chunk)
if buff.len > self.MAX_MEMORY:
break
# Left the loop because there is no more chunks: it stays in memory
if buff.len < self.MAX_MEMORY:
self.data = buff.getvalue()
# Otherwise, prepare a temp file and process the remaining chunks
else:
self.path = self._create_tmp_file_path()
with open(self.path, 'w') as f:
# Write the first downloaded data
buff.seek(0)
f.write(buffer.read())
# Free the buffer ?
buff.seek(0)
buff.truncate()
###################
# Memory does not decrease
# And another 5MB will be added to the memory hiting the next line which is normal because it is the size of a chunk
# But if the buffer was freed, the memory would stay steady: - 5 MB + 5 MB
# Write the remaining chunks directly into the file
for chunk in chunks:
f.write(chunk)
何かご意見は?ありがとう。