python - スレッドのメモリ使用量が増え続ける

Question

私はウェブページにアクセスして、ウェブサイトの所有者が彼に連絡することを許可しているかどうかを確認しようとしています..

これは、各スレッドが呼び出す関数です。

def getpage():
    try:
        curl = urls.pop(0)
        print "working on " +str(curl)
        thepage1 = requests.get(curl).text
        global ctot
        if "Contact Us" in thepage1:
            slist.write("\n" +curl)
            ctot = ctot + 1
    except:
        pass
    finally:
        if len(urls)>0 :
            getpage()

しかし、問題はプログラムのメモリが増え続けることです.. (pythonw.exe)

スレッドが関数を再度呼び出すと、条件は true になります。プログラムのメモリは、少なくともほぼ同じレベルに留まる必要があります。

約 10 万の URL を含むリストの場合、プログラムは 3GB をはるかに超えて消費し、さらに増加しています...

score 3 · Accepted Answer

あなたのプログラムは理由もなく再帰的です。再帰とは、取得するページごとに新しい変数セットを作成することを意味します。これらはまだ関数内のローカル変数によって参照されているため、関数は決して終了しないため、ガベージコレクションは機能せず、継続します。記憶を永遠に食べる。

ステートメントを読んでwhileください。ここで再帰の代わりに使用したいステートメントです。

while len(urls)>0 :
    try:
        curl = urls.pop(0)
        thepage1 = requests.get(curl).text
        global ctot
        if "Contact Us" in thepage1:
            slist.write("\n" +curl)
            ctot = ctot + 1
    except:
        pass

score -1 · Accepted Answer

私はあなたのコードを見ました：http://pastebin.com/J4Rd3NhA

100 スレッドの実行中に join を使用します。

for xd in range(0,noofthreads):
    t = threading.Thread(target=getpage)
    t.daemon = True
    t.start()
    tarray.append(t)
    # my additional code
    if len(tarray) >= 100:
        tarray[-100].join()

これはどのように機能しますか？何か問題がある場合は、教えてください。

python - スレッドのメモリ使用量が増え続ける

2 に答える 2

Related

Reference