queue - Python マルチスレッドの時間依存タスク

Question

from random import randrange
from time import sleep
#import thread
from threading import Thread
from Queue import Queue
'''The idea is that there is a Seeker method that would search a location
for task, I have no idea how many task there will be, could be 1 could be 100.
Each task needs to be put into a thread, does its thing and finishes. I have
stripped down a lot of what this is really suppose to do just to focus on the
correct queuing and threading aspect of the program. The locking was just
me experimenting with locking'''
class Runner(Thread):
    current_queue_size = 0
    def __init__(self, queue):
        self.queue = queue
        data = queue.get()
        self.ID = data[0]
        self.timer = data[1]
        #self.lock = data[2]
        Runner.current_queue_size += 1
        Thread.__init__(self)

    def run(self):
        #self.lock.acquire()
        print "running {ID}, will run for: {t} seconds.".format(ID = self.ID,
                                                                t = self.timer)
        print "Queue size: {s}".format(s = Runner.current_queue_size)
        sleep(self.timer)        
        Runner.current_queue_size -= 1
        print "{ID} done, terminating, ran for {t}".format(ID = self.ID,
                                                                t = self.timer)
        print "Queue size: {s}".format(s = Runner.current_queue_size)
        #self.lock.release()
        sleep(1)
        self.queue.task_done()

def seeker():
    '''Gathers data that would need to enter its own thread.
    For now it just uses a count and random numbers to assign
    both a task ID and a time for each task'''
    queue = Queue()
    queue_item = {}
    count = 1
    #lock = thread.allocate_lock()
    while (count <= 40):
        random_number = randrange(1,350)
        queue_item[count] = random_number
        print "{count} dict ID {key}: value {val}".format(count = count, key = random_number,
                                                          val = random_number)
        count += 1

    for n in queue_item:
        #queue.put((n,queue_item[n],lock))
        queue.put((n,queue_item[n])) 
        '''I assume it is OK to put a tulip in and pull it out later'''
        worker = Runner(queue)
        worker.setDaemon(True)
        worker.start()
    worker.join() 
    '''Which one of these is necessary and why? The queue object
    joining or the thread object'''

    #queue.join()    

if __name__ == '__main__':
    seeker()

私はほとんどの質問をコード自体に入れましたが、要点 (Python2.7) について説明します。

後で自分のために大量のメモリリークを作成していないことを確認したいと思います。
Linuxbox のパテまたは VNC で 40 カウントで実行すると、常にすべての出力が得られるとは限らないことに気付きましたが、Windows で IDLE と Aptana を使用すると得られます。
はい、キューのポイントはスレッドをずらしてシステムのメモリをあふれさせないようにすることだと理解していますが、目の前のタスクは時間に敏感であるため、スレッドの数に関係なく検出されたらすぐに処理する必要がありますがある; Queue を使用すると、ガベージコレクターに推測させるのではなく、タスクがいつ終了したかを明確に指示できることがわかりました。
スレッドまたはキューオブジェクトで .join() を使用して回避できる理由はまだわかりません。
ヒント、コツ、一般的なヘルプ。
読んでくれてありがとう。

score 0 · Accepted Answer

あなたが正しく理解している場合は、実行する必要があるタスクがあるかどうかを確認するために何かを監視するスレッドが必要です。タスクが見つかった場合、それをシーカーおよび現在実行中の他のタスクと並行して実行する必要があります。

これが事実である場合、私はあなたがこれについて間違っている可能性があると思います。GIL が Python でどのように機能するかを見てみましょう。ここで本当に必要なのはマルチプロセッシングだと思います。

pydocs からこれを見てください:

CPython 実装の詳細: CPython では、グローバルインタープリターロックにより、一度に 1 つのスレッドしか Python コードを実行できません (パフォーマンス指向のライブラリによっては、この制限を克服できる場合もあります)。アプリケーションでマルチコアマシンの計算リソースをより有効に活用したい場合は、マルチプロセッシングを使用することをお勧めします。ただし、複数の I/O バウンドタスクを同時に実行する場合は、スレッド化が適切なモデルです。

queue - Python マルチスレッドの時間依存タスク

1 に答える 1

Related

Reference