I am a college freshman and Python newbie so bear with me. I am trying to parallelize some matrix operations. Here is my attempt using the ParallelPython module:
def testfunc(connectionMatrix, qCount, iCount, Htry, tStepCount):
test = connectionMatrix[0:qCount,0:iCount].dot(Htry[tStepCount-1, 0:iCount])
return test
f1 = job_server.submit(testfunc, (self.connectionMatrix, self.qCount, self.iCount, self.iHtry, self.tStepCount), modules = ("scipy.sparse",))
f2 = job_server.submit(testfunc, (self.connectionMatrix, self.qCount, self.iCount, self.didtHtry, self.tStepCount), modules = ("scipy.sparse",))
r1 = f1()
r2 = f2()
self.qHtry[self.tStepCount, 0:self.qCount] = self.qHtry[self.tStepCount-1, 0:self.qCount] + self.delT * r1 + 0.5 * (self.delT**2) * r2
It seems that there is a normal curve with size of the matrix on the x-axis and the percent speed-up on the y-axis. It seems to cap out at a 30% speed increase at 100x100 matrices. Smaller and larger matrices result in less increase and with small enough and large enough matrices, the serial code is faster. My guess is that the problem lies within the passing of the arguments. The overhead of copying the large matrix is actually taking longer than the job itself. What can I do to get around this? Is there some way to incorporate memory sharing and passing the matrix by reference? As you can see, none of the arguments are modified so it could be read-only access.
Thanks.