Here is my python-3.6 code for simulating a 1D reflected random walk, using the joblib
module to generate 400 realizations concurrently across K
workers on a Linux cluster machine.
I note, however, that the runtime for K=3
is worse than for K=1
, and that the runtime for K=5
is even worse!
Can anyone please see a way to improve my use of joblib
?
from math import sqrt
import numpy as np
import joblib as jl
import os
K = int(os.environ['SLURM_CPUS_PER_TASK'])
def f(j):
N = 10**6
p = 1/3
np.random.seed(None)
X = 2*np.random.binomial(1,p,N)-1 # X = 1 with probability p
s = 0 # X =-1 with probability 1-p
m = 0
for t in range(0,N):
s = max(0,s+X[t])
m = max(m,s)
return m
pool = jl.Parallel(n_jobs=K)
W = np.asarray(pool(jl.delayed(f)(j) for j in range(0,400)))
W