python-2.7 - joblib.Parallel を使用したマルチプロセッシング - 自己記述アルゴリズムを並列化するときのエラー

Question

データのモデルに適合する ftrl_proximal() というクラスがあります。これは自作の分類子です (sklearn のものではありません)。
アルゴリズムは、1 つの CPU のみを使用して実行すると完璧に機能しますが、マルチプロセッシング (一種の相互検証) で実行しようとすると、以下に説明するエラーが発生します。コードは次のとおりです。

from FTRL import ftrl_proximal 
from sklearn.externals import joblib
from sklearn.base import clone
import multiprocessing
from sklearn.cross_validation import StratifiedKFold

def ftrl_train(estimator, X, y, train_index, test_index):
    y_ftrl_pred_test = []
    y_ftrl_true = []

    # Split the data to train and test
    X_train, X_test = X[train_index], X[test_index]
    y_train, y_test = y[train_index], y[test_index]  

    # Fit a model on sample of the data
    for idx, x in enumerate(X_train):
        # predict
        _p = estimator.predict(x.indices)
        # update
        estimator.update(x.indices, _p, y_train[idx])

    for idx, x in enumerate(X_test):
        _v = estimator.predict(x.indices)
        y_ftrl_pred_test.append(_v) # Predicted
        y_ftrl_true.append(y_test[idx]) # True

    return y_ftrl_pred_test, y_ftrl_true


cv_fold = 3 # determines the number of folds. 
skf = StratifiedKFold(y, n_folds=cv_fold, random_state=0)

ftrl = ftrl_proximal(alpha, beta, L1, L2, D, interaction) # initialize a learner

parallel = joblib.Parallel(n_jobs=num_cores, verbose=0, pre_dispatch='2*n_jobs')

preds_blocks = parallel(joblib.delayed(ftrl_train)(clone(ftrl), X, y,
                                                          train_index, test_index, verbose=0, fit_params=None)
                                for train_index, test_index in skf)

エラー：

Traceback (most recent call last):
  File "/home/workspace/Predictor/modelSelection.py", line 61, in <module>
    class Main():
  File "/home/workspace/Predictor/modelSelection.py", line 199, in Main
    for train_index, test_index in skf)
  File "/home/anaconda2/lib/python2.7/site-packages/sklearn/externals/joblib/parallel.py", line 658, in __call__
    for function, args, kwargs in iterable:
  File "/home/anaconda2/lib/python2.7/site-packages/sklearn/externals/joblib/parallel.py", line 184, in next
    return next(self._it)
  File "/home/workspace/Predictor/modelSelection.py", line 199, in <genexpr>
    for train_index, test_index in skf)
NameError: global name '_ftrl' is not defined

python-2.7 - joblib.Parallel を使用したマルチプロセッシング - 自己記述アルゴリズムを並列化するときのエラー

0 に答える 0

Related

Reference