apache-spark - mllib で L2 正則化を使用した SGD

Question

L2 正則化を使用した SGD のオープンソース mllib コードを読み取るのが困難です。

コードは

class SquaredL2Updater extends Updater {
override def compute(
  weightsOld: Vector,
  gradient: Vector,
  stepSize: Double,
  iter: Int,
  regParam: Double): (Vector, Double) = {
// add up both updates from the gradient of the loss (= step) as well as
// the gradient of the regularizer (= regParam * weightsOld)
// w' = w - thisIterStepSize * (gradient + regParam * w)
// w' = (1 - thisIterStepSize * regParam) * w - thisIterStepSize * gradient
val thisIterStepSize = stepSize / math.sqrt(iter)
val brzWeights: BV[Double] = weightsOld.toBreeze.toDenseVector
brzWeights :*= (1.0 - thisIterStepSize * regParam)
brzAxpy(-thisIterStepSize, gradient.toBreeze, brzWeights)
val norm = brzNorm(brzWeights, 2.0)

(Vectors.fromBreeze(brzWeights), 0.5 * regParam * norm * norm)
}

悩んでいる部分は

brzWeights :*= (1.0 - thisIterStepSize * regParam)

そよ風ライブラリには、:*= 演算子を説明するドキュメントがあります。

/** Mutates this by element-wise multiplication of b into this. */
final def :*=[TT >: This, B](b: B)(implicit op: OpMulScalar.InPlaceImpl2[TT, B]): This = {
 op(repr, b)
 repr
}

ベクトルとスカラーの単純な乗算のように見えます。

L2正則化の場合の勾配について私が見つけた式は

今回のアップデートで、コードはこのグラデーションをどのように表現していますか? 誰か助けてください。

score 0 · Accepted Answer

わかりました、私はそれを理解しました。アップデータ方程式は

項を並べ替えると

最後の項を認識することは単なる勾配です

これは、

brzAxpy(-thisIterStepSize, gradient.toBreeze, brzWeights)

それを打ち破る

brzWeights = brzWeights + -thisIterStepSize * gradient.toBreeze

前の行で、brzWeights :*= (1.0 - thisIterStepSize * regParam)

つまり、brzWeights = brzWeights * (1.0 - thisIterStepSize * regParam)

それで、最後に

brzWeights = brzWeights * (1.0 - thisIterStepSize * regParam) + (-thisIterStepSize) * gradient.toBreeze

これで、コードと方程式は正規化係数内で一致します。これは、次の行で処理されると思います。

apache-spark - mllib で L2 正則化を使用した SGD

1 に答える 1

Related

Reference