c++ - C++ での多変量正規/ガウス分布からのサンプル

Question

多変量正規分布からサンプリングする便利な方法を探していました。それを行うためのすぐに利用できるコードスニペットを知っている人はいますか? 行列/ベクトルについては、 BoostやEigen、またはよく知らない別の驚異的なライブラリを使用したいと思いますが、ピンチでGSLを使用することもできます。また、メソッドが正定値を必要とするのではなく、非負定値共分散行列を受け入れた場合 (たとえば、コレスキー分解のように) も気に入っています。これは MATLAB、NumPy などに存在しますが、既製の C/C++ ソリューションを見つけるのに苦労しました。

自分で実装する必要がある場合は、不平を言いますが、それで問題ありません。私がそうするなら、ウィキペディアは私がそうすべきだと思わせます

n 0 平均、単位分散、独立正規サンプルを生成します (boost はこれを行います)
共分散行列の固有分解を求める
対応する固有値の平方根によってn 個のサンプルのそれぞれをスケーリングします
スケーリングされたベクトルに、分解によって見つかった正規直交固有ベクトルの行列を事前に乗算することにより、サンプルのベクトルを回転させます

これは早く活躍してほしいです。共分散行列が正であるかどうかを確認する価値がある場合、誰かが直観を持っていますか?もしそうなら、代わりにコレスキーを使用しますか?

score 24 · Accepted Answer

この質問は多くの意見を集めたので、Eigen フォーラムに投稿することで、見つけた最終的な回答のコードを投稿することにしました。このコードでは、単変量法線に Boost を使用し、行列処理に Eigen を使用します。「内部」名前空間を使用する必要があるため、かなり異例のように感じますが、機能します。誰かが方法を提案した場合、私はそれを改善することにオープンです。

#include <Eigen/Dense>
#include <boost/random/mersenne_twister.hpp>
#include <boost/random/normal_distribution.hpp>    

/*
  We need a functor that can pretend it's const,
  but to be a good random number generator 
  it needs mutable state.
*/
namespace Eigen {
namespace internal {
template<typename Scalar> 
struct scalar_normal_dist_op 
{
  static boost::mt19937 rng;    // The uniform pseudo-random algorithm
  mutable boost::normal_distribution<Scalar> norm;  // The gaussian combinator

  EIGEN_EMPTY_STRUCT_CTOR(scalar_normal_dist_op)

  template<typename Index>
  inline const Scalar operator() (Index, Index = 0) const { return norm(rng); }
};

template<typename Scalar> boost::mt19937 scalar_normal_dist_op<Scalar>::rng;

template<typename Scalar>
struct functor_traits<scalar_normal_dist_op<Scalar> >
{ enum { Cost = 50 * NumTraits<Scalar>::MulCost, PacketAccess = false, IsRepeatable = false }; };
} // end namespace internal
} // end namespace Eigen

/*
  Draw nn samples from a size-dimensional normal distribution
  with a specified mean and covariance
*/
void main() 
{
  int size = 2; // Dimensionality (rows)
  int nn=5;     // How many samples (columns) to draw
  Eigen::internal::scalar_normal_dist_op<double> randN; // Gaussian functor
  Eigen::internal::scalar_normal_dist_op<double>::rng.seed(1); // Seed the rng

  // Define mean and covariance of the distribution
  Eigen::VectorXd mean(size);       
  Eigen::MatrixXd covar(size,size);

  mean  <<  0,  0;
  covar <<  1, .5,
           .5,  1;

  Eigen::MatrixXd normTransform(size,size);

  Eigen::LLT<Eigen::MatrixXd> cholSolver(covar);

  // We can only use the cholesky decomposition if 
  // the covariance matrix is symmetric, pos-definite.
  // But a covariance matrix might be pos-semi-definite.
  // In that case, we'll go to an EigenSolver
  if (cholSolver.info()==Eigen::Success) {
    // Use cholesky solver
    normTransform = cholSolver.matrixL();
  } else {
    // Use eigen solver
    Eigen::SelfAdjointEigenSolver<Eigen::MatrixXd> eigenSolver(covar);
    normTransform = eigenSolver.eigenvectors() 
                   * eigenSolver.eigenvalues().cwiseSqrt().asDiagonal();
  }

  Eigen::MatrixXd samples = (normTransform 
                           * Eigen::MatrixXd::NullaryExpr(size,nn,randN)).colwise() 
                           + mean;

  std::cout << "Mean\n" << mean << std::endl;
  std::cout << "Covar\n" << covar << std::endl;
  std::cout << "Samples\n" << samples << std::endl;
}

score 0 · Accepted Answer

SVD を実行してから、行列が PD かどうかを確認するのはどうですか? これは、Cholskey 分解を計算する必要がないことに注意してください。ただし、SVD は Cholskey よりも遅いと思いますが、どちらもフロップ数が 3 次でなければなりません。

c++ - C++ での多変量正規/ガウス分布からのサンプル

4 に答える 4

Related

Reference