c++ - カメラに関する情報を含まない 2 つの画像からの 3D 再構成

Question

私はこの分野では初めてで、2D 画像から 3D で単純なシーンをモデル化しようとしていますが、カメラに関する情報はありません。私は3つのオプションがあることを知っています:

2 つの画像があり、たとえば XML から読み込んだカメラのモデル (intrisics) を知っていますloadXMLFromFile()=> stereoRectify()=>reprojectImageTo3D()
持っていませんが、カメラを調整できます => stereoCalibrate()=> stereoRectify()=>reprojectImageTo3D()
カメラを調整できません (2 つの画像を撮影したカメラを持っていないため、私の場合です。次に、SURF、SIFT などを使用して両方の画像でペアのキーポイントを見つける必要があります (任意のブロブを使用できます)。次に、これらのキーポイントの記述子を計算し、それらの記述子に従って画像右と画像左のキーポイントを照合し、それらから基本行列を見つけます。処理ははるかに難しく、次のようになります。
1. キーポイントの検出 (SURF、SIFT) =>
2. 記述子の抽出 (SURF、SIFT) =>
3. 記述子の比較と照合 (BruteForce、Flann ベースのアプローチ) =>
4. これらのペアから基本的なマット ( findFundamentalMat()) を見つける =>
5. stereoRectifyUncalibrated()=>
6. reprojectImageTo3D()

私は最後のアプローチを使用していますが、私の質問は次のとおりです。

1) そうですか。

2) よろしければ、最後の手順に疑問がありますstereoRectifyUncalibrated()=> reprojectImageTo3D(). reprojectImageTo3D()関数のシグネチャは次のとおりです。

void reprojectImageTo3D(InputArray disparity, OutputArray _3dImage, InputArray Q, bool handleMissingValues=false, int depth=-1 )

cv::reprojectImageTo3D(imgDisparity8U, xyz, Q, true) (in my code)

パラメーター：

disparity– 単一チャネルの 8 ビット符号なし、16 ビット符号付き、32 ビット符号付き、または 32 ビット浮動小数点視差画像を入力します。
_3dImage– と同じサイズの 3 チャネル浮動小数点イメージを出力しdisparityます。の各要素には、視差マップから計算され_3dImage(x,y)たポイントの 3D 座標が含まれます。(x,y)
Q– で取得できる 4x4 透視変換マトリックスstereoRectify()。
handleMissingValues– 関数が欠損値 (つまり、視差が計算されなかったポイント) を処理する必要があるかどうかを示します。の場合handleMissingValues=true、外れ値 (「」を参照StereoBM::operator()) に対応する視差が最小のピクセルは、非常に大きな Z 値 (現在は 10000 に設定) を持つ 3D ポイントに変換されます。
ddepth– オプションの出力配列の深さ。-1 の場合、出力画像にCV_32F奥行きがあります。、または `CV_32F'ddepthに設定することもできます。CV_16SCV_32S

Qマトリックスを取得するにはどうすればよいですか? 、およびまたは別の方法でQ行列を取得することは可能ですか?FH1H2

3) カメラをキャリブレーションせずに xyz 座標を取得する別の方法はありますか?

私のコードは次のとおりです。

#include <opencv2/core/core.hpp>
#include <opencv2/calib3d/calib3d.hpp>
#include <opencv2/imgproc/imgproc.hpp>
#include <opencv2/highgui/highgui.hpp>
#include <opencv2/contrib/contrib.hpp>
#include <opencv2/features2d/features2d.hpp>
#include <stdio.h>
#include <iostream>
#include <vector>
#include <conio.h>
#include <opencv/cv.h>
#include <opencv/cxcore.h>
#include <opencv/cvaux.h>


using namespace cv;
using namespace std;

int main(int argc, char *argv[]){

    // Read the images
    Mat imgLeft = imread( argv[1], CV_LOAD_IMAGE_GRAYSCALE );
    Mat imgRight = imread( argv[2], CV_LOAD_IMAGE_GRAYSCALE );

    // check
    if (!imgLeft.data || !imgRight.data)
            return 0;

    // 1] find pair keypoints on both images (SURF, SIFT):::::::::::::::::::::::::::::

    // vector of keypoints
    std::vector<cv::KeyPoint> keypointsLeft;
    std::vector<cv::KeyPoint> keypointsRight;

    // Construct the SURF feature detector object
    cv::SiftFeatureDetector sift(
            0.01, // feature threshold
            10); // threshold to reduce
                // sensitivity to lines
                // Detect the SURF features

    // Detection of the SIFT features
    sift.detect(imgLeft,keypointsLeft);
    sift.detect(imgRight,keypointsRight);

    std::cout << "Number of SURF points (1): " << keypointsLeft.size() << std::endl;
    std::cout << "Number of SURF points (2): " << keypointsRight.size() << std::endl;

    // 2] compute descriptors of these keypoints (SURF,SIFT) ::::::::::::::::::::::::::

    // Construction of the SURF descriptor extractor
    cv::SurfDescriptorExtractor surfDesc;

    // Extraction of the SURF descriptors
    cv::Mat descriptorsLeft, descriptorsRight;
    surfDesc.compute(imgLeft,keypointsLeft,descriptorsLeft);
    surfDesc.compute(imgRight,keypointsRight,descriptorsRight);

    std::cout << "descriptor matrix size: " << descriptorsLeft.rows << " by " << descriptorsLeft.cols << std::endl;

    // 3] matching keypoints from image right and image left according to their descriptors (BruteForce, Flann based approaches)

    // Construction of the matcher
    cv::BruteForceMatcher<cv::L2<float> > matcher;

    // Match the two image descriptors
    std::vector<cv::DMatch> matches;
    matcher.match(descriptorsLeft,descriptorsRight, matches);

    std::cout << "Number of matched points: " << matches.size() << std::endl;


    // 4] find the fundamental mat ::::::::::::::::::::::::::::::::::::::::::::::::::::

    // Convert 1 vector of keypoints into
    // 2 vectors of Point2f for compute F matrix
    // with cv::findFundamentalMat() function
    std::vector<int> pointIndexesLeft;
    std::vector<int> pointIndexesRight;
    for (std::vector<cv::DMatch>::const_iterator it= matches.begin(); it!= matches.end(); ++it) {

         // Get the indexes of the selected matched keypoints
         pointIndexesLeft.push_back(it->queryIdx);
         pointIndexesRight.push_back(it->trainIdx);
    }

    // Convert keypoints into Point2f
    std::vector<cv::Point2f> selPointsLeft, selPointsRight;
    cv::KeyPoint::convert(keypointsLeft,selPointsLeft,pointIndexesLeft);
    cv::KeyPoint::convert(keypointsRight,selPointsRight,pointIndexesRight);

    /* check by drawing the points
    std::vector<cv::Point2f>::const_iterator it= selPointsLeft.begin();
    while (it!=selPointsLeft.end()) {

            // draw a circle at each corner location
            cv::circle(imgLeft,*it,3,cv::Scalar(255,255,255),2);
            ++it;
    }

    it= selPointsRight.begin();
    while (it!=selPointsRight.end()) {

            // draw a circle at each corner location
            cv::circle(imgRight,*it,3,cv::Scalar(255,255,255),2);
            ++it;
    } */

    // Compute F matrix from n>=8 matches
    cv::Mat fundemental= cv::findFundamentalMat(
            cv::Mat(selPointsLeft), // points in first image
            cv::Mat(selPointsRight), // points in second image
            CV_FM_RANSAC);       // 8-point method

    std::cout << "F-Matrix size= " << fundemental.rows << "," << fundemental.cols << std::endl;

    /* draw the left points corresponding epipolar lines in right image
    std::vector<cv::Vec3f> linesLeft;
    cv::computeCorrespondEpilines(
            cv::Mat(selPointsLeft), // image points
            1,                      // in image 1 (can also be 2)
            fundemental,            // F matrix
            linesLeft);             // vector of epipolar lines

    // for all epipolar lines
    for (vector<cv::Vec3f>::const_iterator it= linesLeft.begin(); it!=linesLeft.end(); ++it) {

        // draw the epipolar line between first and last column
        cv::line(imgRight,cv::Point(0,-(*it)[2]/(*it)[1]),cv::Point(imgRight.cols,-((*it)[2]+(*it)[0]*imgRight.cols)/(*it)[1]),cv::Scalar(255,255,255));
    }

    // draw the left points corresponding epipolar lines in left image
    std::vector<cv::Vec3f> linesRight;
    cv::computeCorrespondEpilines(cv::Mat(selPointsRight),2,fundemental,linesRight);
    for (vector<cv::Vec3f>::const_iterator it= linesRight.begin(); it!=linesRight.end(); ++it) {

        // draw the epipolar line between first and last column
        cv::line(imgLeft,cv::Point(0,-(*it)[2]/(*it)[1]), cv::Point(imgLeft.cols,-((*it)[2]+(*it)[0]*imgLeft.cols)/(*it)[1]), cv::Scalar(255,255,255));
    }

    // Display the images with points and epipolar lines
    cv::namedWindow("Right Image Epilines");
    cv::imshow("Right Image Epilines",imgRight);
    cv::namedWindow("Left Image Epilines");
    cv::imshow("Left Image Epilines",imgLeft);
    */

    // 5] stereoRectifyUncalibrated()::::::::::::::::::::::::::::::::::::::::::::::::::

    //H1, H2 – The output rectification homography matrices for the first and for the second images.
    cv::Mat H1(4,4, imgRight.type());
    cv::Mat H2(4,4, imgRight.type());
    cv::stereoRectifyUncalibrated(selPointsRight, selPointsLeft, fundemental, imgRight.size(), H1, H2);


    // create the image in which we will save our disparities
    Mat imgDisparity16S = Mat( imgLeft.rows, imgLeft.cols, CV_16S );
    Mat imgDisparity8U = Mat( imgLeft.rows, imgLeft.cols, CV_8UC1 );

    // Call the constructor for StereoBM
    int ndisparities = 16*5;      // < Range of disparity >
    int SADWindowSize = 5;        // < Size of the block window > Must be odd. Is the 
                                  // size of averaging window used to match pixel  
                                  // blocks(larger values mean better robustness to
                                  // noise, but yield blurry disparity maps)

    StereoBM sbm( StereoBM::BASIC_PRESET,
        ndisparities,
        SADWindowSize );

    // Calculate the disparity image
    sbm( imgLeft, imgRight, imgDisparity16S, CV_16S );

    // Check its extreme values
    double minVal; double maxVal;

    minMaxLoc( imgDisparity16S, &minVal, &maxVal );

    printf("Min disp: %f Max value: %f \n", minVal, maxVal);

    // Display it as a CV_8UC1 image
    imgDisparity16S.convertTo( imgDisparity8U, CV_8UC1, 255/(maxVal - minVal));

    namedWindow( "windowDisparity", CV_WINDOW_NORMAL );
    imshow( "windowDisparity", imgDisparity8U );


    // 6] reprojectImageTo3D() :::::::::::::::::::::::::::::::::::::::::::::::::::::

    //Mat xyz;
    //cv::reprojectImageTo3D(imgDisparity8U, xyz, Q, true);

    //How can I get the Q matrix? Is possibile to obtain the Q matrix with 
    //F, H1 and H2 or in another way?
    //Is there another way for obtain the xyz coordinates?

    cv::waitKey();
    return 0;
}

score 5 · Accepted Answer

StereoRectifyUn calibrationd は、オブジェクト空間での平行化変換ではなく、単純に平面透視変換を計算します。Qマトリックスを抽出するには、この平面変換をオブジェクト空間変換に変換する必要があります。これには、カメラのキャリブレーションパラメーターの一部が必要だと思います（カメラの組み込み関数など）。この主題に関して進行中のいくつかの研究トピックがあるかもしれません。

カメラの組み込み関数を推定し、カメラの相対的な向きを抽出して、フローが正しく機能するように、いくつかの手順を追加する必要がある場合があります。アクティブな照明方法が使用されていない場合、シーンの適切な 3D 構造を抽出するには、カメラのキャリブレーションパラメータが不可欠だと思います。

また、すべての推定値をより正確な値に調整するには、バンドルブロック調整ベースのソリューションが必要です。

score 2 · Accepted Answer

手順は私には問題ないように見えます。
私の知る限り、画像ベースの3Dモデリングに関しては、カメラは明示的に調整されているか、暗黙的に調整されています。カメラを明示的にキャリブレーションしたくない。とにかくそれらを利用します。対応するポイントペアを一致させることは、間違いなく頻繁に使用されるアプローチです。

score 1 · Accepted Answer

画像を修正して Q を取得するには、StereoRectify を使用する必要があると思います。この関数には、2 つのカメラ間の回転と移動に 2 つのパラメーター (R と T) が必要です。したがって、solvePnP を使用してパラメーターを計算できます。この関数には、特定のオブジェクトの 3 次元の実座標と、画像内の 2 次元の点とそれらに対応する点が必要です。

c++ - カメラに関する情報を含まない 2 つの画像からの 3D 再構成

3 に答える 3

Related

Reference