iphone - AVCaptureSessionの結果であるCMSampleBufferからYコンポーネントを取得するにはどうすればよいですか？

Question

ねえ、私はAVCaptureSessionを使用してiPhoneカメラから生データにアクセスしようとしています。私はAppleが提供するガイドに従います（リンクはこちら）。

サンプルバッファからの生データはYUV形式です（生のビデオフレーム形式についてここで正しいですか??）、サンプルバッファに保存されている生データからYコンポーネントのデータを直接取得する方法。

score 22 · Accepted Answer

生のカメラフレームを返すAVCaptureVideoDataOutputを設定する場合、次のようなコードを使用してフレームの形式を設定できます。

[videoOutput setVideoSettings:[NSDictionary dictionaryWithObject:[NSNumber numberWithInt:kCVPixelFormatType_32BGRA] forKey:(id)kCVPixelBufferPixelFormatTypeKey]];

この場合、BGRAピクセルフォーマットが指定されています（OpenGL ESテクスチャのカラーフォーマットを一致させるためにこれを使用しました）。その形式の各ピクセルには、青、緑、赤、アルファの順に1バイトがあります。これを使用すると、カラーコンポーネントを簡単に引き出すことができますが、カメラネイティブのYUV色空間から変換する必要があるため、パフォーマンスが少し犠牲になります。

その他のサポートされている色空間はkCVPixelFormatType_420YpCbCr8BiPlanarVideoRange、 kCVPixelFormatType_420YpCbCr8BiPlanarFullRange新しいデバイスとkCVPixelFormatType_422YpCbCr8iPhone3Gにあります。VideoRangeまたはFullRange接尾辞は、バイトがYの場合は16〜235、UVの場合は16〜240、または各コンポーネントの場合は完全な0〜255のどちらで返されるかを示します。

AVCaptureVideoDataOutputインスタンスで使用されるデフォルトの色空間は、YUV 4：2：0平面色空間であると思います（YUV 4：2：2がインターリーブされているiPhone 3Gを除く）。これは、ビデオフレーム内に画像データの2つの平面があり、Y平面が最初に来ることを意味します。結果の画像のすべてのピクセルに対して、そのピクセルのY値に1バイトがあります。

デリゲートコールバックに次のようなものを実装することで、この生のYデータを取得できます。

- (void)captureOutput:(AVCaptureOutput *)captureOutput didOutputSampleBuffer:(CMSampleBufferRef)sampleBuffer fromConnection:(AVCaptureConnection *)connection
{
    CVImageBufferRef pixelBuffer = CMSampleBufferGetImageBuffer(sampleBuffer);
    CVPixelBufferLockBaseAddress(pixelBuffer, 0);

    unsigned char *rawPixelBase = (unsigned char *)CVPixelBufferGetBaseAddress(pixelBuffer);

    // Do something with the raw pixels here

    CVPixelBufferUnlockBaseAddress(pixelBuffer, 0);
}

次に、画像上の各X、Y座標のフレームデータ内の位置を把握し、その座標のYコンポーネントに対応するバイトを引き出すことができます。

WWDC 2010からのAppleのFindMyiConeサンプル（ビデオと一緒にアクセス可能）は、各フレームからの生のBGRAデータを処理する方法を示しています。また、iPhoneのカメラからのライブビデオを使用してカラーベースのオブジェクトトラッキングを実行するサンプルアプリケーションを作成しました。このアプリケーションのコードをここからダウンロードできます。どちらも生のピクセルデータを処理する方法を示していますが、どちらもYUV色空間では機能しません。

score 19 · Accepted Answer

ブラッドの答えとあなた自身のコードに加えて、あなたは次のことを考慮したいと思います：

画像には2つの別個の平面があるため、関数CVPixelBufferGetBaseAddressは平面のベースアドレスではなく、追加のデータ構造のベースアドレスを返します。画像を表示できるように、最初のプレーンに十分近いアドレスを取得したのは、おそらく現在の実装が原因です。しかし、それがシフトして左上にゴミがある理由です。最初の飛行機を受け取る正しい方法は次のとおりです。

unsigned char *rowBase = CVPixelBufferGetBaseAddressOfPlane(pixelBuffer, 0);

画像の行は、画像の幅よりも長くなる場合があります（丸めのため）。そのため、行ごとの幅とバイト数を取得するための個別の関数があります。現時点では、この問題は発生していません。しかし、それはiOSの次のバージョンで変わるかもしれません。したがって、コードは次のようになります。

int bufferHeight = CVPixelBufferGetHeight(pixelBuffer);
int bufferWidth = CVPixelBufferGetWidth(pixelBuffer);
int bytesPerRow = CVPixelBufferGetBytesPerRowOfPlane(pixelBuffer, 0);
int size = bufferHeight * bytesPerRow ;

unsigned char *pixel = (unsigned char*)malloc(size);

unsigned char *rowBase = CVPixelBufferGetBaseAddressOfPlane(pixelBuffer, 0);
memcpy (pixel, rowBase, size);

また、iPhone3Gではコードが無残に失敗することにも注意してください。

score 8 · Accepted Answer

輝度チャンネルのみが必要な場合は、変換オーバーヘッドが伴うため、BGRA形式の使用はお勧めしません。Appleは、レンダリングを行う場合はBGRAを使用することをお勧めしますが、輝度情報を抽出するためにBGRAは必要ありません。ブラッドがすでに述べたように、最も効率的なフォーマットはカメラネイティブのYUVフォーマットです。

ただし、サンプルバッファから適切なバイトを抽出することは、特にインターリーブされたYUV422形式のiPhone3Gに関しては、少し注意が必要です。これが私のコードです。iPhone3G、3GS、iPod Touch 4、iPhone4Sで問題なく動作します。

#pragma mark -
#pragma mark AVCaptureVideoDataOutputSampleBufferDelegate Methods
#if !(TARGET_IPHONE_SIMULATOR)
- (void)captureOutput:(AVCaptureOutput *)captureOutput didOutputSampleBuffer:(CMSampleBufferRef)sampleBuffer fromConnection:(AVCaptureConnection *)connection;
{
    // get image buffer reference
    CVImageBufferRef imageBuffer = CMSampleBufferGetImageBuffer(sampleBuffer);

    // extract needed informations from image buffer
    CVPixelBufferLockBaseAddress(imageBuffer, 0);
    size_t bufferSize = CVPixelBufferGetDataSize(imageBuffer);
    void *baseAddress = CVPixelBufferGetBaseAddress(imageBuffer);
    CGSize resolution = CGSizeMake(CVPixelBufferGetWidth(imageBuffer), CVPixelBufferGetHeight(imageBuffer));

    // variables for grayscaleBuffer 
    void *grayscaleBuffer = 0;
    size_t grayscaleBufferSize = 0;

    // the pixelFormat differs between iPhone 3G and later models
    OSType pixelFormat = CVPixelBufferGetPixelFormatType(imageBuffer);

    if (pixelFormat == '2vuy') { // iPhone 3G
        // kCVPixelFormatType_422YpCbCr8     = '2vuy',    
        /* Component Y'CbCr 8-bit 4:2:2, ordered Cb Y'0 Cr Y'1 */

        // copy every second byte (luminance bytes form Y-channel) to new buffer
        grayscaleBufferSize = bufferSize/2;
        grayscaleBuffer = malloc(grayscaleBufferSize);
        if (grayscaleBuffer == NULL) {
            NSLog(@"ERROR in %@:%@:%d: couldn't allocate memory for grayscaleBuffer!", NSStringFromClass([self class]), NSStringFromSelector(_cmd), __LINE__);
            return nil; }
        memset(grayscaleBuffer, 0, grayscaleBufferSize);
        void *sourceMemPos = baseAddress + 1;
        void *destinationMemPos = grayscaleBuffer;
        void *destinationEnd = grayscaleBuffer + grayscaleBufferSize;
        while (destinationMemPos <= destinationEnd) {
            memcpy(destinationMemPos, sourceMemPos, 1);
            destinationMemPos += 1;
            sourceMemPos += 2;
        }       
    }

    if (pixelFormat == '420v' || pixelFormat == '420f') {
        // kCVPixelFormatType_420YpCbCr8BiPlanarVideoRange = '420v', 
        // kCVPixelFormatType_420YpCbCr8BiPlanarFullRange  = '420f',
        // Bi-Planar Component Y'CbCr 8-bit 4:2:0, video-range (luma=[16,235] chroma=[16,240]).  
        // Bi-Planar Component Y'CbCr 8-bit 4:2:0, full-range (luma=[0,255] chroma=[1,255]).
        // baseAddress points to a big-endian CVPlanarPixelBufferInfo_YCbCrBiPlanar struct
        // i.e.: Y-channel in this format is in the first third of the buffer!
        int bytesPerRow = CVPixelBufferGetBytesPerRowOfPlane(imageBuffer, 0);
        baseAddress = CVPixelBufferGetBaseAddressOfPlane(imageBuffer,0);
        grayscaleBufferSize = resolution.height * bytesPerRow ;
        grayscaleBuffer = malloc(grayscaleBufferSize);
        if (grayscaleBuffer == NULL) {
            NSLog(@"ERROR in %@:%@:%d: couldn't allocate memory for grayscaleBuffer!", NSStringFromClass([self class]), NSStringFromSelector(_cmd), __LINE__);
            return nil; }
        memset(grayscaleBuffer, 0, grayscaleBufferSize);
        memcpy (grayscaleBuffer, baseAddress, grayscaleBufferSize); 
    }

    // do whatever you want with the grayscale buffer
    ...

    // clean-up
    free(grayscaleBuffer);
}
#endif

score 4 · Accepted Answer

これは、他のスレッドの上や他のスレッドでの、他のすべての人の努力の集大成であり、それが役立つと思う人のためにswift3に変換されます。

func captureOutput(_ captureOutput: AVCaptureOutput!, didOutputSampleBuffer sampleBuffer: CMSampleBuffer!, from connection: AVCaptureConnection!) {
    if let pixelBuffer = CMSampleBufferGetImageBuffer(sampleBuffer) {
        CVPixelBufferLockBaseAddress(pixelBuffer, CVPixelBufferLockFlags.readOnly)

        let pixelFormatType = CVPixelBufferGetPixelFormatType(pixelBuffer)
        if pixelFormatType == kCVPixelFormatType_420YpCbCr8BiPlanarFullRange
           || pixelFormatType == kCVPixelFormatType_420YpCbCr8BiPlanarVideoRange {

            let bufferHeight = CVPixelBufferGetHeight(pixelBuffer)
            let bufferWidth = CVPixelBufferGetWidth(pixelBuffer)

            let lumaBytesPerRow = CVPixelBufferGetBytesPerRowOfPlane(pixelBuffer, 0)
            let size = bufferHeight * lumaBytesPerRow
            let lumaBaseAddress = CVPixelBufferGetBaseAddressOfPlane(pixelBuffer, 0)
            let lumaByteBuffer = unsafeBitCast(lumaBaseAddress, to:UnsafeMutablePointer<UInt8>.self)

            let releaseDataCallback: CGDataProviderReleaseDataCallback = { (info: UnsafeMutableRawPointer?, data: UnsafeRawPointer, size: Int) -> () in
                // https://developer.apple.com/reference/coregraphics/cgdataproviderreleasedatacallback
                // N.B. 'CGDataProviderRelease' is unavailable: Core Foundation objects are automatically memory managed
                return
            }

            if let dataProvider = CGDataProvider(dataInfo: nil, data: lumaByteBuffer, size: size, releaseData: releaseDataCallback) {
                let colorSpace = CGColorSpaceCreateDeviceGray()
                let bitmapInfo = CGBitmapInfo(rawValue: CGImageAlphaInfo.noneSkipFirst.rawValue)

                let cgImage = CGImage(width: bufferWidth, height: bufferHeight, bitsPerComponent: 8, bitsPerPixel: 8, bytesPerRow: lumaBytesPerRow, space: colorSpace, bitmapInfo: bitmapInfo, provider: dataProvider, decode: nil, shouldInterpolate: false, intent: CGColorRenderingIntent.defaultIntent)

                let greyscaleImage = UIImage(cgImage: cgImage!)
                // do what you want with the greyscale image.
            }
        }

        CVPixelBufferUnlockBaseAddress(pixelBuffer, CVPixelBufferLockFlags.readOnly)
    }
}

iphone - AVCaptureSessionの結果であるCMSampleBufferからYコンポーネントを取得するにはどうすればよいですか？

4 に答える 4

Related

Reference