ios - iOS で PCM (CMSampleBufferRef) を AAC にエンコード - 周波数とビットレートを設定するには?

Question

CMSampleBufferRefPCM ( (s) がからライブAVCaptureAudioDataOutputSampleBufferDelegateに移行) を AACにエンコードしたいと考えています。

最初CMSampleBufferRefに到着したら、AudioStreamBasicDescriptionドキュメントに従って、両方 (in/out) (s)、「out」を設定します

AudioStreamBasicDescription inAudioStreamBasicDescription = *CMAudioFormatDescriptionGetStreamBasicDescription((CMAudioFormatDescriptionRef)CMSampleBufferGetFormatDescription(sampleBuffer));

AudioStreamBasicDescription outAudioStreamBasicDescription = {0}; // Always initialize the fields of a new audio stream basic description structure to zero, as shown here: ...
outAudioStreamBasicDescription.mSampleRate = 44100; // The number of frames per second of the data in the stream, when the stream is played at normal speed. For compressed formats, this field indicates the number of frames per second of equivalent decompressed data. The mSampleRate field must be nonzero, except when this structure is used in a listing of supported formats (see “kAudioStreamAnyRate”).
outAudioStreamBasicDescription.mFormatID = kAudioFormatMPEG4AAC; // kAudioFormatMPEG4AAC_HE does not work. Can't find `AudioClassDescription`. `mFormatFlags` is set to 0.
outAudioStreamBasicDescription.mFormatFlags = kMPEG4Object_AAC_SSR; // Format-specific flags to specify details of the format. Set to 0 to indicate no format flags. See “Audio Data Format Identifiers” for the flags that apply to each format.
outAudioStreamBasicDescription.mBytesPerPacket = 0; // The number of bytes in a packet of audio data. To indicate variable packet size, set this field to 0. For a format that uses variable packet size, specify the size of each packet using an AudioStreamPacketDescription structure.
outAudioStreamBasicDescription.mFramesPerPacket = 1024; // The number of frames in a packet of audio data. For uncompressed audio, the value is 1. For variable bit-rate formats, the value is a larger fixed number, such as 1024 for AAC. For formats with a variable number of frames per packet, such as Ogg Vorbis, set this field to 0.
outAudioStreamBasicDescription.mBytesPerFrame = 0; // The number of bytes from the start of one frame to the start of the next frame in an audio buffer. Set this field to 0 for compressed formats. ...
outAudioStreamBasicDescription.mChannelsPerFrame = 1; // The number of channels in each frame of audio data. This value must be nonzero.
outAudioStreamBasicDescription.mBitsPerChannel = 0; // ... Set this field to 0 for compressed formats.
outAudioStreamBasicDescription.mReserved = 0; // Pads the structure out to force an even 8-byte alignment. Must be set to 0.

とAudioConverterRef。

AudioClassDescription audioClassDescription;
memset(&audioClassDescription, 0, sizeof(audioClassDescription));
UInt32 size;
NSAssert(AudioFormatGetPropertyInfo(kAudioFormatProperty_Encoders, sizeof(outAudioStreamBasicDescription.mFormatID), &outAudioStreamBasicDescription.mFormatID, &size) == noErr, nil);
uint32_t count = size / sizeof(AudioClassDescription);
AudioClassDescription descriptions[count];
NSAssert(AudioFormatGetProperty(kAudioFormatProperty_Encoders, sizeof(outAudioStreamBasicDescription.mFormatID), &outAudioStreamBasicDescription.mFormatID, &size, descriptions) == noErr, nil);
for (uint32_t i = 0; i < count; i++) {

    if ((outAudioStreamBasicDescription.mFormatID == descriptions[i].mSubType) && (kAppleSoftwareAudioCodecManufacturer == descriptions[i].mManufacturer)) {

        memcpy(&audioClassDescription, &descriptions[i], sizeof(audioClassDescription));

    }
}
NSAssert(audioClassDescription.mSubType == outAudioStreamBasicDescription.mFormatID && audioClassDescription.mManufacturer == kAppleSoftwareAudioCodecManufacturer, nil);
AudioConverterRef audioConverter;
memset(&audioConverter, 0, sizeof(audioConverter));
NSAssert(AudioConverterNewSpecific(&inAudioStreamBasicDescription, &outAudioStreamBasicDescription, 1, &audioClassDescription, &audioConverter) == 0, nil);

そして、すべてCMSampleBufferRefを生の AAC データに変換します。

AudioBufferList inAaudioBufferList;
CMBlockBufferRef blockBuffer;
CMSampleBufferGetAudioBufferListWithRetainedBlockBuffer(sampleBuffer, NULL, &inAaudioBufferList, sizeof(inAaudioBufferList), NULL, NULL, 0, &blockBuffer);
NSAssert(inAaudioBufferList.mNumberBuffers == 1, nil);

uint32_t bufferSize = inAaudioBufferList.mBuffers[0].mDataByteSize;
uint8_t *buffer = (uint8_t *)malloc(bufferSize);
memset(buffer, 0, bufferSize);
AudioBufferList outAudioBufferList;
outAudioBufferList.mNumberBuffers = 1;
outAudioBufferList.mBuffers[0].mNumberChannels = inAaudioBufferList.mBuffers[0].mNumberChannels;
outAudioBufferList.mBuffers[0].mDataByteSize = bufferSize;
outAudioBufferList.mBuffers[0].mData = buffer;

UInt32 ioOutputDataPacketSize = 1;

NSAssert(AudioConverterFillComplexBuffer(audioConverter, inInputDataProc, &inAaudioBufferList, &ioOutputDataPacketSize, &outAudioBufferList, NULL) == 0, nil);

NSData *data = [NSData dataWithBytes:outAudioBufferList.mBuffers[0].mData length:outAudioBufferList.mBuffers[0].mDataByteSize];

free(buffer);
CFRelease(blockBuffer);

inInputDataProc()実装：

OSStatus inInputDataProc(AudioConverterRef inAudioConverter, UInt32 *ioNumberDataPackets, AudioBufferList *ioData, AudioStreamPacketDescription **outDataPacketDescription, void *inUserData)
{
    AudioBufferList audioBufferList = *(AudioBufferList *)inUserData;

    ioData->mBuffers[0].mData = audioBufferList.mBuffers[0].mData;
    ioData->mBuffers[0].mDataByteSize = audioBufferList.mBuffers[0].mDataByteSize;

    return  noErr;
}

これで、data生の AAC が保持され、適切な ADTS ヘッダーを使用して ADTS フレームにラップされ、これらの ADTS フレームのシーケンスが再生可能な AAC ドキュメントになります。

しかし、私はこのコードをあまり理解していません。一般的に、私はオーディオを理解していません...ブログ、フォーラム、ドキュメントに従って、かなりの時間で何とか書いたところですが、現在は機能していますが、いくつかのパラメーターを変更する理由と方法がわかりません。だからここに私の質問があります：

HWエンコーダーが占有されている間（によってAVAssetWriter）、このコンバーターを使用する必要があります。これが、私が SW コンバーターをAudioConverterNewSpecific()ではなく経由にする理由AudioConverterNew()です。しかし、現在、設定outAudioStreamBasicDescription.mFormatID = kAudioFormatMPEG4AAC_HE;は機能しません。が見つかりませんAudioClassDescription。mFormatFlagsが 0 に設定されていても、kAudioFormatMPEG4AAC( kMPEG4Object_AAC_SSR) を使用して何が失われkAudioFormatMPEG4AAC_HEますか? ライブストリームには何を使用すればよいですか? kMPEG4Object_AAC_SSRまたはkMPEG4Object_AAC_Main？
サンプルレートを適切に変更するには? たとえば、22050 または 8000に設定outAudioStreamBasicDescription.mSampleRateすると、オーディオの再生が遅くなります。ADTSヘッダーにサンプリング周波数インデックスをそのまま同じ周波数に設定しましたoutAudioStreamBasicDescription.mSampleRate。
ビットレートを変更するには？ffmpeg -i は、生成された aac: のこの情報を表示します Stream #0:0: Audio: aac, 44100 Hz, mono, fltp, 64 kb/s。たとえば、16 kbps に変更するにはどうすればよいですか? 周波数を下げていくとビットレートが下がっていくのですが、これしかないと思いますか？そして、とにかく2で述べたように、周波数を下げると再生が損なわれます。
のサイズを計算する方法はbuffer? 圧縮形式は非圧縮よりも大きくならないと信じているので、今はに設定しましたuint32_t bufferSize = inAaudioBufferList.mBuffers[0].mDataByteSize;...しかし、それは不必要に多すぎませんか?
ioOutputDataPacketSize正しく設定するには？ドキュメントを正しく取得している場合は、0 に設定する必要UInt32 ioOutputDataPacketSize = bufferSize / outAudioStreamBasicDescription.mBytesPerPacket;がmBytesPerPacketあります。0 に設定すると、AudioConverterFillComplexBuffer()エラーが返されます。1に設定すると機能しますが、理由はわかりません...
IninInputDataProc()には 3 つの「out」パラメータがあります。私はちょうど設定しましたioData。とも設定する必要がioNumberDataPacketsありoutDataPacketDescriptionますか？なぜ、どのように？

score 0 · Accepted Answer

オーディオを AAC コンバーターに供給する前に、リサンプリングオーディオユニットを使用して生のオーディオデータのサンプルレートを変更する必要がある場合があります。そうしないと、AAC ヘッダーとオーディオデータが一致しなくなります。

ios - iOS で PCM (CMSampleBufferRef) を AAC にエンコード - 周波数とビットレートを設定するには?

1 に答える 1

Related

Reference