ios - iOS でのオーディオの録音、変更、および再生

Question

編集:最後に、以下で説明したとおり、音声の録音には AVRecorder を使用し、ピッチシフトと再生には openAL を使用しました。それはかなりうまくいきました。

オーディオの録音、変更、再生について質問があります。以前にも同様の質問をしました ( iOS でのリアルタイムでの録音、ピッチの変更、およびオーディオの再生) が、より多くの情報が得られたので、さらにアドバイスをお願いします。

まず、これが私がやろうとしていることです(メインスレッドとは別のスレッドで):

iPhoneのマイクを監視する
特定の音量以上の音を確認する
しきい値を超えた場合、記録を開始します。例: 人が話し始める
音量がしきい値を下回るまで録音を続ける (例: 人が話すのをやめる)
録音された音のピッチを変更します。
再生音

AVRecorder を使用してサウンドを監視および録音することを考えていました。こちらの優れたチュートリアル: http://mobileorchard.com/tutorial-detecting-when-a-user-blows-into-the-mic/

そして、openAL を使用して、録音されたオーディオのピッチを変更することを考えていました。

だから私の質問は、上記のポイントのリストで私の考えは正しいですか、何かが欠けているのでしょうか、それとももっと良い/簡単な方法がありますか. オーディオライブラリを混在させずに、AVFoundation を使用してピッチを変更することはできますか?

score 2 · Accepted Answer

AVRecorderまたはリアルタイムIOオーディオユニットのようなより低いものを使用できます。

「ボリューム」の概念はかなりあいまいです。ピーク値とRMS値の計算の違いを確認し、特定の時間（たとえば、VUメーターが使用する300ms）でRMS値を統合する方法を理解することをお勧めします。

基本的に、値のすべての2乗を合計します。平方根を取り、10 * log10f（sqrt（sum / num_samples））を使用してdBFSに変換しますが、20 * log10f（sum / num_samples）を使用すると、sqrtなしで1ステップで変換できます。

希望どおりに動作させるには、積分時間としきい値を大幅に調整する必要があります。

ピッチシフトについては、OpenALがトリックを実行すると思います。その背後にある手法は、帯域制限補間と呼ばれます-https://ccrma.stanford.edu/~jos/resample/Theory_Ideal_Bandlimited_Interpolation.html

この例は、移動平均としてのrms計算を示しています。循環バッファは二乗の履歴を維持し、操作ごとに二乗を合計する必要をなくします。私はそれを実行していないので、それを擬似コードとして扱います;）

例：

class VUMeter
{

protected:

    // samples per second
    float _sampleRate;

    // the integration time in seconds (vu meter is 300ms)
    float _integrationTime;

    // these maintain a circular buffer which contains
    // the 'squares' of the audio samples

    int _integrationBufferLength;
    float *_integrationBuffer;
    float *_integrationBufferEnd;
    float *_cursor;

    // this is a sort of accumulator to make a running
    // average more efficient

    float _sum;

public:

    VUMeter()
    : _sampleRate(48000.0f)
    , _integrationTime(0.3f)
    , _sum(0.)
    {
        // create a buffer of values to be integrated
        // e.g 300ms @ 48khz is 14400 samples

        _integrationBufferLength = (int) (_integrationTime * _sampleRate);

        _integrationBuffer = new float[_integrationBufferLength + 1];
        bzero(_integrationBuffer, _integrationBufferLength);

        // set the pointers for our ciruclar buffer

        _integrationBufferEnd = _integrationBuffer + _integrationBufferLength;
        _cursor = _integrationBuffer;

    }

    ~VUMeter()
    {
        delete _integrationBuffer;
    }

    float getRms(float *audio, int samples)
    {
        // process the samples
        // this part accumulates the 'squares'

        for (int i = 0; i < samples; ++i)
        {
            // get the input sample

            float s = audio[i];

            // remove the oldest value from the sum

            _sum -= *_cursor;

            // calculate the square and write it into the buffer

            double square = s * s;
            *_cursor = square;

            // add it to the sum

            _sum += square;

            // increment the buffer cursor and wrap

            ++_cursor;

            if (_cursor == _integrationBufferEnd)
                _cursor = _integrationBuffer;
        }

        // now calculate the 'root mean' value in db

        return 20 * log10f(_sum / _integrationBufferLength);
    }
};

score 1 · Accepted Answer

OpenAL のリサンプリングは、ピッチとデュレーションを逆に変更します。たとえば、より高いピッチにリサンプリングされたサウンドは、より短い時間で再生されるため、より速く再生されます。

ios - iOS でのオーディオの録音、変更、および再生

2 に答える 2

Related

Reference