Hellium3 は、ピッチとは何か、また、これらのことを Swift で行うのが良い考えであるかどうかについて、本当に情報を提供してくれることを認識しています。
私の質問はもともと、PCM バスをタップすることがマイクからの入力信号を取得する方法であるかどうかに関するものでした。
この質問をして以来、私はまさにそれをしました。PCM バスをタップして取得したデータを使用し、バッファー ウィンドウを分析します。
それは非常にうまく機能し、PCMバス、バッファ、サンプリング周波数が何であるかを理解していなかったことが、そもそも私に質問をさせました.
この 3 つを知っておくと、これが正しいことが容易にわかります。
編集: 必要に応じて、PitchDetector の (非推奨の) 実装を貼り付けます。
class PitchDetector {
var samplingFrequency: Float
var harmonicConstant: Float
init(harmonicConstant: Float, samplingFrequency: Float) {
self.harmonicConstant = harmonicConstant
self.samplingFrequency = samplingFrequency
}
//------------------------------------------------------------------------------
// MARK: Signal processing
//------------------------------------------------------------------------------
func detectPitch(_ samples: [Float]) -> Pitch? {
let snac = self.snac(samples)
let (lags, peaks) = self.findKeyMaxima(snac)
let (τBest, clarity) = self.findBestPeak(lags, peaks: peaks)
if τBest > 0 {
let frequency = self.samplingFrequency / τBest
if PitchManager.sharedManager.inManageableRange(frequency) {
return Pitch(measuredFrequency: frequency, clarity: clarity)
}
}
return nil
}
// Returns a Special Normalision of the AutoCorrelation function array for various lags with values between -1 and 1
private func snac(_ samples: [Float]) -> [Float] {
let τMax = Int(self.samplingFrequency / PitchManager.sharedManager.noteFrequencies.first!) + 1
var snac = [Float](repeating: 0.0, count: samples.count)
let acf = self.acf(samples)
let norm = self.m(samples)
for τ in 1 ..< τMax {
snac[τ] = 2 * acf[τ + acf.count / 2] / norm[τ]
}
return snac
}
// Auto correlation function
private func acf(_ x: [Float]) -> [Float] {
let resultSize = 2 * x.count - 1
var result = [Float](repeating: 0, count: resultSize)
let xPad = repeatElement(Float(0.0), count: x.count - 1)
let xPadded = xPad + x + xPad
vDSP_conv(xPadded, 1, x, 1, &result, 1, vDSP_Length(resultSize), vDSP_Length(x.count))
return result
}
private func m(_ samples: [Float]) -> [Float] {
var sum: Float = 0.0
for i in 0 ..< samples.count {
sum += 2.0 * samples[i] * samples[i]
}
var m = [Float](repeating: 0.0, count: samples.count)
m[0] = sum
for i in 1 ..< samples.count {
m[i] = m[i - 1] - samples[i - 1] * samples[i - 1] - samples[samples.count - i - 1] * samples[samples.count - i - 1]
}
return m
}
/**
* Finds the indices of all key maximum points in data
*/
private func findKeyMaxima(_ data: [Float]) -> (lags: [Float], peaks: [Float]) {
var keyMaximaLags: [Float] = []
var keyMaximaPeaks: [Float] = []
var newPeakIncoming = false
var currentBestPeak: Float = 0.0
var currentBestτ = -1
for τ in 0 ..< data.count {
newPeakIncoming = newPeakIncoming || ((data[τ] < 0) && (data[τ + 1] > 0))
if newPeakIncoming {
if data[τ] > currentBestPeak {
currentBestPeak = data[τ]
currentBestτ = τ
}
let zeroCrossing = (data[τ] > 0) && (data[τ + 1] < 0)
if zeroCrossing {
let (τEst, peakEst) = self.approximateTruePeak(currentBestτ, data: data)
keyMaximaLags.append(τEst)
keyMaximaPeaks.append(peakEst)
newPeakIncoming = false
currentBestPeak = 0.0
currentBestτ = -1
}
}
}
if keyMaximaLags.count <= 1 {
let unwantedPeakOfLowPitchTone = (keyMaximaLags.count == 1 && data[Int(keyMaximaLags[0])] < data.max()!)
if unwantedPeakOfLowPitchTone {
keyMaximaLags.removeAll()
keyMaximaPeaks.removeAll()
}
let (τEst, peakEst) = self.approximateTruePeak(data.index(of: data.max()!)!, data: data)
keyMaximaLags.append(τEst)
keyMaximaPeaks.append(peakEst)
}
return (lags: keyMaximaLags, peaks: keyMaximaPeaks)
}
/**
* Approximates the true peak according to https://www.dsprelated.com/freebooks/sasp/Quadratic_Interpolation_Spectral_Peaks.html
*/
private func approximateTruePeak(_ τ: Int, data: [Float]) -> (τEst: Float, peakEst: Float) {
let α = data[τ - 1]
let β = data[τ]
let γ = data[τ + 1]
let p = 0.5 * ((α - γ) / (α - 2.0 * β + γ))
let peakEst = min(1.0, β - 0.25 * (α - γ) * p)
let τEst = Float(τ) + p
return (τEst, peakEst)
}
private func findBestPeak(_ lags: [Float], peaks: [Float]) -> (τBest: Float, clarity: Float) {
let threshold: Float = self.harmonicConstant * peaks.max()!
for i in 0 ..< peaks.count {
if peaks[i] > threshold {
return (τBest: lags[i], clarity: peaks[i])
}
}
return (τBest: lags[0], clarity: peaks[0])
}
}
上記の私の実装で研究が使用されている Philip McLeod のすべての功績。http://www.cs.otago.ac.nz/research/publications/oucs-2008-03.pdf