c# - Google Speech API が空の結果を返すものと返さないもの (C#)

Question

EMPTY の結果は次のようになります。

json[0] "{\"result\":[]}"
json[1] ""

NON-EMPTY 結果 (望ましい結果) は次のようになります。

json[0] "{\"result\":[]}"
json[1] "{\"result\":[{\"alternative\":[{\"transcript\":\"good morning Google how are you feeling today\",\"confidence\":0.987629}],\"final\":true}],\"result_index\":0}"
json[2] ""

「.flac」ファイルを取得してテキストに変換することになっているこの関数があります。何らかの理由で、これら 2 つのサンプル ".flac" ファイルのみが Google Speech API を介して渡されたときに文字列を返し、他の flac ファイルは EMPTY の結果を返します。これらの人が抱えているのと同じ問題:リンク

これが私のすべてのflacファイルです：リンク

my.flacGoogle Speech API は、テキストをthis_is_a_test.flac含む jason オブジェクトを提供してくれます。

ただし、recorded.flacGoogle Speech API では機能せず、EMPTY json オブジェクトが返されます。

デバッグ:

問題はマイクだと思い、recorded.flac何度も大音量でクリアに録音し、ffmpeg を使用して flac に変換しました。しかし、Google Speech API はまだ認識できません recorded.flac
コードのフォーマットが間違っていると思ったので、試してみました

_HWR_SpeechToText.ContentType = "audio/116; rate=16000";

それ以外の

_HWR_SpeechToText.ContentType ="audio/x-flac; rate=44100";

Then, none of them worked, not a single flac file. so i changed it back.

FLACファイルをTEXTに変換するGoogle Speech APIコードは次のとおりです（必要ではないと思いますが、何でも）：

public void convert_to_text()
    {
        FileStream fileStream = File.OpenRead("recorded.flac");//my.flac
        MemoryStream memoryStream = new MemoryStream();
        memoryStream.SetLength(fileStream.Length);
        fileStream.Read(memoryStream.GetBuffer(), 0, (int)fileStream.Length);
        byte[] BA_AudioFile = memoryStream.GetBuffer();
        HttpWebRequest _HWR_SpeechToText = null;
        _HWR_SpeechToText = (HttpWebRequest)HttpWebRequest.Create("https://www.google.com/speech-api/v2/recognize?output=json&lang=en-us&key=" + ACCESS_GOOGLE_SPEECH_KEY);
        _HWR_SpeechToText.Credentials = CredentialCache.DefaultCredentials;
        _HWR_SpeechToText.Method = "POST";
        _HWR_SpeechToText.ContentType = "audio/x-flac; rate=44100";
        _HWR_SpeechToText.ContentLength = BA_AudioFile.Length;
        Stream stream = _HWR_SpeechToText.GetRequestStream();
        stream.Write(BA_AudioFile, 0, BA_AudioFile.Length);
        stream.Close();
        HttpWebResponse HWR_Response = (HttpWebResponse)_HWR_SpeechToText.GetResponse();

        StreamReader SR_Response = new StreamReader(HWR_Response.GetResponseStream());
        string responseFromServer = (SR_Response.ReadToEnd());

        String[] jsons = responseFromServer.Split('\n');
        foreach (String j in jsons)
        {
            dynamic jsonObject = JsonConvert.DeserializeObject(j);
            if (jsonObject == null || jsonObject.result.Count <= 0)
            {
                continue;
            }
            text = jsonObject.result[0].alternative[0].transcript;
            jsons = null;
        }
        label1.Content = text;
    }

score 1 · Accepted Answer

まず、ファイルがステレオではなく 16 ビット PCM モノであることを確認します。http://www.audacityteam.org/で簡単に実行できます

次に、この単純なコードを使用してこれを行うことができます。

string api_key = "xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx";
string path = @"C:\temp\good-morning-google.flac";

byte[] bytes = System.IO.File.ReadAllBytes(path);

WebClient client = new WebClient();
client.Headers.Add("Content-Type", "audio/x-flac; rate=44100");
byte[] result = client.UploadData(string.Format(
                "https://www.google.com/speech-api/v2/recognize?client=chromium&lang=en-us&key={0}", api_key), "POST", bytes);

string s = client.Encoding.GetString(result);

c# - Google Speech API が空の結果を返すものと返さないもの (C#)

1 に答える 1

Related

Reference