ffmpeg - WAVファイルのGoogle Speech to Textは

Question

Google Speech to Text API を使用して WAV ファイルをテキストに変換しています。WAV ファイルを再生すると問題なく動作しますが、Google Speech To Text API を実行すると次のエラーが発生します。

WAV header indicates an unsupported format.

ツールを使用してファイルを分析しようとするとffmpeg、次のエラーが発生します。

Output #0, wav, to '/home/shubham/workspace/intent-service/scripts/audio2.tmp.wav':
Metadata:
  ISFT            : Lavf57.83.100
  Stream #0:0: Audio: pcm_s16le ([1][0][0][0] / 0x0001), 8000 Hz, mono, s16, 128 kb/s
  Metadata:
    encoder         : Lavc57.107.100 pcm_s16le
[gsm_ms @ 0x55d4c255cd20] Packet is too small
Error while decoding stream #0:0: Invalid data found when processing input size=7924kB time=00:08:27.16 bitrate= 128.0kbits/s speed=3.72e+03x    
video:0kB audio:7924kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: 0.000961%

私は何が欠けていますか？

score 0 · Accepted Answer

はい、.wav ファイルの場合は、--encoding=linear16 と入力して、エンコードを Linear16 として指定するフラグを追加する必要があります。

Google STT API と互換性のある他の形式は .flac だけです。ffmpeg があるので、これに変換して、コマンド --encoding=flac にフラグを追加できます。

ffmpeg - WAVファイルのGoogle Speech to Textは

2 に答える 2

Related

Reference