speech-to-text - Sphinx4 と es_MX_broadcast_cont_2500 の使用

Question

私は現在、短いスペイン語 (MX) のインタビュー (長さ ~ 2 分) の音声トランスクリバーの開発に取り組んでいます。私はウェブでサーフィンをしてきましたが、これを見つけることができません。おそらく簡単すぎます:/ . .jar の実行中に、es_MX_broadcast... voxforge パッケージの /etc/h4.dict 内のアクセント付きのすべての単語に対してこの警告が表示されます (私は推測します)。転記やその他のエラーはまったくありません。

...

WARNING dictionary The dictionary is missing a phonetic transcription for the word 'kyrgyzst�'

'WARNING dictionary The dictionary is missing a phonetic transcription for the word 'explotaci�'

WARNING dictionary The dictionary is missing a phonetic transcription for the word 'inclu�'

...

私の手がかりは、テキストエンコーダーに構成上の問題がある可能性がありますが、おそらく言語モデルを作成する必要があるということです。本当にトレーニングしたいのですが、まずそれを機能させる必要があります。これは、config.xml ファイルの linguist/dictionary/language_model/acoustic_model 部分です。

<component name="lexTreeLinguist" 
            type="edu.cmu.sphinx.linguist.lextree.LexTreeLinguist">
    <property name="logMath" value="logMath"/>
    <property name="acousticModel" value="wsj"/>
    <property name="languageModel" value="trigramModel"/>
    <property name="dictionary" value="dictionary"/>
    <property name="addFillerWords" value="false"/>
    <property name="fillerInsertionProbability" value="1E-10"/>
    <property name="generateUnitStates" value="false"/>
    <property name="wantUnigramSmear" value="true"/>
    <property name="unigramSmearWeight" value="1"/>
    <property name="wordInsertionProbability" 
            value="${wordInsertionProbability}"/>
    <property name="silenceInsertionProbability" 
            value="${silenceInsertionProbability}"/>
    <property name="languageWeight" value="${languageWeight}"/>
    <property name="unitManager" value="unitManager"/>
</component>    

<component name="dictionary" 
    type="edu.cmu.sphinx.linguist.dictionary.FastDictionary">
    <property name="dictionaryPath"
              value="/home/csampez/Desktop/JavaDev/Sphinx/sphinx4/models/acoustic/es_MX_broadcast_cont_2500/etc/h4.dict"/>
    <property name="fillerPath" 
      value="/home/csampez/Desktop/JavaDev/Sphinx/sphinx4/models/acoustic/es_MX_broadcast_cont_2500/etc/filler.dict"/>
    <property name="addSilEndingPronunciation" value="false"/>
    <property name="wordReplacement" value="&lt;sil&gt;"/>
    <property name="unitManager" value="unitManager"/>
</component>

<component name="trigramModel" 
      type="edu.cmu.sphinx.linguist.language.ngram.large.LargeTrigramModel">
    <property name="unigramWeight" value=".7"/>
    <property name="maxDepth" value="3"/>
    <property name="logMath" value="logMath"/>
    <property name="dictionary" value="dictionary"/>
    <property name="location"
     value="/home/csampez/Desktop/JavaDev/Sphinx/sphinx4/models/acoustic/es_MX_broadcast_cont_2500/etc/H4.arpa.Z.DMP"/>
</component>

<component name="wsj"
           type="edu.cmu.sphinx.linguist.acoustic.tiedstate.TiedStateAcousticModel">
    <property name="loader" value="wsjLoader"/>
    <property name="unitManager" value="unitManager"/>
</component>

<component name="wsjLoader" type="edu.cmu.sphinx.linguist.acoustic.tiedstate.Sphinx3Loader">
    <property name="logMath" value="logMath"/>
    <property name="unitManager" value="unitManager"/>
    <property name="location" value="/home/csampez/Desktop/JavaDev/Sphinx/sphinx4/models/acoustic/es_MX_broadcast_cont_2500/model_parameters/hub4_spanish_itesm.cd_cont_2500"/>
</component>

-------これは新しい情報です（2013年10月3日）----------

ありがとう、しかしそれは問題ではありません。ファイルはすでに UTF8 であり、JAVA TOOLS OPTION を UTF8 に設定しています。また、-Dfile.encoding を指定して .jar を実行し、何かを変更すると、同じリストが得られます。ファイルに別の辞書リストがあるかどうかを調べようとしたので奇妙ですが、私にはわかりません。h4.dict は大文字で、警告は小文字であり、警告リストに表示されないアクセント付きの単語もあるため、これは非常に奇妙なことです。より少ない単語で別の .dict ファイルを保存しようとしましたが、うまくいきませんでした。実際、警告にはより多くの単語が表示されました。

他のデモで使用されているような音響モデルに .jar を使用していないことが問題なのか、それとも転記やその他のエラーがまったくないという事実と関係があるのかはわかりません。

誰かが私を理解するのを手伝ってくれることを本当に願っています。その間、私はもっと一生懸命努力します.

事前に感謝します

score 0 · Accepted Answer

ファイルをUTF-8に変換する必要があります

java オプション -Dfile.encoding=utf-8 を使用して、Java VM がすべての入力ファイルが UTF-8 であると認識していることを確認する必要があります。

最も重要なことは、es_MX_broadcast_cont には特定の機能エクストラクタが必要であることです。構成ファイルでDeltasFeatureExtractor置き換える必要があります。S3FeatureExtractorそれ以外の場合、精度はゼロになります。

speech-to-text - Sphinx4 と es_MX_broadcast_cont_2500 の使用

1 に答える 1

Related

Reference