java - Java によるオープンソースの音声認識ソフトウェア

Question

私は最近、音声認識に基づくアプリケーションを開始することを考えていました。特定のタスクを実行するために特定の結果を意味します。どのように進めていくのがベストなのか、考えてみました。PC用かAndroid用も考えています。私は、JAVA を私の強力なプログラミング言語と考えています。

私はいくつかの検索を行いましたが、これにアプローチするための最良の方法はまだわかりません。

音声認識の部分をオープンソフトウェアに任せて、他の部分を処理することはできますか? 全部自分でやる？はいの場合、JAVAで可能ですか？

どんな情報でも大歓迎です。

前もって感謝します。

score 6 · Accepted Answer

これにアプローチする最善の方法は、既存の認識ツールキットとそれに付属する言語および音響モデルを使用することです。ニーズに合わせてモデルをトレーニングできます。

CMUSphinxは、おそらく最高の FOSS 音声認識ツールキットです。CMUSphinx は、優れた Java 統合とデモアプリケーションも提供します。

score 4 · Accepted Answer

サードパーティの音声認識オプションをいくつか評価した結果、Google の音声認識が最も正確です。Google 音声認識を使用する場合、2 つの基本的なアプローチがあります。最も簡単な方法は、Intent を起動し、それに応じて結果を処理することです。

    Intent intent = new Intent(RecognizerIntent.ACTION_RECOGNIZE_SPEECH);

    intent.addFlags(Intent.FLAG_ACTIVITY_CLEAR_TOP);
    intent.putExtra(RecognizerIntent.EXTRA_LANGUAGE_MODEL, RecognizerIntent.LANGUAGE_MODEL_FREE_FORM);

    startActivityForResult(intent, VOICE_RECOGNITION_REQUEST_CODE );

次に、onActivityResults() で、サービスから返された一致を処理します。

    /**
 * Handle the results from the recognition activity.
 */
@Override
protected void onActivityResult(int requestCode, int resultCode, Intent data) {
    super.onActivityResult(requestCode, resultCode, data);
    //Toast.makeText(this, "voice recog result: " + resultCode, Toast.LENGTH_LONG).show();
    if (requestCode == VOICE_RECOGNITION_REQUEST_CODE && resultCode == RESULT_OK) {
        // Fill the list view with the strings the recognizer thought it could have heard
        ArrayList<String> matches = data.getStringArrayListExtra(
                RecognizerIntent.EXTRA_RESULTS);
        // handleResults
        if (matches != null) {
            handleResults(matches); 
        }                    
    }     
}

2 番目のアプローチはより複雑ですが、認識サービスの実行中に発生する可能性のあるエラー状態をより適切に処理できます。このアプローチを使用して、独自の認識リスナーとコールバックメソッドを作成します。例えば：

聞き始める:

mSpeechRecognizer.startListening(mRecognizerIntent);

ここで、mRecognizerIntent:

    mSpeechRecognizer = SpeechRecognizer.createSpeechRecognizer(getBaseContext());
    mSpeechRecognizer.setRecognitionListener(mRecognitionListener);
    mRecognizerIntent = new Intent(RecognizerIntent.ACTION_RECOGNIZE_SPEECH);
    mRecognizerIntent.putExtra(RecognizerIntent.EXTRA_LANGUAGE_MODEL,
            RecognizerIntent.LANGUAGE_MODEL_FREE_FORM);
    mRecognizerIntent.putExtra("calling_package", "com.you.package");

次に、リスナーを作成します。

    private RecognitionListener mRecognitionListener = new RecognitionListener() {
            public void onBufferReceived(byte[] buffer) {
                    // TODO Auto-generated method stub
                    //Log.d(TAG, "onBufferReceived");
            }

            public void onError(int error) {
                    // TODO Auto-generated method stub
                    // here is where you handle the error...


            public void onEvent(int eventType, Bundle params) {
                    // TODO Auto-generated method stub
                    Log.d(TAG, "onEvent");
            }

            public void onPartialResults(Bundle partialResults) {
                    // TODO Auto-generated method stub
                    Log.d(TAG, "onPartialResults");
            }

            public void onReadyForSpeech(Bundle params) {
                    // TODO Auto-generated method stub
                    Log.d(TAG, "onReadyForSpeech");

            }

            public void onResults(Bundle results) {

                    Log.d(TAG, ">>> onResults");
                    //Toast.makeText(getBaseContext(), "got voice results!", Toast.LENGTH_SHORT);

                    ArrayList<String> matches = results.getStringArrayList(SpeechRecognizer.RESULTS_RECOGNITION);
                    handleResults(matches);


            }

            public void onRmsChanged(float rmsdB) {
                    // TODO Auto-generated method stub
                    //Log.d(TAG, "onRmsChanged");
            }

            public void onBeginningOfSpeech() {
                    // TODO Auto-generated method stub
                    Log.d(TAG, "onBeginningOfSpeech");
            }

            public void onEndOfSpeech() {
                    // TODO Auto-generated method stub
                    Log.d(TAG, "onEndOfSpeech");

            }

};

handleResults() を追加して、好きなことを行うことができます。

score 1 · Accepted Answer

Google Speech API を使用することもできます。Android からは、SpeechRecognizer Class Referenceからアクセスできます。

Java でのデモコードも含まれている stackoverflow の質問へのリンクを次に示します。 Javaでの音声認識

java - Java によるオープンソースの音声認識ソフトウェア

3 に答える 3

Related

Reference