c# - ASP.NET MVC での超高速テキスト読み上げ (WAV -> MP3)

Question

この質問は基本的に、サーバーワークロードに対する Microsoft の Speech API (SAPI) の適合性と、音声合成のためにw3wp内で確実に使用できるかどうかに関するものです。次のように、.NET 4のネイティブSystem.Speechアセンブリ ( Microsoft.SpeechMicrosoft Speech Platform - Runtime Version 11 の一部として出荷されるものではない) とlame.exe を使用して mp3 を生成する非同期コントローラーがあります。

       [CacheFilter]
        public void ListenAsync(string url)
        {
                string fileName = string.Format(@"C:\test\{0}.wav", Guid.NewGuid());                       

                try
                {
                    var t = new System.Threading.Thread(() =>
                    {
                        using (SpeechSynthesizer ss = new SpeechSynthesizer())
                        {
                            ss.SetOutputToWaveFile(fileName, new SpeechAudioFormatInfo(22050, AudioBitsPerSample.Eight, AudioChannel.Mono));
                            ss.Speak("Here is a test sentence...");
                            ss.SetOutputToNull();
                            ss.Dispose();
                        }

                        var process = new Process() { EnableRaisingEvents = true };
                        process.StartInfo.FileName = Path.Combine(AppDomain.CurrentDomain.BaseDirectory, @"bin\lame.exe");
                        process.StartInfo.Arguments = string.Format("-V2 {0} {1}", fileName, fileName.Replace(".wav", ".mp3"));
                        process.StartInfo.UseShellExecute = false;
                        process.StartInfo.RedirectStandardOutput = false;
                        process.StartInfo.RedirectStandardError = false;
                        process.Exited += (sender, e) =>
                        {
                            System.IO.File.Delete(fileName);

                            AsyncManager.OutstandingOperations.Decrement();
                        };

                        AsyncManager.OutstandingOperations.Increment();
                        process.Start();
                    });

                    t.Start();
                    t.Join();
                }
                catch { }

            AsyncManager.Parameters["fileName"] = fileName;
        }

        public FileResult ListenCompleted(string fileName)
        {
            return base.File(fileName.Replace(".wav", ".mp3"), "audio/mp3");
        }

SpeechSynthesizer問題は、戻るためにそのような別のスレッドで実行する必要があるのはなぜですか (これは SO hereおよびhereの他の場所で報告されています)、およびこの要求に対して STAThreadRouteHandlerを実装することが上記のアプローチよりも効率的/スケーラブルかどうかです。

SpeakAsync次に、 ASP.NET (MVC または WebForms) コンテキストで実行するためのオプションは何ですか? 私が試したオプションはどれも機能していないようです（以下の更新を参照）。

このパターンを改善する方法に関するその他の提案 (つまり、互いに連続して実行する必要があるが、それぞれが非同期をサポートする 2 つの依存関係) を歓迎します。このスキームは負荷がかかった状態で維持できるとは思えません。特に、既知のメモリリークを考慮するとSpeechSynthesizer. このサービスを別のスタックで一緒に実行することを検討してください。

更新:SpeakまたはSpeakAsncオプションのどちらも、の下では機能しないようSTAThreadRouteHandlerです。前者は以下を生成します。

System.InvalidOperationException: このコンテキストでは非同期操作は許可されていません。非同期操作を開始するページでは、Async 属性を true に設定する必要があり、非同期操作は PreRenderComplete イベントの前のページでのみ開始できます。System.Web.LegacyAspNetSynchronizationContext.OperationStarted() で System.ComponentModel.AsyncOperationManager.CreateOperation(オブジェクト userSuppliedState) で System.Speech.Internal.Synthesis.VoiceSynthesis..ctor(WeakReference speechSynthesizer) で System.Speech.Synthesis.SpeechSynthesizer.get_VoiceSynthesizer( ) System.Speech.Synthesis.SpeechSynthesizer.SetOutputToWaveFile (文字列パス、SpeechAudioFormatInfo formatInfo) で

後者の結果は次のとおりです。

System.InvalidOperationException: 非同期アクションメソッド 'Listen' を同期的に実行できません。System.Web.Mvc.Async.AsyncActionDescriptor.Execute (ControllerContext controllerContext、IDictionary`2 パラメーター) で

カスタム STA スレッドプール ( ThreadStaticCOM オブジェクトのインスタンスを使用) の方が優れているようです: http://marcinbudny.blogspot.ca/2012/04/dealing-with-sta-coms-in-web.html

更新 #2 : STA 処理は必要ないようです。そのパターンSystem.Speech.SpeechSynthesizerに従っている限り、MTA スレッドで正常に動作するようです。Start/Joinこれは、正しく使用できる新しいバージョンでありSpeakAsync(問題は時期尚早に破棄されていました!)、WAV 生成と MP3 生成を 2 つの別々の要求に分割します。

[CacheFilter]
[ActionName("listen-to-text")]
public void ListenToTextAsync(string text)
{
    AsyncManager.OutstandingOperations.Increment();   

    var t = new Thread(() =>
    {
        SpeechSynthesizer ss = new SpeechSynthesizer();
        string fileName = string.Format(@"C:\test\{0}.wav", Guid.NewGuid());

        ss.SetOutputToWaveFile(fileName, new SpeechAudioFormatInfo(22050,
                                                                   AudioBitsPerSample.Eight,
                                                                   AudioChannel.Mono));
        ss.SpeakCompleted += (sender, e) =>
        {
            ss.SetOutputToNull();
            ss.Dispose();

            AsyncManager.Parameters["fileName"] = fileName;
            AsyncManager.OutstandingOperations.Decrement();
        };

        CustomPromptBuilder pb = new CustomPromptBuilder(settings.DefaultVoiceName);
        pb.AppendParagraphText(text);
        ss.SpeakAsync(pb);               
    });

    t.Start();
    t.Join();                    
}

[CacheFilter]
public ActionResult ListenToTextCompleted(string fileName)
{
    return RedirectToAction("mp3", new { fileName = fileName });
}

[CacheFilter]
[ActionName("mp3")]
public void Mp3Async(string fileName) 
{
    var process = new Process()
    {
        EnableRaisingEvents = true,
        StartInfo = new ProcessStartInfo()
        {
            FileName = Path.Combine(AppDomain.CurrentDomain.BaseDirectory, @"bin\lame.exe"),
            Arguments = string.Format("-V2 {0} {1}", fileName, fileName.Replace(".wav", ".mp3")),
            UseShellExecute = false,
            RedirectStandardOutput = false,
            RedirectStandardError = false
        }
    };

    process.Exited += (sender, e) =>
    {
        System.IO.File.Delete(fileName);
        AsyncManager.Parameters["fileName"] = fileName;
        AsyncManager.OutstandingOperations.Decrement();
    };

    AsyncManager.OutstandingOperations.Increment();
    process.Start();
}

[CacheFilter]
public ActionResult Mp3Completed(string fileName) 
{
    return base.File(fileName.Replace(".wav", ".mp3"), "audio/mp3");
}

score 0 · Accepted Answer

この質問は今では少し古いですが、これは私がやっていることであり、これまでのところうまく機能しています:

    public Task<FileStreamResult> Speak(string text)
    {
        return Task.Factory.StartNew(() =>
        {
            using (var synthesizer = new SpeechSynthesizer())
            {
                var ms = new MemoryStream();
                synthesizer.SetOutputToWaveStream(ms);
                synthesizer.Speak(text);

                ms.Position = 0;
                return new FileStreamResult(ms, "audio/wav");
            }
        });
    }

誰かを助けるかもしれない...

c# - ASP.NET MVC での超高速テキスト読み上げ (WAV -> MP3)

2 に答える 2

Related

Reference