Characters won't do. Use XML markup to control this better, pass the SPF_IS_XML flag:
HRESULT hr = pVoice->Speak(L"Hello <silence msec=\"1000\"/> world",
SPF_IS_XML, NULL );
Or you can use an SSML document with the SPF_PARSE_SSML flag, use the <break>
element:
<speak version="1.0" xmlns="http://www.w3.org/2001/10/synthesis" xml:lang="en-US">
Hello<break time="1000ms" />world
</speak>
If you can use C# then the PromptBuilder class is very handy to build the SSML:
private SpeechSynthesizer synth = new SpeechSynthesizer();
private void sayHello() {
var builder = new PromptBuilder();
builder.AppendText("Hello");
builder.AppendBreak(TimeSpan.FromMilliseconds(1000));
builder.AppendText("world");
synth.SpeakAsync(new Prompt(builder));
}