19172

AudioGraph throws XAUDIO2_E_INVALID_CALL on second frame input node

Question:

I'm attempting to use the AudioGraph API of UWP to reproduce a mix of synthesised speech and short notification sounds ("earcons").

UWP has a speech synthesis API which gives me a stream containing a WAV file, but I don't want to make too many assumptions about the parameters (bit rate, sample depth, etc.) so the idea is to have an AudioSubmixNode and add AudioFrameInputNodes whenever there's some speech to reproduce. There's some complexity around queueing up separate utterances so that they don't overlap.

The graph is initialised as

<pre class="lang-cs prettyprint-override"> private async Task InitAudioGraph() { var graphCreated = await AudioGraph.CreateAsync(new AudioGraphSettings(Windows.Media.Render.AudioRenderCategory.Speech) { QuantumSizeSelectionMode = QuantumSizeSelectionMode.LowestLatency }); if (graphCreated.Status != AudioGraphCreationStatus.Success) return; _Graph = graphCreated.Graph; var outputCreated = await _Graph.CreateDeviceOutputNodeAsync(); if (outputCreated.Status != AudioDeviceNodeCreationStatus.Success) return; _Mixer = _Graph.CreateSubmixNode(); _Mixer.AddOutgoingConnection(outputCreated.DeviceOutputNode); _Graph.Start(); }

and then the current utterance is played with

<pre class="lang-cs prettyprint-override">class SpeechStreamPlayer : IDisposable { internal static void Play(AudioGraph graph, AudioSubmixNode mixer, SpeechSynthesisStream speechStream) { if (!speechStream.ContentType.Equals("audio/wav", StringComparison.OrdinalIgnoreCase)) throw new NotSupportedException("Content type: " + speechStream.ContentType); var stream = speechStream.AsStreamForRead(); // Read the RIFF header uint chunkId = stream.ReadUint(); // "RIFF" - but in little-endian if (chunkId != 0x46464952) throw new NotSupportedException("Magic: " + chunkId); uint chunkSize = stream.ReadUint(); // Length of rest of stream uint format = stream.ReadUint(); // "WAVE" if (format != 0x45564157) throw new NotSupportedException("Stream format: " + format); // "fmt " sub-chunk uint subchunkId = stream.ReadUint(); if (subchunkId != 0x20746d66) throw new NotSupportedException("Expected fmt sub-chunk, found " + subchunkId); uint subchunkSize = stream.ReadUint(); uint subchunk2Off = (uint)stream.Position + subchunkSize; uint audioFormat = (uint)stream.ReadShort(); uint chans = (uint)stream.ReadShort(); uint sampleRate = stream.ReadUint(); uint byteRate = stream.ReadUint(); uint blockSize = (uint)stream.ReadShort(); uint bitsPerSample = (uint)stream.ReadShort(); // Possibly extra stuff added, so... stream.Seek(subchunk2Off, SeekOrigin.Begin); subchunkId = stream.ReadUint(); // "data" if (subchunkId != 0x61746164) throw new NotSupportedException("Expected data sub-chunk, found " + subchunkId); subchunkSize = stream.ReadUint(); // Ok, the stream is in the correct place to start extracting data and we have the parameters. var props = AudioEncodingProperties.CreatePcm(sampleRate, chans, bitsPerSample); var frameInputNode = graph.CreateFrameInputNode(props); frameInputNode.AddOutgoingConnection(mixer); new SpeechStreamPlayer(frameInputNode, mixer, stream, blockSize); } internal event EventHandler StreamFinished; private SpeechStreamPlayer(AudioFrameInputNode frameInputNode, AudioSubmixNode mixer, Stream stream, uint sampleSize) { _FrameInputNode = frameInputNode; _Mixer = mixer; _Stream = stream; _SampleSize = sampleSize; _FrameInputNode.QuantumStarted += Source_QuantumStarted; _FrameInputNode.Start(); } private AudioFrameInputNode _FrameInputNode; private AudioSubmixNode _Mixer; private Stream _Stream; private readonly uint _SampleSize; private unsafe void Source_QuantumStarted(AudioFrameInputNode sender, FrameInputNodeQuantumStartedEventArgs args) { if (args.RequiredSamples <= 0) return; System.Diagnostics.Debug.WriteLine("Requested {0} samples", args.RequiredSamples); var frame = new AudioFrame((uint)args.RequiredSamples * _SampleSize); using (var buffer = frame.LockBuffer(AudioBufferAccessMode.Write)) { using (var reference = buffer.CreateReference()) { byte* pBuffer; uint capacityBytes; var directBuffer = reference as IMemoryBufferByteAccess; ((IMemoryBufferByteAccess)reference).GetBuffer(out pBuffer, out capacityBytes); uint bytesRemaining = (uint)_Stream.Length - (uint)_Stream.Position; uint bytesToCopy = Math.Min(capacityBytes, bytesRemaining); for (uint i = 0; i < bytesToCopy; i++) pBuffer[i] = (byte)_Stream.ReadByte(); for (uint i = bytesToCopy; i < capacityBytes; i++) pBuffer[i] = 0; if (bytesRemaining <= capacityBytes) { Dispose(); StreamFinished?.Invoke(this, EventArgs.Empty); } } } sender.AddFrame(frame); } public void Dispose() { if (_FrameInputNode != null) { _FrameInputNode.QuantumStarted -= Source_QuantumStarted; _FrameInputNode.Dispose(); _FrameInputNode = null; } if (_Stream != null) { _Stream.Dispose(); _Stream = null; } } }

This works once. When the first utterance finishes, the StreamFinished?.Invoke(this, EventArgs.Empty); notifies the queue management system that the next utterance should be played, and the line

<pre class="lang-cs prettyprint-override"> var frameInputNode = graph.CreateFrameInputNode(props);

throws an Exception with message Exception from HRESULT: 0x88960001. A bit of digging shows that <a href="https://msdn.microsoft.com/en-us/library/windows/desktop/ee419234(v=vs.85).aspx" rel="nofollow">it corresponds to XAUDIO2_E_INVALID_CALL</a>, but that's not very descriptive.

In both cases the parameters passed to AudioEncodingProperties.CreatePcm are (22050, 1, 16).

How could I find out more detail about what went wrong? In the worst case I suppose I could throw the whole graph away and build a new one each time, but that seems rather inefficient.

Answer1:

The problem seems to be in

<blockquote>

When the first utterance finishes, the StreamFinished?.Invoke(this, EventArgs.Empty); notifies the queue management system that the next utterance should be played

</blockquote>

Although the documentation for <a href="https://docs.microsoft.com/en-us/uwp/api/windows.media.audio.audioframeinputnode#Windows_Media_Audio_AudioFrameInputNode_QuantumStarted" rel="nofollow">AudioFrameInputNode.QuantumStarted</a> doesn't say anything about forbidden actions, the docs for <a href="https://docs.microsoft.com/en-us/uwp/api/Windows.Media.Audio.AudioGraph#Windows_Media_Audio_AudioGraph_QuantumStarted" rel="nofollow">AudioGraph.QuantumStarted</a> say

<blockquote>

The QuantumStarted event is synchronous, which means that you can't update the properties or state of the AudioGraph or the individual audio nodes in the handler for this event. Attempting perform an operation such as stopping the audio graph or adding, removing, or starting an individual audio node will result in an exception being thrown.

</blockquote>

It appears that this applies also to the node's QuantumStarted event.

The simple solution is to move the graph manipulation to another thread with

Task.Run(() => StreamFinished?.Invoke(this, EventArgs.Empty));

Recommend

  • “No overload takes 2 arguments” but IntelliSense shows overload with 2 arguments
  • c# Get process window titles
  • Alternative to ReadLine?
  • Query about Entity Framework caching
  • How to use `MediaElement` to play sound in Android?
  • I fixed the safari/asp:menu issue using addedcontrol method. Why does this work?
  • subprocess call not working
  • Mix .L and .R files into a stereo file using SOX in bulk
  • Play a sound using python subprocess and threading
  • How do I add some other audio to the muted section of an audio using ffmpeg?
  • How to insert silence into an audio file with programming?
  • How to change imagesize to allow longer audio (wav to png)
  • Windows Form: Play sound, but not from beginning
  • Unity InputField OnValueChanged event shows one less character for InputField.text
  • How do I start a tone and let it run indefinitely till i stop it explicitly?
  • Why does audio work in the simulator but not on my iPad?
  • Compare strings with non-English characters?
  • Convert .wav file to binary and then back to .wav?
  • Preloading sound in kivy
  • MVC JsonNetResult - “dataloss” when serializing List
  • Read a sound backward with DirectSound
  • C# String Filepath Question
  • Asterisk IVR After Hangup
  • iOS - How to access the device's file library?
  • Set cookie from Web Api 2 IAuthenticationFilter AuthenticateAsync method
  • getUserMedia results in TrackStartError in Chrome
  • Get Users in Group from Azure AD via Microsoft Graph
  • 302 Redirect from http to https in Android using Dropbox short Hyperlinks
  • Finding number of samples in a .wav header
  • How to crop a mp3 in ASP.NET + C#?
  • Dynamically generated lookup key for IQueryable
  • How to fail Phing without triggering backtrace
  • Linq Full Outer Join on Two Objects
  • How to get the index of element in the List in c#
  • Ionic storage “get” returns null only on the second call within method
  • multidatatrigger with multibinding in ControlTemplate.Triggers
  • Cloud Code function running twice
  • Thread safety of a fluent like class using clone() and non final fields
  • Converting a WriteableBitmap image ToArray in UWP
  • Time complexity of a program which involves multiple variables