Speech-to-text algorithms have evolved, and now, conversational AI can engage with customers while identifying speakers and labeling their contributions. NVIDIA® Riva fuses multi-sensor audio and vision data into a single stream of information that can be used for advanced transcription, like differentiating multiple voices in real time.