The Mercedes-Benz User Experience (MBUX), Mercedes-Benz’s new human-machine interface and automotive assistant with intelligent voice controls, is one of a kind – in more than one way: It got great reviews in the media, where people claimed the system was “smarter than the average Siri or Alexa,” as it understands and continuously learns the needs and preferences of users, and provides an increasingly personalized and connected experience. It’s also one of the first automotive systems to feature a customized wakeup word. The premium interface is introduced with an entry-class vehicle – the all-new Mercedes A-Class.
For me personally, there’s something about this system that stands out above the rest: It’s the first voice-enabled in-car system launched during my time at Nuance. You can imagine that my expectations were quite high when I got the chance to experience the system myself – we were creating a short video indicating the most important voice-enabled use cases. Full disclosure: These expectations were fully met, and I was so excited about the system that I even agreed to add another first-time experience to my resumé: I debuted in front of a camera.
But I’m not here to talk about myself or lay the foundation for a great acting career: Let’s look at the technical details and key technologies that enable the performance of this voice system. Fortunately, Udo Haiber, VP Automotive Engineering and Services at Nuance, agreed to have a thorough look at the system with me.
MBUX uses Dragon Drive’s hybrid approach that combines embedded and cloud services to achieve the best results and to offer fallback solutions for driving in areas with zero or low network coverage. At the same time, the system is deeply integrated with the car itself and ensures maximum data processing in the vehicle. As a result, in-car functionalities can be controlled without compromising data security, all while enhancing driving safety.
“Hey Mercedes” Wakeup word: One small word for the user, one giant leap for driving safety
MBUX is one of the first in-car systems to offer a brand-specific, unique wakeup-word – a technology widely known from smart assistants for the mobile phone or connected home.
Natural Language: Truly conversational interface beats stereotypes
MBUX understands natural language and implicit commands in 23 languages, different accents, slang words, and dialects. In addition, the system adapts the dialogue output to the user’s questions and has the intelligence to follow the conversation by recalling what a user has said previously and has the “memory” to recall references from the past. Drivers and passengers can bypass lengthy dialogues and reoccurring commands by interrupting the system, a feature that mimics human-to-human speech dynamics.
“Studies in our Drive Lab confirm that using voice is less distracting than pushing a button or touching a screen,” explains Udo Haiber.
Broad domain coverage
Nuance’s Dragon Drive platform, the core technology of all MBUX speech features, supports questions on more than 500 domains. “From a technical point of view, the abundance of domains is very challenging to manage as certain trigger words can potentially belong to a series of domains,” explains Udo Haiber. For example, the trigger word “cold” in the implicit command “I am cold” could refer to environmental weather as well as to the domain of in-vehicle air conditioning control. “The system needs a broad contextual understanding to address the right domain,” Udo Haiber explains. “Thanks to what we call ‘hybrid arbitration,’ the system understands the context and activates the corresponding function or domain – either in the car or in the cloud.”
In-car office and advanced mobile phone operation
The MBUX in-car office capabilities give users access to important data and office functionalities like telephone conferences and appointments. Messages can be dictated and sent in just one shot, and the advanced text-to-speech capabilities make reading emails and messages aloud possible.
Constantly learning system
MBUX is a learning system: New relationships can be added that make commands like “call my brother” or “send a message to my boss” possible, even if these designations are not stored in a user’s mobile contacts. “Thanks to its cloud connection, the system is constantly being updated over-the-air,” says Haiber. “Voice control is enriched with new words and language evolution over time, and new domains are added to broaden the spectrum of information available to the driver at any time to ensure the vehicle can always include state-of-the-art capabilities.