February 19, 2021

By Cassandra Lee

As part of an ongoing initiative to bring new experiences into the car, we recently launched Cerence Pay, a new app for mobility assistants that allows drivers to purchase things like coffee or gas while driving. As the app’s lead user experience (UX) designer, one of the first things I asked myself when embarking upon this project was whether such an idea would be valuable to users. I have to admit, I was pretty skeptical. It’s already easy to tap a credit card or use a phone e-wallet, so above and beyond existing payment tools, how could a voice-controlled e-wallet add new value for drivers? To ground my creative process in real driver feedback, I launched a series of user surveys, interviews, and prototypes to understand if and why drivers would adopt a new payment concept in the car.

The results provided an intriguing insight. Preliminary data from North American and European drivers suggested that most users are fairly interested in trying in-car payments by voice. In an online survey of 113 drivers, 51% selected voice as the preferred way to make purchases while in the car. When asked to explain their choice, participants stated that interfacing with their vehicle seemed quicker and easier than using a machine or a cashier. Paying in your car means you don’t have to wait in lines or fuss around with change. It also provides an alternative should you forget your phone and wallet at home.

Cerence Pay


But the most compelling insight from participants was their belief that a voice assistant could uniquely bring highly desirable context into their purchasing decisions. We make thousands of choices every day.  The context under which we make these decisions is nuanced by our personal situations, mindsets, and understanding of the system at hand.   Let’s say I’m buying tickets for an experience at a museum. Considering my personal schedule on that particular day, I might arrive 15 minutes late for my allotted timeslot. If I don’t arrive on time, will I still be allowed to participate? The answer to that question may change what and if I decide to make a purchase. 

Also consider that people deviate from default behaviour for simple and unpredictable reasons like good weather or spur-of-the-moment plans. On an average day, we might choose to buy take-out from cheap, quick, and familiar restaurants. But on a different day, for whatever reason, we’re in the mood to try new flavours from a restaurant outside our comfort zone. A machine which empowers users to flex spontaneous desires and break out of usual search patterns provides value far beyond facilitating transactions. 

Cerence Pay

Voice assistants empower users to make decisions on more personal and contextual grounds than traditional user interfaces, because natural language affords users the ability to bring that context into the interaction. Consider the simple use case illustrated in the video above. A driver pulls into a street parking spot in an unfamiliar area. She begins the interaction by asking the assistant: “Can I park here?” The assistant responds, “Yes,” lets her know how much it costs, and finishes the transaction in just two steps. When study participants watch this video, they are not especially drawn to how easily the assistant processes the payment – this is straight-forward and already expected of modern technology. For most users, the primary appeal of this product is the ability to ask: “Can I park here?” 

“First, I like how the car tells me whether I'm required to pay or not. Sometimes it is ambiguous. Second, I like that I don't have to even leave my car or search for a meter, or take out money (change is hard to find), or enter a password, or anything else like that: I don't waste time. I just simply authorize the transaction, and I'm ready to go. Fabulous.” 
– Study participant

I am hardly surprised participants react strongly to this feature. Who hasn’t agonized over finding a good parking spot, only to feel duped by an unclear parking sign or confused in an unfamiliar area? Having an assistant directly answer: “Yes. You can park here until 6pm for $3.50 per hour” frames ineffectual parking data in the world of its user. Completing the payment transaction from there is just the cherry on top. 

I can point to many machine-mediated transactions which fail to engage us on a human level. How many times have you searched for answers on a bus map or a long settings page and wished you could just ask your question to another person instead? Speech recognition is well adapted for these kinds of problems because it is a highly efficient input modality: we expend less cognitive resources posing a question out loud than searching for it on a screen. It is this superior capability which makes voice assistants good partners in making decisions, because users can frame their tasks in their own language, rather than the convoluted language of the machine.

Technologists, particularly in the e-commerce domain, tend to promote products and services which make experiences faster and more convenient. But beyond speed and convivence, consumers still value technology which makes their experiences more personal, meaningful, and engaging. As consumers have growing expectations of convenient digital shopping experiences, we must consider how to deepen these experiences. Cerence Pay is well on its way to making it into vehicles in the next coming years. As we move ahead with development, our vision is to continually ask ourselves: how can this experience be more than just convenient?

Discover More About the Future of Moving Experiences