We are often asked why our robot kiosk Heasy doesn’t talk. After all, other robots already have this ability and we see more and more digital assistants like Alexa or Siri where interactions are only verbal. Are we just a bunch of lazy guys?
« It’s not a bug, it’s a feature! »
People in the software industry are used to this joke. In our case it’s true. Not speaking is not a bug. Having a robot that does not talk is exactly what we wanted from the beginning for our robot kiosk. Let us explain the reasons behind this choice.
Can you repeat please?
Yes, Alexa works great, its speech recognition engine is awesome, but it works at your place, in a quiet environment. Things get a little trickier when the interaction is happening in a public area. It’s still hard to extract a specific voice from a noisy environment, even with the latest microphones and good software. We believe this is something that is going to be possible in the future at some point, but we’re just not there yet.
The consequence is that the robot works great during rehearsals and internal tests, but can’t handle field tests. After not understanding you once, the robot will ask you to repeat again and again and you’ll end up leaving and thinking the robot is stupid. Or – when it’s equipped – it asks you to use a touch screen for the interaction. Either way, the outcome is that a talking robot creates a level of expectation for the users where it fails to deliver, so the interaction is deceptive.
By having a robot kiosk that doesn’t talk, we don’t create the expectation of a verbal interaction. The wahoo effect is of course lower at first, but this way, we make sure that every interaction ends up with a positive feeling since the robot is able to deliver every time.
The map or the itinerary?
Speaking isn’t always the best way to share an information. Let’s say you’re in a mall and want to know how to get to a specific point. It’s easier to see a map showing you the path to follow rather than trying to remember all the instructions.
Another example: you’re attending a conference and you’re interested in the agenda. Would you listen to a voice reading all the information, session after session after session? Would you enter into a voice-based search and listen to the results? Or would you rather be able to navigate a visual calendar?
For a lot of actions, voice is not really ideal for us and for our brains. Images are extremely powerful in our world. After all, don’t we say that a picture is worth a thousand words?
Don’t waste my time
Reading an article in the newspaper or online is faster than listening to someone reading the very same article to you. We have all experienced this.
When looking at the average interaction time with Heasy, we can see that people spend 2 minutes with it. It doesn’t mean that they would spend 5 minutes if we were adding more things on the robot. It means that they don’t have more time to dedicate to the interaction: if they’re at the mall, they came before all for shopping; if they’re in an airport, they came to take a plane, etc. We must make sure that during this 2 minutes timeframe, we do as many things as we can. With this in mind, relying on voice interaction would lower the impact of Heasy timewise.
By now, you should all know about Alexa, the digital assistant embedded in the Amazon Echo. It’s all vocal., right? You talk to the speaker and that’s it. Alexa was released in the US in 2015. And it was released in France in 2018. Meaning that it took three years to Amazon to be ready to address the french-speaking market. And we’re talking about Amazon and its almost infinite resources. A giant compared to us.
From the beginning, there was no doubt for us that people need to interact in their native language with a robot kiosk. But verbal interactions would make us slow, since training a verbal model to be relevant in a specific language takes time. It would take time from us, but also from our clients when translating their apps. So that’s another reason for not having Heasy to speak and interact verbally. Instead, we decided to give it the ability to express basic feelings that could be recognized independently of the language spoken by the user: love with hearts in the eyes and a sweet little sound, sadness with tears, etc. This allows Heasy to both express basic emotions and strengthen its bond with the user.
As we’ve seen, there are lots of elements that led us to make a robot that doesn’t talk. Our goal when creating our robot kiosk Heasy was to make sure that every user would finish an interaction with it being fully satisfied of the experience. As we’ve seen, verbal recognition isn’t ideal for our case in public areas and having the robot to speak isn’t the most efficient way to share information while also being slow. As of today, we’re still convinced it is a good choice. In the end, it benefits to our clients and to the people using Heasy.