Language based interfaces are hip right now. Both iPhone and Android OS recently have been equipped with the ability to listen to your talk. So the technology seems to have reached a level where it’s ready to hit the consumer making it easier to communicate with certain devices and services. But the technology alone won’t do the job. It needs an appropriate interface in order for us to understand. Otherwise we won’t realize that a specific device is able to understand our commands.
For instance the Nokia N Series (N73, N80, N95 etc) has “voice recognition” included since several years but only a very few people know about that particular feature (or even use it). So we asked ourself some questions: What are the drawbacks of current voice interfaces? How can we get the human touch in language based technology? And how do we prototype these kinds of interactions in order to come up with new solutions?
You talking to me? Using a visual recipient for voice commands One problem that we identified is the “recipient problem”: Who am I talking to? In which direction do I need to talk? Talking to a screen seems weird to us. The solution we came up with is using a metaphor. In our case a white rabbit.
Schau mir in die Augen! Activating devices by looking at them “The system should always keep users informed about what is going on, through appropriate feedback within reasonable time.” 10 Heuristics of Usability
Another problem we identified: the feedback! Is the device actually listening to me right now? Will the device listen to my commands all the time? Is it active? So we came up with two modes (quiet and listening) and a simple way of switching between them: Using your eyes.
Our setup: a microphone, a webcam, Macspeech Dictate, Nabaztag and a bit of hacking mindset. The quick solution for our rough prototype: use a rabbit that knows your face! We’ve ‘augmented’ our nabaztag bunny with a webcam and used a face detection algorithm trough Processing to find out when somebody is looking straight into the eyes of our little bunny.
We captured a video in order to document the setup and two simple example implementations for voice commands.
All these findings and questions seem are certainly worth to explore. We’ll keep our ears on the ground in order to check for signals and innovations that will support these kind of interactions.