Skype Journal: ETel: They told me
January 25, 2006 06:26 AMI was really impressed by the pitch from Tellme earlier today. (Although, despite them being great voice usability experts, their web site is a pile of Flash-infested crud.) I’d assumed IVR stuff was dull. On the basis you’ll get the best insights from the session that superficially is least attractive, I selected this session. Turns out it was a good choice.
I really liked the idea as telephony as the most intimate medium — a whisper in your ear. But what Tellme really have cracked is making the IVR experience much closer to interacting with a human, and not a string of audio files tacked together with some shell scripts. They played lots of examples of really, truly awful IVR experiences. And then what they did to them.
This is important because good experiences drive real business. The customer’s impression of your business and brand is derived right from the experience they have. The example Tellme gave was UPS. Their old IVR system was very, very slow. The messages went on and on, slowly read. Do you really want UPS to be associated with “slow”? Thought not. Most of all, a good experience creates trust in your brand.
Tag : etelThe first thing they’ve cracked is making the voice experience more seamless. They’ve created vast libraries of all sorts of clever combinations of phrases which get blended together by cognitive psycholgy and linguistics experts. And the result is super-impressive.
Their voice libraries go beyond what’s known as “single prosody”, the old-style IVR where you heard broken-up phrases glued together like “departing | Saturday. | July. | 22nd.” Instead they have multiple prosody — “departing | Saturday, | July 22nd” etc. (note the comma after Saturday.) It works. But they’ve had to record over 37000 WAV files just to read back numbers!
They’ve also cracked “points of co-articulation.” You can’t record every possible combination of terms. So record the first term followed by an example second one starting with one of the 40 phonemes in English — “Hi John”. Then record all the possble second terms: “James”, “Jim”, etc. Then splice in the right second term just in place of the example one. Again, the result is impressive.
You really can tell the difference in terms of comprehension and memory retention.
They also did a great pitch on optimising the usability of IVR systems. The phone is a linear presentation, and taxes short-term memory. You don’t have a 2D screen with bold, drop-down boxes, etc. The boundaries are also invisible. It’s not like the Web. There’s a strong “recency effect” — the last thing said (“press 0 for operator”) is first thing remembered.
So they have a bag of tricks. Personalise. For example the sports team “squeaked by” if you support that side vs. “lost a close one to” if you don’t. They “instruct as you go”, deferring navigation instructions to the time they’re needed. (Lazy evaluation always deserved a comeback…) They use “progress markers” - “First, tell us…” “Next,” “Lastly”. Adopt colloquial language, not written English. Optimise to meet user goals, not sub-tasks. And so on.
I’ve glossed over a lot of ineresting detail, and good stuff. If only they could put up a few corporate blogs and share their cool innovations and work on an everyday basis!
Martin's other tales are told on his Telepocalypse blog.
Tag : etelTrackBack (0)
Comments (0)