SpeechTEK trend – Alternative A/B call flow design
One very interesting trend that came up a handful of times during this year’s SpeechTEK East Conference was the use of Alternative A/B Call Flow designs, pushed by companies such as SpeechCycle and Google.
The basic premises of A/B Call Flow Design are that designs shouldn’t use a single path for all interactions, and that those paths should be driven by actual data.
And it definitively makes sense. Think about it. When you’re designing a call flow path, you’re basically implementing a version of all the assumptions that took place during the Context Gathering phase of your project. You gathered data, considered business requirements and user needs, monitored calls, and drafted what you considered to be the ideal design for that organization (which you hopefully also usability tested).
Unfortunately, those types of designs rely on data that could be coming from disparate systems, from existing designs (which we cannot guarantee are an accurate reflection of user’s behaviors), or sometimes are flat out missing. So what can we do?
The process is pretty straight forward. You create two or more alternate versions of a design (A and B), each using slight variants of prompting and/or call flow, which implement a metric tracking system that follows a call and assesses whether a certain call was successful or not (based on criteria such as caller-generated transfers, call length, etc.). Then, a randomization script receives all calls and routes callers through the alternate paths. Finally, the VUI designer analyzes the data and is able to conclude (with actual data and statistics) which of the various alternatives yields the best results… taking some of the faith out of the design phase.
For example, imagine you have a “Say Anything” or “Speak Freely” type of system in place (which uses Statistical Language Modeling – SLMs – to accurately route callers to specific destinations) and you’re wondering whether prefacing it with just a “How can I help you?” prompt will be enough for callers to understand what’s expected of them. Other alternatives of course include expanding that prompt to include a set of examples (sample phrases callers can say), a set of choices (options callers can choose from) or a combination of both. And on top of that, we could also be wondering what should be the appropriate number of examples and/or choices that we should be presenting. Pretty tough, right?
This type of strategy is also very helpful when dealing with Main Menus. Sometimes the Information Architecture of a system is pretty obvious, but sometimes it allows for alternate designs. For example, on a financial application, some callers think about their institutions in terms of the products they offer (checking, savings, loans, credit cards, etc.), while some others think about them in terms of the services they offer (getting balances, making payments, executing transfers, etc.). So again, we could implement two alternate versions of a main menu (each using a different focus) so that we could track the performance of each type in real time and then we would be able to identify the one that yields the best results (completion rates, call lengths, transfer volumes, etc.)
Up until now, most designs followed a “me too” strategy simply replicating what other systems out in the field were doing, and then having to wait until a full pilot analysis was completed before being able to assess the state of the system. But now, with A/B designs, systems themselves can provide the data we need to make those decisions, and in a way, even learn by themselves and self-tune the interaction using reinforced learning.
Have you tried something similar before? If so, have you find other types of uses for it? And what were your results?
Eduardo,
Thanks for the shout-out! It was good to see you. Hope Nuance is treating you well.
Phillip
Hey Eduardo.
Glad you saw the talk. Just wanted to add that this approach can be used by Speech Science as well. Our speech scientists have compared grammar performance using a random split, while holding the VUI constant.
As far as VUI, we are also currently running an experiment to observe the effects of variations in opt-in prompt wording (By “opt-in prompt” I mean a prompt that gives the caller the option of working with us or waiting in line for an agent). No data yet.
Cheers,
jon bloom