Category Archives: Tools

Tools for VUI design.

More on documentation and creativity

It was very interesting that we were just recently talking about what it would take to make documentation actually useful, and a couple of weeks later I heard about a new book by the GuiMags guys (creators of a magnet-based prototyping tool for interface design) titled “The Unplugged.”

What I find very interesting about it is that as part of being a UI designer, I’ve always had a passion for creativity in general — tools, techniques, case studies, etc. — and one of the common themes that comes up in relation to Creativity and Design is that sometimes, the only way to design something right is to go analog.

Analog? Let me clarify. What I mean by that is that before you write your first line of code, before you create the first page of that design document, before you draw that first block in your call flow, you need to turn the computer off, step away from it (yes, laptops too), grab a pad of paper and a pencil, and start drafting your ideas.

I know it sounds counterintuitive, but sometimes all those software tools we’re so used to in fact hinder our creativity because they force us to adapt our way of thinking to the inherent rules and restrictions of each individual software package. We’re visual creatures: anyone can take a napkin and a pen and quickly sketch a new idea or a solution to a problem (this is my favorite book on this subject), yet those same individuals often remain quiet during design meetings and let the “creative” members take the lead role since they can create breathtaking digital images and amazing presentations with digital effects.

Next time you’re faced with a new design challenge, give it a try, unplug yourself, go analog for a little bit and let your brain and hand take control. Great things will emerge, believe me.

Making Documentation actually useful

I was recently reading an article about the future of wireframes in the context of user interface design documentation. Wireframes have been used mostly for visual elements and became a critical building block in the early days of the web.

But since I like drawing analogies between other UI fields and the VUI field, there were a few quotes that struck a cord because of their universality:

“The object was to create as many wireframes as possible, of every screen in the entire site, in big, monolithic and hugely detailed chunks. Rather than exploring different approaches to the information and structure of the site, the emphasis became entirely focused on using all of the time available to build a collection of wireframes, regardless of whether they were the right wireframes.”

Ouch, how much of that is still taking place nowadays? You create as many “detailed individual states” as possible sometimes loosing track of the real intent of the document. Nevertheless, some of those risks can be lowered by using a layered approach where you start as simple as possible and then start adding details to the design that make sense from a design perspective and that help clarify the overarching intent; for example: starting with a high-level, 1-page interaction flow, then adding details in the form of sample calls, which then evolve into detailed flows and become the source for an initial or “skeleton” specification document (containing mostly initial interactions, without error strategies) which after various reviews (including Usability) become a complete or “full” specification document.

“Why hold the information in a document that’s no one wants to read?”

Thank you, thank you, thank you. How many times designers have to create alternative “views” of their documents because some groups may not be able to use (or care) about certain aspects of the design, which might be buried with other details or is presented in a format that is neither usable nor efficient. But of course, the question that begs to be answered is “What’s the ideal document the developer would like to see to build a system from?.” Suggestions anyone?

“In a previous life at a big ‘old style’ new media agency, there often seemed to be a one tool fits all approach to projects. This applied to information architecture too.”

I’m sorry to say some of might still be living that life. Methodologies/Systems anyone? I totally agree with the notion of finding out what’s the best tool for a particular project. Not every project requires the 12-step program, and not every customer processes information the same way.

“The best sites are those where there’s a seamless divide between the look, the content and the experience.”

This one I would like to borrow and extend as a closing statement: “The best systems are those where there’s a seamless divide between the look, the sound, the content and the experience.”

Time to rethink our current documentation practices…

Hello, this is your medication. Have you forgotten about me?

Outbound calling (meaning automated phone calls that go out to specific individuals) is a very profitable business that thrives at times such as this one when companies need to reach more consumers yet want to reduce the costs of making those calls since most of the time they are nothing more than the equivalent of “phone spam”.

Therefore, I’ve never been a big fan of these types of services, except for those situations where I know we’re adding value to the conversation. Those situations where we’re providing a benefit to consumers, particularly in win-win scenarios where both parties benefit from the interaction.

One product/service I recently found out about that does exactly that is GlowCaps Connect. GlowCaps are electronic pill caps that use some very clever means to ensure patients take their medicine at the times and frequency that they should.

So picture this. If you know someone that needs to manage a chronic disease like diabetes or depression, daily medications are essential for their well being. What this device does is that every day, at the prescribed time, the GlowCap uses a myriad of modalities to remind users and attract their attention. For example, it may flash a visual reminder which is followed by sound if the bottle is not opened within the first hour. If the patient still doesn’t open the bottle, then the cap triggers a phone call to remind them and can even send weekly updates to friends and family as well as send reports to the patient’s doctor with a monthly summary of the bottle’s activity.

So, to summarize, better prescription handling which can be rewarded with coupons and incentives, better healthcare management with the doctor, and an opportunity for pharmacies to handle automatic refills. Those are the types of calls I wouldn’t mind at dinner time.

Total Recall

I have to admit that in this time and age, even with all the advantages technology provides, when it comes down to keeping track of pending items, errands and to-do-items, I tend to stay in the analog world (read pen and paper).

So I was very excited to read about a new smart phone app which seems to be tackling this problem in a very clever way by integrating the best aspects of disparate technologies such as Post-It notes, email, calendars, and voice for free! (with a Pro option available too)

It is called reQall and the way it works is that you call a free number to add “items” via your voice or text such as notes, appointments and memos.  If using your voice, they use speech recognition software to transcribe your message into text so that based on your situation (time, location, etc.), the system can remind you of those items via email, SMS, IM or even a “phone shake”.

A very nice feature is that you can also share your account with other family members and friends, so they can enter reminders for you.  Mmm, I wonder why wives love this feature so much…

Voice Recognition and Mobile Search

Mobile search has been identified as one of those applications where Speech Recognition can become the killer app.  There are many instances in which speech recognition has been integrated with mobile devices, some of which include doing recognition embedded on the device, some others that perform the recognition on “the network” (a remote server farm) which then returns the results to the device, and some others rely on real human beings transcribing the contents of the request so they can be processed accordingly.

Then of course, comes the part of the search itself.  Some services for example, provide you with a list of links to Web pages (such as Google and Yahoo). Others, like ChaCha, uses humans to find the answers for you and then send you the response via text (and yes, you can become a “guide” for them). While some others attempt to integrate other features and capabilities of the devices such as the use of GPS and maps, or trigger subsequent reactions on other services such as changing your status in Facebook or Twitter (as is the case with Vlingo).

Now, if someone could simply find a way to use voice to find out where I left my phone, or my remote, or the car keys…

GrandCentral is now Google Voice

Just when you though that having a single phone number that would ring all your phones, coupled with a central voicemail inbox accesible from the web, including the ability to screen calls by listening in live as callers leave a voicemail for free couldn’t get any better, Google does it again.

That’s right. Google is revamping their GrandCentral system (which we’ve talked about before) and changing it’s name to Google Voice.

Aside from the privacy concerns that have been popping up everywhere in the blogosphere, the system and it’s features have received rave reviews and praise for the enhancements added to the platform. The one I like the best? Transcription of voicemail into text of course! That way you can read your voicemail at leisure, copy/paste them, search for specific terms, etc.  Details about all the features are available here.

The official Google blog has more details about it and Mr. Pogue has a great video showing it in action.  Even though it’s currently only available to existing GrandCentral customers (what can I saw, I got lucky ;) ), you can still request an invitation for when Google Voice becomes available to the public sometime next week.

Microsoft Recite

Microsoft Recite

We’ve talked in the past about the use of speech recognition in the realm of note taking, where tools such as Jott allow you to obtain a text version of a voice message, making it easier to document and search for information.

Well, Microsoft just recently unveiled a new application of speech recognition, but this time with a twist. Microsoft Recite (available as a preview which can be downloaded) allows anyone using a Windows Mobile phone to record a voice message or “remembrance”, store it, and then retrieve it later using speech pattern recognition.

The obvious advantage of pattern recognition compared to other types of speech searches is that the message itself doesn’t have to be decoded, transcribed or converted.  It simply uses a “search” sample as a pattern to match one or more of the words against existing “remembrances”.

Even though initial test have received possitive feedback, I’m hoping they’ll expand the tool to include other devices and languages (it currently only works with US English).

Kindle 2.0 – An ebook “reader” in every sense of the word

It seems after all the criticism Amazon received on the user interface of it’s original Kindle, they’ve addressed not only some of the concerns but also took some of the suggestions which are now part of their second version of the device.

Some of those suggestions included adding speech-to-text capabilities to the Kindle 2.0, making it indeed an e-book “reader”.  I think this is a magnificent idea not only because it not only addresses how devices should evolve to support people with disabilities but also gives control back to the users on how to best interact with the information.

I know what I want for Christmas

I know we’re used to relate the notion of IVRs with arcane self-service over-the-phone systems and IVR jails, yet a company called Moshi found away to leverage the notion of “Interactive Voice Response” in a totally distinctive way.

The Moshi IVR Alarm Clock is the first one to my knowledge that allows you to set the time and the alarm by using your voice. To start interacting with it, you simply say “Hello Moshi” and the clock responds with “Command Please” (I know, a little VUI help never hurt anyone). It currently supports a list of 12 commands including things such as “time”, “set alarm”, “temperature” and “help” (apparently “help” still has its uses).

A demo is currently available at the Moshi website which shows how the phone responds to various commands and Endgadet has some more details about it. Personally, I think it’s pretty cool, plus the price is not bad either ($50). But from a design perspective, I think it’s just a shame they didn’t invest a little bit more in having better sounding prompts (with a professional voice talent), which combined with the use of more natural, concatenated prompting, would’ve yield much better results (let’s face it, anyone still concatenating time in the form of “six” “o’clock” “a m” is being a lousy designer).