Insights

What Companies Should Consider Before Investing In Smart Speakers

Featured in Harvard Business Review

By Gokhan Ozturk and Shri Santhanam
This article first appeared in Harvard Business Review on April 26, 2019.
The Arabic version was published on Harvard Business Review Arabic on May 29, 2019

Just eight years after companies first began selling smart speakers to consumers, more than 100 million voice-enabled virtual assistants are now checking the weather, playing music, and answering basic questions in homes around the world. Smart speakers help users pay bills, get bank account balances, and balance budgets. Soon, many will suggest new products based on customers’ past purchases.

But businesses should refrain from embracing enthusiastic visions of replacing their customer service agents, sales forces, or financial advisers with next-generation smart speakers in the near future. Smart speakers can now recognize and respond relatively smoothly to pre-trained unambiguous sets of commands and data, due to great strides in voice recognition technologies that translate what is said into text. In a few years, they will probably sound even more natural.  But there is a difference between sounding more human and being able to truly interpret customers’ intentions.

The risk to business: If a company prematurely introduces smart speakers for more sophisticated purposes and the technology fails, the company’s hard-earned reputation could suffer.

The technological and organizational alterations required for smart speakers to understand what customers are actually saying accurately enough to advise them on complex decisions are five years away — if not more. Moreover, most companies are only starting to sort out how to stitch together the different systems, applications, devices, and data in order for smart speakers to access enough information to fully understand a customers’ needs across various channels and firms.

As a result, customers will still prefer to use a mix of mobile apps, chatbots, call centers, and smart speakers for the foreseeable future. So how should businesses approach introducing smart speakers?

Avoid Overly Complex Applications

First, companies need to fully understand smart speakers’ limitations. Today’s voice-enabled devices still struggle to grasp the difference between what customers say, mean, and really want. The devices often miss when a customer’s query needs to be escalated to a person. They are known to misjudge when to start, stop, or continue listening in conversations. They also can misunderstand when people use colloquial expressions because they feel joy — or exactly the opposite.

Running a smart speaker pilot alone can cost anywhere from $200,000 to $2,000,000 — yet until smart speakers can process natural language more easily, customers will still seek certain basic answers from chatbots or call centers. So for the foreseeable future, companies should avoid using smart speakers for complicated customers experiences. Instead, they should stick with focusing on new ways to implement smart speakers to carry out specific tasks with a clear answer set and design customer experiences that stay well within smart speakers’ limits.

Prepare for Misinterpretations and Fraud

Beyond cost concerns, companies also need to confirm they can manage the reputational and operational risks that could accompany each new smart speaker application. For example, businesses will have to monitor whether smart speakers are giving inappropriate financial advice that could endanger a customer’s financial health.

Voice-enabled algorithms will not be able to advise customers on different options for products like mortgages, insurance, and car loans until they can grasp much more detailed financial information from separate accounts in potentially multiple institutions that customers provide vocally — beyond the partial data readily available. Important information could also fall through the cracks as clients move between devices and channels.

The technological and organizational alterations required for smart speakers to understand what customers are actually saying accurately enough to advise them on complex decisions are five years away — if not more

Safeguarding customers’ privacy will also be an issue. Requests made over smart speakers leave a trail of digital traces. While most people have accepted that data collected from smart speakers may be used for targeted ads, they will likely feel differently about sensitive sales or financial information that reveals a cash shortfall. That will be especially true if someone may overhear them explaining their financial needs in a public space.

Providing security against fraud committed with recordings of customers’ voice data could also prove problematic. Voice fraud incidents are already rising, as faking audio files in many ways is now easier than copying credit cards or fingerprints. Synthetic voice avatars are widely available. With enough data, artificial intelligence programs can generate fairly convincing audio files of anyone. Hackers trick voice assistants by burying “silent” malicious commands into white noise at audio frequencies beyond the realm of human hearing.

To combat voice fraud, companies will have to build and maintain systems that verify vocal orders by doing much more than asking key questions. New ways will have to be found to detect and notify customers of fraudulent vocal orders as efficiently as for false orders placed on computers or in stores. For example, systems and algorithms will have to be developed that can quickly analyze links to previous fraud incidents and determine if voices are real or pre-recorded.

Find the Right Talent

Due to the vocal nature of the new technologies being introduced, customizing smart speaker services for business operations will involve developing broader skill sets than those that have been necessary for other artificial intelligence innovations. For example, to introduce natural-sounding smart speakers for more sophisticated purposes, companies will have to test and design systems that are prepared for millions of different possible customer responses. In addition to product managers and data scientists, most organizations will need content creators and user experience designers familiar with call centers. These people will need to write scripts and educate voice assistants on how best to respond to customers in distinct segments and geographies with a wide range of requests and responses.

Businesses will need to focus on developing a broad range of internal skills, while partnering with vendors when they can’t keep up with more innovative techniques for, say, voice identification. Companies may even be compelled to pool resources with competitors if they cannot invest enough to provide natural-sounding and safe smart speaker services on their own. Otherwise, customers could choose smart speaker services offered by tech giants with deeper pockets, as they have already done for other services like cloud computing.

Proceed in Stages

The rapidly expanding capabilities and adoption of smart speakers over the past several years have been nothing short of breathtaking, and there is no doubt that businesses are only beginning to scratch the surface.

But the next great leap to smart speakers that can provide valuable advice will not be easy. What’s more, there’s a real risk that overly optimistic companies that rush ahead could be blindsided. As Warren Buffett once famously observed, “it takes 20 years to build a reputation and five minutes to ruin it. If you think about that, you’ll do things differently.” Proceed with caution.


This article is posted with permission of Harvard Business Publishing. Any further copying, distribution, or use is prohibited without written consent from HBP - permissions@harvardbusiness.org