When can you rely on a machine for in-person interpretation in 2018?
Updated: Nov 10, 2019
Introducing a framework to help executives, as per the latest trends in AI and NLP.
Author - Matt Conger
Fifty years ago, in 1968, the U.S. first began certifying certain combinations of planes and pilots to land using a machine. Today, planes don’t always land themselves, but as a society, we’ve gotten comfortable enough with so-called “autoland” to trust our lives to the programmatic logic built into the machinery of an airplane. In 2018, we are often confronted with a similar “man vs. machine” dilemma: would you trust a machine to translate (or, more precisely, interpret) a live event?
I’ve spent the last five years wrestling with this question as CEO of Cadence Translate. I’m an engineer by background, but my co-founder is a trained interpreter. The debate is especially acute since the human-powered solution can cost thousands of dollars for a one-hour meeting, and the machine solution can cost pennies in the form of API calls. Yeah humans are better, but are they worth a 1000x cost differential versus machines?
My primary gripe with this question is that people always answer it on a binary scale. Those whose career is data science or natural language processing give some variation of “all events can use machines for live interpretation”. Those who are practitioners of the interpretation and translation industries give some variation of “machines will never outperform humans”. The most thoughtful commentaries I’ve seen on this issue are basically “machines won’t wholly replace translators, but translators who use machines will replace translators”.
This article is written for executives who are faced with this issue. Most executives don’t have a one-size-fits-all approach to this question of “man vs. machine”. They want a more nuanced way of assessing which solutions are best for a given situation.
If a CEO is attending his daughter’s “take your parent to school day” and having to give an example of what he does in another language, he’s probably okay relying on his phone’s Google Translate app to convey the message. If later that day, the same CEO has to go in front of an international set of journalists to explain why his company’s products accidentally killed users, he’ll probably want a human interpreter.
But what about situations in the middle? And as technology gets better, when does the scale tip in favor of machines? In a society that allows our planes to land themselves, when will we let machines translate speech independently?
Red marks on your forehead
I’ve been in a lot of business meetings and attended a lot of live events that use interpretation. I’m proposing that we judge all such events using a dual-pronged set of criteria:
- “Misfires”: How frequently do errors occur in interpretation?
- “Facepalm rating”: How severe are the consequences of a misinterpretation?
The first question can be quantitatively answered. I’ve dubbed these “misfires” because machines and humans are simply taking inputs and generating outputs. The key issue here is that any misfire, big or small, will distract the focus of a listener. Every time there is cognitive interruption, it makes the whole event suffer. Longtime executives in China have surely been in meetings where the classic “he/she/他/她” gets mistranslated (“My wife couldn’t make it to dinner. He is at home tonight”). Our brains fix the interpretation quickly, but it pulls us out of the meeting, however briefly.
The second one, “facepalm rating,” cannot be measured quantitatively, but we can make broad generalizations here that will help us determine the risk factor of a given event.
A key rule of this framework: all participants know if a human or a machine is behind the interpretation.
Is this approach scientific? Nope. Is it rigorous? Nope. Should it provoke conversations? Yep!
Let’s analyze a few event types:
- KOL livestreams
- Press conferences with Q&A
- Bilateral business negotiations
- Product demonstrations
- Company announcements
- Keynote speeches
Hundreds of key opinion leaders are livestreaming their daily lives around the world. They speak in one language, but brands (and sometimes the KOLs themselves) have increasingly wanted to expand their audience by making these livestreams multilingual.
Misfires: frequent. KOLs use a lot of colloquial, hard-to-translate expressions in their daily lives.
Facepalm rating: inconsequential. Misinterpretation of a daily life occurrence is unlikely to take away from the KOL’s authority or the viewership’s interest in the KOL. The (notable) exception here is if a brand name itself is not properly captured. I once saw a machine listen to the word “Nordstrom” and it rendered it as “North Korea”. That would be a legitimate facepalm!
Press conferences with Q&A
Whether it’s to announce a new product launch or address a major scandal, multinational companies will want to ensure they’re maximizing their company’s press coverage by hiring translators to reach a larger audience.
Misfires: infrequent. Questions asked by the press during these sessions can be unpredictable, but company and government representatives tend to speak clearly and deliberately.
Facepalm rating: significant. Depending on the subject of the press conference, potential ramifications of Q&A misfires could range from some light social media skewering to substantial PR damage control.
Bilateral business negotiations
Senior executives walking into high-stakes business meetings want every advantage available to them. The ability to speak their native language to perfectly capture their messaging is one such advantage.
Misfires: infrequent. Representatives of both sides of a high-level business meeting will be polished speakers, but communication beyond words, such as nuance, tone, and humor, are still at risk of misinterpretation.
Facepalm rating: devastating. Misfires can lead to severe misunderstandings between partners, or even the collapse of a partnership as a whole. This is definitely a situation where the pilot, or human interpreter, should be “landing the aircraft.”
As companies search for new markets abroad, explaining to foreign audiences how to effectively use their product is an essential step of the sales process.
Misfires: occasional. A product demonstration is informal and often hosted remotely, so colloquial language, unpredictable questions, and connection stability issues can all affect the quality of the interpretation. That said, complexity of discussions is also generally relatively low, which simplifies the interpretation task itself.
Facepalm rating: negligible. As long as the client finds your product useful, minor misinterpretations during the exchange can be forgiven. Of course, the client probably won’t be so lenient with misinterpretations surrounding the product specs!
Like press conferences, a corporation’s global launch of a new product or campaign will need translators to expand the reach of the message.
Misfires: practically non-existent. A prepared company announcement is highly predictable and carries little risk. The corresponding interpretation of the announcement is often prepared in advance.
Facepalm rating: manageable. Mistakes in your company announcements are embarrassing but, like press Q&A sessions, are correctable and most likely won’t have a lasting impact on your company or brand.
Interpretation enables companies to broaden their search for the most impactful keynote speakers for their next conference or summit.
Misfires: rare. Prepared speeches on a predetermined topic with an eloquent speaker mean that misfires are relatively low.
Facepalm rating: embarrassing. My co-founder wrote about famed private equity investor Ray Dalio’s visit to China earlier this year. The misinterpretations were hilarious, e.g. translating “arrogant” as “Aragon”, and quite embarrassing for the event organizers, but did not ultimately damage the reputation of the speaker.
If this, then that
What options exist if your event is in the “no machines allowed” zone? What about the “fly, machines, fly” zone?
There are dozens of options for the “no machines allowed” zone, but my company Cadence Translate is one of the few that staffs interpreters not just for their interpretation prowess, but for their industry knowledge and business savvy. We’re first and foremost about empowering companies to do better business; we just happen to be a language services provider.
For “fly, machines, fly,” a few solutions exist on the market and have been deployed for events in Beijing. One of the strongest domestic players is iFlyTek, whose 听见 product empowers event organizers with an on-premise solution to display machine-generated captions on a screen. A startup, Akkadu (www.akkadu.com), is a hybrid model with remote humans assisted by AI.
This framework is intended to help executives realize that there is more nuance out there when it comes to using machines versus humans for live business events. Every year, the gap will close in terms of misfires made by machines and humans. Some events will remain forever out of the purview of machines, while other events may get overtaken by solutions like Akkadu.
And the next time you’re on an airplane, reflect for a moment about the fact that machines have been partially or wholly responsible for getting you safely on the ground for decades.
This post was provided by Cadence Translate, a leading provider of language services to the capital markets and consulting sectors.