The lengthy process Apple uses to teach Siri new languages

By

Sorry, Alexa: Siri still the most widespread AI assistant
Sorry, Alexa: Siri still the most widespread AI assistant
Photo: Ste Smith/Cult of Mac

If you’ve ever tried to learn a new language, you’ll know it’s a hard and incredibly time-consuming process. It’s not much easier for virtual assistants like Siri.

Here are the mind-blowing steps Apple goes through to teach Siri new languages and dialects that help it stay one step ahead of the competition.

Despite being one of the oldest virtual assistants, Siri has fallen behind in recent years as rivals like the Google Assistant and Amazon Alexa arrive with wider skill sets and more flexibility. But there’s one thing Siri still does a lot better than the rest.

That’s speaking different languages. Siri already supports 21 localized for 36 different countries, and according to Reuters, Apple will soon start working to teach it Shanghainese, a dialect of Wu Chinese spoken only around Shanghai.

In comparison, Microsoft Cortana speaks eight languages tailored for 13 countries, while the Google Assistant speaks four. Amazon Alexa speaks just two: English and German. And it’s easy to see why adding new languages takes time.

When teaching Siri a new language, Apple starts by bringing in humans to read passages in a range of accents and dialects, explains Alex Acero, head of the company’s speech team. These are then transcribed by hand so that Siri has an accurate representation of what was said.

“Apple also captures a range of sounds in a variety of voices,” adds Reuters. “From there, a language model is built that tries to predict words sequences.”

Before supporting a new language in Siri, Apple adds it to dictation mode, its text-to-speech translator in iOS and macOS. As customers use this, Apple captures some of the audio recordings (anonymously), complete with background noise.

These recordings are transcribed by humans and fed back into the computer. This process alone cuts the speech recognition error rate in half.

Once Apple has gathered enough data, it brings in a new voice actor to give Siri a voice in another language. It is then released to users with answers to what Apple believes will be the most common questions. But the work doesn’t stop there.

While Siri is in use, it continues to learn more about what it is being asked by users in the real world. Apple makes tweaks and updates every two weeks to keep improving the service, and over time, Siri becomes fluent and more accurate.

Newsletters

Daily round-ups or a weekly refresher, straight from Cult of Mac to your inbox.

  • The Weekender

    The week's best Apple news, reviews and how-tos from Cult of Mac, every Saturday morning. Our readers say: "Thank you guys for always posting cool stuff" -- Vaughn Nevins. "Very informative" -- Kenly Xavier.