Live Captions, in iOS 16, generate subtitles of any audio playing in any app on your iPhone. Powered by the Neural Engine in Apple’s custom silicon, the capability to turn words from music and/or videos into real-time text is a boon to many users, in many different situations.
If you’re hard of hearing, for instance, the ability to see instant captions on the screen is a game changer. Or, if you don’t have headphones when you’re sitting in bed late at night and your partner is asleep – or you’re in any situation where you don’t want to make noise, like on the bus or in an office – you can turn on Live Captions to get subtitles.
The applications are endless and exciting. Here’s how to use Live Captions in iOS 16.
How to use Live Captions in iOS 16
Live Captions made a big splash in May when Apple announced the feature alongside other new accessibility features coming to iOS. These features got their own day in the spotlight to mark Global Accessibility Awareness Day ahead of WWDC22, where Apple laid out its plans for iOS 16 and its other operating systems.
To use these features, you need to install iOS 16. iOS 16 is compatible with every iPhone released in 2017 and later, but this feature is only available on the iPhone 11, 12, 13 and 14 models and iPhone SE (second and third generation).
This feature is also limited to the United States and Canada. You can change the region if your device if you really want to try it out, but I don’t think it’s worth the compromise.
Enable Live Captions
Once you’re all set on iOS 16 (and after you’ve messed around with making a custom Lock Screen), go to Settings > Accessibility > Live Captions (Beta) near the bottom of the list.
Enable Live Captions to turn on the feature. A floating widget will appear, waiting for something to caption. Elsewhere in Settings, you can configure this widget to appear and disappear from Control Center or by triple-clicking the side button, but first, let’s take a look at how it works (click here to skip ahead).
Also, enable Live Captions in FaceTime while you’re here to get captions the next time you’re on a FaceTime call.
How well do Live Captions work?
Fair warning: Despite all of my best efforts, I cannot get the Live Captions widget to appear in my screenshots. I even tried plugging my phone into my computer and recording the screen in QuickTime to no avail.
What follows are pictures of my phone I took with an (old) iPad. I’m sorry.
Live Captions in YouTube videos
I tested Live Captions so you can directly see how accurate the captions are. See, I have been running a video experiment on YouTube for the last two years where I narrate Wikipedia articles in their entirety. This comes in handy, because you can compare how well Live Captions transcribes my narration to the original text on the screen.
As you can see, the results are … not great. This is a little surprising. Voice dictation works great for me in iMessage; Siri usually understands my voice pretty well. My voice, to my knowledge, sounds the same in the videos as it does when during typical iPhone use. And yet, Live Captions transcribes my YouTube videos pretty poorly.
I like to read the news through NetNewsWire and listen to commentary through podcasts. Could I consolidate this even further by using Live Captions to turn podcasts into written words? Kind of.
Even where it works best, we have to address a matter of implementation: You can’t run an entire podcast episode through Live Captions and scroll through the resulting text, reading like an article. Live Captions processes the audio as it hears it.
That means you can read a few lines at a time, at the same speed as the dialogue is spoken. Plus, as Live Captions hears the end of a sentence, it might work backward and correct the beginning of that sentence, adding punctuation or replacing sound-alike words, much as it does when using dictation on your iPhone. In effect, three lines of text really only yields one that you can read.
If you tap the fullscreen arrows, you can read more at once, but that covers up the app controls for pausing and skipping chapters.
It’s also clear that this doesn’t play well with the two dudes talking genre of podcasts. You can’t tell who’s saying what by reading the text alone.
Live Captions for music lyrics
I’m absolutely terrible at understanding song lyrics. There are albums I have heard literally hundreds of times that I still can’t memorize the lyrics to, much less understand. Apple Music serves up live lyrics for a lot of popular music, but confusingly, not for the entire discography of Driftless Pony Club, the greatest indie rock band in the world.
So, can you use Live Captions to parse lyrics for you? Another mixed bag. When it can pick up the lyrics, they’re generally pretty accurate. However, they come and go unpredictably.
On an acoustic song that I thought would be an easy home run, Live Captions decided to transcribe the single guitar lightly strumming in the background as “de de de de de de” instead of the spoken lyrics. After updating to developer beta 3, when I went to take the above screenshot, Live Captions picked up a partial lyric instead (pictured on the left).
I also tried two notoriously challenging songs to try and throw off the Live Captions feature. The spoken word intro to “Ya Got Trouble” from The Music Man captioned surprisingly well — that is, until the full music kicked in (pictured on the right). “The Elements” was another pleasant surprise, with Live Captions correctly identifying about half of the elements reeled off in the rapid-fire recording.
Quickly turn on and off the Live Captions widget
If you want to keep using Live Captions, you don’t want to dig into Settings to enable and disable the floating control menu every time.
To keep it hand, go to Settings > Accessibility > Accessibility Shortcut (at the very bottom) and enable Live Captions. Now, you can enable or disable Live Captions by triple-clicking the iPhone’s side button.
You also can add a button to Control Center. Go back to Settings > Control Center and tap the green + next to Accessibility Shortcuts. Once enabled, swipe down from the top-right of the display (or, if you have an iPhone 8 or iPhone SE, the bottom edge) to bring up Control Center. Tap the Accessibility icon and select Live Captions to turn it on.
How could Live Captions improve?
Apple continues to make leaps and bounds of progress with its Neural Engine. Machine learning is clearly an area where the company excels, and the pace of improvement is only getting faster.
The captions themselves should get more accurate over time. Apple’s current dictation feature, and its voice processing for Siri, both work best when you speak clearly and hold your phone right up to your face. It will take some time for Live Captions to train on translating voices in the wide manner of environments people film YouTube videos and FaceTime their family.
I would love to see a developer API for this feature. The problem is that Live Captions transcribes any audio as it plays out of the speakers; the feature has no concept of what’s playing, how long the audio is, or what’s coming up ahead. It can’t process audio in advance and it can’t be connected inside an app’s interface.
Imagine if, in the Podcasts app, you could push a button that would process an entire episode at once. The text would scroll along with you as you listened, but you also could read ahead or back up.
Maybe in iOS 17. 🤞
This article was first published on July 19. It was republished after the release of iOS 16.