How to use automatic captions for Twitch streams with OBS
by Saralene
In this document, I will instruct you on how to install the OBS captions plugin by ratwithacompiler, starting with the shortest and simplest way to get it working, and then I will introduce its other features and how you can utilize it to its fullest. I’ll try to be as straightforward as possible, and will use images to help make it clear.
Table of Contents
How captions appear to your audience
You might be asking yourself, first of all, Saralene, why do you want more streamers to use this plugin? What are the benefits of using Closed Captions on my stream? You might be saying, I know how inaccurate automatic captions can sometimes be, and this just doesn’t seem worth the trouble.
I totally get that. However, there are a lot of compelling scenarios that make captions worth having, and I personally believe that even if captions were to be only 50% accurate, it is highly worth having them for the reasons I outline below.
With all of this in mind, I believe that having captions at all allows a wider audience, allows the existing audience to attend streams they would otherwise be unable to, and generally improves conversion rate of short-time viewership to long-term viewership among otherwise marginalized groups of people.
This is really important, and it doesn’t take much effort at all to actually implement automatic Closed Captions via Google Cloud on your stream! It’s free and requires only a few basic setup steps on your part! If you want to go the extra mile to set it up to your exact liking, such as adding text replacement filters for usernames or other major terms, you can, but the basic setup takes only a couple minutes and a few very short steps!
Twitch’s player natively supports embedded Closed Captions in all streamed content. They will not show up in Clips, but you can find them anywhere else that they are featured.
(An example of the default appearance of closed captions on a stream)
When Captions are enabled for a stream, they will default to Off for all viewers who do not ordinarily use Captions, so you don’t have to worry about this new feature suddenly surprising your existing viewers. Instead, a small “CC” button will appear on their player next to the other standard options that appear.
(How the closed captions toggle button appears on a typical stream player, disabled and enabled)
Users can customize the appearance of their captions to their liking out of a variety of different colors, background settings, opacity settings, and locations, through the settings menu provided by Twitch as part of the player settings.
(Location of the Closed Caption Options in Twitch’s player settings menu)
(Example of the Closed Caption Settings dialog box)
All of this allows viewers to customize how their captions look to best suit their own situation, while making it a completely optional feature that will go completely unused by the users who don’t need it, or find it too distracting.
Having said that, I realize some of you might be considering adding captions directly to your stream, baked into the video. That’s okay to do if you want to, but I suggest that you enable embedded captions as well if you do that.
The reasoning for this is simple: User needs can vary more wildly than we expect. Different users will have different usage scenarios, and customization will allow it to better suit their specific needs.
-Must have OBS Studio version 28.0 or later.
NOTE: Streamlabs Desktop or other similar programs, even though they are based off of OBS, might not work. SE.Live works fine since it is also just an add-on to OBS.
-Must NOT be using the AMD hardware encoder on Windows (Mac/Linux are fine). The reason it doesn’t work when the AMD hardware encoder is used on Windows is unclear. You should typically probably be using software encoding instead anyway, if you have a relatively modern CPU.
To get started with the captions plugin, we have to select a sound source and enable it. This should ideally be a separated microphone source. If your mic is separate from the rest of your sound, you will get better results than YouTube auto-captioning. Otherwise, it will be about the same as YouTube auto-captioning in quality.
Testing your captions before going live is a good idea. There’s an easy way to do this, because the captions will display live in real time when they are active, and you can set them to be active while recording, not only while streaming. I’ll detail the process here.
One particularly advanced usage scenario involves using multiple audio tracks to capture multiple voices or sources for captioning. For example, if you are doing a collaboration with another streamer, you may need to do this if you want both of your voices to be captioned.
This is probably the most complicated and advanced part of setting up the captions plugin, but it serves an important purpose, so I want people to know how they can make use of it.
In OBS, there is a section for “Advanced Audio Properties” that you can access from the three-dot menu next to any of your audio sources in the Audio Mixer.
In this settings section, you can see that all audio is set to one or all of 6 audio tracks by default.
Typically, most streamers never need to mess with these tracks, unless they are separating out licensed music from their VOD recordings. However, this setting can also be used to decide which audio tracks will be sent to the Captions plugin, and which will not.
Here is an example from my own stream audio settings, showing that Audio Track 6 is disabled for all of the sources except the ones that have some actual microphone audio being sent to them.
I do recommend using Track 6 for this. Do not disable the other tracks on the microphone sources, or you will most likely make yourself inaudible to your viewers.
You can then set the Caption Source in the Caption Settings to the audio track that you have separated away from the others, as seen below.
As a side note, if you absolutely cannot separate your microphone audio from the rest of your audio, this will all still work even in situations where all of your audio tracks are being included, even ones that have a mix of microphone audio and other audio, i.e. music or gameplay audio.
In that particular situation, the quality of the captions will mostly just be reduced to something closer to that of the automatic captions you see on your average YouTube video. Certainly there are worse fates than that; it’s still much better than nothing.
The last tab of the Caption Settings is the Text Filtering section.
In short, Text Filtering allows you to define specific words that will be replaced with other words whenever they are caught on your stream. For example, you might find that your unique streamer name is not well understood by the captions, and you might want to fix that.
The text filters allow for scenarios like this or other common misunderstandings you want to correct.
You don’t need to use text filters in any way. They are only there for you to use if you want to go the extra mile in improving its recognition of specific terms.
Text filtering also supports regular expressions (Regex). If you don’t already know what those are, you probably don’t need to use the Regex filter setting. Regular expressions are for detecting specific text patterns so that you can replace them in any situation.
Given how captions work and because they are captioning your own trusted stream hosts, you probably do not need any sort of advanced text filtering at all.
The profanity filter will censor profanity into asterisks to keep your captions clean.
If the option is enabled, Google Cloud will determine which words require censorship, but this includes pretty much any profanity you would worry about.
While the drop-down box says that the Profanity Filter is unreliable, it has proven extremely reliable in the majority of my tests!
Please note that the most common and undeniable slurs are always censored, regardless of whether the Profanity Filter is enabled or not. You do not necessarily need to enable the Profanity Filter if your only goal is to keep your stream safe.
You might consider experimenting with this in your own time to determine if you want to enable the Profanity Filter, or otherwise set up any text filtering of your own.
This option simply controls if captions capture when the Caption Source is audible (“Caption Source is heard on stream”, default) or if captions capture even when it is not (“Mute Source is heard on stream”).
This is mainly useful in very complex scenarios where you are capturing captions from a different source than the ones you are actually recording or sending to your viewers. Most users will never need to change this option.
Will fill in information on saving Transcripts at a later date. Sorry about that! It’s pretty straightforward though.
Will fill in information on using Open Captions at a later date. Sorry about that!
I’ll try to fill these sections in sometime soon!