Game Retention Study
Gavin Eisenbeisz/Two Star Games 2023
This document will cover audience retention with long form gameplay videos (YouTube), and live streaming (Twitch and YouTube), specifically in mind.
When optimizing any piece of content for retention, you must follow this basic structure. First, you present the audience with a conflict/problem. Immediately after presenting this conflict, you need to tell the audience how this problem must be solved (this can be a general explanation, or even an untrue explanation if your content requires withholding information, or a twist ending of some sort). After the explanation of the overarching conflict and goal, you must show rapid progress toward this goal through clear, quick, and visual progress. Finally, fueled by the rapid progress, you must draw out intermediate goals, conflicts, and other smaller problems that must be solved, while periodically reminding people of the overall goal, your current progress toward it, and what must happen to reach that point. Essentially, you are drawing your audience a map that they can use to follow along with the flow of the content.

(Note: The number of intermediate goals/conflicts can be increased, decreased, or layered such that there are many intermediate goals in progress at any given time. Also, the frequency of progress reminders is simply representative of a constant need to keep people up to date with their progress.)
People lose interest in something if they can’t see the direction they are heading, or see any progress toward that destination. If you tell someone, “Build me furniture” without giving them tools, instructions, or any affirmation that they are building the correct thing, in the correct way, the vast majority of people might start trying to solve the problem but will give up very easily and feel aimless in their attempt. If you give that same person an Ikea table with all the tools, instructions and supplies necessary, they will gladly assemble the entire thing, and have fun doing it. As game developers, we need to give audiences the tools, instructions, and goals that must be followed in order to keep them engaged.
An unengaging video is one that feels aimless, lacks direction, and lacks progress. Accomplishing a task, then accomplishing another task, then accomplishing another task with no end or goal in sight is equivalent to watching a video of someone listing off a random array of numbers for ten minutes. Things are happening, but not toward any important or identifiable end, making it excruciatingly boring, killing your view duration and your chances of going viral.
On social media, seemingly small one or two percent optimizations to a piece of content make an exponential difference in the number of views that content receives. For example, on YouTube a ten minute video, with a decent CTR, in a fairly broad genre, with a 35% average view duration (or three and a half minute watch time), might go on to get 10K-50K views. A video that is in the same genre, with the same stats, except with a 70% average view duration (or seven minute watch time) could go on to pull tens of millions of views. Just two times the view duration, will earn videos exponentially more attention, making every small improvement to your retention incredibly valuable.
Beyond the basic, fundamental formatting I’ve described above, there are specific mechanics and features that can increase or decrease retention. By avoiding the wrong features, and adding the right ones you can dramatically increase your games chances of going viral.
Features that Harm Retention:
- Main Menu: Retention will consistently drop until the gameplay actually starts, so any time spent on your main menu will cost you. You can easily lose 30-50% of your viewers to a bad menu (on no-commentary videos. Likely much less loss on normal commentated gameplay videos). To mitigate this, you need to either have a really interesting animated or interactive main menu, or make it so the player controls the character to navigate the menu.
- Player Deaths: In short, player deaths are bad and generally cause 6-15% dips in retention, but there’s a bit more detail to get into. The core reason that these dips occur is usually because the death is followed by a long drawn out “Game Over” screen, and sometimes twenty to sixty seconds of retreading old ground that the player has already seen. They also cause a jarring break in the flow of the game and progress toward goals. To fix this, make player deaths infrequent, make respawns as close to instant as possible (no menus or long animations, or long loading screens), and respawn players in such a way that they are exactly where they left off, or don’t need to backtrack over things they’ve already done. Deaths can also look like car crashes or karts going off track in a racing game (it causes the same jarring break in flow, resulting in the same severity dip). Also note that jumpscare deaths in horror games often mitigate death dips inherently, but I’ll go into more detail on these in point P.
- Level/Match Completions: Level and match completions are something you need to be very cautious with when making a level based game. Each individual level automatically defines a goal/conflict for itself, but if people interpret that as the primary goal, they’ll leave once the level is completed. Level completions usually cause 3-10% dips, and are the most common and consistent type of dip logged during this study. Factors that contribute to the severity of the dips are the length of the level transition, and how heavy the sense of completion is. If players complete a level and are shown a five second win animation, followed by a six second level statistics screen, then a ten second loading screen taking them back to the level select menu, then another twelve second loading screen when the player selects the next level, you could easily lose 10% of your audience. Best scenario would be to avoid level or match based games if possible, but by making the following optimizations, this format can still work pretty well for retention. Make your level transitions as quick and streamlined as you possibly can. No win animations, no score or completion time statistics, just a really quick loading screen with a colorful animation on it, and the new level. Also note that if you have a game with many very short levels, you’ll have to deal with more transitions. Rather than having one-hundred, thirty second levels, have twenty-five two minute levels. Or instead of having five minute matches in a match based game, lengthen them to twenty or thirty minutes.
- Notes: If implemented incorrectly, individual notes will usually cause 2-3% dips, but I’ve seen a few cases of notes causing as high as 6% dips. There are also many examples of well implemented notes that cause no dip whatsoever. Notes that are worst for retention are very long (multiple paragraphs) and appear in the first few minutes of the game. The notes that don’t impact retention are very short (only a sentence or a few words), have pictures on them, and are placed later in the game (past the first three minutes). The main issue with notes is that they interrupt progress, especially when placed in the beginning of a game when progress should be quickest and most noticeable. Best practice would be to avoid using notes entirely, but if necessary, keep them short, add pictures and visuals with the text, and place them later in the game. Also, make sure that the font size is large enough for people watching on a phone to read easily.
- Expo-Dumps: Long exposition monologues and dialogues can cause 3-10% dips. Similar to notes, the longer they are, and the sooner in the game that they take place, the worse the dip will be. Small moments of monologue that last three to five seconds are completely fine, and very unlikely to cause a dip (especially if placed after the first few minutes of gameplay), but you start entering risky territory with fifteen-plus second expo-dumps. Dips are also made worse when there’s no voice acting, or when there’s nothing else going on outside of the expo-dump. To optimize exposition, keep things short, add voice acting, have an active cutscene, or allow the player to continue moving around, and hold off on exposition until a few minutes into the game.
- Backtracking: Backtracking typically causes 3-6% dips, with one fringe example showing a 10% dip. Backtracking usually happens when a player has died, and respawned at a checkpoint ten plus seconds before where they were killed, requiring them to replay some parts of the game. Dips like this can also happen if the player goes down a long dead-end path, and must retrace their steps back through the environment. Some of this can be fixed by simply applying the strategies mentioned on point “b. Player Deaths”, and the rest can be mitigated by reducing the number of useless dead-end paths and areas.
- Credits: Credits are a pretty simple one. As soon as they start playing, people realize,”Oh, the game is over and completed, I can leave now” and they do exactly that; leave. Any viewers you had left will drop off once the credits start playing, so keep your credits short, and don’t put anything important that you want people to see after the credits. Things like post-credits scenes likely won’t get as much attention as a cutscene placed before the credits roll. It’s also worth noting that ten seconds of credits at the end of an hour long game won’t have any significant effect on your overall retention, but five minutes of credits at the end of a twenty minute game would be detrimental. Best case would be to not put credits at the end of your game at all, and instead put them behind a button on your main menu.
- Walking Simulators: Walking simulators (I mean literal walking simulators, that are simply environment exploration) are generally bad for retention. But, if a walking sim has a decent variety of scenery, and no area is too repetitive for too long, people will be most engaged. This makes sense, and is pretty self explanatory. If you have a long dirt path through a samey forest that players need to spend an entire minute running down, retention will start to drop. If you add things along the path, and interesting landmarks ahead, you can mitigate the retention drop off. You can apply this simple tip to your environment design in any game to help retention. Keep your scenery fresh, and add various landmarks to spice things up and re-engage viewers.
- Lighting and Obscure Style: It’s hard to peg specific numbers to this point, but games that are incredibly dark (visually), or that have a very obscure/convoluted art style make the viewing experience frustrating, and make people more likely to leave. This doesn’t result in massive, focused dips, but instead speeds up the overall decline of retention throughout the video. This is especially an issue in horror games, but make sure to avoid having overly dark or visually obscure areas in general. Keep in mind that not only are most people watching videos on their phones, which are very small, but they may be watching with their screen brightness down or somewhere with glare on their screen.
- Dullness: This one is more general, and perhaps more obvious as well, but it’s worth adding as a point in this list. Dull, boring, or overly slow parts of your game can easily cause 3-15% dips. These dips are worsened by the length of the dull moment, and by whether or not the player is currently working toward a clear goal. Five seconds of walking down a hallway toward a door is not a dull moment, but if the player walks into an elevator, and spends fifteen seconds riding it, you could easily see a 5% dip. Usually, dips caused by dullness are noticeable after around ten seconds of the dull moment. Since this point is fairly self-explanatory, I’ll just round this out with a few other examples of dull moments that have caused noticeable dips.
- Ten plus seconds of climbing a ladder.
- Five plus seconds of a black screen during a cutscene.
- Fifteen plus seconds of aimless walking.
- Ten plus seconds of waiting to jump across to a moving platform.
- Long boring tutorial sequences
- Ten plus seconds of waiting to get on/off an elevator.
- Repetition: Repetition is a tough one to get specific on, but still worth bringing up and thinking about. Essentially, if your core gameplay element constantly repeats in under ten seconds, you probably have a repetitive game that will cause people to become bored very quickly. This is very closely related to point “u. Action Scenes”, and how constant repetitive action causes massive dips in retention. Examples would be games like Night of the Consumers, ASMR Simulator, basic platformer games, and One Last Game. General repetition still isn’t as bad for retention as high-action games, but still something that you want to avoid. I think the best way to think of it is that while you don’t want your game to get dull, you need to give people variety in the gameplay loop, and let it breathe. Give players a little bit of time and space, instead of making a Flappy Bird style game where the player simply flaps over and over again until the end of time.
Features that Help Hold Retention:
- Chase Scenes: Chases are similar to jumpscares, causing 5-10% bumps, but there are examples of chases causing as high as 35% bumps. The really strong aspect of a chase scene is how well they retain viewers. A good chase scene can easily hold viewers attention rock solid for two minutes, with one example holding for an entire five minutes. Some shorter chases in the ten to thirty second range will cause small bumps and look more similar to a jumpscare. To make a perfectly optimized chase scene, start it off with an intense jumpscare to spike retention (only applicable for horror games), reveal the enemy, and engage viewers. Then slowly increase the intensity of the chase as time goes on. You want the audience to feel like they’re inching closer and closer to dying, but without actually dying (as mentioned in “b. Player Deaths”, if the player actually dies, the flow will break, and people will leave the video. Make your chases easy, while still feeling dangerous). To raise the tension, you can show the enemy character getting closer and closer to the player over time, or raise the stakes with new challenges and obstacles to overcome while trying to run away. Increasing the intensity of your music is also a great way of naturally upping the intensity. If the enemy is physically far away from the player, the audience will feel safer, the stakes will lower, and people will start leaving. Keep the enemy close, and make sure the player knows they’re close. When ending your chase scene, don’t end it too abruptly, or give the player too high a sense of safety, since this change of pace and lowering of stakes can also cause a 2-5% dip in retention. It’s best to slowly fade out tensions over twenty or thirty seconds while you give the player a new goal to achieve.
- Boss Fights: Boss fights are very good for holding attention, and there are many examples of them holding attention rock solid for one to three minutes (I suspect you could effectively hold people even longer; over five minutes fairly easily). The way most boss fights are formatted automatically lend themselves well to retention. Fighting a unique enemy in a unique area is refreshing, and the enemy being extra dangerous and more in control of the situation than the player (similar to how a chase scene is formatted), raises the stakes and intensity. Also similar to chase scenes, having a multi-stage boss helps raise the intensity even further, and helps re-engage people who may have started getting a little bit bored, since there’s now a new twist/threat/danger to change the combat dynamic. Essentially, if you have a good standard boss fight, it will be good for retention. This same principle applies pretty well to “mini-bosses”, which can easily hold people's attention for at least thirty seconds. Periodic encounters with larger, more unique enemies that are difficult to avoid or kill can help shift the power dynamic from a boring, repetitive, action sequence, to something more akin to a boss fight.
- Parkour: Parkour is a less common feature, but I think partially because of that, it holds attention fairly well. Since it is a rare feature, I only have a handful of examples using it, but there are some action shooter games that also have parkour, and the parkour levels always have way better holds than the high action shooter levels. Not much in the way of specific stats, but thought it was worth mentioning.
- Clear Definition of Goals: The key pattern here that ties together all the features that hold attention, is the clear definition of a goal, anticipation of something to come, and an increase in stakes. I’ve gone over this concept pretty thoroughly in this section, but I feel like reiterating it. Chase scenes, boss fights, and everything else on this list don’t have special individual reasons why they hold attention. It’s simply because these common game mechanics happen to naturally meet the standard for identifying, and completing goals. The features that harm retention naturally work against it. I guess I wanted to say this since you can create your own unique game mechanics and features that meet the retention requirements just as well if not better than the common ones I’ve listed. Don’t get stuck in one bubble of thinking. Know that the examples in this list are simply a guideline and a few examples to help you structure your game.
Features that Bump Retention:
- Jumpscares: Jumpscares are the most effective way to get people to rewatch part of a video, commonly creating anywhere from 2-25% bumps in retention. There are many factors involved in this wide range that I’ll go over. The main factor is the intensity of the jumpscare. Subtle jumpscares and stingers usually cause mid size bumps of 2-5%, while intense jumpscares usually cause between 10-20% bumps (you can find lots of specific examples in the full retention spreadsheet). An interesting attribute of jumpscares is that individual jumpscares lose their novelty very quickly. If you reuse the same (or a very similar) jumpscare many times, you may see a 10% bump the first time, 5% the next, 2% after that, and past around three to four uses of the same jumpscare, no noticeable bump will occur. To keep all your jumpscares as effective as possible, use multiple different enemy characters, or have variations of each enemies jumpscare. By keeping every jumpscare different, and avoiding reusing the same jumpscare more than a few times, you can keep your bumps highly effective.
- Secrets: Interesting secrets, or hidden rooms can cause slight 2-5% bumps in retention. Most likely, these bumps come from people who have already watched gameplay, or played through your game, but are unfamiliar with the secret being shown. This causes them to rewatch and get a closer look. These bumps are pretty subtle, but useful nonetheless.
- Random Interesting Moments: Some games include very random, funny, or interesting moments that can cause 2-7% bumps, depending on how impactful the moment is. For example, there was a 2% bump when the player in one game started riding a hedgehog with a tiny gnome, or 7% when the camera randomly zoomed in on a fish that started talking to the player. There was another moment when a fish randomly fell out of the sky in front of the player. I figure you'd want to use things like this sparingly, but a few random moments are certainly useful.
- Trippy Stuff and Illusions: Features like this are rare but in the few examples I have, trippy stuff and optical illusions can cause slight 1-3% bumps in retention. When it comes to optical illusions, some people like to rewatch and try to better understand the illusion. Trippy, colorful, and psychedelic visual styles seem to be somewhat interesting to people, and while they don’t seem to cause specific holds, people aren’t turned off by it (at least as long as the style isn’t too obscure and is still readable).
Nuanced Features, and Other Notes:
- Action Scenes: Having read about how great chase scenes are, one might expect action scenes to be great for retention. Unfortunately, this is not the case. First I must make a clear distinction between action scenes and chase scenes. A chase scene is when the player is in danger, but an action scene is when the enemies are in danger. The dividing line is who’s exerting more power and control, and who is in the most danger. Furthermore, when I’m talking about “action” scenes for this point, I’m very specifically talking about anything that is heavily combat focused, where the primary gameplay is the player mowing down lots of enemies (things like boss fights, or slower more strategic combat segments are a little different). Action scenes can cause slight 2-6% bumps, and hold attention for ten to forty-five seconds. After this initial bump and hold however, the action becomes repetitive, exhausting, and people start leaving very quickly. There’s no set number as to how large the dip can be, since as long as the action continues, retention will keep dropping until it hits zero. This can happen in anywhere from one to three minutes of action, depending on how interesting the gameplay, or how unique the action scene is. Keeping people engaged through action scenes is very difficult, so games that are purely action will be the most difficult genre to make go viral, and should be avoided. Ideally your action scenes will be shorter, and more spread apart, giving them more weight and importance. High action shooters like Doom are unlikely to retain viewers, but slower shooters like PUBG, or Fortnite do very well. Essentially, the only way to retain viewers through action scenes is to keep them short, and communicate a clear goal for them to work toward during combat (like driving to, and boarding a moving plane).
- Cutscenes: Cutscenes land in a complete gray area, being neither good or bad. The outcome depends on how the cutscene is used. When it comes to intro cutscenes that play at the start of your game, get to the point fast, and give people a goal/conflict. If you start with ten seconds of random environment shots people will skip or leave, resulting in 15% dips in some examples. An intro cutscene that clearly defines a goal can hold attention very well, and be a fine way to start a game. Cutscenes that interrupt the flow of the game, or make a very abrupt pacing shift seem to be dangerous and can cause 6-20% dips. If you have an intense chase scene, then the player runs through a door and a cutscene plays where the player is safe and sound inside the room, followed by an expo dump or slow environment shots, people will likely leave. Ultimately, cutscenes follow the same principles as gameplay. As long as you don’t have long boring shots, long expo dumps, or apply any of the other negative features listed here to your cutscenes, you’ll be just fine.
- Mystery and Tension: Coaxing people with a mysterious story can be another effective way to hold people's attention, although it is risky since this is typically done through dialogue and expo-dumps. But, since we’re smart epic people, there’s a way around this. Just make stuff mysterious without long expo-dumps (wow, very surprising answer and very crazy *O*). You can seed mysterious ideas, and ominous intentions over the course of a few minutes through small environmental cues, or very brief conversations that follow the best practices outlined in point “e. Expo-Dumps”. Mysterious characters and lore can be a nice way to add an additional layer of conflict, but probably won’t be great at holding viewer's attention on its own. Combine this with other forms of conflict and goals for the best results.
- Subtitles and Voice Acting: No specific stats here, but something I want to mention along with “d. Notes”, and “e. Expo-Dumps”. When it comes to subtitles, notes, and any other important on screen text, make sure the size is large enough to read, doesn’t use a stylish whacky font, and is easily readable by the audience watching on their phones. In addition, exposition, notes, and any other applicable instructions should be read to the player and audience through voice acting as opposed to forcing people to read through dialogue themselves. This just makes the game a little easier to digest, and reduces viewer and player frustration.
Not only does your game need to be formatted such that it can create one viral video, it must be formatted such that it can be the foundation of an infinite number of viral videos. Your game should tell a unique and emergent story every time that it is played, so that every content creator can create hit videos automatically and consistently by simply recording a video of your game. If the story of the game isn’t randomized, it will become repetitive, audiences will become bored, and engagement will drop, stopping the game from gaining views and cutting its longevity short. It should give creators enough breathing room and opportunity to edit the gameplay in such a way that they can tell a story specifically catered to their audience, while still structured enough that an amateur creator can still go viral based on the formatting of the game alone.