Emulating CSGO Client-Side Demos With Server-Side Demos

TL;DR CSGO demos enable players to record and replay games. However, players need to make a choice when recording demos. They can either accurately record only their perspective using a client-side demo (aka a POV demo) or approximately record all players perspectives using a server-side demo (aka a GOTV demo). In this post, I'll explain how to emulate a client-side demo using a server-side demo and a new feature in HLAE. (Thanks Dominik "dtugend" Tugend!)

Trade-Offs For Replaying A CSGO Match

CSGO demo files record a match's state. The match can be replayed by the game engine using this state. Replaying a match with the game engine, known as rerendering, enables new techniques for studying CSGO player behavior that aren't possible with a normal video recording. The below images demonstrate this flexibility. The left image shows a normal replay of a game. The right image is from a rerendering that highlights visible enemies with bright colors and hides all other information. This effect makes it simple to automatically calculate player behaviors like reaction time to enemies becoming visible.

Two replays of the same match. The left image is a normal replay. The right image is a demo rerendering with a custom effect that simplifies measuring reaction times.

There are two types of CSGO demos. The demo types are based on the types of match state recorded in the demos. A 5v5 CSGO match has 11 copies of the match state: one per player's client (their desktop) and one on the server. A client-side demo records one client's state. [1] The single client's state enables the client-side demo to accurately rerender one player's perspective. That player is known as the client's local player. A server-side demo records the server's state. The server's state enables the server-side demo to less accurately rerender any player's perspective. [2]

The goal of this post is to reproduce a player's perspective from the server-side demo with reasonable accuracy. I will do this using knowledge of CSGO's engine to emulate a client's state using the server's state stored in a server-side demo. The reproduction is implemented using HLAE's mirv_pov command. Thank you to HLAE developer Dominik "dtugend" Tugend for adding this new feature to HLAE.

Server State Versus Client State

In this section, I will explain the relationship between the server state and the client state. There are three constraints that guide this relationship.

Server Constraint: The server is the single source of truth. It computes the correct match state, such as players positions' and who won firefights. This is necessary so cheating players can't use their clients to lie about kills.
Local Player Constraint: The client must immediately respond to local player's inputs.
Non-Local Players Constraint: The client must smoothly animate non-local players' movement, even when the network drops packets.

Please note that the we don't require the server and client to agree. The client can show a player being shot or shooting an enemy, and the server can reject this event.

Server Constraint

Due to the server constraint, the server continuously computes the match state and updates the clients. The match state is the server state. A tick is a period of time during which the server computes a match state. The tick rate is the number of ticks per second. A server with a tick rate of 128 computes the match state 128 times per second. A client impacts the match state by sending commands (like move forward) that the server uses during state computation. The server sends updates to the clients to inform them how the commands impacted the game state. Just like tick rate, the number of updates sent from the server per second is the update rate and the number of commands sent from the client per second is the command rate.

The following diagram demonstrates this process of commands, state computations, and updates. The diagram's top axis indicates the time for each server's tick. Tick 2 occurs after tick 1, so tick 2 is to the right of tick 1. After each tick, the server sends an update (a blue arrow). The diagram's bottom axis indicates the time when the client receives the update for that tick. As it receives an update, the client also a command to the server (a red arrow). All diagrams in this post assume that the server's tick rate, the server's update rate, and the client's command rate are 128; so 128 times per second (or every 7.8125 ms): (1) the server finishes computing a match state, (2) the server sends an update to the client (a blue arrow), and (3) the client sends a command to the server (a red arrow). The numbers for the top (server) axis and bottom (client) axis are offset by 5 ticks, or 40 ms, because the network latency from the server to the client is 5 ticks, or 40 ms. An update takes 40 ms to travel from the server to the client. The round-trip time (RTT) is the time from the server to the client and back to the server: 10 ticks or 80 ms.

The server sends updates to the clients, indicated by blue arrows. The client sends commands to the server, indicated by red arrows.

The above process explains how the client and server communicate. However, the client state for rendering is more complicated than just the last received update. The following sections will explain why using the last received update would violate the other constraints.

Local Player's Constraint

Rendering the last received update will violate the local player constraint as it will cause an unacceptable amount of delay between a player's input and the client's response. The below diagram demonstrates this delay. It's the same diagram as above, except it only has the commands and updates that are relevant for this example. Let's assume the client renders at 256 frames per second. Since the update rate is 128, the client renders two frames per update received. If the player inputs a keypress on frame 4.5 (the frame in between receiving updates 4 and 5), the client will send the command to the server when it receives update 5. This command takes 5 ticks (40 ms or RTT/2), to reach the server. Let's assume the server incorporates this command as soon as it arrives. The command will impact the match state produced during server tick 15. It will take 5 ticks (40 ms or RTT/2) for update 15 incorporating the command to reach the client. If the client rendered the last received update, it wouldn't respond to the input for 10 ticks (80 ms or RTT). That is far too much delay. [3]

The local player inputs a keypress on frame 4.5. The client sends a command for that input when it receives update 5. The client won't find out the impact of that command until it receives update 15. A 10 tick (80 ms) delay is too long. The client must respond immediately to user input, not after 80 ms.

Non-Local Players' Constraint

Rendering the last update received will violate the non-local players constraint as it will prevent the client from smoothly animating non-local players. This is for two reasons. First, if the frame rate is higher than the server's tick rate, then the client will be unable to animate non-local players' positions on frames in between updates. Second, if the network drops an update, then the client will be unable to animate any of the non-local players until a new update arrives. The below diagram demonstrates this issue. Update 4 is dropped, so it's dark blue. When the client tries to render frame 4.5, it has no new information, so it can't smoothly animate non-local players' movement for frame 4.5. Even if the client had received update, the non-local players' positions for frames 4 and 4.5 would've been the same, preventing smooth animation between different positions.

If the network drops update 4, the client won't be able to smoothly animate non-local players' movement on frame 4.5.

Prediction and Interpolation Solve The Constraints

In order to solve the constraints, the client state is produced by predicting and interpolating updates describing server states. The client predicts the local player's position to solve the local player constraint. When a client sends a command, the client predicts the local player's position in the update describing the server state impacted by the command. This enables the client to view the impact of immediately. Since it will take RTT for the server to receive the command and the client to receive the corresponding update, the client predicts the local player's position RTT updates beyond the most recently received one.

The client interpolates between updates prior to the most recent one received to solve the non-local players constraint. Interpolating between two updates enables the client to smoothly animate frames in between updates. The client handles dropped updates by interpolating between two updates prior to the current one. The number of ticks between the current update and the prior ones is known as the interpolation period. If one update is dropped, it isn't a problem immediately as the non-local players' positions are based on prior updates. When it's time to use the dropped update, the client can just use an adjacent update for interpolation.

Due to interpolation and prediction, the client state stores the local player's position from an update that is interpolation period + RTT ticks ahead of the update used for non-local players' positions. Since updates describe server states, this is the relationship between relationship between the client state and the server state.

The below image demonstrates this interpolation period + RTT offset for the client state of frame 4.5. The local player's position is predicted between updates 14 and 15, which is RTT beyond the client's last received update. [3] The non-local players' positions are computed by interpolating between updates 0 and 1. This is an interpolation period of 4 ticks before the last received update. If update 1 was dropped by the network, the client could interpolate between updates 0 and 2.

The prediction and interpolation of updates describing server states used to produce the client state for frame 4.5.

Emulating Client State Using Server State

Now that we understand the relationship between client and server states, I will explain how to emulate the client state in a client-side demo using the server state in a server-side demo. Rerendering a demo replays the recorded server states from tick 0 until the final tick. At each moment during the replay, there is a current demo server state for current tick. The key idea is that we use server states prior to the current demo server state to emulate a client state's interpolation and prediction. Our emulation should be accurate if we ensure that the demo rerenders the local player's position using a server state that is interpolation period + RTT ticks after the interpolated server states used for the non-local players' positions. The relationship between these server states and the current demo server state is irrelevant. We only need to consider the relationship between the server states for the local and non-local players.

The HLAE command mirv_pov produces the desired relationship between local and non-local players. The command renders the local player's position by interpolating between server states interpolation period ticks before the current demo server state. This is necessary since the CSGO engine requires that all players' positions are interpolated between current and/or prior states during demo rerendering. The command renders the non-local players' positions by interpolating between server states 2 * interpolation period + RTT ticks before the current demo server state. RTT is determined using the local player's ping. The server records each player's ping during each tick and stores it in the demo file.

The result of the above interpolations is that mirv_pov renders the local player's position from server states that are 2 * interpolation period + RTT - interpolation period = interpolation period + RTT ticks after the server states for the non-local players' positions. Thus, we have achieved the desired relationship between the server states for local and non-local players.

The below image demonstrates how to rerender the client state for frame 4.5 during demo rerendering. The demo is currently in between server states 18 and 19. The local player's position is computed by interpolating the server states 14 and 15, interpolation period before the current demo server state. The non-local players' positions are computed by interpolating the server states 0 and 1, 2 * interpolation period + RTT before the current demo server state. We have reproduced the client state from the above example for frame 4.5 using only the server states from the server-side demo.

The interpolation of demo server states that emulates the client state for frame 4.5.

Example Gameplay Footage Using mirv_pov

The below image demonstrates that mirv_pov emulates the client-side demo using the server-side demo. There are four components to the image. All components show the exact moment when an enemy is peaking around the corner of cat. I've highlighted the tip of the enemy's weapon with a red box in each component. (You may want to open the image in a new tab and zoom in, since only a few pixels of the weapon are visible.)

mirv_pov successfully performs the emulation since I can use it to reproduce the bottom right component's perspective. The bottom right component is taken from an OBS screen capture of a match. I used a screen capture as a theoretically perfect client-side demo, since I'm not confident in the current client-side demo implementation.

The left components show progress towards successful emulation. The top left component is the original, server-side demo. You can tell that the local player is closer to A site in the top left component than in the bottom right one, since there's more light under the P250 in the viewmodel. The bottom left component is the server-side demo with an adjustment for prediction by RTT but not interpolation by interpolation period. The result is closer to the correct answer, the bottom right component. The player is farther from A site and there is less light below the P250 in the viewmodel. However, it isn't perfect.

The top right component is the server-side demo with adjustments for prediction by RTT and interpolation by interpolation period. It's a near perfect reproduction of the screen capture. At least for this situation, mirv_pov correctly emulates the theoretically perfect client-side demo using the server-side demo.

Four different perspectives of the same moment. The top left is the server-side demo. The bottom left is the server-side demo adjusted for prediction. The top right is the server-side demo adjusted for prediction and interpolation. The bottom right is an OBS screen capture.

Request For Feedback

Dominik and I have made some very cool progress! You can try the changes out for yourself with the latest version of HLAE (2.123.0). It works for the above example and a couple other demos that I've tested using more quantitative techniques. I'll write about those techniques in a later blog post. You may need to adjust the mirv_cfg mirvPov interpOffset values for players who aren't using default interpolation period values. [4]

I suspect that the current implementation isn't perfect. For example, we just fixed a bug where the firing animations had a different interpolation period than the local player's view model. If you find issues with the approach, please reach out by emailing me at durst@stanford.edu.

Footnotes

In the CSGO community, client-side demos are known as POV demos and server-side demos are known as GOTV demos.
See Gamasutra's article on demos for a detailed discussion of the different types. There are more options than just the client state and server state ones in CSGO.
I'm ignoring the complexity of buffering in this post. I won't deal with the small increase in latency resulting from clients waiting half a tick to send the command. A more realistic (and complicated) analysis would result in a latency of RTT + Tick Time / 2.
cl_interp_ratio and cl_interp are two ways to set interpolation period. cl_interp sets it in terms of time (seconds) and cl_interp_ratio sets it in terms of ticks. CSGO takes the max interpolation period between these two values.

David Durst's Blog