Abstract

This dissertation presents optimizations that can be applied to the Entity Component System (ECS) pattern in Game development to reduce CPU overhead and memory usage.
The areas of focus of this dissertation are data structures for component storage and approaches to iteration over groups of entities.
We also demonstrate simple benchmarks for gauging the strengths and weaknesses  of different ECS approaches.
This dissertation concludes Hashed Array Trees to be an ideal component storage data structure, using less memory than both vectors and tightly packed arrays, whilst still providing good access times.
The concept of subscriptions is also presented, offering a more efficient way of keeping track of entity compositions. Further optimizations to subscriptions are also evaluated.
The optimizations presented offer an overall 400% increase in performance over a naïve ECS implementation.


Declaration

“I declare that this dissertation represents my own work except where otherwise stated”

Acknowledgement

I would like to thank my project supervisor Dr Rich Davison as well as Dr Willian Blewitt for the guidance and support provided throughout the course of this project.


Table of Contents

1 Introduction        5

1.1 Terminology        7

1.2 Aim and objectives        8

2 Background        9

2.1 Component Systems        9

2.2 Entity Component Systems        9

2.3 Existing ECS implementations        10

3 Implementation        12

3.1 Requirements        12

3.2 Overall Structure        12

3.4 Entities and the Entity Manager        12

3.5 Components and component storage        13

3.5.1 components        13

3.5.2 Component storage        13

3.5.3 Component storage – vector        14

3.5.4 Component storage – tightly packed array        14

3.5.5 Component Storage - Hashed Array Tree        15

3.6 Iterating over entities        18

3.6.1 Naïve Iteration        18

3.6.2 Subscriptions – component listening        18

3.6.3 Subscriptions – delayed updates        18

3.6.4 Active-Passive Sets        19

3.6.5 Aspect vectors        20

4 Testing Methodology        21

4.1 Benchmarks        21

4.2 Data collection        21

5 Results        22

5.1 Entity Iteration:        22

5.2 Component Store:        25

5.2.1 Update times        25

5.2.2 Memory usage        27

5.3 Aspect Vectors        28

6 Conclusion        30

6.1 To design benchmarks for entity component systems.        30

6.2 To implement, profile and select component storage data structures.        30

6.3 To implement, profile and select Entity iteration schemes.        30

6.4 Further works        31

References        32


1 Introduction

Interactive Video games are built of large numbers of game objects, each with differing behaviours. Traditionally, game objects would be represented by classes, using inheritance to reuse common code. For instance, a ‘Soldier’ GameObject may extend the ‘AI’ class, which in turn inherits from the ‘GameObject’ class.
However this approach has several issues both from a software architecture stand point, and a game development stand point.
Architecturally speaking, we run into the ‘Diamond problem’, whereby a class can inherit from multiple super classes. For instance, there is an ‘Enemy’ base class, from which the ‘ArmouredEnemy’ and ‘MountedEnemy’ classes are derived, and a ‘MountedArmouredEnemy’ class which inherit behaviour from either subclass. By inheriting from both subclasses, we end up with an ambiguous case when calling the functions, as we do not know whether to call the function from ‘ArmouredEnemy’ or ‘MountedEnemy’.

From a game development stand point, the standard class hierarchy also presents issues.
Since each class has to be written by an engineer, if a designer wants to introduce new GameObject types, they have to communicate the requirements to the engineers and await the completion of the specific feature. Going back to the previous example, to build the ‘MountedArmouredEnemy’, the engineer has to build a new class, even though all the required code is already present in the ‘MountedEnemy’ and ‘ArmouredEnemy’ classes.
This creates a lot of friction in the development pipeline and slows down prototyping. This has led to designers hacking missing features into games rather than have engineers implement it (Martin, 2007).

Furthermore, Systemic game design has become far more popular in the recent years with games such as Far Cry and Zelda: Breath of the Wild being notable example. Systemic design involves designing systems that interact with each other organically, providing more interesting player experiences. For instance, there may be a puzzle to solve where any metallic object must be placed down to complete a circuit.
Using a class hierarchy means having to create a ‘metallic’ class to inherit from and use RTTI to check if the object is of the metallic type. This further complicates the class hierarchy and makes it harder for designers to add/remove the properties from game objects.

The Entity Component System (ECS) pattern is used to alleviate these issues and reduce need for engineers in the design process by removing complex class hierarchies and favour compositions instead.
Rather than Game Objects being polymorphic classes extending existing behaviour, they a represented by a composition of self-contained components. For instance, our ‘Mounted Enemy’ would be composed of an ‘Enemy’ and a ‘Mounted’ component. Adding the ‘Armoured’ component would make our enemy into an ‘Armoured, Mounted Enemy’.
This allows the engineers to focus on each component individually and allows designers to modify the behaviour of game entities without needing to modify the underlying code, simply by selecting what components each entity should have.
The ECS pattern also lends itself rather well to systemic game design, as each piece of game code only focuses on a small subset of components, without having to worry what other components are present or how the class hierarchy is structured.

Image result for cities skyline traffic

Figure 1 - Entities individually simulated in Cities Skyline

With the complexity of games ever increasing, and the rising popularity of open world games, it is common to have game worlds filled with many thousands of entities. It is therefore important to be able to manage large numbers of Game objects in a scene efficiently, with relatively low CPU overhead and memory usage. Therefore, it is important that Entity Component System implementations introduce little overhead.

Figure 2 – Large number of entities in Supreme Commander 2

This dissertation explores different optimizations to be made to Entity Component Systems, in particular, the data structures used to store components, as well as different iteration schemes. These different implementation details will be measured for run time and memory usage using different benchmarks designed to emulate different ECS workloads.

1.1 Terminology

This dissertation will be following the terminology and definitions as outlined by the Entity-systems Wiki, Adam Martin’s T=Machine blog, as well as borrow terminology used by the Artemis Entity component system.

1.2 Aim and objectives

The aim of the project:

“To implement and evaluate ECS optimisations”

To achieve this goal, we split the project into the following objectives:


2 Background

2.1 Component Systems

The Component System pattern, outlined by Scott Bilas in his 2002 GDC talk, was developed in order to reduce the need for engineers in the game development pipeline (Bilas, 2002). Rather than Game Objects being represented by large inheritance trees, each Game object is instead a set of self-contained pieces of game logic, called components. Examples of components include a rendering component, or a physics component.  Although the individual components still require engineers to build, game objects can be easily modified by game designers by adding or removing components in order to change their behaviour.


Component systems are widely used today in game engines such as Unity and Unreal.
Unity defines Game Objects as “containers for components”; “they do not accomplish much in themselves”. (https://docs.unity3d.com/Manual/class-GameObject.html)
The complex behaviour of GameObjects in Unity come from components querying for other components in its Game Object and acting accordingly. For instance, A Mesh Renderer component will query the Game Object for a mesh Filter in order to render it.
Unity allows for the creation of component in both the Editor, and at runtime using its scripting engine. This allows for the entities to change behaviour at runtime, even allowing you to add multiple components of the same type to a same Game Object.
All game behaviour in Unity is programmed through components, using their scripting engine. This allows both artists and engineers to use a common interface, with great flexibility, allowing for quick prototyping.

Unreal Engine’s Actor class works in much the same way as Unity’s Game Object class, also allowing for multiple components to be assigned to each individual actor.
In general, the words ‘Actor’, ‘Game Object’ and ‘Entity’ can be used interchangeably.

2.2 Entity Component Systems

The Entity Component System (ECS) (sometimes referred to as just ‘Entity System’) is an evolution of the Component System, with aim to separate data from logic, borrowing from the strategy pattern.  The pattern introduces two new concepts: Entities and Systems.
Entities are essentially the game objects. Conceptually, they represent a set of components. However, in practice, they are an index shared between Component arrays. This is done for performance reasons as explored in the implementation section.
The ECS pattern also requires components to be data only, moving all logic to Systems instead. Systems are singleton objects that act on Entities. They will usually only act on the subset of entities that match a certain composition. This provides more flexibility than a Component System, as it allows for some behaviour to arise when an Entity has a particular set of components. (Martin, 2007)
For instance, an entity with the ‘health’ component can take damage, and an entity with the ‘fire’ component will be rendered with flames engulfing it. Another system could be written to find all entities with both the ‘health’ and ‘fire’ components to gradually reduce their health.
In a Component System, this would have required the fire component to continually check for a ‘health’ component on top of doing all the flame rendering, but the ECS design allows for these concerns to be separated.

Garcia et al. (2014) describe how the ECS pattern can help with designing accessible games, outlining how attaching and detaching components from entities can allow the designers to enable or disable user interaction systems without needing to change any code at the engine level.
They present a system where entities are defined in external XML files to allow users to customize the experience.
Although the paper goes over the basic design of an ECS framework, it does not go into detail on how to implement such a system or give any thoughts on optimization.

The T=Machine blog (Martin, 2007, 2014) has a number of articles on the ECS pattern, describing the overall structure of the ECS pattern, but also proposing some optimization that can be made to component storage, discussing cache optimization and memory usage. However, there is little information provided on what the performance impact of each data structure is. There is also no discussion on different iteration techniques.

Majerech (2015) presents an ECS framework designed to make use of multi-core architectures by allowing multithreading of systems. The paper focuses on a scheduling algorithm for ECS frameworks, as well as describing how games state can be represented as a ‘past’ and ‘future’ state to avoid data races. However not much is said on the storage of components or iteration of entities.

Vittorio Romeo (2016) Presents ECST, a multi-threaded compile-time ECS library. Although the focus of the paper is on the multithreaded aspect of the implementation, it does include some thoughts on Component storage. It explores the use of both contiguous buffers in the form or arrays and vectors, as well as hash maps. However, there is little analysis of the chosen data structures.

Most of the work being done with Entity component systems seems to be on the design benefits, or on parallel ECS frameworks, however, there is seemingly little work exploring the other aspects of entity components systems, such as component storage data structures, or on ways to efficiently iterate over entities.
Although Martin’s blog does go over some ideas for component storage, it only provides anecdotal evidence.
This dissertation will give back to the community by analysing the performance benefits of various implementations to be made with component storage, and entity iteration.

2.3 Existing ECS implementations

A number of open source ECS implementations have been developed in recent years. Notable examples include EntityX and Artemis-odb, which have been sources of inspiration for this project.

Artemis-odb in particular, which has been used in a number of projects (https://github.com/junkdog/artemis-odb/wiki/Game-Gallery) has lent a lot of its design to this project.

Artemis-odb is a continuation of the Artemis project, written in Java.
A core concept behind Artemis are the Component Mappers. As the name suggests, they map entities to components. There is a single Component Mapper per component type, used by all systems to retrieve an entity’s component. Component Mappers are also Component Stores, which means that rather than components being stored in entities, they are stored by the component mapper. An array is used to store Components.
Because Artemis stores entities in a Component Store rather than in the entity, the entity is just a shared integer index between component stores.
Artemis uses Aspects to match object compositions. Aspects include three sets of components: Components the entity must have, component the entity must not have, and components the entity must have one of in order to match. Internally, Bitsets are used to represent these component sets; Each component type is given a unique integer id, which are used when building up the bitset. Then, simple bitwise operators are used to compare the aspects.
Subscriptions keep track of what entities match a certain Aspect, and are used by systems to obtain the list of matching entities. Systems can then iterate over the returned entities.

EntityX is a C++ ECS implementation, which shares many similarities with Artemis-odb.
For instance, it also represents Entities as a shared index amongst component stores, however, it also includes a version integer to invalidate Entity objects after an entity has been destroyed.
Rather than being accessed through component mappers, components can be accessed directly using the entity object, however, components are still stored in a common data structure. Rather than arrays, EntityX uses semi-contiguous pools of components.
Rather than having to use Artemis-style subscriptions, EntityX allows you to iterate over entities matching an aspect in situ. This requires less boilerplate code but does require some extra computations to be done internally.

This project borrowed a lot of its design from Artemis-odb, including the component mappers and subscriptions.


3 Implementation

This section focuses on the chosen implementation of the ECS pattern.

3.1 Requirements

This section focuses on the requirements set when designing the ECS.

The C++ programming language was chosen for its high performance, template meta programming and low-level memory management capabilities. However, many of the concepts explored can also translate to other languages such as Java or C#.

3.2 Overall Structure

3.4 Entities and the Entity Manager

The entity is the simplest element of the ECS implementation, as it does not contain any data. Instead, it is only used as a handle to reference a set of components. For that reason, entities are represented as a simple integer. This allows it to be easily stored and passed between functions.
For that reason, the entity and the entity ID is one and the same.

The entity manager has for role to distribute entity IDs when creating a new entity. It must therefore keep track of existing entities and have a strategy to select new IDs.
We decided to use a counter together with an integer queue to keep track of available IDs. When an entity ID is requested, the queue is first checked for an available ID. If the queue is empty, then the counter is increased and returned as ID.
When removing an entity from the game world, the entity manager places it’s ID onto the queue, ready for re-use.

int nextID = 0;

queue<int> available;

int create() {

    if (available.empty()) {

        return nextId++;

    } else {

        return available.pop();

    }

}

int destroy(int entity) {

    available.push(entity);

}

Figure 3 - simple entity manager pseudocode

Reusing entity ID’s ensures we do not run out of ID’s during long simulation sessions, and allows allocated memory to be reused, reducing memory usage and allocations.

3.5 Components and component storage

3.5.1 components

Components are represented by C++ structs. Component storage containers may reallocate memory when grown, which means it is not guaranteed components will have the same memory location throughout the lifetime of the entity. This places the restriction that components cannot keep references to other components and must instead store the ID of the entity containing the component.

3.5.2 Component storage

Storing components in an efficient manner is key to ensuring both good performance and memory usage in an ECS. This section of the dissertation looks at how to store components efficiently to reduce both access times and memory usage.
For the purposes of this dissertation, we will be making the following assumptions:

  1. Systems only act on a small subset of components.
  2. Systems iterate over entities in order. (see section 3.6)
  3. Components are relatively small, often containing asset references or simple information like a transform matrix.
  4. Most components are sparsely distributed.
  5. Entities are created and destroyed throughout the simulation.
  6. Components are read more often than they are created or destroyed.

Given assumptions 1 & 2, we can tell that systems will be reading a small number of components in order. For that reason, we choose to store the components by type. This allows us to retrieve just the components we are interested in. It will also mean components should preferably be stored contiguously and in order, reducing cache misses.

As entities and components get created and destroyed throughout the simulation, we can assume that there will be holes in our component stores. For instance, we may assign components to Entities 1 through 100, but then later delete entities 20 through 30, which will create a 10-component wide hole in our data structure. This is considered when designing our data structures.

3.5.3 Component storage – vector

A vector, or dynamic array is a natural choice for component storage. It is contiguous and naturally maps Entity ID’s to Components. Components are simply put into the array at the position corresponding to the entity ID it is assigned to.

vector<Type> components;

void addComponent(int entity, Type component) {

    components[entity] = component;

}

Type& getComponent(int entity) {

    return components[entity];

}

void removeComponent(int entity) {

    components[entity] = NULL;

}

Figure 4 - vector storage pseudocode

Vectors will grow by over allocating a chunk of memory, and copying data to the new location, before freeing the old memory chunk. The overallocation and copying of memory is expected to impact performance somewhat, as both are costly operations.

This approach was expected to perform reasonably well, as it only requires a single pointer dereference.
Because entity IDs map directly to the array index, and not all entities will be given all components, we expect to see many holes form in the array. These holes are expected to cause cache misses when accessing the components sequentially.
Given assumption 4, it was expected that vectors would use a lot of memory as it still allocates memory for all entities, regardless of if they have components or not.

Vectors will also limit the maximum number of entities possible, as vectors have a maximum size.
Furthermore, growing the array require realocation and copying of memory, which as the vector gets larger, becomes more expensive.

3.5.4 Component storage – tightly packed array

A solution to the memory usage issue would be to pack our components tightly.
This means that we now only need to allocate memory for entities that have the component. However, we now need to keep a table of offsets to each component, which introduces a new level of indirection.
In order to keep our elements in order, we have to be able to insert our components in the middle of the array, which requires shifting the rest of the component, which can be an expensive operation.

vector<Type> components;

vector<int> offsets;

void addComponent(int entity, Type component) {

    int pk;

    for(pk = entity; pk >= 0 && offsets[pk] == -1; pk--);

    int off = (pk == -1) ? 0 : offsets[pk]+1;

   

    components.insert(off, component);

   

    for(int i = entity + 1; i < offsets.size(); i++) {

        if (offsets[i] != -1) {

            offsets[i]++;

        }

    }

   

    offsets[entity] = off;

}

Type& getComponent(int entity) {

    return components[offsets[entity]];

}

void removeComponent(int entity) {

    elements.erase(entity);

    for(int i = entity+1; i < offsets.size()l i++) {

        offsets[i]--;

    }

    offsets[entity] = -1;

}

Figure 5 - Tightly Packed Array pseudocode


Although this approach requires an additional pointer dereference, the components are stored perfectly contiguously, which should help when reading from the array due to the extra cache friendliness.

This approach also suffers from the vector maximum size, but rather than being tied to maximum entity count, it is tied to the number of entities containing the component.
Growing the array also requires realocation and copying of memory, but as the arrays are expected to be much smaller, this is less of an issue than with vectors.

3.5.5 Component Storage - Hashed Array Tree

This third approach aims to reach a good middle ground between insertion times, iteration times and memory usage. This semi-contiguous data structure is made up of multiple contiguous chunks of memory.  Because the chunks are contiguous, we expect to only get a few more cache misses than when using the sparse vector storage.

vector<array<Type, chunk_size>*> chunks;

void addComponent(int entity, Type component) {

    int chunk_index = entity / chunk_size;

    int item_index = entity % chunk_size;

   

    if (chunks[chunk_index] == nullptr) {

        chunks[chunk_index] = new array<Type, chunk_size>();

    }

    chunks[chunk_index][item_index] = component;

}

Type& getComponent(int entity) {

    int chunk_index = entity / chunk_size;

    int item_index = entity % chunk_size;

   

    return (*chunks[chunk_index])[item_index];

}

void removeComponent(int entity) {

    int chunk_index = entity / chunk_size;

    int item_index = entity % chunk_size;

   

    chunks[chunk_index][item_index] = NULL;

    if (chunks[chunk_index]->empty()) {

        delete chunks[chunk_index];

        chunks[chunk_index] = nullptr;

    }

}

Figure 6 - Hashed Array Tree pseudocode


The idea behind this approach is that if a chunk is empty, it does not need to be allocated, potentially reducing memory usage; Given that entities are often created in groups, and that many components will be sparsely distributed, we expect this to happen often, depending on the chunk size.

 It is also very cheap to grow, as each new allocated chunk is relatively small, and does not require copying of data; an advantage over both other implementation that require copying of data when growing the data storage.

This structure does not suffer the issue tied with maximum vector sizes, as each chunk is allocated separately.


The size of the chunks can be chosen depending on the component. If a component is expected to appear often, or if it will be assigned to successive entities, then large chunks can be allocated to improve cache prediction.
If on the other hand the components are expected to be sparsely distributed, then smaller chunks are used to reduce wasted memory. For very sparse components, a chunk size of 1 can be used. This completely removes any cache-prediction performance gains, but as it is expected to be used on a small number of components, it is a non-issue, and the potential memory reduction far outweighs the cost.

Chunk pooling

Memory allocation from the OS is a costly operation, and must therefore be avoided when possible. Because many components get created all the time, we end up having to allocate memory for the newly required chunks. However, Components are destroyed equally as often, freeing up old chunks.
By pooling our chunks, we can reuse unused chunks when a new chunk is required.

This small optimization can reduce the number of allocations needed for short lived components, at practically no cost. Although memory allocations are not so frequent as to damage our average frame times, it can cause small hiccups in frame time as a frame waits for memory to be allocated. Using chunk pooling should help keep the frame times steadier.


3.6 Iterating over entities

Each system will iterate over entities that have a certain composition or aspect.

3.6.1 Naïve Iteration

A naïve approach would be to iterate over all entities and check their composition before acting on them. This approach means Systems have to check all entities, regardless of how many entities actually match the aspect. Entities also have to be checked every update, even if they haven’t changed.

for(int entity = 0; entity < entity_count; entity++) {

    if (entity matches aspect) {

        //perform logic on entity

    }

}

Figure 7 - Naive Iteration Pseudocode

3.6.2 Subscriptions – component listening

Rather than check the entities against an aspect every frame, they can be updated every time a relevant component is changed and the result cached.
This dramatically reduces the number of times an entity is checked against an Aspect, improving performance as that is the most expensive part of an ECS.

Subscriptions can be achieved by using an observer pattern. Each subscription would observe relevant components being inserted or removed and update the set.

set<int> entities;

OnEntityChanged(int entity) {

    if (entity matches aspect) {

        entities.add(entity);

    } else {

        entities.remove(entity);

    }

}

for(int entity: entities) {

    //perform logic on entity

}

Figure 8 - Subscriptions pseudocode

3.6.3 Subscriptions – delayed updates

If multiple components are modified, a simple observer pattern will cause the subscription to update multiple times for the same entity. This situation happens quite often as components are often added in groups, and entities get many components added to them on creation.

Instead, we can keep track of what entities have been modified by placing them in a “dirty” set and update all dirty entities at the end of the update loop (or after each system).

set<int> entities;

set<int> dirty_entities;

OnEntityChanged(int entity) {

    dirty_entities.add(entity);

}

update() {

    for(int entity: dirty_entities) {

        if (entity matches aspect) {

            entities.add(entity);

        } else {

            entities.remove(entity);

        }

    }

}

for(int entity: entities) {

    //perform logic on entity

}

Figure 9 - Delayed Subscriptions pseudocode

This can reduce the number of aspect checks multiple times, which further improves our performance, but comes at the cost of calling the update function.

3.6.4 Active-Passive Sets

The choice of data structure used to store the entities in subscriptions is quite important for both iteration speeds and insertion speeds. It may also be necessary to query the subscription for entities.

A vector will allow us to iterate quickly, however inserting, removing and finding entities is far too slow for real time applications on large data sets.
Sets are faster at insertion, but are slower when iterating.
BitSets will perform the best for insertion, removal, and finding, and on smaller ranges, can be faster at iteration. However, with a large range of entities like in videogames, iteration is again too slow.

A solution is to use an active and passive structure. We can use a bitset as the active structure to quickly insert, remove and find entities, and use a vector for iteration. At the end of each update, we can rebuild the vector from our active bitset.

The bitset is used to represent a fast integer set, by setting the bits at indices corresponding to entities belonging to the set. This will allow us to insert and remove much faster than using a simple integer set, but means rebuilding a vector every update.

bitset entities_active;

vector<int> entities_passive;

OnEntityChanged(int entity) {

    if (entity matches aspect) {

        entities_active[entity] = true;

    } else {

        entities_active[entity] = false;

    }

}

update() {

    entities_passive.clear();

    for(int i = 0; i < entity_count; i++) {

        if (entities_active[i]) {

            entities_passive.add_back(i);

        }

    }

}

for(int entity : entities_passive) {

    //perform logic on entity

}

Figure 10 - Active-Passive set pseudocode


This does mean having to iterate over our bitset either way, but it only has to be done if it was changed, and our initial assumption is that components are not modified often. The subscription may also be shared between systems, so iterating over our bitset once in order to build a vector for faster iteration wins out in performance.
We should expect to see insertion perform in O[1] time, and updates to perform in O[N] time, against the O[log N] insertion times of tree-sets.
Therefore inserting M elements in the tree set gives us O[M log N] time, but out active-passive sets gives us O[M + N] time.

3.6.5 Aspect vectors

When checking if an entity has a specific aspect, one method is to check each component manager to see if the entity has each of the required components. However, this requires many memory reads and a lot of comparisons. This is ok for smaller aspects but can add up for larger aspects.
A potential solution to this problem is to use bitsets to represent the entity’s composition vector.
When comparing the entity to an aspect, only the bitsets need to be compared, which requires fewer comparisons, and has better memory locality.

bitset getAspect(ComponentType* types) {

    bitset aspect;

    for(ComponentType type: types) {

        bitset.set(type.id);

    }

    return aspect;

}

bool compare(bitset aspectA, bitset aspectB) {

    return aspectA & aspectB == aspectB;

}

Figure 11- Aspect Bitset pseudocode


4 Testing Methodology

4.1 Benchmarks

Benchmarks were developed to measure the performance of different implementation details.
Two synthetic benchmarks, Iterations and Add/Remove benchmarks were developed to measure the costs of different ECS operations, whilst a particle systems benchmark was developed to gauge the overall performance of the ECS under a more common workload.

4.2 Data collection

The benchmarks were run using an Intel Core i7-4710HQ CPU running at 2.5GHz base, boosting up to 3.3GHz, 256KB L1, 1.0MB L2 and 6.0MB L3 cache, and 16GB 1600MHz DDR3 RAM,
running 64 bit Windows 10 build 16299.371.

The average update time and maximum update time was measured using std::chrono’s high resolution timer. The memory usage was measure using Visual Studio’s memory usage profiler in order to easily take memory snapshots.

5 Results

5.1 Entity Iteration

 

The Iteration benchmarks shows subscriptions performing 430% better than Naïve iteration, and active-passive sets improving this number by a further 350%, with delayed subscriptions having little to no effect on iteration performance. This is an expected result as iterating over a subscription is just iterating over a set after the initial setup. The active-passive sets use a vector for iteration, which explains the performance gain. Delayed subscriptions do not change the iteration method, which is why it did not affect iteration performance.

The Add/Remove benchmark however shows the extra costs involved with maintaining a subscription, but, also how  both delayed subscriptions and the active-passive set can greatly relieve some of that cost.
Delayed subscriptions only improve the performance by 15% at 1000 operations but have a greater effect as the number of operations per update increases, reaching a 73% increase in performance at 10000 entities. This is because at higher operation counts, entities are modified many times per update. With simple subscriptions, this would cause the subscription to be updated multiple times per update. Delayed subscriptions ensure the entity is only processed once per update.
The active-passive set increase performance by a further 65% at 1000 operations, and 35% at 10000 operations.

The iteration results suggest subscriptions should perform better than naïve iteration. The Particle systems benchmark supports this up to a point: up until 30000 entities, subscriptions do perform better than naïve iteration, however, past 30000 entities, the cost of maintaining the subscription outweighs the gain in iteration performance and we see the cost of subscriptions increasing exponentially. This is likely due to the cost of inserting into the set increasing as the set get larger.

However, the particle benchmark supports both synthetic benchmark in that delayed subscriptions greatly improve the performance, and shows delayed subscriptions continually outperforming naïve iteration. This improvement over simple subscriptions comes from fewer operations being made on the larger ‘entities’ set, and the frequent insertions being made on the much smaller ‘dirty’ set.
The particle benchmark also supports both other benchmarks in that active-passive sets improve performance of the ECS.
Whilst subscriptions seemed to follow a exponential curve, active-passive set subscriptions seem to be linear, as well as reduce the overall cost significantly, up to 400% on our benchmarks at 40000 entities,
However, the benefits when applying it to delayed subscriptions is unclear.
Although the delayed subscription are an improvement over simple subscriptions, bringing similar improvements as active-passive sets them not improving performance when active-passive sets are used as well shows that the expensive part of subscriptions is not checking the aspect of an entity, but rather adding that entity to a set. Delayed subscriptions will reduce the number of checks, but also reduce the number of insertion operations. However, inserting into the active-passive set is very cheap, and so the extra costs introduced by the delayed subscriptions may not be worth it.
More benchmarks with higher entity counts are likely required to ascertain if applying delayed subscriptions to active-passive sets makes a difference.

The benchmark results show that delayed subscriptions and active-passive sets will perform the best for high performance game’s applications, regardless of entity count.


5.2 Component Store:

5.2.1 Update times

The iteration benchmark shows tightly packed arrays have the best iteration performance, closely followed by vectors, and hashed array trees in third. However, all three data structures scaled similarly with entity count, and the difference between all three data structures is within 20%.
Hashed array Trees introduce an extra level of indirection, which explains why they perform worse than vectors. However, Tightly Packed Arrays also introduce an extra level of indirection, but performance better than the vectors.  This can be attributed to its better cache friendliness, as the data is perfectly contiguous, with no gaps in data.

The Add/Remove benchmark clearly shows tightly packed arrays are an unsuitable data structure for high performance applications, performing up to 30 times slower than other data structures. This is supported by the particle systems benchmark where it performs 20 times slower than other data structures, despite its better iteration performance. This is due to the shifting of components when inserting, which requires large amounts of copying; a costly operation.
The Add/Remove benchmark also indicated that vectors may perform slightly better than Hashed Array Trees. This is once again reflected by the Particle system benchmark, with vectors yielding 25% faster update times over Hashed Array Trees. This is due to the extra level of indirection Hashed Array Trees introduce.

5.2.2 Memory usage

The synthetic memory usage benchmark demonstrates the issues with vectors having much higher memory usage overall, and the over allocation is clearly visible. Although vectors do lead to good memory usage at certain entity counts (before the array is grown), most of the time, vectors use more memory than other solutions.
Tightly packed arrays are shown to have the best memory usage by the synthetic benchmark, however, this is not supported by the Particle Systems benchmark, where hashed array trees use up to 20% less memory. This is likely due to the synthetic benchmark being the Hashed Array Tree’s worst case, where components are distributed randomly. However, the particle system benchmark represents a more realistic load, which is more favourable to Hashed Array Trees as entities end up being clustered according to components, creating larger gaps in component stores, for which Hashed Array Trees do not need to allocate memory.

The Particle Systems benchmarks indicates that Hashed Array Trees are the ideal component storage data structure, using up to 40% less memory.

5.3 Aspect Vectors

When not using aspect vectors, the cost of checking an entity’s Aspect is proportional to the size of the aspect being checked. However, aspect vectors seem to have the opposite trend, greatly reducing the time taken to compare aspects as the aspects grow in complexity. This is because the bitwise check can only fail early when a component is not matched. Since there are more components in complex aspects, it fails early more often with complex vectors.
The aspect vector outperforms not using bitsets for all aspect sizes except with an aspect size of 1. This however does not matter all that much in real world applications as most systems will be interested in aspects with at least 2 components.

The particle benchmark shows a very small performance increase of 4% average. This shows that aspect checks are far from the most expensive operation. However, the small performance increase comes at little other costs, and so is worth implementing in all ECS systems.


6 Conclusion

The project was split into 3 objectives.

6.1 To design benchmarks for entity component systems.
The chosen benchmarks designed succeeded in demonstrating the differences between chosen implementation details, as well as outline the overall performance of the ECS.

This allowed us to profile and select different iteration schemes and data structures, as well as identify when each implementation detail may be used most effectively.

However, more situations could have been benchmarked, as well as more data points tracked, such as maximum frame time, to better capture the real-world performance of the ECS.

More advanced measurements to capture low level CPU performance could also have provided insight into the performance numbers produced by the benchmarks. These could include measurements such as cache misses, or memory latency.

6.2 To implement, profile and select component storage data structures.

Data structures were one of the two key aspects of this project, and greatly impacted the overall performance of the ECS. 3 main data structures were implemented, and the benchmarks allowed them to be profiled and the best overall performing data structure selected, as well as situations to be identified where another data structure may be preferable.

Vectors were found to be the best performing data structures in terms of CPU time. It featured the lowest insertion costs and performed very closely to tightly packed arrays at sequential reads.
However, due to its higher memory usage, it was determined to be unsuitable for most game applications.
Hashed array trees used significantly less memory, without introducing much costs in insertion and iteration, and is therefore better suited to real time games applications.

 

6.3 To implement, profile and select Entity iteration schemes.
Entity iteration was the other key aspect to this project. It too ended up having a considerable impact on performance. 2 major iteration schemes were selected, naïve iteration and subscriptions, with the later having been implemented with two extensions; active-passive sets and delayed subscriptions.

We found that subscriptions could dramatically improve the performance of our ECS framework compared to naïve iteration, especially when active-passive sets were used. We also found that delayed subscriptions could help a lot when using simple sets, but their benefit was not as clear when active-passive sets were also used.
However, because delayed subscriptions can help with stability, it is still a good optimisation to use.

The project aim was: “To implement and evaluate ECS optimisations”
This was successfully achieved as we have managed to outline several optimizations to be made to entity component systems and evaluated when they should be considered.
Overall, the optimizations managed to improve the performance of a Naïve implementation by up to 400% and reduced its memory usage considerably.

6.4 Further works

There is still a lot of optimizations that can be made to the ECS pattern.
Some of these include:


References

Adam Martin (2007Entity Systems are the future of MMOG development – Part 2 What is an Entity System?, Available at: http://t-machine.org/index.php/2007/11/11/entity-systems-are-the-future-of-mmog-development-part-2   (Accessed: 23rd April 2018).

Adam Martin (2014) Data Structures for Entity Systems: Contiguous memory, Available at: http://t-machine.org/index.php/2014/03/08/data-structures-for-entity-systems-contiguous-memory  (Accessed: 22nd April 2018).

(2014) Entity Systems Wiki, Available at: http://entity-systems.wikidot.com (Accessed: 22nd April 2018)

Scott Bilas (2002) A Data-Driven Game Object System [GDC Talk Slides], Available at: http://gamedevs.org/uploads/data-driven-game-object-system.pdf (Accessed: 22nd April 2018).

Ferdinand Majerech (2015) A concurrent component-based entity architecture for game development, Available at: http://ics.upjs.sk/~krajci/skola/ine/SVK/pdf/Majerech.pdf (Accessed: 22nd April 2018).

Adrian Papari et al (2018) Artemis-odb [Github Repository], Available at: https://github.com/junkdog/artemis-odb (Accessed: 22nd April 2018).

Adrian Papari (2016) entity-system-benchmarks [Github Repository], Available at: https://github.com/junkdog/entity-system-benchmarks (Accessed: 22nd April 2018)

Colosal Order (2015) Cities:Skylines – Dev Diary 7: Simulation, Available at: https://forum.paradoxplaza.com/forum/index.php?threads/cities-skylines-dev-diary-7-simulation.828492/ (Accessed: 28th April 2018)

Danielsson M., and Bohlin G.P. (2015) A High Performance Data-Driven, Entity-Component Framework For Game Engines With Focus on Data-Oriented Design, Available at: https://autious.net/files/ECS.pdf (Accessed: 1st May 2018)

Garcia, f., and Almeida Neris, V. (2014). A Data-Driven Entity-Component approach to  develop universally accessible games. In Universal Access in Human-Computer Interaction. Universal Access to Information and Knowledge. 8th International Conference, UAHCI 2014 Proceeding, Part II

Vittorio Romeo (2016) Analysis of entity encoding techniques, design and implementation of a Multithreaded compile-time Entity-Component-System c++14 library. Available at: https://github.com/SuperV1234/bcs_thesis/blob/master/final/rev1/web_version.pdf (Accessed: 28th April 2018)

Dennis Wiebusch and Marc Erich Latoschik (2015) Decoupling the entity-component-system pattern using semantic traits for reusable realtime interactive systems. Available at: https://ieeexplore.ieee.org/document/7854098/ (Accessed: 3rd May 2018)