How browsers work
Domenico De Felice�DUBLIN, AUGUST 2015�
Why?
Web browsers are likely the most widely used�software in the world.
They present web resources and create a sandboxed environment in which web applications can run.
The way the browser achieves this is very complex and dictated by many different standards.
Some mechanisms are deceptive and/or counter-intuitive.
Understanding how the browsers work gives us useful insights to improve the efficiency of our website / web application and the general structure of the code.
Complexity
Of course it is not possible to explore all the details of how a browser works in a single presentation.
Each browser is implemented in its own way.
General and modular approach
Browsers follow the same standards (IE?)�
Most of them share the same overall structure and the same modules/phases.
We will dwell on these shared behaviours and this will give us some insights that we can apply in our everyday work as web developers.
Browsers have two main modules:
It is easy to understand what is the job of the javascript interpreter.
�The rendering engine is where a big deal of the process takes place
Mozilla Firefox uses Gecko, made by Mozilla
Both Safari and Google Chrome (until version 27) use WebKit�Chrome uses Blink after version 27
Components of a web page:
What is the job of the�rendering engine?
Starting from these files (HTML, CSS, JS, etc.), render the web page on the screen of the user.
This is done roughly in four phases:
Optimizing this critical path will enable us to display the content to the user as soon as possible.
The important point to get here is that: these phases are not strictly consequential.
The engine will display contents to the user as soon as possible.
�Phase 2 won’t wait phase 1 to be complete before starting�It will start consuming output from phase 1 as soon as possible, and so will do the other phases
Since browsers try to show the content as soon as possible,�in some situations content may be displayed before�the style has been loaded:�this is known as the Flash of unstyled content (FOUC) problem.�
An example straight from Wikipedia:
PHASE 1:�Parse the HTML and the CSS
In this phase, the engine will start the parsing of the HTML.
The output will be a tree called Document Object Model (DOM) or content tree,�where each HTML tag is represented by a node of the tree (DOM node).
The mapping between HTML tags and DOM nodes is not 1:1 but almost.
The root element of the tree is the <html> element.
The purpose of the DOM is at least dual:
Parsers
Parsers are heavily used in computer science.
�They are a software components that take input data, usually text, and build a data structure to be used by the software.
Most of the times the input data doesn’t change during the parsing process: it is static.
The HTML parser is an exception.�It is a reentrant parser: it means that its input can change dynamically while parsing.
This makes this phase more complex and will lead us to our real first insight. Let’s see why.
Why HTML parsing is reentrant?
Because:
The HTML parsing is synchronous
The HTML parsing is synchronous
When the parser reaches a <script> tag:
�Since <script>s are blocking the parsing, moving them at the bottom of our web page (or just after the above the fold content) makes the page rendering faster.
defer and async
Two attributes of the <script> tag can also help when loading external scripts.
Marking a script with the defer attribute,�will make the script run only when the DOM has been completely built.
Source: caniuse.com
defer and async
The async attribute,�will make the script run only when it is available, without blocking the parsing.
Source: caniuse.com
Curiosity
In both Gecko and Webkit,�when the engine is blocked fetching and executing a script,�a second thread starts parsing the document looking for external resources to load:��it won’t modify the DOM but just start fetching external resources.
What about CSS?
Javascript blocked the parsing because it could modify the document.
CSS can’t modify the structure of the document, so it would seem we have no reasons to block the parsing.
CSS is blocking
CSS itself is a blocking resource,
for two reasons.
1- Scripts
Javascript could ask for style information that have not been parsed yet.
Different browsers behave differently, but roughly:
Mozilla Firefox holds script execution until there are still stylesheets being fetched and parsed
WebKit pauses script execution when they try to access styles that may be affected by stylesheets that still need to be fetched or parsed.
This means that an external <script>�in the head of the page, on Firefox:�
As before, this can be avoided moving scripts as much to the bottom as possible.
2- Rendering
The browser will hold the rendering process�until all the CSSOM has been built.
This can be mitigated by:
Network
Is there something we need to know�about how the files are fetched?
HTTP/1.1
The RFC 2616, released in 1999, defines the HTTP protocol version 1.1.�
From section 8.1.4 Practical Considerations:�
«A single-user client SHOULD NOT maintain more than 2 connections with any server or proxy.»
Imagine having this external files in the head of your web page:�
The download of file my_style3.css won’t start until we first complete download of my_style1.css.
The limits are per domain.�So if two file were on www.example.com and other two on assets.example.com the file could be downloaded at the same time
It is important to make few HTTP requests during page load and distribute the files on different domains.
Anyway:
This is the connections per hostname limit today according to browserscope.org:��
So it is higher
�but nevertheless it is good practice to reduce the number of HTTP requests
�to load the webpage faster
CSS
All the CSS of a page is parsed in a tree called�the CSS Object Model (CSSOM).
Why the CSS is represented by a tree?
The answer for this is in the name itself:�� Cascading Style Sheet.
Cascading
Every element can be matched by many CSS rules
How can the browser choose which property to apply to the element?
Ordering them by:
CSS origin
Weight�
Each property can have a normal weight or a !important weight.
Specificity�
Note�The weight ( !important ) has been introduced to let users override pages styles.�It was not intended for developers use!
Specificity
The specificity of a rule is represented by a group of four numbers (a, b, c, d)
The final group (a, b, c, d) will be considered as a single number
�HTML
<div id=”sidebar”>
<div id=”widget” class=”class-div”>
<span class=”span-class” style=”color: red”>Hello, world!</span>
</div>
</div>
�
CSS
.span-class {
color: green;
}
#sidebar #widget {
color: orange;
}
#sidebar {
color: yellow;
}
#sidebar .class-div {
color: blue;
}
�
�
/* Specificity (0, 0, 1, 0) */
/* Specificity (0, 2, 0, 0) */
/* Specificity (0, 1, 0, 0) */
/* Specificity (0, 1, 1, 0) */
The inline rule, will have a specificity of (1, 0, 0, 0).
When we find ourselves needing !important to get our style applied
we just need to increase�the specificity of our rule
for example by adding a class to the selector
Cascading summary
The rules that apply to an element are sorted by:
The specificity is taken in consideration only if the rule have the same origin and weight.�
If also the specificity is the same, then the rule that has been defined last is applied.
The CSSOM is not the only data structure built from the CSS
The CSS rule matching can be a heavy job
�Each rule is added to one of several hash tables, according to the most specific selector of the rule.
There are hash tables for IDs, class names, tag names and a general hash table for rules that don’t fit in any other dictionary.
When the browser needs to find rules that apply to an element,
it doesn’t need to look in every rule but only in the hash tables.
This leads us to another important insight:�we will always match rules by their rightmost selector,�that for this reason is also called the key selector.
<div id=”container1”>
… thousands of <a> elements here …
<a> … </a>
… thousands of <a> elements here …
</div>
<div id=”container2”>
<a class=”a-class”>...</a>
</div>
Let’s say we want to select only the <a> in the container2.
#container2 a {
…
}
This could be quite expensive.�Reading the rule left-to-right it reads: take #container2 subtree and find all <a>s.
�But the browser reads it right-to-left, so it reads:�take all <a>s and goes up in the DOM until you find a #container2.�
<div id=”container1”>
… thousands of <a> elements here …
<a> … </a>
… thousands of <a> elements here …
</div>
<div id=”container2”>
<a class=”a-class”>...</a>
</div>
We can write a most efficient rule:
#container2 .a-class { … }
or, if no other elements have that class, just
�.a-class { … }
This can be leveraged also from javascript / jquery
Instead of selecting an element with
$(‘#container .class-name’);
we can achieve better performances by using:
$(‘.class-name’, ‘#container’);
In the first case, all the .class-name element of the page will be grabbed and then the DOM will be traversed up looking for the #container element.
In the second case, the #container element will be grabbed (constant time) and then .class-name element will be looked for only in its subtree.
When all the HTML has been parsed, the document will be marked as interactive and its state set to complete:�
2) DOM+CSSOM → Render tree
The nodes from the DOM and the CSSOM trees will be combined together in a new tree, the render tree.
Its structure is similar to the DOM tree but only visual elements will be there.
Also, some DOM elements could be represented by more nodes in the render tree (like multi-line text nodes).
It is a tree of the visual elements�in the same order in which they will be rendered.
Each node of the render tree stores the CSS properties that apply to it.
This nodes are called differently in different browsers.
WebKit call them renderers (hence render tree).
Firefox calls them frames and the tree is called frame tree.
�Sometime they are simply called boxes from the concept of CSS box, since each node represent a rectangular area that usually corresponds to the node’s CSS box.
3) Layout phase
The layout phase, also called flow (or reflow, as we’ll see later)
What else is missing to be able to render the page?
The position and dimensions of the nodes, called together geometry.
Sometimes the geometry is specified in the stylesheet, sometimes not.
But in any case they need to be computed for all nodes to be able to correctly render them.
The browser traverses once again the render tree, starting from the root, and for each node computes the geometry.
All the relative units used in the stylesheet are converted to pixels.
4) Paint
This phase is also called painting�(or repaint, as we’ll see later)
Important points:
Reflows and repaints are unavoidable,�but we can reduce their frequency and cost.
What triggers reflows and repaints?
Javascript manipulation of the page:
User interactions:
There’s not much we can do to avoid the reflows and repaints triggered by the user,�but we can try to make them as less expensive as possible:
�
We have more control over the reflows and repaints triggered by our code.
Browsers are smart (1)
Browsers will try to do the minimal possible actions in response to a change.
Changing a property of an element may just trigger a local repaint
Hence, reflows and repaints can be global or incremental.
Applying changes to subtrees that are as small as possible will make the reflow/repaint faster.
In other words:�strive to apply your changes to elements that are deeper in the DOM and with a low height.
Reflows and repaints don’t have to happen together.
For example, if you change only the color of an element then only a repaint is triggered.
If you change the position of an element, both reflow and repaint are triggered.
What is more expensive, reflow or repaint?
The browser will use heavy caching to avoid recalculations.
A repaint requires the browser to search through all the elements to determine what is visible and what should be displayed.
A reflow recalculate the geometry for an element. It will recursively reflow also its children and sometimes also its siblings.
A reflow of course will trigger a repaint to update the webpage.
Optimizing CSS and HTML�for minimal reflow
Optimizing javascript code is more important.
Let’s see some javascript code:
var foo = document.getElementById(‘foobar’);
foo.style.color = ‘blue’;
foo.style.marginTop = ‘30px’;
How many reflows and repaints do we trigger?
Browsers are smart (2)
�The browser accumulates DOM manipulations done in a timeframe into a queue and will performs them in batch.
Let’s see some other code
var foo = document.getElementById(‘foobar’);
foo.style.color = ‘blue’;
var margin = parseInt(foo.style.marginTop);
foo.style.marginTop = (margin + 10) + ‘px’;
How many reflows and repaints do we trigger?
Why?
After the first change to the color property, the manipulation has been added to the accumulating queue.
Then, we ask for the style of that element.
To give us an answer, the browser will trigger the batch execution of the queue.�
We are forcing an early repaint of the page.
How does this work?
When we change an element of the DOM, that element will be flagged as dirty.�Sometimes it may also have the children are dirty flag, meaning that at least one of its children needs a reflow.
Once the interval runs out, all the dirty elements are reflowed and repainted.
�Asking some properties from an element that has been marked as dirty force the browser to perform the reflow early.
Forced synchronous layout
or
layout trashing
�Interleaving a lot of reads and writes to the DOM can lead us to what is called layout trashing
How can we avoid layout trashing?
requestAnimationFrame
window.requestAnimationFrame() allows us to execute some code at the next reflow
This is very useful because it allows us to interleave reads and writes and at the same time while executing them in the right moment
function doubleHeight(element) {
var currentHeight = element.clientHeight;
element.style.height = (currentHeight * 2) + ‘px’;
}
all_my_elements.forEach(doubleHeight);
function doubleHeight(element) {
var currentHeight = element.clientHeight;
� window.requestAnimationFrame(function () {
element.style.height = (currentHeight * 2) + ‘px’;
});
}
all_my_elements.forEach(doubleHeight);
RAF support
Anyway javascript developers have been using a trick for a long time to achieve roughly the same result:
window.setTimeout(function () {...}, 0);
Setting a timeout with a zero interval will usually invoke the callback function at the next reflow.
Source: caniuse.com
Virtual DOM libraries
Virtual DOM libraries are gaining a lot of momentum lately, thanks to Facebook’s React library.
How does a virtual DOM works?
The developer performs the manipulations on a virtual DOM that will then:
Thank you for your attention :-)