Critical Rendering Path (Web Performance)

Know how the browser converts data bytes to actual pixels on the screen.

Critical Rendering Path (Web Performance)

When it comes to user experience, speed matters. Poorly performing sites and applications can pose real costs for the people who use them.

Performance optimization has always been very important for web apps and through this developers makes web applications efficient.

Before we understand all the performance optimizations tricks like minification, gzip(compression), caching, service workers, css split(media), image optimizations, preload, prefetch, requestAnimationFrame, web worker, code splitting , tree shaking, OCSP stapling(speeds up TLS handshake), scope hoisting, defer rendering, partial hydration, lazy loading, reducing selector complexity, Layout Thrashing(FSL), compositing layer, Domain Sharding(split resources,diff hosts), async JavaScript, etc , we must understand the Critical Rendering Path because once we understand this, most of the performance optimizations will feel obvious.

Critical Rendering Path

The critical rendering path is the sequence of steps a browser goes through to convert HTML,CSS and JavaScript to actual pixels on the screen. If we can optimize that we can make our page render fast.

In order to render content the browser has to go through a series of steps:

  1. Document Object Model(DOM)
  2. CSS Object Model(CSSOM)
  3. Render Tree
  4. Layout
  5. Paint

Critical Rendering Path

Document Object Model (DOM)

When we request data from server using URL ,we receive the response in the form of HTTP messages which consists of three parts Start line,Header files and Body. The start line and headers are textual and the body can contain arbitrary binary data(images,videos,audio) as well as text.

Once the browser receives the response (HTML markup text) , browser must convert all the markup into something which we usually see on or screens.

The browser follows well defined set of steps and it starts with processing the HTML and building the DOM.

  1. Convert bytes to characters
  2. Identify tokens
  3. Convert tokens to nodes
  4. Build DOM Tree

DOM

Initially the characters(<html><head><meta name="viewport" content="width=device-width"><link href="styles.css"......) are converted to Tokens(StartTag:head Tag:meta Tag:link EndTag:head Hello...) which is done by tokenizer.

While the tokenizer is doing this work ,another process consumes these tokens and convert them to Node objects and once we consume all the tokens we arrive at Document Object Model or DOM which is a tree structure that captures the content and property of HTML and all the relationships between the nodes.

Browser constructs the DOM incrementally i.e the browser does not have to wait for all the HTML to arrive from server before starting to process, hence we can take advantage of this process to increase speed.

CSS Object Model (CSSOM)

So the DOM captures the content of the page but not the CSS associated. To include CSS , we have to build the CSS Object Model. CSSOM is also constructed pretty much similar to DOM.

CSSOM

But we cannot apply the same incremental trick(partially constructed CSS tree) that we used in DOM construction, here. Let us assume that we constructed our page with partial css eg: p {background:'red'}

And in the later parts of stylesheet(which is not yet received by our browser) we have p {background:'blue'} which overrides the previous p {background:'red'}

If we use partial CSSOM tree to render our page then we will end up displaying paragraphs with red background instead of blue which is incorrect. Thus browser blocks page rendering untill it receives and processes all of the CSS.

CSS IS RENDER BLOCKING

It is important to note that,

JAVASCRIPT IS PARSER BLOCKING

because it blocks the DOM construction when we encounter the script tag in our HTML markup. JS can only be run after CSSOM construction as the JavaScript may try to change style of the page. CSS blocks rendering as well as the JavaScript execution.

Some scripts dont modify the DOM or the CSSOM and they should not block rendering. For those scripts we use async ,so that it neither block DOM construction nor gets blocked by CSSOM.

Render Tree

After DOM and CSSOM gets constructed they are combined together and this step can be seen in dev tools as Recalculate Styles . DOM and CSSOM forms a render tree that contains the contents and the styles associated with the content. Render Tree only captures visible content (i.e it ignores elements with properties like display:none)

Render Tree

Layout

Now that our render tree is formed we need to figure out where and how all the elements are positioned on page. This is our layout step.

Every time we make changes to the geometry(width,height,position) of elements the browser will run layout step.

layout

Paint

Finally in the paint step,the visible content of the page can be converted to pixels to be displayed on the screen. This process include conversion of vector(boxes or shapes made in the layout step) to raster(combination of individual pixels to be displayed on screen) which is done by rasterizer. The rasterizer uses draw calls like save , translate, drawRectangle, drawText, clipPath, etc to fill pixels.

VectorRaster.png

Paint is generally done into a single surface.However sometimes browser makes different surfaces called layers and it can paint into those individually. Once it is completed, the browser combines all the layers into one layer in a correct order and displays them on screen.This process is referred to as Composite Layers.

Composite

All of this happens on the CPU , the layers are then uploaded to the GPU and then GPU puts pictures up on the screen.

If there is any kind of visual change in screen from scrolling to animation , the device is gonna put up a new picture or frame onto the screen for user to see. Most devices refreshes the screen 60 times a second(60fps) which we measure in Hz.

So if we have 1000ms for 60 frames ,for a single frame we have only ~16ms to render it. Usually we have around 10ms only as browser does other works in the rest of the time frame.

If the browser is taking too long to make a frame it will get missed out,the frame rate will drop and content judders on screen. This is often referred to as jank or lag.

pixel-to-screen pipelineAreas we have most control over in pixel-to-screen pipeline

Each of these parts of the pipeline represents an opportunity to introduce jank, so it's important to understand exactly what parts of the pipeline our code triggers.

Hope I was able to properly describe the Critical Rendering Path . In upcoming posts we will discuss about all the common performance optimizations techniques in great details.

Resources : developers.google.com/web