One overlooked aspect of optimizing page speed involves knowing a bit about browser internals. Browsers make certain optimizations to improve performance in ways that we as developers can’t—but only so long as we don’t thwart those optimizations unintentionally.
One internal browser optimization to understand is the browser preload scanner. In this post, we’ll talk a bit about how the preload scanner works—and more importantly, how you can avoid getting in its way.
What’s a preload scanner? #
Every browser has a primary HTML parser that tokenizes raw markup and processes it into an object model. This all merrily goes on until the parser pauses when it finds a blocking resource, such as a stylesheet loaded with a
<link> element, or script loaded with a
<script> element without an
In the case of CSS files, both parsing and rendering are blocked in order to prevent a flash of unstyled content (FOUC), which is when an unstyled version of a page can be seen briefly before styles are applied to it.
The browser also blocks parsing and rendering of the page when it encounters
<script> elements without a
These are good reasons for why the browser should block both parsing and rendering. Yet, blocking either of these important steps is undesirable, as they can hold up the show by delaying the discovery of other important resources. Thankfully, browsers do their best to mitigate these problems by way of a secondary HTML parser called a preload scanner.
A preload scanner’s role is speculative, meaning that it examines raw markup in order to find resources to opportunistically fetch before the primary HTML parser would otherwise discover them.
How to tell when the preload scanner is working #
The preload scanner exists because of blocked rendering and parsing. If these two performance issues never existed, the preload scanner wouldn’t be very useful. The key to figuring out whether a web page benefits from the preload scanner depends on these blocking phenomena, and to do that, we can introduce an artificial delay for requests to find out where the preload scanner is working.
Let’s take a page of basic text and images with a stylesheet. Because CSS files block both rendering and parsing, we can introduce an artificial delay of two seconds for the stylesheet through a proxy service. This delay makes it easier to see in the network waterfall where the preload scanner is working.
As you can see in the waterfall, the preload scanner discovers the
<img> element even while rendering and document parsing is blocked. Without this optimization, the browser can’t fetch things opportunistically during the blocking period, and more resource requests would be consecutive rather than concurrent.
With that toy example out of the way, let’s take a look at some real-world patterns where the preload scanner can be defeated—and what can be done to fix them.
async scripts #
Let’s say you’ve got HTML in your
const scriptEl = document.createElement('script');
scriptEl.src = '/yall.min.js';
Injected scripts are
async by default, so when this script is injected, it will behave as if the
async attribute was applied to it. That means it will run as soon as possible and not block rendering. Sounds optimal, right? Yet, if we presume that this inline
<script> comes after a
<link> element that loads an external CSS file, we get a suboptimal result:
Let’s break down what happened here:
- At 0 seconds, the main document is requested.
- At 1.4 seconds, the first byte of the navigation request arrives.
- At 2.0 seconds, the CSS and image are requested.
asyncscript comes after that stylesheet at 2.6 seconds, the functionality that script provides isn’t available as soon as it could be.
This is suboptimal because the request for the script occurs only after the stylesheet has finished downloading. This delays the script from running as soon as possible. This could have the potential to affect a page’s Time to Interactive (TTI). By contrast, because the
<img> element is discoverable in the server-provided markup, it’s discovered by the preload scanner.
So, what happens if we use a regular
<script> tag with the
async attribute as opposed to injecting the script into the DOM?
script src="/yall.min.js" async></script>
This is the result:
There may be some temptation to suggest that these issues could be remedied by using
rel=preload. This would certainly work, but it may carry some side effects. After all, why use
rel=preload to fix a problem that can be avoided by not injecting a
<script> element into the DOM?
Preloading “fixes” the problem here, but it introduces a new problem: the
async script in the first two demos—despite being loaded in the
<head>—are loaded at “Low” priority, whereas the stylesheet is loaded at “Highest” priority. In the last demo where the
async script is preloaded, the stylesheet is still loaded at “Highest” priority, but the script’s priority has been promoted to “High”.
When a resource’s priority is raised, the browser allocates more bandwidth to it. This means that—even though the stylesheet has the highest priority—the script’s raised priority may cause bandwidth contention. That could be a factor on slow connections, or in cases where resources are quite large.
The answer here is straightforward: if a script is needed during startup, don’t defeat the preload scanner by injecting it into the DOM. Experiment as needed with
<script> element placement, as well as with attributes such as
Lazy loading is a great method of conserving data, one that’s often applied to images. However, sometimes lazy loading is incorrectly applied to images that are “above the fold”, so to speak.
This introduces potential issues with resource discoverability where the preload scanner is concerned, and can unnecessarily delay how long it takes to discover a reference to an image, download it, decode it, and present it. Let’s take this image markup for example:
img data-src="/sand-wasp.jpg" alt="Sand Wasp" width="384" height="255">
The use of a
data- prefix, meaning that in the preceding example,
src. This update prompts the browser to fetch the resource.
This pattern isn’t problematic until it’s applied to images that are in the viewport during startup. Because the preload scanner doesn’t read the
data-src attribute in the same way that it would an
Depending on the size of the image—which may depend on the size of the viewport—it may be a candidate element for Largest Contentful Paint (LCP). When the preload scanner cannot speculatively fetch the image resource ahead of time—possibly during the point at which the page’s stylesheet(s) block rendering—LCP suffers.
The solution is to change the image markup:
img src="/sand-wasp.jpg" alt="Sand Wasp" width="384" height="255">
This is the optimal pattern for images that are in the viewport during startup, as the preload scanner will discover and fetch the image resource more quickly.
The result in this simplified example is a 100-millisecond improvement in LCP on a slow connection. This may not seem like a huge improvement, but it is when you consider that the solution is a quick markup fix, and that most web pages are more complex than this set of examples. That means that LCP candidates may have to contend for bandwidth with many other resources, so optimizations like this become increasingly important.
CSS background images #
Remember that the browser preload scanner scans markup. It doesn’t scan other resource types, such as CSS which may involve fetches for images referenced by the
Like HTML, browsers process CSS into its own object model, known as the CSSOM. If external resources are discovered as the CSSOM is constructed, those resources are requested at the time of discovery, and not by the preload scanner.
Let’s say your page’s LCP candidate is an element with a CSS
background-image property. The following is what happens as resources load:
In this case, the preload scanner isn’t so much defeated as it is uninvolved. Even so, if an LCP candidate on the page is from a
background-image CSS property, you’re going to want to preload that image:
<!-- Make sure this is in the <head> below any
stylesheets, so as not to block them from loading -->
link rel="preload" as="image" href="lcp-image.jpg">
rel=preload hint is small, but it helps the browser discover the image sooner than it otherwise would:
rel=preload hint, the LCP candidate is discovered sooner, lowering the LCP time. While that hint helps fix this issue, the better option may be to assess whether or not your image LCP candidate has to be loaded from CSS. With an
<img> tag, you’ll have more control over loading an image that’s appropriate for the viewport while allowing the preload scanner to discover it.
The remedy for this scenario depends on the answer to this question: Is there a reason why your page’s markup can’t be provided by the server as opposed to being rendered on the client? If the answer to this is “no”, server-side rendering (SSR) or statically generated markup should be considered where possible, as it will help the preload scanner to discover and opportunistically fetch important resources ahead of time.
Help the preload scanner help you #
The preload scanner is a highly effective browser optimization that helps pages load faster during startup. By avoiding patterns which defeat its ability to discover important resources ahead of time, you’re not just making development simpler for yourself, you’re creating better user experiences that will deliver better results in many metrics, including some web vitals.
To recap, here’s the following things you’ll want to take away from this post:
- The browser preload scanner is a secondary HTML parser that scans ahead of the primary one if it’s blocked to opportunistically discover resources it can fetch sooner.
- Resources that aren’t present in markup provided by the server on the initial navigation request can’t be discovered by the preload scanner. Ways the preload scanner can be defeated may include (but are not limited to):
- The preload scanner only scans HTML. It does not examine the contents of other resources—particularly CSS—that may include references to important assets, including LCP candidates.
If, for whatever reason, you can’t avoid a pattern that negatively affects the preload scanner’s ability to speed up loading performance, consider the
rel=preload resource hint. If you do use
rel=preload, test in lab tools to ensure that it’s giving you the desired effect. Finally, don’t preload too many resources, because when you prioritize everything, nothing will be.
Hero image from Unsplash, by Mohammad Rahmani .