The Performance Cost of Server Side Rendered React on Node.js

I like React as a templating engine, not only on the client side but on the server as well. Over the last year or two rendering templates with React.js on the server has become commonplace. Services from rather static content driven sites to Universal JavaScript Applications built on frameworks like Next.js are serving dynamic of server side rendered views using React.

While I like the concept of the guarantees of structure and validity, I do recognise that there’s a lot of overhead in how it works for constructing just a single view server side. This is why I decided to examine just how much overhead there is compared to more traditional templating engines that work on strings and don’t guarantee structure of the generated HTML markup.

I created a simple Node.js application (Source on GitHub) with TypeScript that renders a HTML table of 100 rows of employee data from a JSON file with a number of different templating methods:

For measure I threw in a simple Node HTTP server that dumped the content as static HTML. For Nunjucks and Pug I compiled the templates to reflect performance in production environments.

I placed the app on a single core VPS on UpCloud with Node.js (v9.2.0) and ran a set of benchmarks to get an idea of how the alternatives compare. For benchmarking I used another VPS with hey, a contemporary Apache Bench / Siege alternative written in Go to provide high load on single routes.

Note: After initial results I received feedback that I should define the environment variable NODE_ENV=production as this greatly improves performance of server side rendered React. This is what I did and the article has been updated to reflect this, with a new run of benchmark with results.

Benchmark results

I ran the benchmarks with concurrencies of 1, 5, 50, 100 and 250 and tracked throughput (requests / second) and the average response time (in Milliseconds of waiting for response). The benchmarks were ran three times with 10 seconds of waiting time in between. Average values of the three runs were used. Results are from Node prod mode, except for React where I have results for both dev and prod settings.

Throughput

Node.js templating throughput for ES6 template literals, Nunjucks, Pug, React.js and Static

For throughput you can see that React is at a different league compared to the other templating engines when running without production settings, being close to ten times slower than the fastest traditional templating engine. It seems that then React hits CPU limits at some 300 requests a second on dev.

With production optimisations in place, React rises significantly and manages to be equal in performance compared to the Nunjucks engine. This is an impressive feat. It shows that while the thoughput of React in development mode will be enough for many low to medium traffic web services - there is a significant margin to be gained simply by defining an environment variable.

Native ECMAScript template literals and the Pug templating engine are a close match in performance. Pug manages to stay ahead of ES6 at a margin of some 5% at an average, and has a significant performance benefit over optimised React - which enjoys a five fold advantage over the unoptimised.

The static HTML dump is clearly in a league of it's own with results being very stable from concurrency of 5 all the way up to 250. This indicates that performance is likely limited by network throughput or other infrastructure.

Response times

Node.js templating response times for ES6 template literals, Nunjucks, Pug, React.js and Static

The average response times for all templating methods are reasonable numbers at low concurrency. From fifty concurrent requests onwards response time for React in dev mode shoots up, first to 153ms and ending up at a crippling 787ms at 250 concurrent requests to a single CPU VPS.

In production environment React performs again similarly to Nunjucks, but noticeably the average response time is significantly longer with the diffrence being some 70ms at concurrency of 250. Like in throughput results, Pug performs close to ES6 template literals, but there is a consistent advantage in response times by the JavaScript native templating - some 12 percent at peak concurrency.

Response times for delivering the static HTML dump stays at a very reasonable level through the range.

Nunjucks template compilation

The Nunjucks template engine is a clone of the Jinja2 templating engine and uses a familiar syntax to PHP developers using the Twig templating engine:

{% for employee in employees %}
  <div>Hello {{ employee.name }}</div>
{% endfor %}

This format needs to be translated to raw JavaScript, which naturally takes some time. Nunjucks is able to precompile these templates to reduce overhead for repeated use of the same template. Below are the results when using pre-compilation and alternatively compiling the template code upon on each request.

Throughput for non-compiled vs. compiled Nunjucks templates on Node.jsResponse time for non-compiled vs. compiled Nunjucks templates on Node.js

Results for uncompiled templates are noticeably lower. Throughput stabilises for both options from concurrency of 5 or more, but for high concurrencies the response time for uncompiled templates grows respectively more. If you are using a compiling template system, be sure you are taking advantage of the pre-compilation mechanism offered by your library.

React streaming templates

Since React-DOM 16.0 the library offers the capability to stream HTML output. This means that output is sent as it is completed, enabling the browser to already start reading and processing the markup. This is a standard browser feature, which enables faster rendering for users.

Out of curiosity I benchmarked the results between the streaming rendering versus the standard method that sends the complete markup only when it is completed. Results are shown below:

Throughput for React.js SSR non-streamed vs. streamed on Node.jsResponse times for React.js SSR non-streamed vs. streamed on Node.js

Throughput for both methods is quite is neck-to-neck at higher concurrencies. The streaming rendering is somewhat lower, but not by a significant margin. On the response times the difference is also noticeable, but quite low.

Given the fact that streaming should provide an improved real-life user experience, I would recommend using streaming if possible at the expense of maximum throughput. Performance should also increase as React enables asynchronous rendering in future versions.

The significance of production environment

As React enjoyed a massive boost in performance due to running with the production environment setup, I decided to take a look at how the effects would be for the other libraries. You can find results below for all libraries compared in both with and without NODE_ENV=production being in effect:

Node.js templating throughput for ES6 template literals, Nunjucks, Pug, React.js and Static

React is the only library that benefits from this setting very significantly, with throughput going from 300 to over 1500 requests per second depending on the env. There are significant optimisations baked in to the codebase. React SSR without them has an effect similar to running PHP without an Opcache.

For other libraries there are some improvements, especially ES6 Template literals seem to get a boost of over 20% as well. Pug also improves somewhat in prod env, but Nunjucks curiously drops somewhat. The server set to deliver the static HTML dump is unaffected by the setting, even at low concurrency.

Conclusion

All the results clearly display that there is a price to pay for server side rendered React.js. This is no surprise as vDOM template rendering is more complex than string based processing. In this light the performance of ReactDOMServer with optimisations is downright impressive.

Running React SSR without the optimal configuration is easy to miss. But as the server performs template generation for multiple clients, this can be a significant issue for scalability. Maybe future versions can signal a message to developers when running the client, similar to the notices now in place for the browser console reminding developers about setting key attributes to looped elements, etc.

For high load services where React.js is primarily used for rendering content feeds or similar non-dynamic content to HTML - setting up traditional HTTP caching is advisable. Scaling stateless rendering nodes horizontally is not complicated, but the fastest work is the work that need not be done. In addition technologies like lit-html building on native JS templating may be in a position to benefit.

At high volumes this can also be a cost issue. After verifying you're running with production settings - Instead of spending resources on rendering, consider a static HTML export, a CDN or a simple reverse proxy setup to a provide lower maintenance cost. When working with API driven front ends, it is easy to miss that your server render is CPU bound while focusing on improving client user experience.

-- Jani Tarvainen, 22/11/2017