Causal is a multidimensional spreadsheet that’s capable of handling everything from basic arithmetic all the way up to billion-calculation financial models. The Causal frontend was built with Create React App (CRA) in 2019, and it served us well - it required minimal initial setup and allowed us to iterate quickly. As our customers grew in size and complexity, and performance became more of a concern, we reached the limits of what CRA was designed to support. Most importantly, CRA doesn't natively support route splitting across multi-page apps, so our page load times were frustratingly slow. To solve these problems, we switched to Next.js, reducing our initial page load times by as much as 70% and unlocking a new level of developer experience.
What is Next.js?
Next.js is a framework that comes with build tools and runtime libraries for creating rich React applications. It has the same functionality as CRA, but also includes built-in support for key functionality that CRA is missing: page routing, intelligent pre-loading based on page contents, and hybrid static & server-side rendering.
Migrating from CRA to Next.js
In mid 2022, we determined that the benefits of migrating from CRA to Next.js would be worth the time investment. We were particularly excited about having its built-in page routing primitive, so we wouldn't have to manually configure our routing and Webpack builds.
We'd previously needed to set up our own in CRA using react-loadable and react-router + react-router-dom, including a large routes.tsx file that explicitly set up a routed component for each page in the app:
One of the advantages of Next.js over CRA is that Next.js ships with its own integrated linking & routing solution, next/router. Placing a file at pages/model/[:id]/edit.tsx with a default-exported React component is all Next.js needs to know to render a page at that path, with an id prop indicating the URL's id.
Furthermore, the built-in Next.js Webpack configuration automatically splits pages out into their own bundles. That means visiting a page for local development only needs to build the bundle contents required for that page. Although CRA does support code splitting, in our experience the Next.js configuration is out-of-the-box much faster for local rebuilds.
Many of the older CSS files in the Causal codebase had been written prior to the team standardizing on CSS modules best practices. A number of those files utilized "impure" CSS selectors, meaning they could impact elements rendered by components elsewhere on the page.
For example, our previous Button component unintentionally targeted all buttons on the page:
We switched global CSS styles to CSS modules wherever possible. This allowed components to be more explicit about which styles they're taking in.
For example, we switched our Button component to a scoped class name in its CSS module:
Note: switching to "pure" CSS modules also significantly improved build times in the CRA app before the switch to Next.js was finalized. Many .scss files had been also using @use and @extend SCSS directives to build up styles using other, shared .scss files. Those directives caused the shared files to be rebuilt as a part of each file that included them - causing multiple seconds of build time each for some of the bigger files!
See this Next.js discussion answer on pure modules for more information.
Once we had Next.js working locally, the next step was to change our deploy strategy.
Here, CRA and Next.js have fundamental differences. The build output from CRA is just static files, so serving it is relatively straightforward. The build output from Next.js does include some static files, but it can also include code to run a separate server. This server is responsible for serving redirects, server-side rendering dynamic pages, and also serving static pages.
When evaluating the options for deploying our new Next.js frontend, we settled on three possibilities:
- Don’t use any server-side rendering for Next.js, build using next export, and treat the output exactly the same as CRA’s static output.
- Host the entire frontend on Vercel, pointed at our backend (hosted in GCP).
- Write a custom Docker image for the Next.js server, and host it in GCP alongside our backend and other services.
Each option had pros and cons:
- Pro: almost 0 work to set up (identical to CRA output)
- Con: does not support server-side rendering
Host on Vercel
- Pro: minimal setup required
- Con: no official support for Yarn 2
- Con: cannot easily connect to database for faster server-side rendering
Custom docker image
- Pro: maximum flexibility for server-side rendering dDirect DB connection is possible, backend API calls will be very fast because of colocation on GCP)
- Pro: most granular control over resources required/used
- Con: maximum setup required: Vercel provides examples but they don’t work out of the box; Kubernetes routing/networking, scaling, etc. all need custom setup
Given our desire for maximum flexibility, we chose option 3: writing a custom Docker image. (We do still deploy to Vercel, though - more on that later!). We do a small amount of server-side rendering on a few pages, and we have found the performance to be excellent so far, in large part thanks to the minimal network distance required to talk to our other services.
Although Vercel wasn’t feasible for our production deploys (as explained above), it is still quite useful for its preview apps. Although setting the Vercel build process up required a couple of workarounds (for the aforementioned lack of Yarn 2 support, and to build a common package used in our frontend), the benefit is immense: every commit pushed to our GitHub repository now gets built and deployed on Vercel as a preview app, pointed at our staging backend.
This has enabled an order of magnitude improvement in the code review experience for frontend changes. Instead of requiring pulling branches locally to test, reviewers can just click a link in the PR they’re reviewing and preview exactly what the branch will look like in production.
Although this change did not require Next.js, it was a breeze because of Vercel’s native support for its own framework.
Switching to Next.js unlocked significant improvements in both end-user and our developer experience:
Causal models are typically created by a few people but viewed by dozens of others; these viewers look at the model dashboard. And without any effort into advanced server-side rendering (using e.g. getServerSideProps), load times for these dashboards decreased by 32% (2.6s → 1.5s)!1
Simpler pages have even more dramatic speed-ups. For example, our home page (my.causal.app) loads 71% faster (1.7s → 0.5s), with no layout jump except the necessary transition from the loading state to the loaded state.
The performance benefits extend beyond just the user experience. Next.js has a significantly faster development experience than CRA; developers benefit from 30% (or more!) faster start-up times, and the fast refresh experience is a game changer for quickly iterating on small UI tweaks. By far the largest improvement came from pull request preview apps — a significant improvement to the code review experience. It takes seconds instead of minutes to preview frontend code changes, which has allowed us to give more frequent reviews on smaller pull requests, and also allowed our customer success team to provide feedback earlier in the development process.
We're thrilled to see the app running in production on <p-inline-color> ]Next.js.<p-inline-color> Our page loads are significantly faster, our local builds take seconds instead of minutes to get started, and the amount of Webpack configuration we need to maintain is dozens of lines instead of hundreds.
We plan on implementing even more server-side rendering soon, starting with embedded charts and tables, which are typically viewed by anonymous visitors. We expect to see a significantly improved experience for these users as a result of faster load times.
Of course, performance in modern web apps is about much more than first load times. Even more important is the performance of user interactions, which are particularly difficult to optimize in Causal because we are a data-heavy application rendering complex grids, charts, and tables. In a future blog post, we’ll share more about how we solve those performance problems.
Thank you to Joshua Goldberg for instrumental help with this migration!
1: We’re measuring First Contentful Paint