This post was originally posted on InVision’s Engineering Medium..
Faster experiences, faster teams, better features, oh my!
Over the last few quarters, we at InVision have completed a big evolution of our web architecture that addressed many technical limitations, web performance issues, and slow iteration velocity. This change took us from an origin-delivered multi-page + multi- single-page-app (SPA) architecture to an edge-delivered and federated SPA experience. In doing so, …
InVision experiences have seen massive performance improvements (30–60% P95 TTFB improvement, 20–40% P95 Initial render improvement, and more) Engineers have a shared framework that propagates improvements as we iterate. Improving velocity for all teams and, for some teams, we’re orders of magnitude faster at propagating changes. New product capabilities have been unblocked to provide next-level product features. In this article, we want to share how we went about solving our core challenges, what our solutions look like, the benefits we captured, and what comes next.
Where we started
A key bit of context to have here is that over the last few years, we rebuilt our full product platform from top to bottom. Within that stream of work, we were pushing to move fast. However, not all teams move at the same pace, which naturally resulted in decisions that left us with siloed solutions. We shipped web features that very much looked like the shape of our org chart; if a particular team owned a sub-feature, it had its own standalone app.
This put us in a state where we had 20+ teams building features and mounting their SPAs under unique HTML paths at the origin with little consistency due to a lack of shared web tech standards. Users felt this disjointed experience when navigating across our features, as they would suffer from a full page reload to fetch a new SPA, hence, removing the primary benefit of the SPA model. This is what we mean by “origin delivered multi-page + multi-single-page-app (SPA)” architecture; this is the worst of both worlds. This pattern is not uncommon within organizations that take on massive overhauls or look to scale quickly — but we can do better.
Given that context, here are the problems we were facing:
Slow performance (Time to first byte and initial render) Reduced capabilities due to navigations between SPAs requiring full page reloads Slow velocity from having to build every new feature from the ground up Heavy page weight due to a lack of code sharing Although the challenges outweighed the benefits, it is still worth calling out that a benefit of this approach was high team autonomy to build, test, and deploy in ways that worked best for them.
The reality was that our new web experience was not just suffering from poor web architecture, but also from being a collection of isolated designs which did not fit well together. At InVision, we obsess over providing great experiences to users so this was very important for us to change.
The next few sections describe how we identified these issues, created an architectural plan, and followed it through it to turn this reality around.
The new architecture
We took a critical look at every facet of this problem and the path to deliver a better user experience. Although our list was exhaustive, we will not cover every option that we considered here. Instead, we will call out those that were important to our organization.
Team Autonomy: As a fully remote organization, we highly benefit from team autonomy, but complete autonomy was a mistake. We had to find the right balance so that it allowed teams to focus on providing their unique value.
Iterative: We were already in the midst of a platform and product rebuild and did not want to stop to rewrite what was already available. The new architecture was to enable gradual adoption and support iteration.
Shared platform: Subscribing to the “build once, use many’’ philosophy, shared problems like HTML delivery, CDN, caching, etc. should be solved once. Any technology that spanned multiple features should be added to the shared foundational layers and not repeatedly added to individual features. This is also how we established best practices by default.
Given our current architecture, the above core considerations, and the mountain of research that we conducted, we decided to build a true Single Page App that composed our experiences using a feature federation strategy. Our tenants were to provide our teams with autonomy, with better boundaries, and to deliver a performant, unified, and seamless user navigation experience.
What that looks like (very high-level):
Layers of InVision Web Arch — top layer: data and user assets, second layer: features, third layer: App Shell, fourth layer: UI Gateway and Global Static Pipeline Layers of the Web Architecture Each layer maintains a separation of concerns and does not worry about the layer above
- Consistent Web Artifacts (GSP). Every feature leverages a shared build process to enforce common rules and to publish the immutable artifacts and assets to our CDN layer called the Global Static Pipeline (GSP).
- Each team builds, tests, and deploys its features via the GSP.
- The build step generates a manifest that describes a feature. Most importantly, it identifies the critical path files needed to load a feature, i.e. the main JS code and the initial CSS. This manifest is used throughout the other layers.
- The deployment step aggregates these manifests per environment and promotes them. Serving HTML (UI Gateway). All of our HTML is delivered via globally distributed Cloudflare Workers that we call the UI Gateway. This gateway is in charge of collating the routes for each feature (aka a “feature configuration”) and mapping them to the manifest that was generated previously in the GSP. Once built, the gateway delivers the HTML to the browser which launches the next step.
- Rendering Features (App Shell). Using the feature configurations that detail route/navigation ownership and the manifests that detail how to load any of the features, our App Shell JavaScript client library is in charge of managing navigation across features and leveraging the details from the manifest to load the features.
- Features Experiences (Client Features). Our teams build features that only worry about the experience they need to deliver. They do not worry about how the experience reaches users. They focus on loading their app-specific data and presenting users with their assets and experiences.
Our teams develop their piece of the overall web experience, deploy it when they are ready, and those changes are delivered to users alongside the other features owned/deployed by other teams. The architecture has a geo-distributed server component that aggregates our feature configs and manifests and delivers them to users as HTML. That HTML starts our App Shell client library which mounts the current feature and watches for URL changes as users navigate across different features. As users navigate, the App Shell determines which features should be mounted or unmounted and provides a smooth transition experience between them.
This new architecture allows us to lean into developer autonomy with better boundaries that supports team focus and innovation on their distinct features experiences — not the common layers that are previously solved problems. We pushed the generation of experiences to the edge leveraging Cloudflare Workers providing fast delivery times of initial HTML. Then with App Shell’s management of feature lifecycles, users do not experience full page reloads after their first landing. On the same note, maintaining the same page memory across features, we can provide shared resources across feature boundaries which allows us to provide a more connected experience.
Diving Deeper
Here is how all of the technical components come together.
Feature Setup
This is the “step 0” part of registering a federated feature within the architecture. The UI Gateway and App Shell code repositories maintain a list of features and some lightweight configurations about those features. This configuration describes critical information like the name of the feature, the CDN namespace where we can find its manifest (described below), and what routes the feature owns. This information allows the UI Gateway and App Shell to know what to do or where to go to get the information it needs to serve a request.
Feature Build
Our teams leverage a shared build step that enforces rules and processes on the build artifacts. The most important of which is enforcing file immutability, standardizing the CDN usage, and providing a new feature-specific manifest file. This build → CDN pipeline is internally called the Global Static Pipeline (GSP).
By enforcing immutability, we can make safe assumptions downstream such as making all static assets cacheable for very long periods of time and provide safe invalidation at the CDN or Service Worker layers. This shared build step also provides a foundation for us to improve our build tooling across many features in a single location. For example, we rolled out page weight budgets, branch-based synthetic perf testing, and the generation of branch preview links all in one place.
The manifest is a very simple JSON blob that annotates key runtime information — most important is the criticalPathFiles field which is the basis for our feature federation. It represents the HTML, JS, and CSS files that need to be loaded and in the order they are needed to ensure dependencies are met. Because of this, the App Shell or any program can take a feature manifest and have the context for how to load it correctly.
Feature Deployment
Separate from this specific architecture, our infrastructure team provides generic deployment tools and a release pipeline. This pipeline promotes resources to different tiers like testing, preview, multi-tenant, etc. We integrated a new interface behind this deployment pipeline that validates a feature’s readiness to deploy, then promotes the manifest generated in their build phase to the appropriate tier. All features have validated manifests that are promoted to the tier for which they are deploying.
When our infrastructure initiates a deployment, it completes the promotion of the manifest to the next tier then triggers the UI Gateway to synchronize all of the feature manifests for all of the tiers. That synchronization is simply to take all of the configured features, look up their manifests for each tier based on a standardized naming convention along with a CDN namespace and store them in fast access locations. This is the federation piece — features are built and deployed separately, then we synchronize the server and client components to have the latest information for that tier.
Feature Delivery
We centralized the ownership of the HTML to geographically distributed Cloudflare Workers that we call the UI Gateway. The two most important jobs of the UI Gateway are to aggregate the manifests for each feature and to provide the HTML to users.
For the manifest aggregation, the UI Gateway has several layers of redundancy and optimizations to make sure all features are up to date for each tier. As mentioned in the Feature Deployment section, feature manifest synchronization is asynchronously triggered after every release. But if the UI Gateway finds itself in a situation where it is missing manifests or they are not valid for some reason, it takes the feature configurations and looks up the manifests directly from the CDN namespace + tier for that feature. Leveraging the asynchronously synchronized manifests collection is extremely fast (sub-millisecond) but these redundancies are in place to make sure we can always load manifests on demand for anomalous situations.
The base HTML includes our App Shell client library, feature configurations (the routes they own, name, etc.), and the manifest for each feature. With these two things, the App Shell can completely manage the lifecycle of all of our features as the user navigates across the product. With this system on Cloudflare Workers, we saw a 30–60% improvement on time to first byte metrics compared to when individual features provided their own HTML delivery at the origin.
Because deployments are immutable and delivery is managed by the manifest promotion, our deployment times are extremely fast.
Feature Orchestration
App Shell is a client library orchestrator that is in charge of the common client needs and management of feature-to-feature navigation and interactions. Once the HTML is delivered, App Shell bootstraps itself, determines the current feature to mount, puts the feature’s critical path files on the page, and asks the feature to run its initialization process. As users navigate across different pages, the App Shell will detect navigations to routes owned by other features and instructs the current feature to run its unmount processes then proceeds to mount the next one (if the critical files for the next feature are not already there). This process repeats with each navigation away from a page.
The App Shell, in essence, ties features together to achieve the Single Page App user experience. Because it manages feature life cycles, it can also provide shared resources such as libraries, common sub-features, etc. which reduces page weight and memory consumption for common tools.
What about…
Memory Consumption
With continuing to load more and more features on a page, a valid concern pops up around memory consumption. In theory, this is a very real issue but, in reality, our users generally don’t use our product in that way. They generally have a “job to be done”, and they use our product to achieve that job and leave. The only segment of the population that visits every feature of our product within one page load are those that are building our products. To ensure this does not bite us, we introduce a full page load after so many feature-to-feature navigations to be sure. This number is low enough to avoid memory aggregation concerns but large enough to account for 99% of users having a consistently seamless experience.
Updating Features
If we no longer have full page reloads, how do we ensure users are using the latest features? Similar to the answer in the section titled Memory Consumption, we add logic to trigger a full page load after so many navigations (which triggers a fetch of the latest manifest). This has been successful enough that we haven’t had to add auto-update detection. But auto-update is another approach we have on the roadmap.
Webpack Federated Modules
Webpack Federated Modules were in their very early days when we built our new architecture (this post is reflecting on our work after complete propagation of the arch + iterations). Since then, this technology has come a long way to maturity and it serves as validation to see that its approach is very similar to our federation approach. The basic principles are the same, but with our architecture, we have the ability to be more deeply integrated up and down the stack to improve the performance of federating features that we cannot achieve with Webpack Federated Modules. That said, we will continue supporting that initiative and keep an eye on opportunities where it makes sense to leverage it. The work they are doing to provide module federation on the server is especially interesting.
Authenticating HTML Delivery
When delivering HTML from the edge, “when do you authenticate the user?” Typically, the pattern web servers follow is to receive an HTML request, check for a user session, and redirect to the login page if they are unauthenticated. This verification step usually takes place at an origin server and is not free from a performance perspective. This check at the origin is common even for public sites. Furthermore, web apps that dynamically load data via API requests also validate a user session again after this initial HTML auth check.
When reviewing that full flow, it is apparent that there are redundant auth checks and that this flow is optimized for an unauthenticated user redirection experience. That accommodation for a less common unauthenticated user scenario incurs a cost for every user. Therefore, we want to optimize for the authenticated user case. To do that, our UI Gateway delivers the HTML optimistically and does not check for authentication, letting a downstream data fetch request made by the features handle detection and redirection if a user is unauthenticated. Our features do this as part of the baseline security measures and do not rely on a previous step to have done this for them. Given our HTML is never sensitive (it contains no user/company data), there is no reason to go all the way to our origin for authentication to download common HTML.
Feature Development
When leveraging Cloudflare Workers as the delivery mechanism, how do you handle development when a Cloudflare Worker is not available in a “local” environment? The web architecture offers two options to make front-end development fast and easy for developers:
The primary option is to do feature development on special tiers where all of the infrastructure is hosted similar to Production and with Cloudflare Workers available. In this tier, we have a “developer mode” where the UI Gateway provides all of the manifests like it normally would but it also instructs the App Shell to load this feature’s manifest from a locally hosted location rather than the CDN. This local manifest points to critical path files that are on a local machine. The critical path files in this developer case can be the output from local bundles and live-reloading steps as feature development happens. Features only have to have the features they are actively working on developed locally, everything else is what would be available in production.
The secondary option is to develop in a local environment where all of the infrastructure is recreated but Cloudflare Workers are not in front of every request. In this situation, we want to have the UI Gateway function the same as the former option (providing all manifests and requesting App Shell to fetch the local development manifests). To accomplish this, we provide a local proxy service that handles all incoming requests and calls a local UI Gateway for HTML requests. This UI Gateway logic is a locally hosted Cloudflare Dev server that emulates the Cloudflare Worker runtime.
Deploying tightly coupled feature versions
Generally speaking, it’s good practice to avoid having two features be tightly coupled to the point that they have to go out at the same exact time. But when that has to happen, how do we solve that when our features are federated?
One option is to try to deploy them at the “same time” which almost never works out at scale and some users will run into issues. We have a more deterministic option.
We solve this by letting features “pin” the SHA version of the manifest that they want to be deployed across all tiers. This is set in the configuration that UI Gateway owns. So two features can specify that they need to be deployed as these exact versions. Once all tiers have received that version, the pinned versions can be removed and future deploys go back to not having tight couplings.
Where are we taking this next?
We have a lot of great opportunities to leverage this architecture to provide additional capabilities across all of our features
- The architecture packages up our features in a way that directly fits into the Progressive Web App approach (PWA). We plan to continue executing on that path to provide a PWA this year.
- Having the App Shell and individual features as two distinct layers on the client-side enables us to create additional systems that are “feature aware” but are not tied to the features. Systems like monitoring, plugins, etc. are really interesting opportunities.
- The UI Gateway and App Shell are in a central repository to provide better tooling for security, privacy, observability, etc. We leverage this to be able to propagate changes across our architecture in a manner of seconds with one PR.