Cloud, Load, and Modular Code: What 2022 Looks Like for Shopify

You may have heard that 2021 was Shopify’s biggest Black Friday Cyber Monday (BFCM) ever. This four-day period was monumental for both Shopify’s merchants and our engineering teams.

Last year’s numbers capture a moment in time but can also help us predict what’s to come in the year ahead. On our cloud in 2021, our peak BFCM traffic surpassed 32 million app server requests per minute (RPM). In the same time period our load balancers peaked at more than 34 million RPM. To put that in perspective, this means that the equivalent of Texas’s total population hit our load balancers in a given minute. One flash sale—a short-lived sale that exceeds our checkout per minute threshold—even generated enough load to use over 20% of our total computing capacity at its peak.

During BFCM 2021, we also:

  • sent nearly 145 million emails
  • averaged 30 TB per minute of egress network traffic
  • handled 42 billion API calls and delivered 13 billion webhooks
  • wrote 3.18 PB and read 15 PB of data from our storefront caching infrastructure
  • performed over 11 Million queries per second and delivered 11 terabytes per second read I/O with our MySQL database fleet

The year ahead poses even bigger challenges for our engineers, data scientists, user experience designers, and product managers. More BFCM sales are happening on mobile devices. More people are shopping on social media. Commerce is happening across a growing array of platforms and buyers expect a fast and consistent experience. If the metaverse becomes a reality, there will be commerce opportunities within that world that need to be explored. What does a flash sale look like in the metaverse and how does that play out?

Infographic of Shopify's BFCM 2021 technical stats
Shopify's technical stats from BFCM 2021

If the data and trends above tell us anything, it's that there’s no getting around the fact that flash sales, huge floods of web traffic, and many different buying environments are a big part of the future of commerce. The questions for me are: What are the enduring challenges for the engineering teams working to enable this incredible growth in the next five to ten years? How do we build scalable products and infrastructure so millions of merchants can go from zero to IPO—and beyond? Engineering at Shopify is about solving challenges and building resilient systems so merchants can focus on their business instead of technology. 

Here are a few things we’re planning on doing in 2022 to work quickly in a world that’s growing rapidly, becoming more global, and at the same time moving closer to where merchants do business and where buyers are shopping.

We are building more modular code. Shopify is famously one of the world’s largest Rails monolith codebases. We’ve been actively changing the architecture of the monolith to a majestic, modular monolith for several years. And more recently, we’ve been changing our architectural patterns as we deconstruct parts of the monolith for better developer productivity

As an example, we split out our storefront rendering process from the modular monolith repo to make sure merchants (and their customers) get the fastest online shopping experience possible. When we were done with the split and some code refactoring work, the results were four times faster cache fill rates and five times faster page render times. Also, pulling the storefront renderer out means it can now be deployed in geographies around the planet without having to deploy our full Rails monolith. The closer we can render the storefront to the buyer, the fewer round-trips between the store and the browser need to be made, again improving overall storefront performance. In 2022, we’re going to continue exploring majestic monoliths. We see that engineers working on repos that directly improve merchant performance, like storefront rendering, iterate and deploy quickly. This model also allows us to put our developer experience first and provide a simpler setup with tighter coupling with our debugging and resiliency tools. 

We are leveraging new cloud development platforms to work more efficiently on a global scale. This year, we’ll spend a lot of time making sure developers can create impact fast—in minutes not hours. We’re moving the majority of our developers into our cloud development environment, called Spin. Devs can spin up (pun intended) a full development environment in seconds as opposed to minutes. You can even have multiple environments for experimentation to share work-in-progress with teammates. (We plan to share more about Spin in the future.)

Another big part of this year will be about building on this cloud development platform foundation to make our developer workflow faster and even smoother. We also moved all of our engineering to working on Apple M1 Macbook Pro laptops and these powerful devices, combined with Spin, are already making developers much more productive. Spin creates opportunities for us to build much improved IDE and browser extensions for enhanced productivity and delight, and an exciting opportunity for us to explore new ways to solve developer problems at scale that just weren’t possible in our previous local development environment paradigm. 

We are making load testing a more natural part of the development process. To prepare for BFCM 2021, we began load testing in July and ran the highest load test in Shopify’s history: a load balancer peak of 50.7 million RPM. But, flash sales that spike in minutes are not as predictable in their load requirements as a seasonal growth pattern like BFCM. To help prepare our infrastructure and products to handle larger and spikier scale, we’re continuing to improve our load testing. These load tests, built in-house, help our teams understand how products handle the larger platform-wide surge scenarios. Our load testing helps test product sales regardless if they are exclusively online, in-person using our retail POS products, or a combination of both. Automating and combining load tests as part of our product development processes is absolutely critical to avoid performance issues as we scale alongside our merchants.

These are a few ways we’re making it as easy as possible for developers to do the best work of their lives. We want to have the right tools so we can be creative about commerce—not “How do I set up my environment?” or “How does my code get built?” Engineers want to work at scale, ship impactful changes on a regular cadence, and work with a great team.

Speaking of great teams, a team of engineers from Shopify and Github built YJIT, a new Just-in-time (JIT) compiler that merged with Ruby 3.1. It’s 31% faster than interpreted CRuby and 26% faster than MJIT, reaching near-peak performance after a single iteration of any benchmark. It’s having a huge impact on the Ruby community inside and outside of Shopify and accelerating lots of production code execution times.

What isn’t changing in 2022: We remain opinionated about our tech stack. We’re all in on Rails and doubling down on React Native for mobile. We are going to continue to make big bets on our infrastructure, on building delightful developer environments, and making sure that we’re building for the success of all of our merchants. BFCM 2022? Bring it on.

Allan Leinwand is Chief Technology Officer at Shopify leading the engineering and data teams. Allan was previously SVP of Engineering at Slack and CTO at ServiceNow. He co-founded and held senior leadership positions at multiple companies, has authored books, and ventured to the dark side as a venture capital investor for seven years. He’s passionate about helping Shopify be the best commerce platform for everyone!

Wherever you are, your next journey starts here! If building systems from the ground up to solve real-world problems interests you, our Engineering blog has stories about other challenges we have encountered. Intrigued? Visit our Engineering career page to find out about our open positions and learn about Digital by Design.