At Shopify, we recognize the positive impact data-informed decisions have on the growth of a business. But we also recognize that data exploration is gated to those without a data science or coding background. To make it easier for our merchants to inform their decisions with data, we built an accessible, commerce-focused querying language. We call it ShopifyQL. ShopifyQL enables Shopify Plus merchants to explore their data with powerful features like easy to learn syntax, one-step data visualization, built-in period comparisons, and commerce-specific date functions.
I’ll discuss how ShopifyQL makes data exploration more accessible, then dive into the commerce-specific features we built into the language, and walk you through some query examples.
Why We Built ShopifyQL
As data scientists, engineers, and developers, we know that data is a key factor in business decisions across all industries. This is especially true for businesses that have achieved product market fit, where optimization decisions are more frequent. Now, commerce is a broad industry and the application of data is deeply personal to the context of an individual business, which is why we know it’s important that our merchants be able to explore their data in an accessible way.
Standard dashboards offer a good solution for monitoring key metrics, while interactive reports with drill-down options allow deeper dives into understanding how those key metrics move. However, reports and dashboards help merchants understand what happened, but not why it happened. Often, merchants require custom data exploration to understand the why of a problem, or to investigate how different parts of the business were impacted by a set of decisions. For this, they turn to their data teams (if they have them) and the underlying data.
Historically, our Shopify Plus merchants with data teams have employed a centralized approach in which data teams support multiple teams across the business. This strategy helps them maximize their data capability and consistently prioritizes data stakeholders in the business. Unfortunately, this leaves teams in constant competition for their data needs. Financial deep dives are prioritized over operational decision support. This leaves marketing, merchandising, fulfillment, inventory, and operations to fend for themselves. They’re then forced to either make decisions with the standard reports and dashboards available to them, or do their own custom data exploration (often in spreadsheets). Most often they end up in the worst case scenario: relying on their gut and leaving data out of the decision making process.
Going past the reports and dashboards into the underlying datasets that drive them is guarded by complex data engineering concepts and languages like SQL. The basics of traditional data querying languages are easy to learn. However, applying querying languages to datasets requires experience with, and knowledge of, the entire data lifecycle (from data capture to data modeling). In some cases, simple commerce-specific data explorations like year-over-year sales require a more complicated query than the basic pattern of selecting data from some table with some filter. This isn’t a core competency of our average merchant. They get shut out from the data exploration process and the ability to inform their decisions with insights gleaned from custom data explorations. That’s why we built ShopifyQL.
A Data Querying Language Built for Commerce
We understand that merchants know their business the best and want to put the power of their data into their hands. Data-informed decision making is at the heart of every successful business, and with ShopifyQL we’re empowering Shopify Plus merchants to gain insights at every level of data analysis.
With our new data querying language, ShopifyQL, Shopify Plus merchants can easily query their online store data. ShopifyQL makes commerce data exploration accessible to non-technical users by simplifying traditional aspects of data querying like:
- Building visualizations directly from the query, without having to manipulate data with additional tools.
- Creating year-over-year analysis with one simple statement, instead of writing complicated SQL joins.
- Referencing known commerce date ranges (For example, Black Friday), without having to remember the exact dates.
- Accessing data specifically modeled for commerce exploration purposes, without having to connect the dots across different data sources.
Intuitive Syntax That Makes Data Exploration Easy
The ShopifyQL syntax is designed to simplify the traditional complexities of data querying languages like SQL. The general syntax tree follows a familiar querying structure:
FROM {table_name}
SHOW|VISUALIZE {column1, column2,...}
TYPE {visualization_type}
AS {alias1,alias2,...}
BY {dimension|date}
WHERE {condition}
SINCE {date_offset}
UNTIL {date_offset}
ORDER BY {column} ASC|DESC
COMPARE TO {date_offset}
LIMIT {number}
We kept some of the fundamentals of the traditional querying concepts because we believe these are the bedrock of any querying language:
- FROM: choose the data table you want to query
- SELECT: we changed the wording to SHOW because we believe that data needs to be seen to be understood. The behavior of the function remains the same: choose the fields you want to include in your query
- GROUP BY: shortened to BY. Choose how you want to aggregate your metrics
- WHERE: filter the query results
- ORDER BY: customize the sorting of the query results
- LIMIT: specify the number of rows returned by the query.
On top of these foundations, we wanted to bring a commerce-centric view to querying data. Here’s what we are making available via Shopify today.
1. Start with the context of the dataset before selecting dimensions or metrics
We moved FROM to precede SHOW. It’s more intuitive for users to select the dataset they care about first and then the fields. When wanting to know conversion rates it's natural to think about product and then conversion rates, that's why we swapped the order of FROM and SHOW as compared to traditional querying languages.
2. Visualize the results directly from the query
Charts are one of the most effective ways of exploring data, and VISUALIZE aims to simplify this process. Most query languages and querying interfaces return data in tabular format and place the burden of visualizing that data on the end user. This means using multiple tools, manual steps, and copy pasting. The VISUALIZE keyword allows Shopify Plus merchants to display their data in a chart or graph visualization directly from a query. For example, if you’re looking to identify trends in multiple sales metrics for a particular product category:
We’ve made the querying process simpler by introducing smart defaults that allow you to get the same output with less lines of code. The query from above can also be written as:
FROM sales
VISUALIZE total_sales, gross_sales
BY month
WHERE product_category = ‘Shoes’
SINCE -13m
The query and the output relationship remains explicit, but the user is able to get to the result much faster.
The following language features are currently being worked on, and will be available later this year:
3. Period comparisons are native to the ShopifyQL experience
Whether it’s year-over-year, month-over-month, or a custom date range, period comparison analyses are a staple in commerce analytics. With traditional querying languages, you either have to model a dataset to contain these comparisons as their own entries or write more complex queries that include window functions, common table expressions, or self joins. We’ve simplified that to a single statement. The COMPARE TO keyword allows ShopifyQL users to effortlessly perform period-over-period analysis. For instance, comparing this week’s sales data to last week:
This powerful feature makes period-over-period exploration simpler and faster; no need to learn joins or window functions. Future development will enable multiple comparison periods for added functionality.
4. Commerce specific date ranges simplify time period filtering
Commerce-specific date ranges (for example Black Friday Cyber Monday, Christmas Holidays, or Easter) involve a manual lookup or a join to some holiday dataset. With ShopifyQL, we take care of the manual aspects of filtering for these date ranges and let the user focus on the analysis.
The DURING statement, in conjunction with Shopify provided date ranges, allows ShopifyQL users to filter their query results by commerce-specific date ranges. For example, finding out what the top five selling products were during BFCM in 2021 versus 2020:
Future development will allow users to save their own date ranges unique to their business, giving them even more flexibility when exploring data for specific time periods.
Check out our full list of current ShopifyQL features and language docs at shopify.dev.
Data Models That Simplify Commerce-Specific Analysis and Explorations
ShopifyQL allows us to access data models that address commerce-specific use cases and abstract the complexities of data transformation. Traditionally, businesses trade off SQL query simplicity for functionality, which limits users’ ability to perform deep dives and explorations. Since they can’t customize the functionality of SQL, their only lever is data modeling. For example, if you want to make data exploration more accessible to business users via simple SQL, you have to either create one flat table that aggregates across all data sources, or a number of use case specific tables. While this approach is useful in answering simple business questions, users looking to dig deeper would have to write more complex queries to either join across multiple tables, leverage window functions and common table expressions, or use the raw data and SQL to create their own models.
Alongside ShopifyQL we’re building exploration data models that are able to answer questions across the entire spectrum of commerce: products, orders, and customers. Each model focuses on the necessary dimensions and metrics to enable data exploration associated with that domain. For example, our product exploration dataset allows users to explore all aspects of product sales such as conversion, returns, inventory, etc. The following characteristics allow us to keep these data model designs simple while maximizing the functionality of ShopifyQL:
- Single flat tables aggregated to a lowest domain dimension grain and time attribute. There’ is no need for complicated joins, common table expressions, or window functions. Each table contains the necessary metrics that describe that domain’s interaction across the entire business, regardless of where the data is coming from (for example, product pageviews and inventory are product concerns from different business processes).
- All metrics are fully additive across all dimensions. Users are able to leverage the ShopifyQL aggregation functions without worrying about which dimensions are conformed. This also makes table schemas relatable to spreadsheets, and easy to understand for business users with no experience in data modeling practices.
- Datasets support overlapping use cases. Users can calculate key metrics like total sales in multiple exploration datasets, whether the focus is on products, orders, or customers. This allows users to reconcile their work and gives them confidence in the queries they write.
Without the leverage of creating our own querying language, the characteristics above would require complex queries which would limit data exploration and analysis.
ShopifyQL Is a Foundational Piece of Our Platform
We built ShopifyQL for our Shopify Plus merchants, third-party developer partners, and ourselves as a way to serve merchant-facing commerce analytics.
Merchants can access ShopifyQL via our new first party app ShopifyQL Notebooks
We used the ShopifyQL APIs to build an app that allows our Shopify Plus merchants to write ShopifyQL queries inside a traditional notebooks experience. The notebooks app gives users the ultimate freedom of exploring their data, performing deep dives, and creating comprehensive data stories.
ShopifyQL APIs enable our partners to easily develop analytics apps
The Shopify platform allows third-party developers to build apps that enable merchants to fully customize their Shopify experience. We’ve built GraphQL endpoints for access to ShopifyQL and the underlying datasets. Developers can leverage these APIs to submit ShopifyQL queries and return the resulting data in the API response. This allows our developer partners to save time and resources by querying modeled data. For more information about our GraphQL API, check out our API documentation.
ShopifyQL will power all analytical experiences on the Shopify platform
We believe ShopifyQL can address all commerce analytics use cases. Our internal teams are going to leverage ShopifyQL to power the analytical experiences we create in the Shopify Admin—the online backend where merchants manage their stores. This helps us standardize our merchant-facing analytics interfaces across the business. Since we’re also the users of the language, we’re acutely aware of its gaps, and can make changes more quickly.
Looking ahead
We’re planning new language features designed to make querying with ShopifyQL even simpler and more powerful:
- More visualizations: Line and bar charts are great but, we want to provide more visualization options that help users discover different insights. New visualizations on the roadmap include dual axis charts, funnels, annotations, scatter plots, and donut charts.
- Pivoting: Pivoting data with a traditional SQL query is a complicated endeavor. We will simplify this with the capability to break down a metric by dimensional attributes in a columnar fashion. This will allow for charting trends of dimensional attributes across time for specific metrics with one simple query.
- Aggregate conditions: Akin to a HAVING statement in SQL, we are building the capability for users to filter their queries on an aggregate condition. Unlike SQL, we’re going to allow for this pattern in the WHERE clause, removing the need for additional language syntax and keyword ordering complexity.
As we continue to evolve ShopifyQL, our focus will remain on making commerce analytics more accessible to those looking to inform their decisions with data. We’ll continue to empower our developer partners to build comprehensive analytics apps, enable our merchants to make the most out of their data, and support our internal teams with powering their merchant-facing analytical use cases.
Ranko is a product manager working on ShopifyQL and data products at Shopify. He's passionate about making data informed decisions more accessible to merchants.
Are you passionate about solving data problems and eager to learn more, we’re always hiring! Reach out to us or apply on our careers page.