4 Tips for Shipping Data Products Fast

4 Tips for Shipping Data Products Fast

Shipping a new product is hard. Doing so under tight time constraints is even harder. It’s no different for data-centric products. Whether it’s a forecast, a classification tool, or a dashboard, you may find yourself in a situation where you need to ship a new data product in a seemingly impossible timeline. 

Shopify’s Data Science team has certainly found itself in this situation more than a few times over the years. Our team focuses on creating data products that support our merchants’ entrepreneurial journeys, from their initial interaction with Shopify, to their first sale, and throughout their growth journey on the platform. Commerce is a fast changing industry, which means we have to build and ship fast to ensure we’re providing our merchants with the best tools to help them succeed.

Along the way, our team learned a few key lessons for shipping data products quickly, while maintaining focus and getting things done efficiently—but also done right. Below are four tips that are proven to help you ship new products fast. 

1. Utilize Design Sprints 

Investing time in a design sprint pays off down the line as you approach deadlines. The design sprint (created by Google Ventures) is “a process for answering critical business questions through design, prototyping and testing ideas with customers.” Sprints are great for getting a new data product off the ground quickly because they carve out specific time blocks and resources for you and your team to work on a problem. The Shopify Data Science teams make sprints a common practice, especially when we’re under a tight deadline. When setting up new sprints, here are the steps we like to take:

  1. Choose an impactful problem to tackle. We focus on solving problems for our merchants, but in order to do that, we first have to uncover what those problems are by asking questions. What is the problem we’re trying to solve? Why are we solving this problem? Asking questions empowers you to find a problem worth tackling, identify the right technical solution and ultimately drive impact.
  2. Assemble a small sprint team: Critical to the success of any sprint is assembling a small team (no more than 6 or 7) of highly motivated individuals. Why a small team? It’s easier to stay aligned in a smaller group due to better communication and transparency, which means it’s easier to move fast.
  3. Choose a sprint Champion: This individual should be responsible for driving the direction of the project and making decisions when needed (should we use solution A or B?). Assigning a Champion helps remove ambiguity and allow the rest of the team to focus their energy on solving the problem in front of them.
  4. Set your sprint dates: Timeboxing is one of the main reasons why sprints are so effective. By setting fixed dates, you're committing your team to focus on shipping on a precise timeline. Typically, a sprint lasts up to five days. However, the timeline can be shortened based on the size of the project (for example, three days is likely enough time for creating the first version of a dashboard that follows the impact of COVID-19 on the business’s acquisition funnel).

With your problem identified, your team set up, and your dates blocked off, it’s now time to sprint. Keep in mind while exploring solutions that solving a data-centric problem with a non-data focused approach can sometimes be simple and time efficient. For instance, asking a user for its preferred location rather than inferring it using a complex heuristic.

2. Don’t Skip Out on Prototyping

Speed is critical! The first iterations of a brand new product often go through many changes. Prototypes allow for quick and cheap learning cycles. They also help prevent the sunk cost fallacy (when a past investment becomes a rationale for continuing). 

In the data world, a good rule of thumb is to leverage spreadsheets for building a prototype. Spreadsheets are a versatile tool that help accelerate build times, yet are often underutilized by data scientists and engineers. By design, spreadsheets allow the user to make sense of data in messy contexts, with just a few clicks. The built-in functions cover most basic use cases: 

  • cleaning data by hand rapidly
  • displaying graphs
  • computing basic ranking indices
  • formatting output data.

While creating a robust system is desirable, it often comes at the expense of longer building times. When releasing a brand new data product under a tight timeline, the focus should be on developing prototypes fast. 

A sample Google Sheet dashboard evaluating Inbound Leads.  The dashboard consists of 6 charts.  The 3 line charts on the left measure Lead Count, Qualification Rate %, and Time to Qualification in Minutes.  The 3 bar charts on the right  measure Leads by Channel, Leads by Country, and Leads by Last Source Touched.
An example of a dashboard prototype created within Google Sheets.

Despite a strong emphasis on speed, a prototype should still look and feel professional. For example, the first iteration of the Marketing attribution tool developed for Shopify’s Revenue team was a collection of SQL queries automated by a bash script. The output was then formatted in a spreadsheet. This allowed us to quickly make changes to the prototype and compare it to out-of-the-box tools. We avoided any wasted effort spinning up dashboards, production code, as well as any sentimental attachment to the work, which made it easier for the best solution to win.

3. Avoid Machine Learning (on First Iterations)

When building a new data product, it’s tempting to spend lots of time on a flashy machine learning algorithm. This is especially true if the product is supposed to be “smart”. Building a machine learning model for your first iteration can cost a lot of time. For example, when sprinting to build a lead scoring system for our Sales Representatives, our team spent 80% of the sprint gathering features and training a model. This left little time to integrate the product with the existing customer relationship management (CRM) infrastructure, polish it, and ask for feedback. A simple ranking using a proxy metric would be much faster to implement for the first iteration. The time gained would allow for more conversations with the users about the impact, use and engagement with the tool. 

We took that lesson to heart in our next project when we built a sales forecasting tool. We started with a linear regression using only two input variables that allowed us to have a prototype ready in a couple of hours. Using a simple model allowed us to ship fast and quickly learn whether it solved our user’s problem. Knowing we were on the right track, we then built a more complex model using machine learning.

Focus on building models that solve problems and can be shipped quickly. Once you’ve proven that your product is effective and delivers impact, then you can focus your time and resources on building more complex models. 

4. Talk to Your Users

Shipping fast also means shipping the right product. In order to stay on track, gathering feedback from users is invaluable! It allows you to build the right solution for the problem you’re tackling. Take the time to talk to your users, before, during, and after each build iteration. Shadowing them, or even doing the task yourself is a great return on investment.

Gathering feedback is an art. Entire books and research papers are dedicated to it. Here are the two tips we use at Shopify that increased the value of feedback we’ve received:

  • Ask specific questions. Asking, “Do you have any feedback?” doesn’t help the user direct their thoughts. Questions like, “How do you feel about the speed at which the dashboard loads?” or “Are you able to locate the insights you need on this dashboard to report on top of funnel performance?” are more targeted and will yield richer feedback.
  • Select a diverse group of users for feedback. Let’s suppose that you are building a dashboard that’s going to be used by three regional teams. It’s more effective to send a request for feedback to one person in each team rather than five people in a single team.
A sample Google Form that measures Prototype A's Scoring.  The form consists of 2 questions. The first question is "Is the score easy to parse and interpret? It is scored using a ranking from 1 - 5 with 1 = Very Hard and 5 = Very Easy. The 2nd question is "Additional Comments" and has a text field for the answer.
Feedback our team asked for the scoring system we created. When asking for feedback, you want to ask specific questions so you can yield better feedback.

We implemented the two tips when requesting feedback from users of the sales forecasting tool highlighted in the previous section. We asked a diverse group specific questions about our product, and learned that displaying a numerical score (0 - 100) was confusing. The difference between scores wasn’t clear to the users. Instead, it was suggested to display grades (A, B, C) which turned out to be much quicker to interpret and led to a better user experience.

At Shopify, following these tips has provided the team with a clearer path for launching brand new data products under tight time constraints. More importantly, it helped us avoid common pitfalls like getting stuck during neverending design phases, overengineering complex machine learning systems, or building data products that users don’t use. 

Next time you’re under a tight timeline to ship a new data product, remember to:

  1. Utilize design sprints to help focus your team’s efforts and remove the stress of the ticking clock
  2. Don’t skip on prototyping, it’s a great way to fail early
  3. Avoid machine learning (for first iterations) to avoid being slowed down by unnecessary complexity
  4. Talk to your users so you can get a better sense of what problem they’re facing and what they need in a product

If you’d like to read more about shipping new products fast, we recommend checking out The Design Sprint book, by Jake Knapp et al. which provides a complete framework for testing new ideas.

If you’re interested in helping us ship great data products, quickly, we’re looking for talented data scientists to join our team.