by Commoncog

← Back to articles

Getting Started with Data Collection

by Bennett Clement

Table of Contents

Let’s say that you’re running a business, and you wonder if having a good handle on data would help you run your business better. So you start thinking about where you’d get your business data from. You think about the various tools you use to run your business: you have a sales tool (perhaps Hubspot?), some product analytics, an email marketing tool, plus some customer onboarding tracked inside Amplitude. You know that each of them exposes some handful of metrics.

But these are a lot of tools and even more metrics! It’s so overwhelming. There’s so much to do. Where do you even start?

You remember reading about how big companies use data, and they appear to have it all figured out. Large companies have data warehouses, with fancy data teams that pull in reams of data from their software using expensive, custom-built data pipelines. They then run all these fancy data analyses on it. You often see their employees bragging about all their sophistication on LinkedIn.

All of this seems complicated and far away from your current business reality. Where do you even start?

Is there something so simple that you can do now to start making informed decisions?

The answer is, of course yes. Here’s a story from the early days of what is now a massively successful e-commerce business.

Back in 1997, Amazon had the Analytics Package that executives would read every month. They considered the Analytics Package a serious tool to make Amazon’s business legible.

The Analytics Package had 100 pages of graphs on every aspect of Amazon’s business! Revenue. Editorial. Marketing. Operations. Customer Service. Headcount. G&A. Customer sentiment. Market penetration. The lifetime value of a customer. Inventory turns.

And how did they create this Analytics Package, you ask? They didn’t use fancy tools. They manually entered the data for all of these pages of graphs into Excel. Every. Single. Month.

Eugene Wei, who prepared the monthly Analytics Package, tells the story:

This was back in the days before entire companies focused on building internal dashboards and analytical tools, so the Analytics Package was done with what we might today consider as comparable to twigs and dirt in sophistication. I entered the data by hand into Excel tables, generated and laid out the charts in Excel, and printed the paper copies.

After hours of laying out charts in Excel, Eugene still needed to print 100 copies of these 100-page documents. It was a lot of work. Plus, printers are not the most reliable piece of technology.

… A sheet will jam somewhere. The ink cartridge will go dry. How many collated copies do you risk printing at once? Too few and you have to go through the setup process again. Too many and you risk a mid-job error, which then might cascade into a series of ever more complex tasks, like trying to collate just the pages still remaining and then merging them with the pages that were already completed. [If you wondered why I had to insert page numbers by hand, it wasn’t just for ease of referencing particular graphs in discussion; it was also so I could figure out which pages were missing from which copies when the copy machine crapped out.]

You could try just resuming the task after clearing the paper jam, but in practice it never really worked. I learned that copy machine jams on jobs of this magnitude were, for all practical purposes, failures from which the machine could not recover.

I became a shaman to all the copy machines in our headquarters at the Columbia building. I knew which ones were capable of this heavy duty task, how reliable each one was. Each machine’s reliability fluctuated through some elusive alchemy of time and usage and date of the last service visit. Since I generally worked late into every night, I’d save the mass copy tasks for the end of my day, when I had the run of all the building’s copy machines.

Sometimes I could sense a paper jam coming just by the sound of machine’s internal rollers and gears. An unhealthy machine would wheeze, like a smoker, and sometimes I’d put my hands on a machine as it performed its service for me, like a healer laying hands on a sick patient. I would call myself a copy machine whisperer, but when I addressed them it was always a slew of expletives, never whispered. Late in my tenure as analyst, I got budget to hire a temp to help with the actual printing of the monthly Analytics Package, and we keep in touch to this date, bonded by having endured that Sisyphean labor.

Despite much manual work, preparing these charts helped Eugene understand how Amazon made its profit. He could even predict their revenue next quarter with high certainty. In his words,

For all the painful memories that cling to the Analytics Package, I consider it one of the formative experiences of my career. In producing it, I felt the entire organism of our business laid bare before me, its complexity and inner working made legible. The same way I imagine programmers visualizing data moving through tables in three dimensional space, I could trace the entire ripple out from a customer’s desire to purchase a book, how a dollar of cash flowed through the entire anatomy of our business. I knew the salary of every employee, and could sense the cost of their time from each order as the book worked its way from a distributor to our warehouse, from a shelf to a conveyor belt, into a box, then into a delivery truck. I could predict, like a blackjack player counting cards in the shoe, what % of customers from every hundred orders would reach out to us with an issue, and what % of those would be about what types of issues.

I knew, if we gained a customer one month, how many of their friends and family would become new customers the next month, through word of mouth. I knew if a hundred customers made their first order in January of 1998, what % of them would order again in February, and March, and so on, and what the average basket size of each order would be. As we grew, and as we gained some leverage, I could see the impact on our cash flow from negotiating longer payable days with publishers and distributors, and I’d see our gross margins inch upwards every time we negotiated better discounts off of list prices.


At Amazon, I could see our revenue next quarter to within a few percentage points of accuracy, and beyond. The only decision was how much to tell Wall Street we anticipated our revenue being. Back then, we always underpromised on revenue; we knew we’d overdeliver, the only question was how much we should do so and still maintain a credible sense of surprise on the next earnings call.

We now live in the digital world, so you no longer have to print out charts. But initially, a little manual work is still the easiest way to get started. Here’s a solid method we recommend:

  1. Make a copy of this Google Sheet
  2. Choose two or three input metrics and two or three output metrics that you care about.
  3. Every Monday morning, copy and paste data for these metrics out from all the various tools you use into the Google Sheets you just created. This Google Sheet will be your single source of truth for your business data.
  4. First, you need to grab some historical data; three months is a good starting point.
  5. If you have enough data points, you can start analyzing. Our favorite way to get started is to paste the data into Xmrit tool and see if any of our metrics have recently changed. In a future article, we will run through a case of analysis using Xmrit.

You can start small. And if this seems repetitive, remember: Amazon started the same way.

Yes, you may say that copying and pasting data every week is a huge pain. But the point is that you shouldn’t use the excuse of not having a budget or the tools—if the prize of understanding your business so that you can grow it is so great, then the pain of doing this manually is well worth it.

Amazon built a billion-dollar business on manual Excel input. If they could do it, so could you.

Last Updated: 21 May 2024

Want to learn more?

The Free Xmrit Email Course

Want to quickly get started with XmR charts? You'll learn …

  • How to use XmR charts to take action in your business.
  • Four major ways to use an XmR chart!
  • When XmR charts don't work so well.
  • When you can and cannot trust your limit lines.
  • And more …

One week. No spam. Just the basics.