Welcome to another installment of Dog People Dig Data! Last time, we shared our philosophy towards pedagogy, the method and practice of teaching. This week, we’ll dive in on some specific challenges we’ve been facing with data democratization and how we address them.
Challenge: source of truth
When multiple people are writing queries and creating reports, multiple ways of looking at the data will arise—and conflicts do too. People ask similar questions in a slightly different way and define business metrics differently. This can have benefits. For one, it forces us to examine our assumptions. In the best case scenario, it makes everyone better since there is a wealth of approaches and the best can be selected. However, the lack of a single Source of Truth presents an ugly side if it’s left to grow unchecked. A feeling of uncertainty about the data can seep into the discourse. Carried to its worst conclusion, it leads to a dysfunctional atmosphere where meetings are spent arguing over which numbers are the real ones. This is counter to the open, collaborative culture we want to build.
To address this challenge, the Data Engineering team takes documentation and standardization very seriously.
- Together with the Analyst Team, we maintain a set of Standard Reports, abstractions that are commonly used throughout the organization. We promote these abstractions in our documentation and recommendations.
- We maintain a Data Warehouse Data Dictionary, meticulously documenting the meaning of each field and just how each metric is derived. When new abstractions are added, or new data sources are onboarded, adding documentation is just part of the task. These living documents are a cornerstone of our data ecosystem.
Challenge: quality assurance
With folks learning SQL, errors can and will pop up. In fact, we intentionally want to create an atmosphere where people are not afraid to try something new, and not afraid to make a mistake along the way. Users will generate inaccurate results, and they will misinterpret data. Our approach to this challenge isn’t about preventing mistakes, but about identifying them and turning them into valuable learning opportunities that make everyone better.
Through our established highly collaborative and data-driven culture, peer reviews are a standard practice. The more high-profile the work, the more pairs of eyes will scrutinize it. Since the overall level of data literacy in the organization is high, there are plenty of qualified reviewers. Our team leads by example, demonstrating a standard procedure for answering data validation issues and publishing our exploratory analysis, even if it’s messy and imperfect.
Currently, we have more people leveraging more data from more sources more effectively than we ever have before, but there’s always so much more to do! We’re constantly exploring and evaluating.
- When and how do we best collaborate vs. specialize on a wide variety of tasks?
- Where can we automate, and how high of a priority is automation?
- When do we escalate a request to a data engineer or data scientist?
- How do we provide opportunities for people to grow their skills continuously?
We’re developing procedures organically as we go. One way we establish these norms as they crystalize is via our intranet site, which serves as a hub and starting point for all things Data. There, we carefully catalogue and archive helpful guides, walkthroughs, resources, procedure documents, and answers to FAQs. Our growth brings a constant press of new challenges to bear. Together, we will continue to care for the culture of autonomy and accountability that we value.
Sarah Johnson is a data engineer at Rover. Prior to joining Rover, she was a high school math teacher, actuary and business intelligence analyst at a leading pet insurance company. 📈📇🐶🤓✨