Getting Started with Data Products Series: Part One

September 7, 2023
Getting Started with Data Products Series: Part One

Data practitioners are now at the point where they need data to start pulling its weight. Considering the cost of storage, processing, and—most of all—querying, managing data as just assets simply won’t do anymore. A “data asset” isn’t valuable unless it becomes information; at rest, it has no value.

Enter the data product, an actionable approach that turns as assets into information, with all the elements of trust, management, metadata, and access built in. Data product creation and use is transforming the very nature of infrastructure today, because it saves time and money but also enables data to be used quickly for decisions and learning.

In this series, we’ll cover the basics of data products and how you can getting started using data products to drive value for your business

More data—and more data tools—don’t mean better results

Merely having data doesn’t solve anything. Without some way to unlock its value, your data becomes just another asset to manage. Many companies have begun deploying data tools to make their data usable, which has led the global data preparation tools market size to expand at a compound annual growth rate (CAGR) of 18.6% from 2021 to 2028.

There are thousands of data tools available on the market, and the average company uses an average of 15 tools in their stack. You would think that this sophisticated tooling means that businesses are getting greater value from their data, but if the data is trapped in systems that don’t maximize its potential, then data is back to being a burdensome and costly asset. Without a new approach, companies are struggling to benefit from the data they have access to, unable to understand, find, and take action on it. As a result, a whopping 73% of available data is wasted and left unused.

The problem: Data tools can be limiting

First and foremost, most tools available today are point solutions, which may be well-suited for performing singular functions such as ETL or cataloguing data. The majority of these tools lack the ability to perform multiple functions, so companies tend to individually add on to their stack as performance needs increase. This problem has grown worse as more and more data tools—and ultra-specialized tools at that—are released into the market. Now, even small companies find themselves using countless tools to handle their data, dramatically increasing the complexity of data workflow and making it difficult to get a comprehensive and accurate dataset. In addition, this also requires someone on the team to manage the integration of the stack altogether.

Second, many data tools in the mix means that it’s difficult for users to find data, especially if they’re new data users in a company. Users don’t know which tool to use to find certain data, there’s little indication of whether it’s the right or most up-to-date data, and it’s difficult to tell if data from a certain tool is even appropriate to use for their given need. And as companies continue to add on more and more tools—nesting an ever-increasing number of data tools together—it becomes exponentially more challenging to hunt down the right information.

Third, using a suite of disparate data tools means that each tool has its own protocols and its own data quality levels. Consequently, users don’t know if the data is accurate or if another tool is creating a different look of accuracy.

Finally, at their core, data tools are designed for use by data professionals working with data, not for the average user. They are often used by data analysts or data scientists to manipulate data in order to gain insights or create reports, and they may require coding knowledge or prior experience using other data tools for data cleansing, data transformation, data visualization, and data analysis. While this is fine when only data professionals are using data tools, it makes it difficult to achieve more widespread adoption by others in an organization, which in turn results in lower ROI.

The solution: Data products fill in the gaps

Faced with these compounding issues and increasing complexity, organizations can struggle to find solutions—and that’s where data products come in. A data product takes data from disparate primary source and bring them together to make data searchable, findable, accessible, and useable by an average business user. While data products can take many forms, such as software applications or online platforms, the key attribute is that they help transform distributed raw data into valuable information, knowledge, or actionable insights for users.

Next: Data products take you from "analysis-ready data" to "business-ready data” and the “four factors”...