Data Products – summary

Data products are the end result of file or data movements to the cloud; ETL; processing; de-duplication; curation and storage in a consumable layer. There is no hard definition of what a data product might be. A data product is composed of a data set(s); code, metadata and is self-standing.

TOGAF’s architectural domains — Business, Data, Application, and Technology — are used to set up the layout. Let’s examine quickly the structural components on the diagram.

Business Architecture: Motivation

A data mesh or fabric means that data is viewed as a domain-owned product, and data has an intrinsic value and purpose for the organisation.

In most cases, this purpose is bound to particular business objectives, especially when we are talking about consumer-oriented data products. It serves pre-defined (business) analytics needs.

It doesn’t imply that the data product is restricted to this particular analytical requirement. Indeed, it can effectively and flexibly cater to other analytics demands, much like traditional data or information marts. However, there is a primary analytical scenario that motivates its development.

Business Architecture: Organization

There are two main actors: the data owner and the data product engineering team.

A product owner — a person who leverages the business value of the data product, can drive the data product development, and takes care of evangelizing it across an organization.

An engineering team — a team of people who are closest to the business and closest to the data, capable of implementing the end-to-end flow within a business domain. Ideally, it should be dedicated to the appropriate business domain. This allows for decentralized ownership, eliminating the need for cross-team synchronization.

Data Architecture

Business Domain & Corresponding Business Glossary: Data product always belongs to a specific business domain that actually owns a data product.

A business glossary — is a dictionary of terms used to describe the purpose of the data product and all its data attributes

External Data Source — an external source of data, used to populate the data product. Not all data products use external data sources, but for those which are source-oriented, it is always the case.

Sibling Data Product — alternatively, if a data product is not source-oriented, but rather aggregate-oriented or consumer-oriented, you need to determine which other data products are used for data ingestion.

Application Architecture

Consumption Interfaces — various channels or methods through which users interact with and utilize the data product.

Consumption Applications — software applications or tools through which users consume or interact with data products. These applications are designed to provide users with access to the data product’s features, functionalities, and insights

Technology Architecture

Technology Profile — A technology stack required for the implementation of the current data product.

Platform and Technology services define a set of interfaces to provision required resources.

Data Product Components

Once you’ve defined the external dependencies, the next step involves defining the data product components. These are the main parts that enable the data product to serve its purpose.