Rethink Metadata … Mind Map [Business, Capabilities & Products]

Deepak Chandramouli
StellarSense
Published in
4 min readNov 14, 2021

--

Background Photo by Ricardo L on Unsplash

Previously …

We had discussed Relevance of Metadata in the Blog Series.

In this part, let’s explore various Metadata Products and Capabilities, and also see how these connect to deliver value for the top level Business Functions.

We will be navigating bottom up — the 3 levels of Products, Capabilities & Business functions.

1. Specialized metadata products are necessary to operate with high efficiency in various facets of an Enterprise.

  • App Lifecycle Management — is key to managing end-to-end lifecycle of Applications. Few key features include — Onboarding App, Orchestration, integration with Data Ecosystem, Upstream and Downstream management, etc,.
  • Data Catalog help manage the lifecycle of data assets providing many features such as schema, dictionary, ownership, usage, health metrics, links to applications, data classifications, etc.
  • ML Catalogs — are becoming prominent in companies that are increasingly becoming ML driven. ML registration, versioning, release management, ownership, lineage and serving are few prominent features.
  • API Catalogs — have been here for a long time and are extremely critical for managing and operating micro services at scale in almost all tech-driven industries.
  • (DLP) Data Loss Prevention — is critical to identifying sensitive data, monitoring, detecting and preventing data breaches.
  • Inventory catalog — helps track the lifecycle of physical assets such as database servers, hardware and many other data center related details.
  • Identity & Access metadata — provide single source of truth for all roles, policies, target systems, databases, role owners and role subscribers.
  • Org Graphs — provide user metadata connecting employees, organizations, locations and various forms of org hierarchies.
  • Ownership Tools — Map Data and Technical stacks to owners and manage the lifecycle of ownership.

2. Capabilities are delivered by collating features from one or more metadata products.

There are many infrastructure products in a technical ecosystem. Core capabilities are created via combination of features from several products.

In the above diagram, I’ve generalized various products and capabilities. Now lets look at few examples capabilities to understand further — how an array of products help deliver each capability.

  • Data LineageTo create a holistic data lineage capability : metadata from various systems have to be collated. Such system include API Catalog, App Catalog, Data Catalog, ML Catalog. Thanks to another blog post that articulates data lineage in layman terms.
  • Sensitive Data HandlingA combination of DLP (Data Loss Prevention), Data Catalog and App Catalog and is key to effectively track, manage and secure sensitive data assets.
  • Data Risk & Exposure — are key problems from the lenses of Information Security and Data Risk management. By collating the details facets such as Infra, Data, DLP, Identity and Access, the Chief Technology and Security offices could track and measure Security and Risk related KPIs and KRIs on the tech landscape.
  • Recommendations — is a capability that is useful for Efficiency. By connecting users, access policies, assets — we can generate connected components and surface key insights and recommendations for new User Onboarding and reducing learning curve.
  • Knowledge graphs — By connecting various facets, we may be able to generate many graph centric insights such as Connected Components, Centrality. These could be very useful for identifying high risk components in the tech stack — catering to the needs of Data Security & Risk.

Along similar lines …

  • Data Rationalization is directly linkable to both Efficiency and Governance. Combining details from Lineage, Access, Stats and Similarities — we can distinguish top and relevant datasets from the depreciable ones. Having such insights enables optimization — thus reducing significantly the cost of operation and ownership .
  • Data Risk, Security & Compliance needs are well served when we are able to link Data assets, Ownership, Access Stats and Data Class. Correlating these dimensions: an organization can help assess Risk and take appropriate actions to secure data and be compliant.

3. At Top level — an array of capabilities help impact business functions.

Big Picture

  • Metadata centric products are necessary to operate with high efficiency in individual facets of an enterprise.
  • However a combination of products are often necessary to creating higher capabilities that serve top level business functions.

To quote a couple of examples —

  • Capabilities to handle Data Subject Rights, Data Classification, Lineage, Ownership and Accountability help meet many of the Privacy, Governance, Security and Risk requirements.
  • Operational Visibility, Landscape Visibility, Data Rationalization help achieve Efficiency.

Take away

  • A variety of metadata centric products in the technical landscape produce and consume metadata.
  • Collating metadata from various products is often necessary to deliver capabilities for top level business functions.
  • While the individual products fulfill their domains operationally, by nature this has manifested many inefficiencies, gaps and opportunities at an Enterprise level.

Up next…

Background Image by Yeshi Kangrang on Unsplash

--

--