Data Catalogs

This course provides a comprehensive deep dive into data catalogs and their role in modern data governance, data engineering, and AI. It shows how catalogs support reusable data products, data marketplaces, and enterprise knowledge graphs for AI-driven use cases.

Data Catalogs

find a course date

description.

This one-day class looks in detail at what a data catalog is and why it is important to implement one. We will look at:

The data complexity challenges that companies are dealing with
How data catalogs can be used to deal with this problem
Why data catalogs are critical to making a unified approach to enterprise data governance possible
How can data catalogs accelerate data engineering to produce ‘business ready’ reusable data products and publish them in a data marketplace for analytical use
Why data catalogs are central to creating an enterprise knowledge graph and semantic layer for AI Agents

Why attend

Attendees will learn:

How data catalogs work and what their capabilities are
How to use data catalogs to discover, classify and catalog data in multiple data stores both on premises, across multiple clouds, and at the edge
How to use data catalogs in an enterprise data governance program to systematically classify data and set policies to govern classified data and content across a distributed data estate from a single place. This includes automatic discovery and classification of sensitive data, governance of data access security, data privacy, data loss prevention, data sharing, data usage, data retention, and data quality
How to use data catalogs to discover data that can be engineered in data integration pipelines to produce data products that can be published in a data marketplace
How data catalogs can be used to create an enterprise knowledge graph / semantic layer with one or more domain-specific ontologies to provide context for AI agents

Who should attend

This course is intended for business and IT professionals responsible for data engineering, data product provisioning, enterprise data governance (including data access security, data privacy, data sharing, data usage, data retention, and data quality) of structured and unstructured data, and also for those responsible for AI who need to implement an enterprise ontology and semantic layer for AI Agents. This includes Chief Data Officers, Heads of AI, Citizen and professional IT Data Engineers, Data Architects, Data Scientists, Heads of Data Governance, Data Stewards, Solution Architects, and Enterprise Architects.

Prerequisites

This course assumes a basic understanding of data governance, data management, metadata, data warehousing, data cleansing, data integration, enterprise ontologies and providing context for AI agents.

outline.

Introduction to Data Catalogs

This module looks at typical existing setups and challenges that companies are facing to explain why data catalogs are needed.

The ever-increasing complex distributed data landscape
The growth in new data sources
Disparate operational transaction systems
The emergence of data mesh and its impact with respect to data engineering and data architecture
The impact of ungoverned data
Major requirements facing companies with respect to data
What is a data catalog and why have one?
- What is a data catalog?
- Why have a data catalog? - data catalog use cases
- The data catalog software marketplace
- Core data catalog capabilities

The Importance of a Business Glossary

This session looks at the need to understand your data landscape from a business perspective. The key to making this happen is to establish a common business vocabulary in the business glossary of a data catalog to create common data names and definitions for your data. This enables you to understand the meaning of data, search for and govern data across your data estate from a business perspective and be able to use this business metadata to help create an enterprise semantic layer for AI agents.

Data standardisation using a common business vocabulary
The purpose of a common vocabulary in data governance and in semantic layers for AI
Business glossary software – a data catalog capability
Planning for a business glossary
Glossary roles and responsibilities
Glossary term submission, voting approval, and dispute resolution processes
Approaches to creating a common vocabulary
Organising data definitions in a business glossary
- Business glossaries, taxonomies, hierarchies
The role of a data concept model in establishing semantic meaning
Utilising a common vocabulary in data modelling, BI, AI, ETL, ESB, APIs, & MDM

Automated Data Discovery, Cataloguing and Business Glossary Mapping

Having defined your data, this session looks at discovering what data you have, where it is, and how it maps to your business glossary to provide a business understanding of the meaning of data in your data estate.

The critical role of data catalog software in understanding your data estate
The automated data discovery process
Registering data sources for discovery
Automated data discovery and data quality profiling using a data catalog
Mapping data assets to a business glossary

Classifying Data and Content to Know How to Govern It

This session looks at manually and automatically labelling data using a data catalog to know how to govern it using predefined classifiers, user-defined classification schemes, and trainable classifiers. It then looks at how classified data shows up in a data catalog and how policies can be assigned to labelled data to govern it across your data estate.

What is data classification?
Automated sensitive data type detection and classification using pre-defined trained classifiers
Creating your own data classification schemes for data confidentiality and retention
Manually classifying content using your own classification scheme
Training classifiers to automatically label unstructured content
Using trained classifiers to auto-label unstructured content in the cloud and on-premises
Using your own classification schemes to find data within a data catalog
Automatically classifying sensitive structured data using a data catalog
Using classification insights to understand sensitive data proliferation and data redundancy across your data estate
Setting policies in a data catalog to govern data across your data estate

Data Governance and Stewardship

This session looks at what the requirements are to govern data in a modern enterprise and how the requirements can be met using a data catalog.

Key requirements for governing data and content across a distributed data landscape
What do you need to know to govern data?
Introducing a data governance framework to help meet the challenge
People – key roles, responsibilities and data governance operating model
Core processes needed to establish and govern commonly understood data
The role of the data catalog in governing data for use in analytics and AI
Core data governance capabilities needed
Tasks involved in governing a distributed data estate

Accelerating Data Engineering Using a Data Catalog

This session looks at what the requirements are to accelerate data engineering in a modern enterprise and how the requirements can be met using a data catalog and data fabric software.

The role of the data catalog and data fabric in data engineering, data provisioning, and data sharing
Defining data products in a business glossary
Automatically discovering, mapping and classifying data in a data catalog
Data catalog integration with Data Fabric
Building data engineering pipelines to produce data products
Creating a data marketplace as a data catalog application to share business-ready data products
Publishing and consuming data products using a data marketplace, data contracts, and data catalog metadata

Building an Enterprise Semantic Layer for AI Agents Using a Data Catalog

This session looks at how a data catalog can be used to create an enterprise knowledge graph and common business meaning in a semantic layer to provide context for AI Agents.

The role of the data catalog as a knowledge graph for AI
Using a data catalog to capture business and technical metadata about your data and data relationships, and store it in a graph
Capturing and inferring lineage to understand dependencies
Provisioning business context metadata to AI Agents to understand meaning via an MCP server and GraphRAG queries

instructor.

Mike Ferguson

Mike Ferguson is the Managing Director of Intelligent Business Strategies Limited. As an independent IT industry analyst and consultant, he specializes in BI/Analytics and data management. With over 40 years of IT experience, Mike has consulted for dozens of companies on BI/analytics, data strategy, technology selection, data architecture and data management.

Mike is also conference chairman of Big Data LDN, the fastest-growing data and analytics conference in Europe and a member of the EDM Council CDMC Executive Advisory Board. He has spoken at events all over the world and written numerous articles.

Formerly he was a principal and co-founder of Codd and Date Europe Limited – the inventors of the Relational Model, a Chief Architect at Teradata on the Teradata DBMS.

He teaches popular master classes in Data Warehouse Modernization, Big Data Architecture & Technology, How to Govern Data Across a Distributed Data Landscape, Practical Guidelines for Implementing a Data Mesh (Data Catalog, Data Fabric, Data Products, Data Marketplace), Real-Time Analytics, Embedded Analytics, Intelligent Apps & AI Automation, Migrating your Data Warehouse to the Cloud, Modern Data Architecture and Data Virtualisation & the Logical Data Warehouse.

dates & price.

Course

Delivery Method

Dates

Location

Price

Data Catalogs

Virtual Live Classroom

8 hours

02 Jun - 02 Jun '26

Show Class Times

09:00 - 17:00 (CET)

Virtual

EUR

725,00 (ex. VAT)

Data Catalogs

1 day

17 Nov - 17 Nov '26

Amsterdam

Show Address

Steigenberger Airport Hotel Amsterdam, Stationsplein Zuid-West 951
1117 CE Schiphol, Netherlands

EUR

725,00 (ex. VAT)

Register

pricing.

The fee for this course is EUR 725,00 (+VAT) per person.

We offer the following discounts:

10% discount for groups of 2 or more students from the same company registering at the same time
20% discount for groups of 4 or more students from the same company registering at the same time

Note: Groups that register at a discounted rate must retain the minimum group size, or the discount will be revoked. Discounts cannot be combined.

Data Catalogs

Data Catalogs

description.

Why attend

Who should attend

Prerequisites

outline.

Introduction to Data Catalogs

The Importance of a Business Glossary

Automated Data Discovery, Cataloguing and Business Glossary Mapping

Classifying Data and Content to Know How to Govern It

Data Governance and Stewardship

Accelerating Data Engineering Using a Data Catalog

Building an Enterprise Semantic Layer for AI Agents Using a Data Catalog

instructor.

Mike Ferguson

dates & price.

Course

Delivery Method

Dates

Location

Price

pricing.

related content.

What is the Role of a Data Catalog in Data Governance Programs?

What is Data Product and What is the Best Architecture to Build Data Products?