Description

Introduction

The extract, transform, and load (ETL) phase of the data warehouse development lifecycle is the most difficult, time-consuming, and labor-intensive phase of building a data warehouse. A solid ETL system is reliable, accurate and high performant.

Joy Mundy, co-author with Ralph Kimball of The Data Warehouse Lifecycle Toolkit and The Kimball Group Reader, shows you how a properly designed ETL system extracts the data from the source systems, enforces data quality and consistency standards, conforms the data so that separate sources can be used together, and finally delivers the data in a presentation-ready format.

Joy will lead you through a rapid-fire one-day course on how to:

  • Choose the appropriate architecture for your ETL system
  • Plan and design your ETL system
  • Build the suite of ETL processes
  • Build a comprehensive data cleaning subsystem
  • Tune the overall ETL process for optimum performance
  • Determine the role of Big Data in your DW architecture
  • And much more 

Why attend

This vendor-neutral course helps you understand all the factors necessary for effectively designing the back room ETL system of your Kimball DW/BI environment. The course focuses on the overall architecture and design of the ETL system. We drill into the critical processes within the ETL system that are often overlooked. By the end of this course, you will understand how your data warehouse ETL system can be built to respond to ever-changing business requirements.

This is not a code-oriented implementation course; it is a vendor-neutral architecture course for the designer who must keep a broad perspective.

Who should attend

This course is designed for those responsible for building the back room ETL system of a data warehouse environment, including data warehouse team leads, ETL architects, ETL designers and developers, and data warehouse operational staff.

Prerequisites

This course assumes familiarity with the Kimball Approach to dimensional data warehousing. Students must have:

  • Attended the Lifecycle and Dimensional Modeling courses, or
  • Read The Data Warehouse Toolkit book, or
  • A lot of work experience!
Code: ETL2024
Price: EUR 725

Inquire about this course

Outline

 
 
 
 
 
 
 
 
 
 

Understanding the Requirements of ETL

  • Business needs
  • Defining the project
  • ETL development process flow
  • Technical requirements
  • The 34 subsystems of ETL
  • Available IT skills and licenses
  • Coding vs. tool

Extract the Data

  • Design review exercise
  • Profile data
  • Extract data
  • Capture changed data

Clean the Data

  • Data cleansing system
  • Audit dimension
  • Compliance tracking
  • Track error events
  • Deduplicate dimension data

Deliver Dimension Tables

  • Manage slowly changing dimensions
  • Populate mini-dimensions
  • Create and manage bridge tables
  • Manage hierarchies
  • Maintain special dimensions
  • Distribute dimension data

Deliver Fact Tables

  • Maintain transactional fact tables
  • Maintain periodic snapshots
  • Maintain accumulating snapshots
  • Calculate derived or consolidated fact tables
  • Lookup fact table surrogate keys
  • Handle late arriving data
  • Extending the dimensional model
  • Manage performance aggregations
  • Feed data to downstream systems

Manage the ETL Process

  • Schedule jobs and handle exceptions
  • Backup, recovery, and restartability
  • Monitor ETL workflow

Instructor

Joy Mundy

This course gives you the opportunity to learn directly from Joy Mundy. She co-authored, with Ralph Kimball and other members of Kimball Group, many of the popular “Toolkit” books including The Data Warehouse Lifecycle Toolkit (Second Edition), The Microsoft Data Warehouse Toolkit, and The Kimball Group Reader (Second Edition). Joy teaches the full course portfolio, previously taught by Kimball University for one simple reason: the methodology proves its value over and over in practice.

Joy Mundy has worked with business managers and IT professionals to prioritize, justify and implement large-scale business intelligence and data warehousing systems since 1992. She leverages these consulting experiences when teaching DW/BI courses. 

Joy began her career as a financial analyst, but soon decided that she enjoyed working with a wide variety of data. She learned the fundamentals of data warehousing by building a system at Stanford University, and then started a data warehouse consultancy in 1994. She worked at WebTV and Microsoft’s SQL Server product development team for a few years before returning to consulting with Kimball Group in 2004, until Kimball Group’s dissolution in 2016. Joy is now semi-retired, but loves teaching and the occasional consulting engagement. She graduated from Tufts University with a BS in Economics, and from Stanford University with an MS in Engineering-Economic Systems.

Dates

This course is only available as Customer Specific Training, whereby we can deliver private courses arranged at both a location (or virtual) and time to suit you, covering the right content to address your specific learning needs. Contact us by e-mail at info@q4k.com.

Copyright ©2023-2024 quest for knowledge