Data Warehouse ETL: The Kimball Approach

Joy Mundy, co-author of Ralph Kimball’s best-selling books and former partner of the Kimball Group, will show you how a properly designed ETL system extracts the data from the source systems, enforces data quality and consistency standards, conforms the data so that separate sources can be used together, and finally delivers the data in a presentation-ready format.

Data Warehouse ETL: The Kimball Approach

description.

The extract, transform, and load (ETL) phase of the data warehouse development lifecycle is the most difficult, time-consuming, and labor-intensive phase of building a data warehouse. A solid ETL system is reliable, accurate and high performant.

Joy Mundy, co-author with Ralph Kimball of The Data Warehouse Lifecycle Toolkit and The Kimball Group Reader, shows you how a properly designed ETL system extracts the data from the source systems, enforces data quality and consistency standards, conforms the data so that separate sources can be used together, and finally delivers the data in a presentation-ready format.

Joy will lead you through a rapid-fire  1-day course on how to:

  • Choose the appropriate architecture for your ETL system
  • Plan and design your ETL system
  • Build the suite of ETL processes
  • Build a comprehensive data cleaning subsystem
  • Tune the overall ETL process for optimum performance
  • Determine the role of Big Data in your DW architecture
  • And much more 

 

Why attend

This vendor-neutral course helps you understand all the factors necessary for effectively designing the back room ETL system of your Kimball DW/BI environment. The course focuses on the overall architecture and design of the ETL system. We drill into the critical processes within the ETL system that are often overlooked. By the end of this course, you will understand how your data warehouse ETL system can be built to respond to ever-changing business requirements.

This is not a code-oriented implementation course; it is a vendor-neutral architecture course for the designer who must keep a broad perspective.

 

Who should attend

This course is designed for those responsible for building the back room ETL system of a data warehouse environment, including data warehouse team leads, ETL architects, ETL designers and developers, and data warehouse operational staff.

 

Prerequisites

This course assumes familiarity with the Kimball Approach to dimensional data warehousing. Students must have:

  • Attended the Lifecycle and Dimensional Modeling courses, or
  • Read The Data Warehouse Toolkit book, or
  • A lot of work experience!

outline.

Understanding the Requirements of ETL

  • Business needs
  • Defining the project
  • ETL development process flow
  • Technical requirements
  • The 34 subsystems of ETL
  • Available IT skills and licenses
  • Coding vs. tool

Extract the Data

  • Design review exercise
  • Profile data
  • Extract data
  • Capture changed data

Clean the Data

  • Data cleansing system
  • Audit dimension
  • Compliance tracking
  • Track error events
  • Deduplicate dimension data

Deliver Dimension Tables

  • Manage slowly changing dimensions
  • Populate mini-dimensions
  • Create and manage bridge tables
  • Manage hierarchies
  • Maintain special dimensions
  • Distribute dimension data

Deliver Fact Tables

  • Maintain transactional fact tables
  • Maintain periodic snapshots
  • Maintain accumulating snapshots
  • Calculate derived or consolidated fact tables
  • Lookup fact table surrogate keys
  • Handle late arriving data
  • Extending the dimensional model
  • Manage performance aggregations
  • Feed data to downstream systems

Manage the ETL Process

  • Schedule jobs and handle exceptions
  • Backup, recovery, and restartability
  • Monitor ETL workflow

instructor.

Joy Mundy

Joy Mundy

This course gives you the opportunity to learn directly from Joy Mundy. She co-authored, with Ralph Kimball and other members of Kimball Group, many of the popular “Toolkit” books including The Data Warehouse Lifecycle Toolkit (Second Edition), The Microsoft Data Warehouse Toolkit, and The Kimball Group Reader (Second Edition). Joy teaches the full course portfolio, previously taught by Kimball University for one simple reason: the methodology proves its value over and over in practice.

Joy Mundy has worked with business managers and IT professionals to prioritize, justify and implement large-scale business intelligence and data warehousing systems since 1992. She leverages these consulting experiences when teaching DW/BI courses.

Joy began her career as a financial analyst, but soon decided that she enjoyed working with a wide variety of data. She learned the fundamentals of data warehousing by building a system at Stanford University, and then started a data warehouse consultancy in 1994. She worked at WebTV and Microsoft’s SQL Server product development team for a few years before returning to consulting with Kimball Group in 2004, until Kimball Group’s dissolution in 2016. Joy is now semi-retired, but loves teaching and the occasional consulting engagement. She graduated from Tufts University with a BS in Economics, and from Stanford University with an MS in Engineering-Economic Systems.

dates & price.

This course is offered exclusively as Customer Specific Training, whereby we can deliver private courses - on-site or virtually - at a time that works best for you, with content tailored to your team’s specific learning needs.

 

Need more information?

Simply leave your details in our contact form, and a member of our team will be in touch shortly to discuss your requirements.

Related Content

Dimensional Modeling: The Kimball Approach

 

Don't miss the opportunity to learn directly from Joy Mundy, co-author with Ralph Kimball and other members of Kimball Group, of many of the popular “Toolkit” books. Learn techniques for developing your dimensional model, from the basics to the most advanced practices.

Learn more

Kimball Community