Data science projects start with exploring a business problem and result in providing new insights as well as producing analytical models. Attempting to deliver a data science project without clear understanding of key components of the data science lifecycle and their mutual dependence most probably will lead a project to failure.

This practical course will give you a full overview of the data science lifecycle, from initial project planning, requirements gathering, sourcing and preparing data, creating machine learning models and publishing results. You will learn how to apply machine learning to business problems, how to decide the right machine learning techniques, how to build, train, test and validate your model and how to deploy the model.

A Practical Hands-On Course

The emphasis of this course is on practical applications and implementations. Concepts are taught through a combination of lectures, workshops and hands-on exercises. You will work with data science tools such as Python/R, TensorFlow, Scikit-learn, MLLib, Knime and RapidMiner.

Why attend

You will learn:

  • Definitions, concepts and terminology used in data science
  • To identify, qualify and prioritize data science projects that deliver business value
  • How to develop a data science project team
  • To use agile development techniques for data science projects
  • The foundations of machine learning
  • Preparing data for machine learning
  • To apply mathematical models to business problems
  • How to visualize the results of machine learning

Who should attend

  • Business analysts, data analysts and functional analysts who need to frame analytcal problems and opportunities to present solutions
  • Business and technical managers who need to understand the specialized nature of data science work
  • Data professionals, including Business Intelligence and analytics professionals, who work with data scientists and need to support the growing demand for data science
  • Project sponsors and managers who need a deeper understanding of data science and machine learning
  • Data scientists who just started to work on a data science project and want to get acquainted with an overall project structure
  • Anyone who aspires to become a data scientist

Please note that this class is NOT for data science developers or IT developers looking to write analytics programs in Java, Python, R or Scala. 


You should have some coding experience and basic knowledge of statistics.

Need custom training for your team?

Get a quote

Inquire about this course

This question is for testing whether or not you are a human visitor and to prevent automated spam submissions.



Data Science: Myths vs Realities

  • A short history of Data Science
  • Types of Analytics
    • Descriptive
    • Diagnostic
    • Predictive
    • Prescriptive
  • The relation between Analytics, Data Science, Artificial Intelligence, Natural Language Processing, Machine Learning and Deep Learning
  • Supervised, Semi-Supervised, Unsupervised and Reinforcement Learning
  • Characteristics of Data Science
    • Statistics
    • Data Scientist’s skills
    • Exploratory Analysis
  • The Data Scientist’s Toolbox
    • Platform & Tools
    • Python & R
    • Hadoop & Spark
    • Pros and cons of Open Source
    • Cloud Computing
    • Hand coding vs tools
    • Self-service Data Science
  • Workshop: Data Science vs Business Intelligence
  • Modern Data Architecture for Data Science

Data Science Lifecycle

  • Why Data Science projects succeed or fail
  • Using a methodology (CRISP-DM, SEMMA, KDD-process)
  • Data Science Process
    • Business Understanding
    • Data Understanding
    • Data Preparation
    • Machine Learning Modelling & Evaluation
    • Deployment
  • How to start a Data Science Project?
    • Project Planning
    • Readiness Checklist
    • The Human Factor
    • Technical Architecture
    • Technology Capabilities
    • Budget
    • Governance
  • Workshop: From Proof of Concept to Implementation

Data Engineering

  • Data Pipelines
  • Working with Modern Data Sources
  • The Process for Data Preparation and Feature Engineering
    • Data Understanding
      • Data Profiling
      • Data Usability
      • Quality Analysis
      • Exploratory Data Analysis (EDA)
    • Data Preperation
      • Data Integration
      • Data Cleaning
      • Data Selection
      • Data Transformation
  • Feature Selection
  • Choosing the Correct Analytic Model
  • Hand coding vs ETL tools vs Self-service Data Preparation tools
  • Workshop: Data Engineering with Python and Knime
    • Using the Jupyter notebook
    • Python: Load, clean, transform data: numpy and pandas libraries
    • Create a data preperation pipeline in Knime
    • Data exploration and visualisations: matplotlib and seaborn libraries vs PowerBI or Tableau

Machine Learning

  • The theory and maths behind machine learning models
  • Machine Learning Model Development Process
  • Machine learning models by example
    • Classification
      • Naive Bays
      • K-Nearest Neighbours 
      • Support Vector Machine
      • Decision Trees
      • Neural Networks
    • Prediction
      • Regression
      • Lineair Discriminant Analysis
      • Principal Component Analysis
    • Clustering
      • K-means clustering
    • Model validation
      • Fitting a ModeBias/Variance trade-off
      • Cross-validation
      • Accuracy, precision, recall, F1-score
  • Application of Models
    • What modelling techniques align with the business problem
  • Workshop 4: Modelling and model evaluation with Python or Knime

Deploying Machine Learning Models

  • Using models in a production environment
  • Agile, DevOps, CICD and Data Science
  • Governance, Compliance & Performance
  • Deployment, Monitoring and Maintenance Plan


David de Roos

David de Roos is a Senior Solution Architect in the area of Business Intelligence, (Big) Data Engineering and Data Science. He works on all aspects of Business Intelligence and Big Data Analytics projects, including data mining and predictive modelling. He experienced and implemented the change from simple analytical statistics to complex data science models. Next to designing data models, ETL processes and visualizing data, he implemented churn prediction, fraud detection and credit risk prediction models in Finance and Energy.

David obtained his Masters Degree in Sociology and Statistics at the Erasmus University Rotterdam. He co-authored articles about Political Preference Prediction and has experience in teaching Statistics. In addition to his regular work, he assists students who need statistical and methodological assistance when they graduate.


10 Dec10 Dec


The fee for this two-day course is EUR 1.450 per person. This includes two days of instruction, lunch and morning/afternoon snacks and course materials.

We offer the following discounts.

  • 10% discount for groups of 2 or more students from the same company registering at the same time.
  • 20% discount for groups of 5 or more students from the same company registering at the same time.

Note: Groups that register at a discounted rate must retain the minimum group size or the discount will be revoked. Discounts cannot be combined.

Copyright ©2019 Quest for Knowledge