Lecture "Data Warehousing and Data Mining Techniques"

Information
Classification: 
Master Informatik, Master Wirtschaftsinformatik
Credits: 
4 or 5 (depending on course of study and exam regulations)
Exam: 
Oral
Regular Dates: 
Every Tuesday, starting 16th October Room IZ 161
09:45-12:15 Hrs.
Contents
Contents: 

In this course, we examine the aspects of building, maintaining, and operating data warehouses and give an insight into the main knowledge discovery techniques. The course deals with basic issues like the storage of data, execution of analytical queries and data mining procedures.

This course will be in English.

The general structure of the course is as follows:

  • Typical DW use case scenarios
  • Basic architecture of DW
  • Data modelling on conceptual, logical and physical levels
  • Multidimensional E/R modelling
  • Cubes, dimensions, measures
  • Query processing, OLAP queries (OLAP vs OLTP), roll-up, drill down, slice, dice, pivot
  • MOLAP, ROLAP, HOLAP
  • SQL99 OLAP operators, MDX
  • Snowflake, star and starflake schemas for relational storage
  • Multimedia physical storage (linearization)
  • DW Indexing as search optimization mean: R-Trees, UB-Trees, Bitmap indexes
  • Other optimization procedures: data partitioning, star join optimization, materialized views
  • ETL
  • Association rule mining, sequence patterns, time series
  • Classification: Decision trees, naive Bayes classifications, SVM
  • Cluster analysis: K-means, hierarchical clustering, agglomerative clustering, outlier analysis
  • Deep Learning
Materials
  Date Topic Slides Exercises Videos
1 16.10.2018 Introduction Slides   Video
2 23.10.2018 Architecture Slides   Video
3 30.10.2018 Modeling Slides   Video
4 06.11.2018 Indexing

Slides

B-Trees (from RDB 2 Lecture)

  Video
5 13.11.2018 Optimization Slides   Video
6 20.11.2018 OLAP Operations & Queries Slides   Video
7 27.11.2018 Build the DW, ETL Slides  

Video

Google Refine

8 04.12.2018 Real-Time DW Slides   Video
9 11.12.2018 DM Overview & Association Rule Mining Slides   Video
10 18.12.2018 Sequence Patterns & Time Series Durability Slides   Video
11 08.01.2019 Classification Slides   Video
12 15.01.2019 Clustering Slides   Video
13 22.01.2019 Meta-Algorithms for Classification Slides   Video
14 29.01.2019 Deep Learning Slides   Video