Wintersemester 2024/2025
Mining Massive Datasets (IMMD)
Lecture (Prof. Dr. Artur Andrzejak)
Mondays 14:00-16:00, INF 230 COS (Centre for Organismal Studies), großer Hörsaal.
Tutorials
Tutorial 1: Wednesdays 16:00-18:00, INF 205 Mathematikon, SR B.
Tutorial 2: Thursdays 16:00-18:00, INF 205 Mathematikon, SR B.
Registration for exercises: Müsli (mandatory)
Further Information: Moodle, Müsli, Module description
Please register via Müsli before the first lecture. The tutorial day can be changed later.
Course Content Overview
- Programming paradigms for massively parallel data processing, especially Apache Spark and Google's JAX.
- Algorithms and application cases for scalable data analysis:
- Recommendation systems
- Search for similar objects (locality sensitive hashing)
- Large graph mining and link analysis
- Mining of data streams
- Finding frequent itemsets
- Advertising on the Web
- Graph neural networks.
- Deep Learning Transformer models, Large Language Models (LLMs), and their interpretability.
Check the accompanying books for more details on most of these topics:
- Jure Leskovec, Anand Rajaraman, Jeff Ullman, Mining of Massive Datasets, 3rd ed., 2020, (PDF version is free)
- Dzejla Medjedovic, Emin Tahirovic, Algorithms and Data Structures for Massive Datasets, 2022, free online via HEIDI
- Sebastian Raschka, Build a Large Language Model (From Scratch), September 2024, free video via HEIDI
Softwarepraktikum "AI Methods and Tools for Programming"
Softwarepraktikum for beginners / for advanced
Softwarepraktikum für Anfänger / für Fortgeschrittene
Mondays 16:00-18:00, INF 205 Mathematikon, SR 1.
Topics and details of the application process will be discussed during the first meeting, on 14 October 2024 at 16:00 CEST. Please register in Müsli before the first meeting and attend this meeting (in person or online) if you want to participate. A link to a Zoom meeting will be sent via email on October 14 to all registered participants.
Further information: Moodle, Müsli, heiCO (beginners)-TBU., heiCO (advanced)-TBU., Modulhandbuch: IAP, IFP