Skip to main content

Wintersemester 2024/2025


Mining Massive Datasets (IMMD)

Lecture (Prof. Dr. Artur Andrzejak)

Mondays 14:00-16:00, INF 230 COS (Centre for Organismal Studies), großer Hörsaal.

Tutorials

Tutorial 1: Wednesdays 16:00-18:00, INF 205 Mathematikon, SR B.

Tutorial 2: Thursdays 16:00-18:00, INF 205 Mathematikon, SR B.

Registration for exercises: Müsli (mandatory)

Further Information: Moodle, Müsli, Module description

Please register via Müsli before the first lecture. The tutorial day can be changed later.

Course Content Overview
  • Programming paradigms for massively parallel data processing, especially Apache Spark and Google's JAX.
  • Algorithms and application cases for scalable data analysis:
    • Recommendation systems
    • Search for similar objects (locality sensitive hashing)
    • Large graph mining and link analysis
    • Mining of data streams
    • Finding frequent itemsets
    • Advertising on the Web
    • Graph neural networks.
  • Deep Learning Transformer models, Large Language Models (LLMs), and their interpretability.

Check the accompanying books for more details on most of these topics:


Softwarepraktikum "AI Methods and Tools for Programming"

Softwarepraktikum for beginners / for advanced

Softwarepraktikum für Anfänger / für Fortgeschrittene

Mondays 16:00-18:00, INF 205 Mathematikon, SR 1.

Topics and details of the application process will be discussed during the first meeting, on 14 October 2024 at 16:00 CEST. Please register in Müsli before the first meeting and attend this meeting (in person or online) if you want to participate. A link to a Zoom meeting will be sent via email on October 14 to all registered participants.

Further information: Moodle, Müsli, heiCO (beginners)-TBU., heiCO (advanced)-TBU., Modulhandbuch: IAP, IFP