TU Wien:Recommender Systems VU (Sacharidis)

Aus VoWi
Zur Navigation springen Zur Suche springen
Ähnlich benannte LVAs (Materialien):

Daten[Bearbeiten | Quelltext bearbeiten]

Diese LVA wird nicht mehr von dieser Person angeboten, ist ausgelaufen, oder läuft aus und befindet sich daher nur noch zu historischen Zwecken im VoWi.
Vortragende Dimitrios Sacharidis
ECTS 3
Letzte Abhaltung 2020S
Sprache English
Mattermost recommender-systemsRegisterMattermost-Infos
Links tiss:194035, eLearning
Zuordnungen
Masterstudium Data Science
Masterstudium Business Informatics


Inhalt[Bearbeiten | Quelltext bearbeiten]

Various recommendation systems are introduced in a structurally clear and understandable manner. Students learn to understand basic principles of recommending items to users with focus on rating-based data (items rated f.e. on scale 1-10, rather than simply likes).

The lecture starts by explaining user-user collaborative filtering (CF), goes on to item-item CF, then introduces matrix factorisation methods, explore more advanced recommendation methods, such as factorisation machines, and finally describes evaluation and performance metrices.

Ablauf[Bearbeiten | Quelltext bearbeiten]

Weekly lecture, 4 assignments to be done individually, a group project and a final exam

Benötigte/Empfehlenswerte Vorkenntnisse[Bearbeiten | Quelltext bearbeiten]

simple matrix operations and mathematical optimisation

experience with python and the numpy and pandas modules

Vortrag[Bearbeiten | Quelltext bearbeiten]

noch offen

Übungen[Bearbeiten | Quelltext bearbeiten]

4 assignments to be solved individually, 1 project in groups of 5+ people.

  • Assignment 1: Implement a User-User recommender system, important code already predefined, such that student only needs care about the algorithmic part of the problem
  • Assignment 2: Implement a matrix factorisation recommender system
  • Assignment 3: Implement a content based recommender
  • Assignment 4: Prepare previous algorithms as modules and benchmark
  • Project:[SS 2020]: Twitter RecSys Challenge. Can be partially combined with Data-intensive Computing class.

Depending on previous experiences with matrix calculations and python around 2-6 hours per assignment when working alone.

Prüfung, Benotung[Bearbeiten | Quelltext bearbeiten]

Exam: About 4-5 open questions, to be answered in 1-2 sentences or with a mathematical formula. Those aim mostly at understanding, i.e. they ask about some of the concepts covered (e.g. similarity between items, ...) but give different examples for applying them. Additionally, for about 2-3 different topics, about 2-5 True/False statementes each. The exam is open book, i.e. all material from the lecture and self-prepared material can be used. The material was not checked. Even computers where allowed "as long as they were used for checking the slides".

The exam lasts for 1 hour [physical retake exam of SS 2020], which is not a lot but still OK. There was also a TUWEL-based exam as the main exam of SS 2020, due to Covid-19 situation.


Dauer der Zeugnisausstellung[Bearbeiten | Quelltext bearbeiten]

noch offen

Zeitaufwand[Bearbeiten | Quelltext bearbeiten]

noch offen

Unterlagen[Bearbeiten | Quelltext bearbeiten]

Slides provided on TUWEL. In SS 2020 there was a recorded video for each slide set, due to Covid-19 situation (no physical lectures), which was a plus (compared to other lectures).

Tipps[Bearbeiten | Quelltext bearbeiten]

For the exam, it may help to summarize the most important stuff and bring some notes and the slides to the exam. Also, try to critically think about the concepts and try to ask different questions about the statements given in the slides, as the exam aks more for understanding of the concepts with examples a little different that what is covered in slides and exercises (but in a constructive way imho).

Assignment1 Bonus points for vectorization:

  • use user_avgs
  • check the indptr attribute of a sparse matrix
  • use it in combination with np.diff
  • check np.repeat
  • use and understand what the .data attribute is for

Assignment2 Bonus points for vectorization: There is an easy way and a hard way to get the bonus points here.

  • Easy way:
    • Use only matrix operations to get the batch updates
    • Efficient use of the X_U and X_I matrix to compute the updates. Think about what a dot product of a binary matrix does
  • Hard way:
    • Use numpy functionality
    • Google how to use numpy reduceat in this context

Verbesserungsvorschläge / Kritik[Bearbeiten | Quelltext bearbeiten]

Slides are partly quite minimalistic. A more exhaustive and detailed script would be very much appreciated.