TU Wien:Intelligent Audio and Music Analysis VU (Knees)

From VoWi
Jump to navigation Jump to search


Lecturers Alexander SchindlerRichard VoglPeter Knees
ECTS 4,5
Alias Intelligent Audio and Music Analysis (en)
Department Information Systems Engineering
When winter semester
Last iteration 2020WS
Language English
Links tiss:194039
Master Data Science Wahlmodul MLS/EX - Machine Learning and Statistics - Extension
Master Business Informatics Wahlmodul DA/EXT - Data Analytics Extension
Master Visual Computing Wahlmodul Media Understanding
Master Media and Human-Centered Computing Wahlmodul Media Understanding

Mattermost: Channel "intelligent-audio-and-music-analysis"RegisterMattermost-Infos

Subject Matter[edit]

Copied from TISS:

Selected topics from the areas of acoustic signal processing, auditory scene analysis, and music information retrieval are presented and discussed, comprising the following:

- Fundamentals of audio processing, analysis, and description

- Audio event detection and classification

- Detection and tracking of musical events

- Tracking of musical concepts (e.g., beats, meter, key)

- Instrument detection and transcription

- Real-time tracking of audio events

- Music genre classification and tagging

- Multi-modality in semantic music description

- Music retrieval and recommendation

Topics are contextualized historically, giving an overview of the development from hand-crafted features to recent deep learning based methods, including convolutional and recurrent neural networks. Emphasis is given to aspects of evaluation, such as used metrics and ground truth construction. Understanding of theoretical concepts is deepened through accompanying applied lab exercises.

Course Structure[edit]


Every two weeks an exercise sheet is published in the form of a jupyter notebook. Each sheet deals with a specific topic, usually from the previous lecture(s). See the Exercises section for more info.

At the end there was an online test focused on theory, though it did include some small exercises. See the Exam section for more info.

Required/Recommended Previous Knowledge[edit]


In my opinion, having an intermediate level of python knowledge is needed for this course, though if you don't have it, doing courses like Data-Oriented Programming Paradigms, even in parallel with this class, will force you to become much more versed at python.

I'd also say that some previous knowledge on machine learning, and especially neural networks is needed. If you lack it, you'll either have to invest more time into this class, or do a course like Machine Learning before starting this one.

Finally, I came into this course with some informal knowledge of audio processing from my hobby of music production. It definitely helped with some concepts, though if you are generally interested in this subject matter, you'll pick them up pretty quickly.



We had weekly zoom lectures on previous topics from three different lecturers: Peter Knees, Richard Vogl and Alexander Schindler. There was supposed to be a guest lecture but I believe it got cancelled.

I recommend definitely attending the lectures as the slides alone might not be enough to understand the exercises. Since they are not recorded officially, you can perhaps do it yourself or have a colleague do it for you and share the recordings.



The sheets contain cells with code skeletons and comments on how each part should be implemented, and some text-input cells that ask open questions about implementation decisions and tie back to some theoretical knowledge from the lecture. Some cells are already implemented and act as simple tests which make sure your code isn't completely broken, though the final grading is done by hand. In my opinion the grading is generous and it makes up for somewhat vague tasks that will have you searching the forums and mattermost. As far as I remember there was also an option to turn in assignments late for a reduced number of points.

The exercises were split into three topic blocks with about 3 sheets each. There was a mandatory number of points that needed to be acquired in each of those blocks in order to pass the course. While you can skip some exercises, I advise against it, as you will be re-using your own code a lot of the time. The tasks often reference previous exercises, so understanding those will always prove useful as you go along.

Exam, Grading[edit]


At the end of the lecture we had a 1 hour online exam in TUWEL, which included multiple-choice, single choice and open questions on the theory from the lectures, in addition to a task where you had to calculate some performance metrics like precision, recall, F1 and so on.

The grading is done in 4 blocks: three from the exercises and one consisting of the exam. They are all equally weighted, and while you do have to pass all four, your grade is calculated by taking into account only the best three. That means that you can focus on the exercise, barely pass the exam, and still have a 1.

As mentioned in the exercise section, I believe the grading is generous for both the test and the exam, though we had no information on how the exam was going to look like, making it hard to prepare oneself. There was a retake however, a month after the first slot.

The exam setup was also fairly strict and in my opinion unnecessary (two cameras, breakout rooms, the whole deal), since this is a niche course taken by about 20 students. By the time you get to the end, the lecturers know who you are and you will meet almost all of your colleagues in the forum/mattermost.

I will say that the organizers of this course did accommodate us students by pushing back two exercise deadlines, once because of technical issues, and once since we asked nicely and had a good reason for it. As with most sensible lecturers at TU, if you are realistic, make a a good case for yourself and ask nicely, concessions will be made to you.

Time taken to give out certificates[edit]

WS2020: The certificates were given out the same week as the second exam slot.

Time Requirement[edit]


As someone with perhaps sub-par python knowledge, but possessing strong fundamentals in machine learning and audio processing, each sheet took me about 2 full days to complete.




Still open

Improvement suggestions / Criticism[edit]

Still open