TU Wien:Intelligent Audio and Music Analysis VU (Knees)
|Lecturers||Alexander Schindler• Richard Vogl• Peter Knees|
|Alias||Intelligent Audio and Music Analysis (en)|
|Department||Information Systems Engineering|
Copied from TISS:
Selected topics from the areas of acoustic signal processing, auditory scene analysis, and music information retrieval are presented and discussed, comprising the following:
- Fundamentals of audio processing, analysis, and description
- Audio event detection and classification
- Detection and tracking of musical events
- Tracking of musical concepts (e.g., beats, meter, key)
- Instrument detection and transcription
- Real-time tracking of audio events
- Music genre classification and tagging
- Multi-modality in semantic music description
- Music retrieval and recommendation
Topics are contextualized historically, giving an overview of the development from hand-crafted features to recent deep learning based methods, including convolutional and recurrent neural networks. Emphasis is given to aspects of evaluation, such as used metrics and ground truth construction. Understanding of theoretical concepts is deepened through accompanying applied lab exercises.
Every two weeks an exercise sheet is published in the form of a jupyter notebook. Each sheet deals with a specific topic, usually from the previous lecture(s). See the Exercises section for more info.
At the end there was an online test focused on theory, though it did include some small exercises. See the Exam section for more info.
Required/Recommended Previous Knowledge
In my opinion, having an intermediate level of python knowledge is needed for this course, though if you don't have it, doing courses like Data-Oriented Programming Paradigms, even in parallel with this class, will force you to become much more versed at python.
I'd also say that some previous knowledge on machine learning, and especially neural networks is needed. If you lack it, you'll either have to invest more time into this class, or do a course like Machine Learning before starting this one.
Finally, I came into this course with some informal knowledge of audio processing from my hobby of music production. It definitely helped with some concepts, though if you are generally interested in this subject matter, you'll pick them up pretty quickly.
We had weekly zoom lectures on previous topics from three different lecturers: Peter Knees, Richard Vogl and Alexander Schindler. There was supposed to be a guest lecture but I believe it got cancelled.
I recommend definitely attending the lectures as the slides alone might not be enough to understand the exercises. Since they are not recorded officially, you can perhaps do it yourself or have a colleague do it for you and share the recordings.
The sheets contain cells with code skeletons and comments on how each part should be implemented, and some text-input cells that ask open questions about implementation decisions and tie back to some theoretical knowledge from the lecture. Some cells are already implemented and act as simple tests which make sure your code isn't completely broken, though the final grading is done by hand. In my opinion the grading is generous and it makes up for somewhat vague tasks that will have you searching the forums and mattermost. As far as I remember there was also an option to turn in assignments late for a reduced number of points.
The exercises were split into three topic blocks with about 3 sheets each. There was a mandatory number of points that needed to be acquired in each of those blocks in order to pass the course. While you can skip some exercises, I advise against it, as you will be re-using your own code a lot of the time. The tasks often reference previous exercises, so understanding those will always prove useful as you go along.
At the end of the lecture we had a 1 hour online exam in TUWEL, which included multiple-choice, single choice and open questions on the theory from the lectures, in addition to a task where you had to calculate some performance metrics like precision, recall, F1 and so on.
The grading is done in 4 blocks: three from the exercises and one consisting of the exam. They are all equally weighted, and while you do have to pass all four, your grade is calculated by taking into account only the best three. That means that you can focus on the exercise, barely pass the exam, and still have a 1.
As mentioned in the exercise section, I believe the grading is generous for both the test and the exam, though we had no information on how the exam was going to look like, making it hard to prepare oneself. There was a retake however, a month after the first slot.
The exam setup was also fairly strict and in my opinion unnecessary (two cameras, breakout rooms, the whole deal), since this is a niche course taken by about 20 students. By the time you get to the end, the lecturers know who you are and you will meet almost all of your colleagues in the forum/mattermost.
I will say that the organizers of this course did accommodate us students by pushing back two exercise deadlines, once because of technical issues, and once since we asked nicely and had a good reason for it. As with most sensible lecturers at TU, if you are realistic, make a a good case for yourself and ask nicely, concessions will be made to you.
Time taken to give out certificates
WS2020: The certificates were given out the same week as the second exam slot.
As someone with perhaps sub-par python knowledge, but possessing strong fundamentals in machine learning and audio processing, each sheet took me about 2 full days to complete.
Improvement suggestions / Criticism