TU Wien:Data Stewardship VO (Rauber)

Aus VoWi
Zur Navigation springen Zur Suche springen
Ähnlich benannte LVAs (Materialien):

Daten[Bearbeiten | Quelltext bearbeiten]

Vortragende Tomasz MiksaAndreas RauberMartin Weise
ECTS 3,0
Ersetzt Digital Preservation VO (Rauber)
Letzte Abhaltung 2024S
Sprache English
Mattermost data-stewardshipRegisterMattermost-Infos
Links tiss:194044, eLearning
Masterstudium Data Science Modul FDS/CO - Fundamentals of Data Science - Core
Masterstudium Business Informatics Modul DA/EXT - Data Analytics Extension (Gebundenes Wahlfach)
Masterstudium Medizinische Informatik Modul Informationsverarbeitung (Gebundenes Wahlfach)
Masterstudium Software Engineering & Internet Computing Modul Advanced Security (Gebundenes Wahlfach)

Inhalt[Bearbeiten | Quelltext bearbeiten]

noch offen, bitte nicht von TISS oder Homepage kopieren, sondern aus Studierendensicht beschreiben.

Ablauf[Bearbeiten | Quelltext bearbeiten]

noch offen

Benötigte/Empfehlenswerte Vorkenntnisse[Bearbeiten | Quelltext bearbeiten]

It helps to know a bit about ontologies and linked data, semantic web, metadata standards. Basic understanding of experiment design and related processes, with respect to reproducibility. Classes that may be a bit helpful (and possibly even cover some of the material presented here) are "Experiment Design for Data Science", and to a lesser degree "Introduction to Semantic Systems".

Vortrag[Bearbeiten | Quelltext bearbeiten]

noch offen

Übungen[Bearbeiten | Quelltext bearbeiten]

noch offen

Prüfung, Benotung[Bearbeiten | Quelltext bearbeiten]

SS22 - Exam 13.06.2022:

  • 20 MC questions (+2 points for right answer, 0 points for no answer, -1 for wrong answer)
  • 4 Open questions (15 points for each)

SS20 - Exam 30.09.2020:

  • multiple choice questions, only 1 open question
  • exam was apparently copied/randomly assigned from moodle or the like - i got asked 6-7 times the same multiple choice question. sometimes answers true/false were mixed or true/true or false/false were answer possibilities. overall, very unclear what to tick and what not. :/
  • Other student: SS20 - I agree one of the most unfriendliest exams I ever took in 9 semesters at TU Wien. Sometimes you could really argue a question to be true and false at the same time because there was not enough context given.
  • Other student: I did really prepare well for this exam, but I was lost. I confirm the above. It was really not clear if some of the answers were true or false. Everything could be argued and interpreted in the one or other direction. And as you said, the context was not given. Just some random sentence. AND: if you did not tick the right combination of answers, you lose all the points on most questions.

SS20 - Exam 15.12.2020:

  • oral exam online (Covid restrictions). 2-3 candidates at once, at least 2 rounds of questions. Questions or parts of questions were somtimes handed over to other candidates if enough or too little info was given. Grading seemed fair to me.

Dauer der Zeugnisausstellung[Bearbeiten | Quelltext bearbeiten]

(OUTDATED (format of exam etc. has changed): I was the only one who took the exam on 16.12.19, got the grade the same day in the evening)

Zeitaufwand[Bearbeiten | Quelltext bearbeiten]

Lots of material. Aiming for top grade may require going through the whole material 2 times + some extra material (see below). I used a month for ingesting stuff in smaller portions.

Unterlagen[Bearbeiten | Quelltext bearbeiten]

noch offen

Tipps[Bearbeiten | Quelltext bearbeiten]

SS20: Do NOT take this course!!! The new exam design is horrible and unfair, even if you prepare well for the exam it is hard to pass the course. A good grade is impossible, if you don't have a lot prior knowledge in this field and even then some single choice questions are just an impudence.

[Experience from oral exam Dec. 2020, no written exams due to Corona]:

For a grade in the top range, going through the slides 1-2 times completely was necessary for me, and additionally reading some of the existing summaries and the few documented exam questions, to get an idea what kind of questions are asked.

Attendance at the lecture helped in so far as to know which topics and parts of the slides are more important, which are less. The slides are quite verbose and there's so much material there.

This means one should start quite early with peparations. I stretched all the reading out over about a month, that made it easier to ingest the material in smaller pieces.

For some concepts, I resorted to skimming some of the papers that they cite in the slides; this may not be really necessary, but helped me understand some of the material better. Imho worth a look are (find the sections that seem important, not necessarily read the whole paper) (this is certainly not an exhaustive list, you should check on your own...):

  • Miksa et al (2019) - Ten principles for machine-actionable DMP
  • Miksa et al (2018) - Defining requirements for maDMP

Interesting was also this one:

  • Bajpai et al (2019) - The Dagstuhl Beginners Guide to the Reproducibility for Experimental Networking Research

A common theme that runs through some of the material is "machine-actionability". My understanding is that this is mostly achieved by using controlled vocabularies and (possibly community specific) metadata standards, as well as persistant identifiers and possibly typical communication protocols (like OAI-MPH). I'm not sure I missed the place where this is clearly defined, but it only dawned on my at some point that this is what this means. [and, as always, correct me if I'm wrong]

The slides are full of acronyms. A collection was started here https://vowi.fsinf.at/wiki/TU_Wien:Data_Stewardship_VO_(Rauber)/Questions_from_Slides (far from complete), but note: most of them don't seem to be so important. The top contenders of acronyms and the things behind them that one should know (I think):

  • FAIR (!)
  • DMP
  • maDMP
  • DC (Dublin Core)
  • DCAT
  • PID
  • DOI
  • ARK
  • OAIS
  • SIP
  • AIP
  • DIP
  • PDI
  • RDF

This is all my (subjective) opinion, your milage may very.

Highlights / Lob[Bearbeiten | Quelltext bearbeiten]

noch offen

Verbesserungsvorschläge / Kritik[Bearbeiten | Quelltext bearbeiten]

The slides are partially very verbose, in other places not very detailed. Slides alone are not enough for some topics. That is why I don't understand why there were no recordings of the lectures produced. I think that would be a tremendous help (yes, yes, there are some concerns about creating videos; imho they are all outweighed by the benefits).

A more concrete official collection of possible exam questions would probably help a lot with peparations for the exam, to get more of what is the focus among the huge amount of topics and material - there are some questions in the slide sets, but not in all of them.

This is clearly a quite dynamic field, with new stuff being added to the lecture often, it seems, based on practical experience of the lecturers. I guess this makes it not so easy to settle on more concrete forms of material; again, showing focus by providing some questions for each topic could be helpful.