TU Wien:Semistrukturierte Daten VU (Woltran)/Zusammenfassung
Zur Navigation springen
Zur Suche springen
Semi-structured data[Bearbeiten | Quelltext bearbeiten]
- SSD can be represented as a labeled tree
XML[Bearbeiten | Quelltext bearbeiten]
- An element consists of:
- start tag:
<element-name>
- content
- empty
- simple content — text
- element content — one or more elements
- mixed content — text and elements
- end tag:
</element-name>
- start tag:
Empty elements can be abbreviated as <element-name/>
.
- XML is case-sensitive.
- XML predefines five entity references: < & > " '
- <!-- this is a comment -->
- <? this is a processing instruction ?>
- default namespace is declared as
xmlns="name"
Document Type Definition (DTD)[Bearbeiten | Quelltext bearbeiten]
A DTD lists all the elements and attributes the document uses. The order of declarations is not significant.
<!ELEMENT person (name, tel, fax, email+)>
<!ELEMENT name (#PCDATA)>
<!ATTLIST person id_number ID #REQUIRED>
If a document matches the schema, it is valid, otherwise, it is invalid. Validation errors may be ignored by applications.
Document Type Declaration:
<!DOCTYPE person SYSTEM "http://www.example.com/dtds/person.dtd">
The location can also be relative.
XML Schema Definition (XSD)[Bearbeiten | Quelltext bearbeiten]
- simple elements
- Contain only text. We can add restrictions.
Built-in types:
- xsd:boolean, xsd:string, xsd:decimal, xsd:integer, xsd:date, xsd:time, etc.
- Restrictions on Values
- xsd:minInclusive
- xsd:maxInclusive
- xsd:minExclusive
- xsd:maxExclusive
- xsd:enumeration
- xsd:pattern (regex)
- xsd:whiteSpace
- xsd:length
- xsd:minLength
- xsd:maxLength
- Complex elements
- Contain other elements and/or attributes.
Order Indicators
- all
- choice
- sequence
Occurence Indicators
- minOccurs
- maxOCcurs
Keys and References
XPath[Bearbeiten | Quelltext bearbeiten]
13 axes
Extensible Stylesheet Language Transformations (XSLT)[Bearbeiten | Quelltext bearbeiten]
Templates match the input document, and define the output.
Templates for the subtree are only called with
<xsl:apply-templates/>
XSLT has the following default templates:
- for root and elements: apply templates for child elements
- for text elements: copy content to output
- for attributes: copy value to output
Exactly one template is executed, more specific XPaths are prioritized.
XQuery[Bearbeiten | Quelltext bearbeiten]
- FLWOR
- for ... let ... where ... order by ... return ...
Parsing[Bearbeiten | Quelltext bearbeiten]
- event-based: SAX (Simple API for XML)
- tree-based: DOM (Document Object Model)