1 of 117

Hierarchical data formats:�XML, XML Schema, XPath, XSLT

Jakub Klímek

2 of 117

XML

2

3 of 117

Example of a human readable message

Dear John Doe,

the balance on your bank account 111333444/1123 as of 3rd of January 2021 is 25000 CZK.

Best regards,

Your bank

1234 5th Avenue

+420123456789

3

4 of 117

Example of a tagged message - 2/3

Dear <customer><firstName>John</firstName> <lastName>Doe</lastName></customer>,

the balance on your bank account <accountNumber>111333444/1123</accountNumber> as of <balanceDate>3rd of January 2021</balanceDate> is <balance><value>25000</value> <currency>CZK</currency></balance>.

Best regards,

<bankName>Your bank</bankName>

<streetAddress>1234 5th Avenue</streetAddress>

<phone>+420123456789</phone>

4

5 of 117

Example of a tagged message - 3/3

<?xml version="1.0" encoding="UTF-8"?>

<message>

Dear <customer><firstName>John</firstName> <lastName>Doe</lastName></customer>,

the balance on your bank account <accountNumber>111333444/1123</accountNumber> as of <balanceDate>3rd of January 2021</balanceDate> is <balance><value>25000</value> <currency>CZK</currency></balance>.

Best regards,

<bankName>Your bank</bankName>

<streetAddress>1234 5th Avenue</streetAddress>

<phone>+420123456789</phone>

</message>

5

Document-centric or Document-oriented�XML

6 of 117

6

We “hang” the data from the Catalog entity

7 of 117

Example of data-centric XML data

<?xml version="1.0" encoding="UTF-8"?>

<catalog>

<title>My catalog</title>

<description>This is my dummy catalog</description>

<contact-point>

<name>John Doe</name>

<e-mail>mailto:john@doe.org</e-mail>

</contact-point>

<datasets>

<dataset>

<title>My first dataset</title>

</dataset>

<dataset>

<title>My second dataset</title>

</dataset>

</datasets>

</catalog>

7

Data-centric or�Data-oriented�XML

8 of 117

XML - eXtensible Markup Language, 1.0 and 1.1

Extensible Markup Language (XML) 1.0

  • W3C Recommendation, First edition, 1998
  • widely adopted
  • element and attribute names use Unicode 2.0 and list permitted characters

Extensible Markup Language (XML) 1.1

  • W3C Recommendation, Second edition, 2006
  • not so widely adopted
  • element and attribute names use Unicode and list forbidden characters
    • Therefore unicode version independent

Extensible Markup Language (XML) 1.0 (Fifth Edition)

  • W3C Recommendation, Fifth edition, 2008
  • relaxes element and naming restrictions (towards XML 1.1)

<?xml version="1.0" encoding="UTF-8"?>

<!-- XML declaration -->

<!-- root element -->

<root-element>

<!-- an empty element -->

<element/>

<!-- attributes of an element -->

<element attribute1="value"

attribute2="another value">...</element>

<!-- an element with subelements -->

<element attribute1="value" attribute2="value">

<subelement>TEXT CONTENT</subelement>

<subelement>...</subelement>

</element>

</root-element>

8

9 of 117

XML - eXtensible Markup Language

<?xml version="1.0" encoding="UTF-8"?>

<!-- XML declaration -->

<!-- root element -->

<root-element>

<!-- an empty element -->

<element/>

<!-- attributes of an element -->

<element attribute1="value"

attribute2="another value">...</element>

<!-- an element with subelements -->

<element attribute1="value" attribute2="value">

<subelement>TEXT CONTENT</subelement>

<subelement>...</subelement>

</element>

</root-element>

9

Prolog (XML declaration)

  • version (1.0 or 1.1)
  • encoding (utf-8, … - case-insensitive)

One root element

Empty element�“/” at the end

Attributes

unordered, case-sensitive, unique within tag

Nested elements subelements, ordered

start tag�case-sensitive

end tag�“/” at the beginning

Comments

<!-- multiline -->

In Czech legacy systems also:�iso-8859-2, windows-1250, utf-16

10 of 117

XML - Mixed content

<?xml version="1.0" encoding="UTF-8"?>

<message>

Dear <customer><firstName>John</firstName> <lastName>Doe</lastName></customer>,

the balance on your bank account <accountNumber>111333444/1123</accountNumber> as of <balanceDate>3rd of January 2021</balanceDate> is <balance><value>25000</value> <currency>CZK</currency></balance>.

Best regards,

<bankName>Your bank</bankName>

<streetAddress>1234 5th Avenue</streetAddress>

<phone>+420123456789</phone>

</message>

10

Mixed content

Elements containing both text and nested elements��ordered

11 of 117

Basic syntax errors in XML documents

11

<?xml version="1.0" encoding="UTF-8"?>

<elementA>

<!-- a bad enclosing symbol -->

<elementB>...</elementb>

<!-- an element without an enclosing symbol -->

<elementD>

<!-- a bad element nesting -->

<elementE>

<elementF>

</elementE>

</elementF>

</elementA>

<!-- another root element -->

<elementG />

Case mismatch

B vs. b

elementD started, but not ended�missing </elementD>

elementE started first and also ended first�first, the nested elementF needs to be closed

There can be only one root

12 of 117

XML document well-formedness

XML document is well-formed iff it complies with XML syntax rules.

12

13 of 117

XML namespaces

13

<table>� <tr>� <td>Apples</td>� <td>Bananas</td>� </tr>�</table>

<table>� <name>African Coffee Table</name>� <width>80</width>� <length>120</length>�</table>

14 of 117

XML namespaces

14

<h:table>� <h:tr>� <h:td>Apples</h:td>� <h:td>Bananas</h:td>� </h:tr>�</h:table>

<f:table>� <f:name>African Coffee Table</f:name>� <f:width>80</f:width>� <f:length>120</f:length>�</f:table>

15 of 117

XML namespaces, Qualified name (QName)

15

<root>� <h:table xmlns:h="http://www.w3.org/TR/html4/">� <h:tr>� <h:td>Apples</h:td>� <h:td>Bananas</h:td>� </h:tr>� </h:table>

<f:table xmlns:f="https://www.w3schools.com/furniture">� <f:name>African Coffee Table</f:name>� <f:width>80</f:width>� <f:length>120</f:length>� </f:table>�</root>

16 of 117

XML namespaces

16

<root xmlns:h="http://www.w3.org/TR/html4/"� xmlns:f="https://www.w3schools.com/furniture">� <h:table>� <h:tr>� <h:td>Apples</h:td>� <h:td>Bananas</h:td>� </h:tr>� </h:table>� <f:table>� <f:name>African Coffee Table</f:name>� <f:width>80</f:width>� <f:length>120</f:length>� </f:table>�</root>

17 of 117

XML namespaces - default namespace

17

<root xmlns="http://www.w3.org/TR/html4/">� <table>� <tr>� <td>Apples</td>� <td>Bananas</td>� </tr>� </table>�</root>

18 of 117

XML CDATA sections

<?xml version="1.0" encoding="UTF-8"?>

<elementA>

<![CDATA[<greeting>Hello, world!</greeting>]]>

</elementA>

18

Everything between <![CDATA[ and ]]> is treated as string

19 of 117

Natural language specification - xml:lang attribute

<?xml version="1.1" encoding="UTF-8"?>

<document>

<p xml:lang="en">The quick brown fox jumps over the lazy dog.</p>

<p xml:lang="en-GB">What colour is it?</p>

<p xml:lang="en-US">What color is it?</p>

<sp who="Faust" desc='leise' xml:lang="de">

<l>Habe nun, ach! Philosophie,</l>

<l>Juristerei, und Medizin</l>

<l>und leider auch Theologie</l>

<l>durchaus studiert mit heißem Bemüh'n.</l>

</sp>

</document>

19

Language tags specified by IETF BCP 47

consisting of�RFC 4646: Tags for Identifying Languages, and RFC 4647: Matching of Language Tags

(same as in RDF, HTTP headers, …)

20 of 117

XML processing instruction (PI)

<?xml-stylesheet type="text/xsl" href="style.xsl"?>

PI not part of XML data

Must be passed to applications by XML parser

Syntax similar to XML declaration in prolog, but prolog is not a PI.

20

Targeted application

21 of 117

XML entities, entity references, character references

Predefined XML entities

< &lt;�> &gt;�& &amp;�' &apos;�" &quot;

There are also other types of entities�(omitted in this lecture)

Character references

< &#60; - decimal�< &#x3C; - hexadecimal

<?xml version="1.0" encoding="UTF-8"?>

<elementA>

This is content of &lt;elementA&gt; with

entity references.

</elementA>

<?xml version="1.0" encoding="UTF-8"?>

<elementB>

This is content of &#60;elementB&#62; with

character references.

</elementB>

21

22 of 117

XML constructs not covered here

DTD - Document Type Definition

<!DOCTYPE TVSCHEDULE [

<!ELEMENT TVSCHEDULE (CHANNEL+)>

<!ELEMENT CHANNEL (BANNER,DAY+)>

<!ELEMENT BANNER (#PCDATA)>

<!ELEMENT DAY (DATE,(HOLIDAY|PROGRAMSLOT+)+)>

<!ELEMENT HOLIDAY (#PCDATA)>

<!ELEMENT DATE (#PCDATA)>

<!ELEMENT PROGRAMSLOT (TIME,TITLE,DESCRIPTION?)>

<!ELEMENT TIME (#PCDATA)>

<!ELEMENT TITLE (#PCDATA)>

<!ELEMENT DESCRIPTION (#PCDATA)>

<!ATTLIST TVSCHEDULE NAME CDATA #REQUIRED>

<!ATTLIST CHANNEL CHAN CDATA #REQUIRED>

<!ATTLIST PROGRAMSLOT VTR CDATA #IMPLIED>

<!ATTLIST TITLE RATING CDATA #IMPLIED>

<!ATTLIST TITLE LANGUAGE CDATA #IMPLIED>

]>

XML entities

  • part of DTD
  • used to reference text or data from a place in XML document

22

<?xml version="1.0" encoding="utf-8"?>

<!DOCTYPE TVSCHEDULE SYSTEM "tvschedule.dtd">

<TVSCHEDULE>

23 of 117

Bad example - XML (XHTML) for tabular data

<?xml version="1.0" encoding="UTF-8"?>

<table>

<tr>

<th>Name</th>

<th>Age</th>

<th>Coffees per day</th>

</tr>

<tr>

<td>John</td>

<td>20</td>

<td>2</td>

</tr>

<tr>

<td>Jane</td>

<td>18</td>

<td>1</td>

</tr>

<tr>

<td>Steve</td>

<td>31</td>

<td>5</td>

</tr>

</table>

Obviously, it CAN be done. But is it right?

  • XML elements do not reflect the meaning of data
    • they describe the generic structure of a table here
    • good maybe for printing and human reading

23

Name

Age

Coffees per day

John

20

2

Jane

18

1

Steve

31

5

24 of 117

Bad example - XML for tabular data

<?xml version="1.0" encoding="UTF-8"?>

<consumers>�����

<consumer>

<name>John</name>

<age>20</age>

<coffees-per-day>2</coffees-per-day>

</consumer>

<consumer>

<name>Jane</name>

<age>18</age>

<coffees-per-day>1</coffees-per-day>

</consumer>

<consumer>

<name>Steve</name>

<age>31</age>

<coffees-per-day>5</coffees-per-day>

</consumer>

</consumers>

Obviously, it CAN be done. But is it right?

  • Better usage of XML elements, but still
    • XML’s hierarchical nature is not used
      • it is not necessary here
    • We unnecessarily force data consumers to use more complex tools => CSV would fit better here

24

Name

Age

Coffees per day

John

20

2

Jane

18

1

Steve

31

5

25 of 117

Example - XML

<?xml version="1.0" encoding="UTF-8"?>

<consumers>

<consumer>

<name>John</name>

<drinks>

<coffee-type when="morning">V60</coffee-type>

<coffee-type when="after lunch">Batch brew</coffee-type>

<coffee-type when="afternoon">Flat white</coffee-type>

</drinks>

<age>20</age>

<coffees-per-day>2</coffees-per-day>

</consumer>

<consumer>

<name>Jane</name>

<drinks>

<coffee-type when="morning">Aeropress</coffee-type>

<coffee-type when="after lunch">Cappuccino</coffee-type>

<coffee-type when="afternoon">Double espresso</coffee-type>

</drinks>

<age>18</age>

<coffees-per-day>1</coffees-per-day>

</consumer>

</consumers>

25

Name

Drinks

Age

Coffees per day

John

V60, Batch brew, Flat white

20

2

Jane

Aeropress, Cappuccino, Double espresso

18

1

❌ - violates 1NF and still misses information

Name

Drinks

When

Age

Coffees per day

John

V60

morning

20

2

John

Batch brew

after lunch

20

2

John

Flat white

afternoon

20

2

“flattening” - starts being redundant ❌

26 of 117

Example - XML

<?xml version="1.0" encoding="UTF-8"?>

<consumers>

<consumer>

<name>John</name>

<drinks>

<coffee-type when="morning">V60</coffee-type>

<coffee-type when="after lunch">Batch brew</coffee-type>

<coffee-type when="afternoon">Flat white</coffee-type>

</drinks>

<age>20</age>

<coffees-per-day>2</coffees-per-day>

</consumer>

<consumer>

<name>Jane</name>

<drinks>

<coffee-type when="morning">Aeropress</coffee-type>

<coffee-type when="after lunch">Cappuccino</coffee-type>

<coffee-type when="afternoon">Double espresso</coffee-type>

</drinks>

<age>18</age>

<coffees-per-day>1</coffees-per-day>

</consumer>

</consumers>

26

Name

Age

Coffees per day

John

20

2

Jane

18

1

Name

Drinks

When

John

V60

morning

John

Batch brew

after lunch

John

Flat white

afternoon

Starts getting complex for sharing ❌

27 of 117

Principal ways of XML processing in apps - DOM

How can we process XML data in an application?

1. Document object model (DOM)

Loads the entire XML document into memory

  • does not work for streams
  • does not work for large files
  • supports arbitrary querying�(a.k.a. random access)

e.g. XPath /consumers/consumer[1]/name

<?xml version="1.0" encoding="UTF-8"?>

<consumers>

<consumer>

<name>John</name>

<drinks>

<coffee-type when="morning">V60</coffee-type>

<coffee-type when="after lunch">Batch brew</coffee-type>

<coffee-type when="afternoon">Flat white</coffee-type>

</drinks>

<age>20</age>

<coffees-per-day>2</coffees-per-day>

</consumer>

<consumer>

<name>Jane</name>

<drinks>

<coffee-type when="morning">Aeropress</coffee-type>

<coffee-type when="after lunch">Cappuccino</coffee-type>

<coffee-type when="afternoon">Double espresso</coffee-type>

</drinks>

<age>18</age>

<coffees-per-day>1</coffees-per-day>

</consumer>

</consumers>

27

28 of 117

Principal ways of XML processing in apps - SAX

How can we process XML data in an application?

2. Simple API for XML (SAX)

Processes the XML file as a stream of events

  • works for streams
  • works for large files
  • does not support effective arbitrary querying (a.k.a. random access)

SAX parser pushes events to your code�

  • until it reads the whole file

<?xml version="1.0" encoding="UTF-8"?>

<consumers>

<consumer>

<name>John</name>

<drinks>

<coffee-type when="morning">V60</coffee-type>

<coffee-type when="after lunch">Batch brew</coffee-type>

<coffee-type when="afternoon">Flat white</coffee-type>

</drinks>

<age>20</age>

<coffees-per-day>2</coffees-per-day>

</consumer>

<consumer>

<name>Jane</name>

<drinks>

<coffee-type when="morning">Aeropress</coffee-type>

<coffee-type when="after lunch">Cappuccino</coffee-type>

<coffee-type when="afternoon">Double espresso</coffee-type>

</drinks>

<age>18</age>

<coffees-per-day>1</coffees-per-day>

</consumer>

</consumers>

28

29 of 117

Principal ways of XML processing in apps - SAX

SAX parser pushes events to your code:

  1. Element start (name = "consumers")
  2. Element start (name = "consumer")
  3. Element start (name = "name")
  4. Text value (value = "John")
  5. Element end (name = "name")
  6. Element start (name="drinks")
  7. Element start (name="coffee-type")
  8. Attribute (name="when" value="morning")
  9. Text value (value="V60")
  10. Element end (name="coffee-type")
  11. ...

<?xml version="1.0" encoding="UTF-8"?>

<consumers>

<consumer>

<name>John</name>

<drinks>

<coffee-type when="morning">V60</coffee-type>

<coffee-type when="after lunch">Batch brew</coffee-type>

<coffee-type when="afternoon">Flat white</coffee-type>

</drinks>

<age>20</age>

<coffees-per-day>2</coffees-per-day>

</consumer>

<consumer>

<name>Jane</name>

<drinks>

<coffee-type when="morning">Aeropress</coffee-type>

<coffee-type when="after lunch">Cappuccino</coffee-type>

<coffee-type when="afternoon">Double espresso</coffee-type>

</drinks>

<age>18</age>

<coffees-per-day>1</coffees-per-day>

</consumer>

</consumers>

29

30 of 117

Principal ways of XML processing in apps - StAX

How can we process XML data in an application?

3. Streaming API for XML (StAX)

Processes the XML file as a stream of events

  • works for streams
  • works for large files
  • does not support effective arbitrary querying (a.k.a. random access)

Your code pulls events from the StAX parser

  • you can stop processing when done

<?xml version="1.0" encoding="UTF-8"?>

<consumers>

<consumer>

<name>John</name>

<drinks>

<coffee-type when="morning">V60</coffee-type>

<coffee-type when="after lunch">Batch brew</coffee-type>

<coffee-type when="afternoon">Flat white</coffee-type>

</drinks>

<age>20</age>

<coffees-per-day>2</coffees-per-day>

</consumer>

<consumer>

<name>Jane</name>

<drinks>

<coffee-type when="morning">Aeropress</coffee-type>

<coffee-type when="after lunch">Cappuccino</coffee-type>

<coffee-type when="afternoon">Double espresso</coffee-type>

</drinks>

<age>18</age>

<coffees-per-day>1</coffees-per-day>

</consumer>

</consumers>

30

31 of 117

What can be done with XML?

Exploitation

Parsing

  • DOM, SAX, StAX, LINQ

Validation

  • XML Schema (XSD)
    • most wide-spread
  • RelaxNG (out of scope)
    • XSD alternative
  • DTD (out of scope)
    • basically deprecated
  • Schematron (out of scope)
    • rule-based validation

Querying

  • XPath
  • XQuery (out of scope)

Transformation

  • XSLT

Persistence

  • XML databases (out of scope)
  • SQL/XML (out of scope)

Message transfer

  • Web services (SOAP, WSDL)

31

32 of 117

Specific XML formats example - SVG

Scalable Vector Graphics (SVG) 1.1 (Second Edition)�W3C Recommendation 2011

Scalable Vector Graphics (SVG) 2�W3C Candidate Recommendation 2018

Explained later in the course

<svg height="210" width="500">

<polygon points="200,10 250,190 160,210" style="fill:lime;stroke:purple;stroke-width:1" />

</svg>

32

33 of 117

XML Schema

33

34 of 117

XML Schema - Example

34

<?xml version="1.0" encoding="UTF-8"?>

<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"

xmlns:vc="http://www.w3.org/2007/XMLSchema-versioning"

elementFormDefault="qualified" attributeFormDefault="unqualified"

vc:minVersion="1.1">

<xs:complexType name="TypeAddress">

<!-- specification of content -->

<xs:sequence>

<xs:element name="Street" type="xs:string"/>

<xs:element name="Number" type="xs:integer"/>

<xs:element name="City" type="xs:string"/>

</xs:sequence>

<!-- specification of attributes -->

<xs:attribute name="Country" type="xs:string" default="CZ"/>

</xs:complexType>

<xs:element name="Address" type="TypeAddress"/>

</xs:schema>

35 of 117

XSD and XML documents / instances

35

<?xml version="1.0" encoding="utf-8"?>

<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">

... <!-- XML schema definition -->

</xs:schema>

<?xml version="1.0" encoding="utf-8"?>

<root_element_of_XML_document

xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"

xsi:noNamespaceSchemaLocation="schema2.xsd">

... <!-- XML document -->

</root_element_of_XML_document>

Link to XML Schema

36 of 117

XML document validity

XML document is valid iff it validates against an XML schema.

(for instance XML Schema, but also DTD, Relax NG, Schematron, ...)

36

37 of 117

XML Schema

  • W3C Recommendations
  • Currently XML Schema 1.1, 2012
    • as with XML 1.0/1.1, lots of parsers and validators still only support XML Schema 1.0

37

38 of 117

XML Schema 1.1 Simple types

W3C XML Schema Definition Language (XSD) 1.1 Part 2: Datatypes

Most common built-in simple data types:

  • boolean
  • anyURI
  • date, time, dateTime, dateTimeStamp
  • decimal, integer, double
  • hexBinary, base64Binary
  • gYear

38

39 of 117

XML Schema - Basic Principles

Definitions of:

  • Data types
    • Simple (simpleType)
    • Complex (complexType)
  • Elements (element)
    • Groups of elements (group)
  • Attributes (attribute)
    • Groups of attributes (attributeGroup)

39

40 of 117

XSD - Element definition

<?xml version="1.0" encoding="UTF-8"?>

<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">

<xs:element name="Catalog"/>

</xs:schema>

<?xml version="1.0" encoding="UTF-8"?>

<Catalog

xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"

xsi:noNamespaceSchemaLocation="Schema01.xsd"

/>

<?xml version="1.0" encoding="UTF-8"?>

<Catalog1

xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"

xsi:noNamespaceSchemaLocation="Schema01.xsd"

/>

<?xml version="1.0" encoding="UTF-8"?>

<Catalog

xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"

xsi:noNamespaceSchemaLocation="Schema01.xsd"

>

<Dataset>

<test/>

</Dataset>

</Catalog>

40

41 of 117

XSD - Element definition

<?xml version="1.0" encoding="UTF-8"?>

<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">

<xs:element name="Catalog"/>

<xs:element name="Catalog1"/>

</xs:schema>

<?xml version="1.0" encoding="UTF-8"?>

<Catalog

xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"

xsi:noNamespaceSchemaLocation="Schema02.xsd"

/>

<?xml version="1.0" encoding="UTF-8"?>

<Catalog1

xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"

xsi:noNamespaceSchemaLocation="Schema02.xsd"

/>

41

42 of 117

42

subelements or attributes?

No

simpleType

<e>

aaaa

</e>

  • restriction
  • list
  • union
  • enumeration

43 of 117

XSD - Element simple type definition

<?xml version="1.0" encoding="UTF-8"?>

<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">

<xs:element name="Catalog" type="xs:boolean"/>

</xs:schema>

<?xml version="1.0" encoding="UTF-8"?>

<Catalog

xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"

xsi:noNamespaceSchemaLocation="Schema03.xsd"

/>

<?xml version="1.0" encoding="UTF-8"?>

<Catalog

xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"

xsi:noNamespaceSchemaLocation="Schema03.xsd"

>

<Dataset>

<test/>

</Dataset>

</Catalog>

<?xml version="1.0" encoding="UTF-8"?>

<Catalog

xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"

xsi:noNamespaceSchemaLocation="Schema03.xsd"

>false</Catalog>

43

44 of 117

XSD - Simple type restriction

<?xml version="1.0" encoding="UTF-8"?>

<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">

<xs:element name="Catalog">

<xs:simpleType>

<xs:restriction base="xs:integer">

<xs:minInclusive value="42"/>

</xs:restriction>

</xs:simpleType>

</xs:element>

</xs:schema>

<?xml version="1.0" encoding="UTF-8"?>

<Catalog

xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"

xsi:noNamespaceSchemaLocation="Schema04.xsd"

>42</Catalog>

<?xml version="1.0" encoding="UTF-8"?>

<Catalog

xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"

xsi:noNamespaceSchemaLocation="Schema04.xsd"

>22</Catalog>

44

45 of 117

XSD - Named type definition and usage

<?xml version="1.0" encoding="UTF-8"?>

<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">

<xs:simpleType name="atLeast42">

<xs:restriction base="xs:integer">

<xs:minInclusive value="42"/>

</xs:restriction>

</xs:simpleType>

<xs:element name="Catalog" type="atLeast42"/>

</xs:schema>

<?xml version="1.0" encoding="UTF-8"?>

<Catalog

xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"

xsi:noNamespaceSchemaLocation="Schema05.xsd"

>42</Catalog>

<?xml version="1.0" encoding="UTF-8"?>

<Catalog

xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"

xsi:noNamespaceSchemaLocation="Schema05.xsd"

>22</Catalog>

45

46 of 117

XSD - Enumeration

<?xml version="1.0" encoding="UTF-8"?>

<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">

<xs:simpleType name="number135">

<xs:restriction base="xs:integer">

<xs:enumeration value="1"/>

<xs:enumeration value="3"/>

<xs:enumeration value="5"/>

</xs:restriction>

</xs:simpleType>

<xs:element name="Catalog" type="number135"/>

</xs:schema>

<?xml version="1.0" encoding="UTF-8"?>

<Catalog

xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"

xsi:noNamespaceSchemaLocation="Schema05-3.xsd">

42

</Catalog>

<?xml version="1.0" encoding="UTF-8"?>

<Catalog

xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"

xsi:noNamespaceSchemaLocation="Schema05-4.xsd">

3

</Catalog>

46

47 of 117

47

subelements or attributes?

Yes

complexType

No

simpleType

  • restriction
  • list
  • union
  • enumeration

subelements?

No

simpleContent

<e a="##">

aaaa

</e>

text content

attributes

no subelements

  • restriction
  • extension

<e>

aaaa

</e>

48 of 117

XSD - Complex type - simple content + attribute

<?xml version="1.0" encoding="UTF-8"?>

<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">

<xs:element name="Catalog">

<xs:complexType>

<xs:simpleContent>

<xs:extension base="xs:string">

<xs:attribute name="numberOfPublishers"

type="xs:integer"

use="required"/>

</xs:extension>

</xs:simpleContent>

</xs:complexType>

</xs:element>

</xs:schema>

<?xml version="1.0" encoding="UTF-8"?>

<Catalog

xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"

xsi:noNamespaceSchemaLocation="Schema06.xsd"

>22</Catalog>

<?xml version="1.0" encoding="UTF-8"?>

<Catalog

xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"

xsi:noNamespaceSchemaLocation="Schema06.xsd"

numberOfPublishers="42"

>someText</Catalog>

48

49 of 117

49

subelements or attributes?

Yes

complexType

No

simpleType

<e a="##">

<e2></e2>

</e3></e3>

</e>

subelements?

complexContent

Yes

No

simpleContent

<e a="##">

aaaa

</e>

text content

attributes

no subelements

  • restriction
  • extension
  • restriction
  • extension

subelements

attributes

<e>

aaaa

</e>

  • restriction
  • list
  • union
  • enumeration

50 of 117

XSD - Complex type - sequence

<?xml version="1.0" encoding="UTF-8"?>

<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">

<xs:element name="Catalog">

<xs:complexType>

<xs:sequence>

<xs:element

name="Dataset"

minOccurs="1"

maxOccurs="unbounded"/>

</xs:sequence>

</xs:complexType>

</xs:element>

</xs:schema>

<?xml version="1.0" encoding="UTF-8"?>

<Catalog

xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"

xsi:noNamespaceSchemaLocation="Schema07.xsd"

>22</Catalog>

<?xml version="1.0" encoding="UTF-8"?>

<Catalog

xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"

xsi:noNamespaceSchemaLocation="Schema07.xsd"

>

<Dataset>test</Dataset>

<Dataset><Distribution/></Dataset>

<Dataset></Dataset>

<Dataset/>

</Catalog>

50

51 of 117

XSD - Complex type - sequence & element order

<?xml version="1.0" encoding="UTF-8"?>

<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">

<xs:element name="Catalog">

<xs:complexType>

<xs:sequence>

<xs:element

name="Dataset"

minOccurs="1"

maxOccurs="unbounded"/>

<xs:element

name="DataService"/>

</xs:sequence>

</xs:complexType>

</xs:element>

</xs:schema>

<?xml version="1.0" encoding="UTF-8"?>

<Catalog

xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"

xsi:noNamespaceSchemaLocation="Schema08.xsd"

>

<Dataset>test</Dataset>

<Dataset></Dataset>

<DataService/>

<Dataset/>

</Catalog>

51

52 of 117

XSD - Complex type - sequence & element order

<?xml version="1.0" encoding="UTF-8"?>

<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">

<xs:element name="Catalog">

<xs:complexType>

<xs:sequence>

<xs:element

name="Dataset"

minOccurs="1"

maxOccurs="unbounded"/>

<xs:element

name="DataService"/>

</xs:sequence>

</xs:complexType>

</xs:element>

</xs:schema>

<?xml version="1.0" encoding="UTF-8"?>

<Catalog

xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"

xsi:noNamespaceSchemaLocation="Schema08.xsd"

>

<Dataset>test</Dataset>

<Dataset></Dataset>

<DataService/>

</Catalog>

52

53 of 117

XSD - Complex type - sequence & element order

<?xml version="1.0" encoding="UTF-8"?>

<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">

<xs:element name="Catalog">

<xs:complexType>

<xs:sequence maxOccurs="unbounded">

<xs:element

name="Dataset"

minOccurs="1"

maxOccurs="unbounded"/>

<xs:element

name="DataService"/>

</xs:sequence>

</xs:complexType>

</xs:element>

</xs:schema>

<?xml version="1.0" encoding="UTF-8"?>

<Catalog

xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"

xsi:noNamespaceSchemaLocation="Schema09.xsd"

>

<Dataset>test</Dataset>

<Dataset></Dataset>

<DataService/>

<Dataset/>

</Catalog>

53

54 of 117

XSD - Complex type - sequence & element order

<?xml version="1.0" encoding="UTF-8"?>

<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">

<xs:element name="Catalog">

<xs:complexType>

<xs:sequence maxOccurs="unbounded">

<xs:element

name="Dataset"

minOccurs="1"

maxOccurs="unbounded"/>

<xs:element

name="DataService"/>

</xs:sequence>

</xs:complexType>

</xs:element>

</xs:schema>

<?xml version="1.0" encoding="UTF-8"?>

<Catalog

xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"

xsi:noNamespaceSchemaLocation="Schema09.xsd"

>

<Dataset>test</Dataset>

<Dataset></Dataset>

<DataService/>

<Dataset/>

<DataService/>

</Catalog>

54

55 of 117

XSD - Complex type - sequence & element order

<xs:element name="Catalog">

<xs:complexType>

<xs:sequence>

<xs:element name="Datasets" minOccurs="1">

<xs:complexType>

<xs:sequence>

<xs:element

name="Dataset"

minOccurs="1"

maxOccurs="unbounded"/>

</xs:sequence>

</xs:complexType>

</xs:element>

<xs:element name="DataServices" minOccurs="1">

<xs:complexType>

<xs:sequence>

<xs:element

name="DataService"

minOccurs="0"

maxOccurs="unbounded"/>

</xs:sequence>

</xs:complexType>

</xs:element>

</xs:sequence>

</xs:complexType>

</xs:element>

<?xml version="1.0" encoding="UTF-8"?>

<Catalog

xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"

xsi:noNamespaceSchemaLocation="Schema10.xsd">

<Datasets>

<Dataset>test</Dataset>

<Dataset/>

<Dataset/>

</Datasets>

<DataServices>

<DataService/>

<DataService/>

</DataServices>

</Catalog>

55

56 of 117

XSD - Complex type - sequence & element order

<xs:element name="Catalog">

<xs:complexType>

<xs:sequence>

<xs:element name="Datasets" minOccurs="1">

<xs:complexType>

<xs:sequence>

<xs:element

name="Dataset"

minOccurs="1"

maxOccurs="unbounded"/>

</xs:sequence>

</xs:complexType>

</xs:element>

<xs:element name="DataServices" minOccurs="1">

<xs:complexType>

<xs:sequence>

<xs:element

name="DataService"

minOccurs="0"

maxOccurs="unbounded"/>

</xs:sequence>

</xs:complexType>

</xs:element>

</xs:sequence>

</xs:complexType>

</xs:element>

<?xml version="1.0" encoding="UTF-8"?>

<Catalog

xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"

xsi:noNamespaceSchemaLocation="Schema10.xsd">

<Datasets>

<Dataset>test</Dataset>

<Dataset/>

<Dataset/>

</Datasets>

<DataServices/>

</Catalog>

56

57 of 117

XSD - Complex type - choice

<?xml version="1.0" encoding="UTF-8"?>

<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">

<xs:element name="Catalog">

<xs:complexType>

<xs:choice>

<xs:element name="Dataset"/>

<xs:element name="DataService"/>

</xs:choice>

</xs:complexType>

</xs:element>

</xs:schema>

<?xml version="1.0" encoding="UTF-8"?>

<Catalog

xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"

xsi:noNamespaceSchemaLocation="Schema11.xsd">

<Dataset/>

</Catalog>

<?xml version="1.0" encoding="UTF-8"?>

<Catalog

xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"

xsi:noNamespaceSchemaLocation="Schema11.xsd">

<DataService>test</DataService>

</Catalog>

57

58 of 117

XSD - Element references

<?xml version="1.0" encoding="UTF-8"?>

<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">

<xs:element name="Dataset">

<xs:complexType>

<xs:sequence>

<xs:element name="name"/>

<xs:element name="description"/>

</xs:sequence>

</xs:complexType>

</xs:element>

<xs:element name="Catalog">

<xs:complexType>

<xs:sequence>

<xs:element ref="Dataset"/>

</xs:sequence>

</xs:complexType>

</xs:element>

</xs:schema>

<?xml version="1.0" encoding="UTF-8"?>

<Catalog

xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"

xsi:noNamespaceSchemaLocation="Schema13.xsd">

<Dataset>

<name>My dataset</name>

<description>My dataset description</description>

</Dataset>

</Catalog>

58

59 of 117

XSD - namespaces

59

<xs:schema

xmlns:xs="http://www.w3.org/2001/XMLSchema"

targetNamespace="http://tempuri.org/">

<xs:element name="Add">

<xs:complexType>

<xs:sequence>

<xs:element name="intA" type="xs:int"

form="qualified"/>

<xs:element name="intB" type="xs:int"

form="qualified"/>

</xs:sequence>

</xs:complexType>

</xs:element>

</xs:schema>

<?xml version="1.0" encoding="UTF-8"?>

<n1:Add

xmlns:n1="http://tempuri.org/" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://tempuri.org/ Untitled1.xsd">

<n1:intA>1</n1:intA>

<n1:intB>3</n1:intB>

</n1:Add>

Untitled1.xsd

document.xml

60 of 117

XSD - namespaces

60

<xs:schema

xmlns:xs="http://www.w3.org/2001/XMLSchema"

targetNamespace="http://tempuri.org/"

elementFormDefault="qualified">

<xs:element name="Add">

<xs:complexType>

<xs:sequence>

<xs:element name="intA"

type="xs:int"/>

<xs:element name="intB"

type="xs:int"/>

</xs:sequence>

</xs:complexType>

</xs:element>

</xs:schema>

<?xml version="1.0" encoding="UTF-8"?>

<n1:Add

xmlns:n1="http://tempuri.org/" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://tempuri.org/ Untitled1.xsd">

<n1:intA>1</n1:intA>

<n1:intB>3</n1:intB>

</n1:Add>

Untitled1.xsd

document.xml

61 of 117

XSD - namespaces

61

<xs:schema

xmlns:xs="http://www.w3.org/2001/XMLSchema"

targetNamespace="http://tempuri.org/">

<xs:element name="Add">

<xs:complexType>

<xs:sequence>

<xs:element name="intA" type="xs:int"

form="unqualified"/>

<xs:element name="intB" type="xs:int"

form="qualified"/>

</xs:sequence>

</xs:complexType>

</xs:element>

</xs:schema>

<?xml version="1.0" encoding="UTF-8"?>

<n1:Add

xmlns:n1="http://tempuri.org/" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://tempuri.org/ Untitled1.xsd">

<intA>1</intA>

<n1:intB>3</n1:intB>

</n1:Add>

Untitled1.xsd

document.xml

62 of 117

More XML Schema - not covered here

  • additional simple types: union, list
  • combining multiple schemas: import, include, override
  • re-using existing types: complexContent extensions and restrictions
  • any - allowing content unspecified by the XSD
  • substitution groups
  • Mixed content (text and elements)
  • (element) groups, attributeGroups
  • assertions - integrity constraints
  • ID/IDREFs, key, key-refs
  • ...

62

63 of 117

XML Path Language - XPath

63

64 of 117

XPath - example

<?xml version="1.0" encoding="UTF-8"?>

<catalog>

<title xml:lang="en">My catalog</title>

<title xml:lang="cs">Můj katalog</title>

<description xml:lang="en">This is my dummy catalog</description>

<description xml:lang="cs">Toto je můj falešný katalog</description>

<contact-point>

<name xml:lang="en">John Doe</name>

<e-mail>mailto:john@doe.org</e-mail>

</contact-point>

<datasets>

<dataset>

<title xml:lang="en">Bikesharing in Brno</title>

<title xml:lang="cs">Sdílení kol v Brně</title>

<distributions>

<distribution>

<media-type>application/xml</media-type>

<downloadURL>http://brno.cz/myfile.xml</downloadURL>

</distribution>

<distribution>

<accessService>

<endpointURL>https://brno.cz/myAPI</endpointURL>

<title xml:lang="en">My API</title>

</accessService>

</distribution>

</distributions>

</dataset>

<dataset>

<title xml:lang="en">Bikesharing in Prague</title>

<title xml:lang="cs">Sdílení kol v Praze</title>

<distributions>

<distribution>

<title xml:lang="en">CSV</title>

<media-type>text/csv</media-type>

<downloadURL>http://praha.eu/myfile.csv</downloadURL>

</distribution>

</distributions>

</dataset>

</datasets>

</catalog>

/catalog/datasets/dataset/title

  • Element: title
  • Element: title
  • Element: title
  • Element: title

64

/catalog/datasets/dataset/title/text()

  • Text: Bikesharing in Brno
  • Text: Sdílení kol v Brně
  • Text: Bikesharing in Prague
  • Text: Sdílení kol v Praze

/catalog/datasets/dataset/title[@xml:lang="en"]/text()

  • Text: Bikesharing in Brno
  • Text: Bikesharing in Prague

65 of 117

XPath - specifications

  • XML Path Language (XPath) 1.0
    • W3C Recommendation 1999
    • what we will cover mostly
  • XML Path Language (XPath) 2.0 (Second Edition)
    • W3C Recommendation 2010
    • most widely implemented
  • XML Path Language (XPath) 3.1
    • W3C Recommendation 2017

65

66 of 117

XPath Data Model

<?xml version="1.0" encoding="UTF-8"?>

<catalog>

<title xml:lang="en">My catalog</title>

<title xml:lang="cs">Můj katalog</title>

<description xml:lang="en">This is my dummy catalog</description>

<description xml:lang="cs">Toto je můj falešný katalog</description>

<contact-point>

<name xml:lang="en">John Doe</name>

<e-mail>mailto:john@doe.org</e-mail>

</contact-point>

<datasets>

<dataset>

<title xml:lang="en">Bikesharing in Brno</title>

<title xml:lang="cs">Sdílení kol v Brně</title>

<distributions>

<distribution>

<media-type>application/xml</media-type>

<downloadURL>http://brno.cz/myfile.xml</downloadURL>

</distribution>

<distribution>

<accessService>

<endpointURL>https://brno.cz/myAPI</endpointURL>

<title xml:lang="en">My API</title>

</accessService>

</distribution>

</distributions>

</dataset>

<dataset>

<title xml:lang="en">Bikesharing in Prague</title>

<title xml:lang="cs">Sdílení kol v Praze</title>

<distributions>

<distribution>

<title xml:lang="en">CSV</title>

<media-type>text/csv</media-type>

<downloadURL>http://praha.eu/myfile.csv</downloadURL>

</distribution>

</distributions>

</dataset>

</datasets>

</catalog>

66

document�(root)

element�<catalog>

element�<title>

text�"My catalog"

element�<title>

element�<description>

element�<datasets>

attribute�"xml:lang"

"en"

text�"Můj katalog"

attribute�"xml:lang"

"cs"

element�<dataset>

element�<dataset>

element�<title>

text�"Bikesharing in Brno"

element�<title>

attribute�"xml:lang"

"en"

text�"Sdílení kol v Brně"

attribute�"xml:lang"

"cs"

element�<distributions>

element�<distribution>

67 of 117

XPath Data Model

XPath types of nodes

  • document (root)
  • element
  • attribute
  • text
  • comment
  • namespace
  • processing instruction

67

document�(root)

element�<catalog>

element�<title>

text�"My catalog"

element�<title>

element�<description>

element�<datasets>

attribute�"xml:lang"

"en"

text�"Můj katalog"

attribute�"xml:lang"

"cs"

element�<dataset>

element�<dataset>

element�<title>

text�"Bikesharing in Brno"

element�<title>

attribute�"xml:lang"

"en"

text�"Sdílení kol v Brně"

attribute�"xml:lang"

"cs"

element�<distributions>

element�<distribution>

68 of 117

XPath basic examples

68

69 of 117

XPath - example

/

69

document�(root)

element�<catalog>

element�<title>

text�"My catalog"

element�<title>

element�<description>

element�<datasets>

attribute�"xml:lang"

"en"

text�"Můj katalog"

attribute�"xml:lang"

"cs"

element�<dataset>

element�<dataset>

element�<title>

text�"Bikesharing in Brno"

element�<title>

attribute�"xml:lang"

"en"

text�"Sdílení kol v Brně"

attribute�"xml:lang"

"cs"

element�<distributions>

element�<distribution>

70 of 117

XPath - example

/catalog

70

document�(root)

element�<catalog>

element�<title>

text�"My catalog"

element�<title>

element�<description>

element�<datasets>

attribute�"xml:lang"

"en"

text�"Můj katalog"

attribute�"xml:lang"

"cs"

element�<dataset>

element�<dataset>

element�<title>

text�"Bikesharing in Brno"

element�<title>

attribute�"xml:lang"

"en"

text�"Sdílení kol v Brně"

attribute�"xml:lang"

"cs"

element�<distributions>

element�<distribution>

71 of 117

XPath - absolute path

/catalog/datasets

71

document�(root)

element�<catalog>

element�<title>

text�"My catalog"

element�<title>

element�<description>

element�<datasets>

attribute�"xml:lang"

"en"

text�"Můj katalog"

attribute�"xml:lang"

"cs"

element�<dataset>

element�<dataset>

element�<title>

text�"Bikesharing in Brno"

element�<title>

attribute�"xml:lang"

"en"

text�"Sdílení kol v Brně"

attribute�"xml:lang"

"cs"

element�<distributions>

element�<distribution>

Absolute path�/step1/.../stepN

72 of 117

XPath - result set of nodes

/catalog/datasets/dataset

72

document�(root)

element�<catalog>

element�<title>

text�"My catalog"

element�<title>

element�<description>

element�<datasets>

attribute�"xml:lang"

"en"

text�"Můj katalog"

attribute�"xml:lang"

"cs"

element�<dataset>

element�<dataset>

element�<title>

text�"Bikesharing in Brno"

element�<title>

attribute�"xml:lang"

"en"

text�"Sdílení kol v Brně"

attribute�"xml:lang"

"cs"

element�<distributions>

element�<distribution>

Result is a set of nodes�(in no explicit order)

73 of 117

XPath - result set of nodes

/catalog/datasets/dataset/title

73

document�(root)

element�<catalog>

element�<title>

text�"My catalog"

element�<title>

element�<description>

element�<datasets>

attribute�"xml:lang"

"en"

text�"Můj katalog"

attribute�"xml:lang"

"cs"

element�<dataset>

element�<dataset>

element�<title>

text�"Bikesharing in Brno"

element�<title>

attribute�"xml:lang"

"en"

text�"Sdílení kol v Brně"

attribute�"xml:lang"

"cs"

element�<distribution>

element�<distributions>

element�<distribution>

74 of 117

XPath - access function

/catalog/datasets/dataset/title/text()

74

document�(root)

element�<catalog>

element�<title>

text�"My catalog"

element�<title>

element�<description>

element�<datasets>

attribute�"xml:lang"

"en"

text�"Můj katalog"

attribute�"xml:lang"

"cs"

element�<dataset>

element�<dataset>

element�<title>

text�"Bikesharing in Brno"

element�<title>

attribute�"xml:lang"

"en"

text�"Sdílení kol v Brně"

attribute�"xml:lang"

"cs"

element�<distributions>

element�<distribution>

access function

text()

75 of 117

XPath - attribute

/catalog/datasets/dataset/title/@xml:lang

75

document�(root)

element�<catalog>

element�<title>

text�"My catalog"

element�<title>

element�<description>

element�<datasets>

attribute�"xml:lang"

"en"

text�"Můj katalog"

attribute�"xml:lang"

"cs"

element�<dataset>

element�<dataset>

element�<title>

text�"Bikesharing in Brno"

element�<title>

attribute�"xml:lang"

"en"

text�"Sdílení kol v Brně"

attribute�"xml:lang"

"cs"

element�<distributions>

element�<distribution>

@attribute

76 of 117

XPath - predicate

/catalog/datasets/dataset/title[@xml:lang="en"]/text()

76

document�(root)

element�<catalog>

element�<title>

text�"My catalog"

element�<title>

element�<description>

element�<datasets>

attribute�"xml:lang"

"en"

text�"Můj katalog"

attribute�"xml:lang"

"cs"

element�<dataset>

element�<dataset>

element�<title>

text�"Bikesharing in Brno"

element�<title>

attribute�"xml:lang"

"en"

text�"Sdílení kol v Brně"

attribute�"xml:lang"

"cs"

element�<distributions>

element�<distribution>

[predicate]

�logical expression

77 of 117

XPath - relative path

title[@xml:lang="en"]/text()

77

document�(root)

element�<catalog>

element�<title>

text�"My catalog"

element�<title>

element�<description>

element�<datasets>

attribute�"xml:lang"

"en"

text�"Můj katalog"

attribute�"xml:lang"

"cs"

element�<dataset>

element�<dataset>

element�<title>

text�"Bikesharing in Brno"

element�<title>

attribute�"xml:lang"

"en"

text�"Sdílení kol v Brně"

attribute�"xml:lang"

"cs"

element�<distributions>

element�<distribution>

Relative path�step1/.../stepN

Needs a starting point

78 of 117

XPath - axes - child

/catalog/datasets

/child::catalog/child::datasets

78

document�(root)

element�<catalog>

element�<title>

text�"My catalog"

element�<title>

element�<description>

element�<datasets>

attribute�"xml:lang"

"en"

text�"Můj katalog"

attribute�"xml:lang"

"cs"

element�<dataset>

element�<dataset>

element�<title>

text�"Bikesharing in Brno"

element�<title>

attribute�"xml:lang"

"en"

text�"Sdílení kol v Brně"

attribute�"xml:lang"

"cs"

element�<distributions>

element�<distribution>

Default axis can be omitted�

child

XPath path step�

axis::node-test [predicate1] ... [predicateN]

79 of 117

XPath - axes - child

/catalog/child::*

/catalog/*

79

document�(root)

element�<catalog>

element�<title>

text�"My catalog"

element�<title>

element�<description>

element�<datasets>

attribute�"xml:lang"

"en"

text�"Můj katalog"

attribute�"xml:lang"

"cs"

element�<dataset>

element�<dataset>

element�<title>

text�"Bikesharing in Brno"

element�<title>

attribute�"xml:lang"

"en"

text�"Sdílení kol v Brně"

attribute�"xml:lang"

"cs"

element�<distributions>

element�<distribution>

Children do not contain�attributes or text nodes

80 of 117

XPath - axes - descendant

/catalog/descendant::title

80

document�(root)

element�<catalog>

element�<title>

text�"My catalog"

element�<title>

element�<description>

element�<datasets>

attribute�"xml:lang"

"en"

text�"Můj katalog"

attribute�"xml:lang"

"cs"

element�<dataset>

element�<dataset>

element�<title>

text�"Bikesharing in Brno"

element�<title>

attribute�"xml:lang"

"en"

text�"Sdílení kol v Brně"

attribute�"xml:lang"

"cs"

element�<distributions>

element�<distribution>

81 of 117

XPath - axes - descendant

/catalog/descendant::*

81

document�(root)

element�<catalog>

element�<title>

text�"My catalog"

element�<title>

element�<description>

element�<datasets>

attribute�"xml:lang"

"en"

text�"Můj katalog"

attribute�"xml:lang"

"cs"

element�<dataset>

element�<dataset>

element�<title>

text�"Bikesharing in Brno"

element�<title>

attribute�"xml:lang"

"en"

text�"Sdílení kol v Brně"

attribute�"xml:lang"

"cs"

element�<distributions>

element�<distribution>

82 of 117

XPath - axes - attribute

/catalog/descendant::*/attribute::*

/catalog/descendant::*/@*

82

document�(root)

element�<catalog>

element�<title>

text�"My catalog"

element�<title>

element�<description>

element�<datasets>

attribute�"xml:lang"

"en"

text�"Můj katalog"

attribute�"xml:lang"

"cs"

element�<dataset>

element�<dataset>

element�<title>

text�"Bikesharing in Brno"

element�<title>

attribute�"xml:lang"

"en"

text�"Sdílení kol v Brně"

attribute�"xml:lang"

"cs"

element�<distributions>

element�<distribution>

83 of 117

XPath - axes - preceding-sibling

/catalog/title/preceding-sibling::title/text()

83

document�(root)

element�<catalog>

element�<title>

text�"My catalog"

element�<title>

element�<description>

element�<datasets>

attribute�"xml:lang"

"en"

text�"Můj katalog"

attribute�"xml:lang"

"cs"

element�<dataset>

element�<dataset>

element�<title>

text�"Bikesharing in Brno"

element�<title>

attribute�"xml:lang"

"en"

text�"Sdílení kol v Brně"

attribute�"xml:lang"

"cs"

element�<distributions>

element�<distribution>

84 of 117

XPath - axes - descendant-or-self

/catalog/descendant-or-self::title

/catalog//title

84

document�(root)

element�<catalog>

element�<title>

text�"My catalog"

element�<title>

element�<description>

element�<datasets>

attribute�"xml:lang"

"en"

text�"Můj katalog"

attribute�"xml:lang"

"cs"

element�<dataset>

element�<dataset>

element�<title>

text�"Bikesharing in Brno"

element�<title>

attribute�"xml:lang"

"en"

text�"Sdílení kol v Brně"

attribute�"xml:lang"

"cs"

element�<distributions>

element�<distribution>

85 of 117

XPath - axes - self, parent

Starting in dataset

.

self::node()

..

parent::node()

85

document�(root)

element�<catalog>

element�<title>

text�"My catalog"

element�<title>

element�<description>

element�<datasets>

attribute�"xml:lang"

"en"

text�"Můj katalog"

attribute�"xml:lang"

"cs"

element�<dataset>

element�<dataset>

element�<title>

text�"Bikesharing in Brno"

element�<title>

attribute�"xml:lang"

"en"

text�"Sdílení kol v Brně"

attribute�"xml:lang"

"cs"

element�<distributions>

element�<distribution>

86 of 117

XPath - all axes

86

ancestor

descendant

following

preceding

following-sibling

preceding-sibling

child

attribute

namespace

self

parent

87 of 117

XPath - document order

<?xml version="1.0" encoding="UTF-8"?>

<catalog>

<title xml:lang="en">My catalog</title>

<title xml:lang="cs">Můj katalog</title>

<description xml:lang="en">This is my dummy catalog</description>

<description xml:lang="cs">Toto je můj falešný katalog</description>

<contact-point>

<name xml:lang="en">John Doe</name>

<e-mail>mailto:john@doe.org</e-mail>

</contact-point>

<datasets>

<dataset>

<title xml:lang="en">Bikesharing in Brno</title>

<title xml:lang="cs">Sdílení kol v Brně</title>

<distributions>

<distribution>

<media-type>application/xml</media-type>

<downloadURL>http://brno.cz/myfile.xml</downloadURL>

</distribution>

<distribution>

<accessService>

<endpointURL>https://brno.cz/myAPI</endpointURL>

<title xml:lang="en">My API</title>

</accessService>

</distribution>

</distributions>

</dataset>

<dataset>

<title xml:lang="en">Bikesharing in Prague</title>

<title xml:lang="cs">Sdílení kol v Praze</title>

<distributions>

<distribution>

<title xml:lang="en">CSV</title>

<media-type>text/csv</media-type>

<downloadURL>http://praha.eu/myfile.csv</downloadURL>

</distribution>

</distributions>

</dataset>

</datasets>

</catalog>

87

document�(root)

element�<catalog>

element�<title>

text�"My catalog"

element�<title>

element�<description>

element�<datasets>

attribute�"xml:lang"

"en"

text�"Můj katalog"

attribute�"xml:lang"

"cs"

element�<dataset>

element�<dataset>

element�<title>

text�"Bikesharing in Brno"

element�<title>

attribute�"xml:lang"

"en"

text�"Sdílení kol v Brně"

attribute�"xml:lang"

"cs"

element�<distributions>

element�<distribution>

according to the position of start tags of elements.

0

1

2

3

4

8

9

10

11

12

13

20

88 of 117

XPath - name()

/catalog/datasets/name()

88

document�(root)

element�<catalog>

element�<title>

text�"My catalog"

element�<title>

element�<description>

element�<datasets>

attribute�"xml:lang"

"en"

text�"Můj katalog"

attribute�"xml:lang"

"cs"

element�<dataset>

element�<dataset>

element�<title>

text�"Bikesharing in Brno"

element�<title>

attribute�"xml:lang"

"en"

text�"Sdílení kol v Brně"

attribute�"xml:lang"

"cs"

element�<distributions>

element�<distribution>

89 of 117

XPath - position()

/catalog/datasets/dataset/position()

  • 1 (xs:integer)
  • 2 (xs:integer)

89

document�(root)

element�<catalog>

element�<title>

text�"My catalog"

element�<title>

element�<description>

element�<datasets>

attribute�"xml:lang"

"en"

text�"Můj katalog"

attribute�"xml:lang"

"cs"

element�<dataset>

element�<dataset>

element�<title>

text�"Bikesharing in Brno"

element�<title>

attribute�"xml:lang"

"en"

text�"Sdílení kol v Brně"

attribute�"xml:lang"

"cs"

element�<distributions>

element�<distribution>

90 of 117

XPath - position()

/catalog/datasets/dataset[position() = 2]

/catalog/datasets/dataset[2]

90

document�(root)

element�<catalog>

element�<title>

text�"My catalog"

element�<title>

element�<description>

element�<datasets>

attribute�"xml:lang"

"en"

text�"Můj katalog"

attribute�"xml:lang"

"cs"

element�<dataset>

element�<dataset>

element�<title>

text�"Bikesharing in Brno"

element�<title>

attribute�"xml:lang"

"en"

text�"Sdílení kol v Brně"

attribute�"xml:lang"

"cs"

element�<distributions>

element�<distribution>

91 of 117

XPath - last()

/catalog/datasets/dataset[last()]

/catalog/datasets/dataset[position() = last()]

91

document�(root)

element�<catalog>

element�<title>

text�"My catalog"

element�<title>

element�<description>

element�<datasets>

attribute�"xml:lang"

"en"

text�"Můj katalog"

attribute�"xml:lang"

"cs"

element�<dataset>

element�<dataset>

element�<title>

text�"Bikesharing in Brno"

element�<title>

attribute�"xml:lang"

"en"

text�"Sdílení kol v Brně"

attribute�"xml:lang"

"cs"

element�<distributions>

element�<distribution>

92 of 117

92

Axes

self

child

descendant

parent

ancestor

attribute

following(-siblings)

preceding(-siblings)

Node tests

name

node with particular name

*

node with arbitrary name

node()

any node

text()

any text node

Abbreviations

/

/child::

/@

/attribute::

/.

/self::node()

/..

/parent::node()

//

/descendant-or-self::node()/

Functions

position()

position of node in the result

last()

position of the last node in the result

count()

number of nodes in the result

normalize-space()

normalization of white spaces

name()

name of node

93 of 117

XPath - common errors

“Select car rental companies in Hawaii which offer at least one cabrio"

/rental[state="Hawaii"]/offer/car[type="cabrio"]

�Correct:

/rental[state="Hawaii" and offer/car[type="cabrio"]]

93

Wrong: returns cars

94 of 117

XPath - common errors

“Select the last section in the book."

//section[last()]

/descendant-or-self::node()/section[last()]

�Correct: /descendant::section[last()]

<book>

<chapter>

<section></section>

</chapter>

<chapter>

<section>

<section></section>

<section>

<section></section>

</section>

</section>

<section></section>

</chapter>

</book>

94

Wrong: returns the last section in each chapter / section

95 of 117

Some XPath 2.0 features

  • result is a sequence (ordered)�("a", 2, "c")[3] results in "c"�(1 to 10)[7] results in 7

  • conditional expressions�if (count(//dataset) > 1) then "Datasets" else "Dataset"��//dataset[some $title in title satisfies $title/@xml:lang="en"]
  • for cycles�for $dataset in //dataset return count($dataset//distribution)

  • comments�(:comment:)

95

96 of 117

  • mapping operator !

//dataset ! count(descendant::distribution)

  • 1
  • 2

  • string concatenation operator ||�//dataset[1]/title[@xml:lang="en"] || //dataset[1]/title[@xml:lang="cs"]�concat(//dataset[1]/title[@xml:lang="en"], //dataset[1]/title[@xml:lang="cs"])
  • functions chaining (to avoid deep nesting)

upper-case(/descendant::dataset[1]/title[1])

/descendant::dataset[1]/title[1] => upper-case()

96

97 of 117

XSL Transformations - XSLT

97

98 of 117

XSLT - example

<?xml version="1.0" encoding="UTF-8"?>

<catalog>

<title xml:lang="en">My catalog</title>

<title xml:lang="cs">Můj katalog</title>

<description xml:lang="en">This is my dummy catalog</description>

<description xml:lang="cs">Toto je můj falešný katalog</description>

<contact-point>

<name xml:lang="en">John Doe</name>

<e-mail>mailto:john@doe.org</e-mail>

</contact-point>

<datasets>

<dataset>

<title xml:lang="en">Bikesharing in Brno</title>

<title xml:lang="cs">Sdílení kol v Brně</title>

<distributions>

<distribution>

<media-type>application/xml</media-type>

<downloadURL>http://brno.cz/myfile.xml</downloadURL>

</distribution>

<distribution>

<accessService>

<endpointURL>https://brno.cz/myAPI</endpointURL>

<title xml:lang="en">My API</title>

</accessService>

</distribution>

</distributions>

</dataset>

<dataset>

<title xml:lang="en">Bikesharing in Prague</title>

<title xml:lang="cs">Sdílení kol v Praze</title>

<distributions>

<distribution>

<title xml:lang="en">CSV</title>

<media-type>text/csv</media-type>

<downloadURL>http://praha.eu/myfile.csv</downloadURL>

</distribution>

</distributions>

</dataset>

</datasets>

</catalog>

<html xmlns:fn="http://www.w3.org/2005/xpath-functions" xmlns:xs="http://www.w3.org/2001/XMLSchema">

<head>

<title>My catalog</title>

</head>

<body>

<h1>My catalog</h1>

<h2>Bikesharing in Brno</h2>

<p>Number of distributions: 2</p>

<h2>Bikesharing in Prague</h2>

<p>Number of distributions: 1</p>

</body>

</html>

98

XSLT

99 of 117

XSLT - example

<?xml version="1.0" encoding="UTF-8"?>

<catalog>

<title xml:lang="en">My catalog</title>

<title xml:lang="cs">Můj katalog</title>

<description xml:lang="en">This is my dummy catalog</description>

<description xml:lang="cs">Toto je můj falešný katalog</description>

<contact-point>

<name xml:lang="en">John Doe</name>

<e-mail>mailto:john@doe.org</e-mail>

</contact-point>

<datasets>

<dataset>

<title xml:lang="en">Bikesharing in Brno</title>

<title xml:lang="cs">Sdílení kol v Brně</title>

<distributions>

<distribution>

<media-type>application/xml</media-type>

<downloadURL>http://brno.cz/myfile.xml</downloadURL>

</distribution>

<distribution>

<accessService>

<endpointURL>https://brno.cz/myAPI</endpointURL>

<title xml:lang="en">My API</title>

</accessService>

</distribution>

</distributions>

</dataset>

<dataset>

<title xml:lang="en">Bikesharing in Prague</title>

<title xml:lang="cs">Sdílení kol v Praze</title>

<distributions>

<distribution>

<title xml:lang="en">CSV</title>

<media-type>text/csv</media-type>

<downloadURL>http://praha.eu/myfile.csv</downloadURL>

</distribution>

</distributions>

</dataset>

</datasets>

</catalog>

<?xml version="1.0" encoding="UTF-8"?>

<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:xs="http://www.w3.org/2001/XMLSchema" xmlns:fn="http://www.w3.org/2005/xpath-functions">

<xsl:output method="html" encoding="UTF-8" indent="yes"/>

<xsl:template match="catalog">

<html>

<head>

<title>

<xsl:value-of select="title[@xml:lang='en']"/>

</title>

</head>

<body>

<h1>

<xsl:value-of select="title[@xml:lang='en']"/>

</h1>

<xsl:apply-templates/>

</body>

</html>

</xsl:template>

<xsl:template match="dataset">

<h2>

<xsl:value-of select="title[@xml:lang='en']"/>

</h2>

<p>Number of distributions: <xsl:value-of select="count(descendant::distribution)"/>

</p>

</xsl:template>

<xsl:template match="text()">

<xsl:apply-templates/>

</xsl:template>

</xsl:stylesheet>

99

100 of 117

XSLT - Specifications

  • XSL Transformations (XSLT) Version 1.0
    • W3C Recommendation, 1999
    • what we will cover mostly
  • XSL Transformations (XSLT) Version 2.0
    • W3C Recommendation, 2007
    • most widely implemented
  • XSL Transformations (XSLT) Version 3.0
    • W3C Recommendation, 2017

100

101 of 117

XSLT principles - stylesheet, template, processor

Input

  • one or more XML documents

Output

  • one or more text files
    • XML, HTML
    • RDF Turtle
    • TXT
    • ...

XSLT stylesheet

  • is an XML document
  • stylesheet root element
  • set of templates

XSLT template

  • matches part of input XML document using XPath expressions
  • produces output text

XSLT processor

  • goes through an input XML document
  • tries to match templates

101

102 of 117

XSLT - empty stylesheet

<?xml version="1.0" encoding="UTF-8"?>

<xsl:stylesheet version="2.0"xmlns:xsl="http://www.w3.org/1999/XSL/Transform"xmlns:xs="http://www.w3.org/2001/XMLSchema"xmlns:fn="http://www.w3.org/2005/xpath-functions">

<xsl:output method="html" encoding="UTF-8" indent="yes" />

</xsl:stylesheet>

version attribute - version of XSLT used

xsl:output - specifies the output behavior of the XSLT processor

  • method
    • html, xhtml, xml - produces well-formed documents
    • text - pure text output
  • indent
    • yes - generates correct indentation for xml, html
    • no - only explicitly generated whitespace included in output

102

103 of 117

XSLT - first template

<?xml version="1.0" encoding="UTF-8"?>

<xsl:stylesheet version="2.0"xmlns:xsl="http://www.w3.org/1999/XSL/Transform"xmlns:xs="http://www.w3.org/2001/XMLSchema"xmlns:fn="http://www.w3.org/2005/xpath-functions">

<xsl:output method="html" encoding="UTF-8" indent="yes" />

<xsl:template match="catalog">

<html>

<head>

<title>

<xsl:value-of select="title[@xml:lang='en']"/>

</title>

</head>

<body>

<h1>

<xsl:value-of select="title[@xml:lang='en']"/>

</h1>

</body>

</html>

</xsl:template>

</xsl:stylesheet>

match - contains XPath expression which needs to match

xsl:template

  • content goes to the output
    • here, we generate the HTML stub
  • xsl: elements get processed

e.g xsl:value-of

  • select attribute contains XPath expression
  • result of the expression replaces the xsl:value-of element in the output

103

104 of 117

XSLT - first template

<?xml version="1.0" encoding="UTF-8"?>

<xsl:stylesheet version="2.0"xmlns:xsl="http://www.w3.org/1999/XSL/Transform"xmlns:xs="http://www.w3.org/2001/XMLSchema"xmlns:fn="http://www.w3.org/2005/xpath-functions">

<xsl:output method="html" encoding="UTF-8" indent="yes" />

<xsl:template match="catalog">

<html>

<head>

<title>

<xsl:value-of select="title[@xml:lang='en']"/>

</title>

</head>

<body>

<h1>

<xsl:value-of select="title[@xml:lang='en']"/>

</h1>

</body>

</html>

</xsl:template>

</xsl:stylesheet>

104

<html xmlns:fn="http://www.w3.org/2005/xpath-functions" xmlns:xs="http://www.w3.org/2001/XMLSchema">

<head>

<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">

<title>My catalog</title>

</head>

<body>

<h1>My catalog</h1>

</body>

</html>

  • output is indented
  • whitespace is normalized
  • encoding indicated in the head

105 of 117

XSLT - second template

<?xml version="1.0" encoding="UTF-8"?>

<xsl:stylesheet version="2.0"xmlns:xsl="http://www.w3.org/1999/XSL/Transform"xmlns:xs="http://www.w3.org/2001/XMLSchema"xmlns:fn="http://www.w3.org/2005/xpath-functions">

<xsl:output method="html" encoding="UTF-8" indent="yes" />

<xsl:template match="catalog">...</xsl:template>

<xsl:template match="dataset">

<h2>

<xsl:value-of select="title[@xml:lang='en']"/>

</h2>

<p>

Number of distributions:� <xsl:value-of select="count(descendant::distribution)"/>

</p>

</xsl:template>

</xsl:stylesheet>

105

<html xmlns:fn="http://www.w3.org/2005/xpath-functions" xmlns:xs="http://www.w3.org/2001/XMLSchema">

<head>

<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">

<title>My catalog</title>

</head>

<body>

<h1>My catalog</h1>

</body>

</html>

Nothing new in the output… why?

106 of 117

XSLT - apply templates

<?xml version="1.0" encoding="UTF-8"?>

<xsl:stylesheet version="2.0"xmlns:xsl="http://www.w3.org/1999/XSL/Transform"xmlns:xs="http://www.w3.org/2001/XMLSchema"xmlns:fn="http://www.w3.org/2005/xpath-functions">

<xsl:output method="html" encoding="UTF-8" indent="yes" />

<xsl:template match="catalog">

<html>

<head>

<title>

<xsl:value-of select="title[@xml:lang='en']"/>

</title>

</head>

<body>

<h1>

<xsl:value-of select="title[@xml:lang='en']"/>

</h1>

<xsl:apply-templates/>

</body>

</html>

</xsl:template>

<xsl:template match="dataset">...</xsl:template>

</xsl:stylesheet>

106

<htmlxmlns:fn="http://www.w3.org/2005/xpath-functions" xmlns:xs="http://www.w3.org/2001/XMLSchema">

<head>

<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">

<title>My catalog</title>

</head>

<body>

<h1>My catalog</h1>

My catalog

Můj katalog

This is my dummy catalog

Toto je můj falešný katalog

John Doe

mailto:john@doe.org

<h2>Bikesharing in Brno</h2><p>Number of distributions: 2</p>

<h2>Bikesharing in Prague</h2><p>Number of distributions: 1</p>

</body>

</html>

🤷🏻‍♂️

107 of 117

XSLT - implicit templates

<?xml version="1.0" encoding="UTF-8"?>

<xsl:stylesheet version="2.0"

xmlns:xsl="http://www.w3.org/1999/XSL/Transform"

xmlns:xs="http://www.w3.org/2001/XMLSchema"

xmlns:fn="http://www.w3.org/2005/xpath-functions">

<xsl:template match="*|/">

<xsl:apply-templates/>

</xsl:template>�

<xsl:template match="text()|@*">

<xsl:value-of select="."/>

</xsl:template>�

<xsl:template match="processing-instruction()|comment()"/>�

</xsl:stylesheet>

  • Present implicitly - need to be overridden
  • Result in text from elements and attributes to be copied to output

107

108 of 117

XSLT - implicit templates

<?xml version="1.0" encoding="UTF-8"?>

<xsl:stylesheet version="2.0"

xmlns:xsl="http://www.w3.org/1999/XSL/Transform"

xmlns:xs="http://www.w3.org/2001/XMLSchema"

xmlns:fn="http://www.w3.org/2005/xpath-functions">

<xsl:output method="html" encoding="UTF-8" indent="yes" />

<xsl:template match="catalog">

<html>

<head>

<title>

<xsl:value-of select="title[@xml:lang='en']"/>

</title>

</head>

<body>

<h1>

<xsl:value-of select="title[@xml:lang='en']"/>

</h1>

<xsl:apply-templates/>

</body>

</html>

</xsl:template>

<xsl:template match="dataset">

<h2>

<xsl:value-of select="title[@xml:lang='en']"/>

</h2>

<p>Number of distributions: <xsl:value-of select="count(descendant::distribution)"/>

</p>

</xsl:template>

<xsl:template match="text()"/>

</xsl:stylesheet>

108

<html xmlns:fn="http://www.w3.org/2005/xpath-functions" xmlns:xs="http://www.w3.org/2001/XMLSchema">

<head>

<title>My catalog</title>

</head>

<body>

<h1>My catalog</h1>

<h2>Bikesharing in Brno</h2>

<p>Number of distributions: 2</p>

<h2>Bikesharing in Prague</h2>

<p>Number of distributions: 1</p>

</body>

</html>

override the implicit template

109 of 117

XSLT - apply templates - select which

<?xml version="1.0" encoding="UTF-8"?>

<xsl:stylesheet version="2.0"

xmlns:xsl="http://www.w3.org/1999/XSL/Transform"

xmlns:xs="http://www.w3.org/2001/XMLSchema"

xmlns:fn="http://www.w3.org/2005/xpath-functions">

<xsl:output method="html" encoding="UTF-8" indent="yes" />

<xsl:template match="catalog">

<html>

<head>

<title>

<xsl:value-of select="title[@xml:lang='en']"/>

</title>

</head>

<body>

<h1>

<xsl:value-of select="title[@xml:lang='en']"/>

</h1>

<xsl:apply-templates select="datasets/dataset"/>

</body>

</html>

</xsl:template>

<xsl:template match="dataset">

<h2>

<xsl:value-of select="title[@xml:lang='en']"/>

</h2>

<p>Number of distributions: <xsl:value-of select="count(descendant::distribution)"/>

</p>

</xsl:template>

<xsl:template match="text()"/>

</xsl:stylesheet>

109

<html xmlns:fn="http://www.w3.org/2005/xpath-functions" xmlns:xs="http://www.w3.org/2001/XMLSchema">

<head>

<title>My catalog</title>

</head>

<body>

<h1>My catalog</h1>

<h2>Bikesharing in Brno</h2>

<p>Number of distributions: 2</p>

<h2>Bikesharing in Prague</h2>

<p>Number of distributions: 1</p>

</body>

</html>

XPath selecting nodes to which the templates will be applied next.�Default: child::node()

110 of 117

XSLT - named templates and parameters

<?xml version="1.0" encoding="UTF-8"?>

<xsl:stylesheet version="2.0"

xmlns:xsl="http://www.w3.org/1999/XSL/Transform"

xmlns:xs="http://www.w3.org/2001/XMLSchema"

xmlns:fn="http://www.w3.org/2005/xpath-functions">

<xsl:output method="html" encoding="UTF-8" indent="yes" />

<xsl:template match="catalog">

<html>

<head>

<xsl:call-template name="processTitle">

<xsl:with-param name="element">title</xsl:with-param>

<xsl:with-param name="lang">cs</xsl:with-param>

</xsl:call-template>

</head>

<body>

<xsl:call-template name="processTitle">

<xsl:with-param name="element">h1</xsl:with-param>

<xsl:with-param name="lang">en</xsl:with-param>

</xsl:call-template>

<xsl:apply-templates select="datasets/dataset"/>

</body>

</html>

</xsl:template>

<xsl:template match="dataset">

<xsl:call-template name="processTitle">

<xsl:with-param name="element">h2</xsl:with-param>

<xsl:with-param name="lang">en</xsl:with-param>

</xsl:call-template>

<p>Number of distributions: <xsl:value-of select="count(descendant::distribution)"/>

</p>

</xsl:template>

<xsl:template match="text()"/>

<xsl:template name="processTitle">

<xsl:param name="element" required="yes"/>

<xsl:param name="lang" required="yes"/>

<xsl:element name="{$element}">

<xsl:value-of select="title[@xml:lang=$lang]"/>

</xsl:element>

</xsl:template>

</xsl:stylesheet>

Named templates

  • name attribute instead of match attribute
  • accept parameters
    • xsl:param - definition in named template
    • $variable - access to variable value in XPath
    • {$variable} - access to variable value elsewhere
  • called using xsl:call-template
    • does not change the currently processed node set
    • xsl:with-param - values passed when calling

xsl:element

  • creates an element on the output
  • name can be constant or {$variable}

110

111 of 117

XSLT - global variables, modes, if

<?xml version="1.0" encoding="UTF-8"?>

<xsl:stylesheet version="2.0"

xmlns:xsl="http://www.w3.org/1999/XSL/Transform"

xmlns:xs="http://www.w3.org/2001/XMLSchema"

xmlns:fn="http://www.w3.org/2005/xpath-functions">

<xsl:output method="html" encoding="UTF-8" indent="yes" />

<xsl:variable name="lang">en</xsl:variable>

<xsl:template match="catalog">

<html>

<head>

<xsl:apply-templates mode="head"/>

</head>

<body>

<xsl:apply-templates mode="catalog"/>

</body>

</html>

</xsl:template>

<xsl:template match="dataset" mode="head"/>

<xsl:template match="dataset" mode="catalog">

<xsl:apply-templates mode="dataset"/>

<p>Number of distributions: <xsl:value-of select="count(descendant::distribution)"/>

</p>

</xsl:template>

<xsl:template match="title" mode="head">

<xsl:if test="@xml:lang=$lang">

<xsl:element name="title">

<xsl:value-of select="text()"/>

</xsl:element>

</xsl:if>

</xsl:template>

<xsl:template match="title" mode="catalog">

<xsl:if test="@xml:lang=$lang">

<xsl:element name="h1">

<xsl:value-of select="text()"/>

</xsl:element>

</xsl:if>

</xsl:template>

<xsl:template match="text()" mode="#all"/>

</xsl:stylesheet>

Global variable

  • defined in the xsl:stylesheet root element using xsl:variable
  • accessible in the whole stylesheet
    • e.g. $lang

Mode

  • ability to process the same nodes in different ways
    • different templates with the same match
  • specified in xsl:apply-templates
  • used in unnamed xsl:template
    • #all matches all modes

111

If

112 of 117

Some remaining XSLT 1.0 features

"Switch"�<xsl:choose>� <xsl:when test='$level=1'>� <xsl:number format="i"/>� </xsl:when>� <xsl:when test='$level=2'>� <xsl:number format="a"/>� </xsl:when>� <xsl:otherwise>� <xsl:number format="1"/>� </xsl:otherwise>�</xsl:choose>

For each�<xsl:for-each select="item">� <xsl:sort select="."/>� <p>� <xsl:number value="position()" format="1. "/>� <xsl:value-of select="."/>� </p>�</xsl:for-each>

Include - XML-based inclusion

<xsl:stylesheet version="1.0"

xmlns:xsl="http://www.w3.org/1999/XSL/Transform">

<xsl:include href="article.xsl"/>

���Import - templates in importing stylesheet take precedence over imported templates

<xsl:stylesheet version="1.0"

xmlns:xsl="http://www.w3.org/1999/XSL/Transform">

<xsl:import href="article.xsl"/>

112

113 of 117

Some XSLT 2.0 and 3.0 features

Grouping of data�<xsl:for-each-group select="cities/city" group-by="@country">� <tr>� <td><xsl:value-of select="@country"/></td>� <td>� <xsl:value-of select="current-group()/@name" separator=", "/>� </td>� <td><xsl:value-of select="sum(current-group()/@pop)"/></td>� </tr>�</xsl:for-each-group>

Multiple output documents�<xsl:result-document href="foo.html">� <!-- add instructions to generate document content here -->�</xsl:result-document>

Regular expressions�<!--This example transforms dates of the form "12/8/2003" into ISO 8601 standard form: "2003-12-08".-->�<xsl:analyze-string select="$date" regex="([0-9]+)/([0-9]+)/([0-9]{{4}})">� <xsl:matching-substring>� <xsl:number value="regex-group(3)" format="0001"/><xsl:text>-</xsl:text>� </xsl:matching-substring>�</xsl:analyse-string>

Streaming�<xsl:template match="/">� <xsl:stream href="books.xml">� <xsl:iterate select="/books/book">� <xsl:result-document href="{concat('book', position(),'.xml')}">� <xsl:copy-of select="."/>� </xsl:result-document>� <xsl:next-iteration/>� </xsl:iterate>� </xsl:stream>�</xsl:template>

Higher-order functions�<xsl:value-of select="$f1(2)"/>

Text processing: CSV, JSON, … on input�<xsl:variable name="header" select="tokenize(unparsed-text-lines($csv)[1], $sep)"/>

113

114 of 117

Examples

114

115 of 117

XSLT Example - IANA registry - generating HTML

<?xml version='1.0' encoding='UTF-8'?>

<?xml-stylesheet type="text/xsl" href="media-types.xsl"?>

<?oxygen RNGSchema="media-types.rng" type="xml"?>

<registry xmlns="http://www.iana.org/assignments" id="media-types">

<title>Media Types</title>

<category>Multipurpose Internet Mail Extensions (MIME) and Media Types</category>

<updated>2021-03-10</updated>

<registration_rule>Expert Review for Vendor and Personal Trees.</registration_rule>

<expert>Ned Freed, Alexey Melnikov, Murray Kucherawy (backup)</expert>

<xref type="rfc" data="rfc6838"/>

<xref type="rfc" data="rfc4855"/>

...

115

Link to XSLT stylesheet transforming XML to HTML

116 of 117

XSLT Example - Generating RDF Turtle

<?xml version="1.0" encoding="UTF-8"?>

<xsl:stylesheet version="2.0"

xmlns:xsl="http://www.w3.org/1999/XSL/Transform"

xmlns:xs="http://www.w3.org/2001/XMLSchema"

xmlns:fn="http://www.w3.org/2005/xpath-functions">

<xsl:output method="text" encoding="UTF-8" />

<xsl:variable name="prefix">https://ex.org/resource/</xsl:variable>

<xsl:variable name="catalogIRI" select="concat($prefix, 'Catalog')"/>

<xsl:template match="catalog">

@prefix dcat: &lt;http://www.w3.org/ns/dcat#&gt; .

@prefix dcterms: &lt;http://purl.org/dc/terms/&gt; .

&lt;<xsl:value-of select="$catalogIRI"/>&gt; a dcat:Catalog .

<xsl:apply-templates>

<xsl:with-param name="currentIRI" select="$catalogIRI"/>

</xsl:apply-templates>

</xsl:template>

<xsl:template match="dataset">

<xsl:variable name="datasetIRI" select="concat($prefix, 'dataset/', fn:position())"/>

&lt;<xsl:value-of select="$catalogIRI"/>&gt; dcat:dataset &lt;<xsl:value-of select="$datasetIRI"/>&gt; .

&lt;<xsl:value-of select="$datasetIRI"/>&gt; a dcat:Dataset .

<xsl:apply-templates select="title">

<xsl:with-param name="currentIRI" select="$datasetIRI"/>

</xsl:apply-templates>

</xsl:template>

<xsl:template match="title">

<xsl:param name="currentIRI"/>

&lt;<xsl:value-of select="$currentIRI"/>&gt; dcterms:title &quot;<xsl:value-of select="text()"/>&quot;@<xsl:value-of select="@xml:lang"/> .

</xsl:template>

<xsl:template match="text()" mode="#all"/>

</xsl:stylesheet>

@prefix dcat: <http://www.w3.org/ns/dcat#> .

@prefix dcterms: <http://purl.org/dc/terms/> .

<https://ex.org/resource/Catalog> a dcat:Catalog .

<https://ex.org/resource/Catalog> dcterms:title "My catalog"@en .

<https://ex.org/resource/Catalog> dcterms:title "Můj katalog"@cs .

<https://ex.org/resource/Catalog> dcat:dataset <https://ex.org/resource/dataset/2> .

<https://ex.org/resource/dataset/2> a dcat:Dataset .

<https://ex.org/resource/dataset/2> dcterms:title "Bikesharing in Brno"@en .

<https://ex.org/resource/dataset/2> dcterms:title "Sdílení kol v Brně"@cs .

<https://ex.org/resource/Catalog> dcat:dataset <https://ex.org/resource/dataset/4> .

<https://ex.org/resource/dataset/4> a dcat:Dataset .

<https://ex.org/resource/dataset/4> dcterms:title "Bikesharing in Prague"@en .

<https://ex.org/resource/dataset/4> dcterms:title "Sdílení kol v Praze"@cs .

116

117 of 117

Literature

Jiří Kosek - XML pro každého (2004!) - https://www.kosek.cz/xml/index.html (in Czech)

117