XML and Databases (MSc/6+)

Dear Students,

this lecture will start on Thursday, February 25th, 2021, 08:15-09:45.

It will be held via zoom: https://ethz.zoom.us/j/99856254757

The OLAT link is https://lms.uzh.ch/auth/RepositoryEntry/16979231153

Please find below the entire material (slides and demos of all chapters) of this lecture as well as a list with tips how to prepare for the examination. More details on the online examination and the test run, will be provided at appropriate time.

Upfront tips for the examination preparation:

  1. The examination will be online via EPIS. It takes 60 minutes and consists of 12 questions and 9 tasks such that everybody gets the chance to show what (s)he has learned. However, you have to be really very quick to finish the ENTIRE questions/tasks of the examination. So you should be really well prepared and do not rely on that the fact that this is an open-book examination where can lookup certain issues. If you have no idea w.r.t to a given question/task, it might be better to skip it and proceed with the next question/task. Note that there NO return back to question/tasks after having proceeded to next one.
  2. The questions/tasks are related to the lecture chapters 1 to 9. Chapter 10 (Systems) can be ignored for the examination.
  3. Usually, there is at least one question/task to every chapter 1 to 9.
  4. The following topics will surely come in every examination of this lecture:
  5. Chapter 1 (Introduction). You should be able to explain the differences between HTML and XML and traditional semantic data models. Also relevant is that you have understood how the descriptor-oriented model works and why it represents the one end of the spectrum of a semi-structured data model. What are the gains and limits of such a data model? How XML could be used to improve information retrieval?
  6. Chapter 2 (XML). You should know how XML documents are structured as well as the underlying data model and syntax of XML documents, XML document types as well as of XML schema documents. You should be able to write down a new XML schema data type or XML-DTD given an informal description of a data type. That is, you have to know in detail the concepts of XML-DTD and XML schema.
  7. Chapter 3 (XML Processors). Here you have not look to the syntax of the different approaches. For the examination, it is just relevant to know how the different approaches work in principle and their pros and cons. 
  8. Chapter 4 (XML Query Languages). Given an XML document or XML-DTD, you should be able to write XQueries! So you should definitely have a very deep look into the the XPath/XQuery chapter 4-10 to 4-51. Your queries should be written in a way, i can see which concept you are trying to apply. So for little typo in keywords, there will no drop in points. For the section "Extending Query Languages with IR Functionality", there will no concrete queries to write. So, you can fully neglect the syntax of this part. However, there could be a question regarding the concepts to extend XQuery by Information Retrieval functionality (weighting, ranking, relevance-oriented search, etc.)
  9. Chapter 5 (XML Updates). Given an XML document, you should be able to write an XQuery using the commands of the Update Facility to perform updates on the given XML document.
  10. Chapter 6/7 (Mapping XML to DBMS and DBMS to XML. Given an XML schema document or XML-DTD, you should be able to map it an SQL table definition! Given an SQL table definition, you should be able to mapped it onto an XML schema document or XML-DTD.
  11. Chapter 8 (XML Indexing). Here I want to see that you know the different index structures and how they work in principle on XML values. So, the syntax of index creation is not relevant here.
  12. Chapter 9 (SQL/XML) intents to give you all necessary information about the SQL/XML standard. So there a some foils that are only there for completeness purposes. Relevant for the examination are foils 1-24 and 29-31. Given an SQL table, you should be able to map the table definition to a corresponding XML schema document as well as the table content to a corresponding XML document, as it is shown in foil 9-24! Also, there will be a task where you have to apply the XQuery and XMLTable commands based on a given SQL table. That is, you have to be able to write an SQL query using these XML functions.

Can Türker, 24.02.2021

Lecturer: Dr. Can Türker

Summary: Today, the W3C standard XML is widely used as document format for exchanging data over the Internet. While the generation of XML data is easy, the management of XML data requires systems that can efficiently store, query, and process XML data. With other words, database technology is required for handling XML data. The goal of this lecture is to teach the interplay between XML and databases. The following aspects are studied in detail: semi-structured data model of XML, query languages (XPath, XQuery) for declarative access to XML data, XML processor technologies, mapping between XML and databases including efficient storage and index structures for XML data. A further central concern of this lecture is to show the practical relevance of all presented concepts by demonstrating how they are realized in major database management systems such as Oracle, IBM DB2, Microsoft SQL Server, and PostgreSQL.

Goal: Achieve deep understanding of XML and its interplay with database technology

Prerequisite: Databases (Bachelor level), i.e., basic knowledge in databases

Recommended: MSc Studies

Examination: 17.06.2021, 08:15-09:15

Credit Points: 3

Date: Thursday 08:15-09:45

Start: 25.02.2021

End: 03.06.2021

Schedule:

Week Date Topic Slides Demos

8

25.02.2021

Introduction and Motivation

Slides (ZIP, 11 MB) Demos (ZIP, 246 KB)

9

04.03.2021

XML & XML-DTD

   

10

11.03.2021

XML Schema & XML Processors

   

11

18.03.2021

XML Processors

   

12

25.03.2021

XML Query Languages

   

13

01.04.2021

XML Query Languages

   

14

08.04.2021

No lecture!

   

15

15.04.2021

XML Query Languages & XML Update

   

16

22.04.2021

Mapping between XML and Databases

   

17

29.04.2021

Mapping between XML and Databases

   

18

06.05.2021

XML Indexing & SQL/XML

   

19

13.05.2021

No lecture!

   

20

20.05.2021

SQL/XML

   

21

28.05.2021

XML Support in Databases Systems

   

22

03.06.2021

XML Support in Databases Systems

   

24

17.06.2021

Examination

   

Tools:

  • BASEX (Open source XML tool supporting XPath/XQuery 3.0 and Query Update Facility 1.0): http://basex.org

 

​Student Assistant Positions Available! Students-Assistant-Job-Announcement (PDF, 255 KB)

  • Web Application Programming on the J2EE Platform
  • Become a member of the B-Fabric Development Team: http://fgcz-bfabric.uzh.ch/wiki
  • Work time flexible; fulltime possible during semester holidays

If interested, send your CV to tuerker@fgcz.ethz.ch.