XML and Databases (MSc/6+)

Lecturer: Dr. Can Türker

Location: BIN 2.A.10

Date: Thursday 08:15-09:45 Start: 24.02.2022  End: 02.06.2022  Examination: 16.06.2022 (BIN 0.K.02)

Course Material via OLAT: https://lms.uzh.ch/auth/RepositoryEntry/17190519595

Tips for the examination preparation:

  1. The examination will be a classic paper (written) one. It takes 90 minutes and  questions and tasks are prepared in a way such that everybody gets the chance to show what (s)he has learned. However, you have to be really very quick to finish the ENTIRE questions/tasks of the examination. So you should be really well prepared and do not rely on that the fact that you allowed to bring a sheet with notes with you BUT NO electronic devices.
  2. The questions/tasks are related to the lecture chapters 1 to 9. Chapter 10 (Systems) can be ignored for the examination.
  3. Usually, there is at least one question/task to every chapter 1 to 9.
  4. The following topics will surely come in every examination of this lecture:
  5. Chapter 1 (Introduction). You should be able to explain the differences between HTML and XML and traditional semantic data models. Also relevant is that you have understood how the descriptor-oriented model works and why it represents the one end of the spectrum of a semi-structured data model. What are the gains and limits of such a data model? How XML could be used to improve information retrieval?
  6. Chapter 2 (XML). You should know how XML documents are structured as well as the underlying data model and syntax of XML documents, XML document types as well as of XML schema documents. You should be able to write down a new XML schema data type or XML-DTD given an informal description of a data type. That is, you have to know in detail the concepts of XML-DTD and XML schema.
  7. Chapter 3 (XML Processors). Here you have not look to the syntax of the different approaches. For the examination, it is just relevant to know how the different approaches work in principle and their pros and cons. 
  8. Chapter 4 (XML Query Languages). Given an XML document or XML-DTD, you should be able to write XQueries! So you should definitely have a very deep look into the the XPath/XQuery chapter 4-10 to 4-51. Your queries should be written in a way, i can see which concept you are trying to apply. So for little typo in keywords, there will no drop in points. For the section "Extending Query Languages with IR Functionality", there will no concrete queries to write. So, you can fully neglect the syntax of this part. However, there could be a question regarding the concepts to extend XQuery by Information Retrieval functionality (weighting, ranking, relevance-oriented search, etc.)
  9. Chapter 5 (XML Updates). Given an XML document, you should be able to write an XQuery using the commands of the Update Facility to perform updates on the given XML document.
  10. Chapter 6/7 (Mapping XML to DBMS and DBMS to XML. Given an XML schema document or XML-DTD, you should be able to map it an SQL table definition! Given an SQL table definition, you should be able to mapped it onto an XML schema document or XML-DTD.
  11. Chapter 8 (XML Indexing). Here I want to see that you know the different index structures and how they work in principle on XML values. So, the syntax of index creation is not relevant here.
  12. Chapter 9 (SQL/XML) intents to give you all necessary information about the SQL/XML standard. So there a some foils that are only there for completeness purposes. Relevant for the examination are foils 1-24 and 29-31. Given an SQL table, you should be able to map the table definition to a corresponding XML schema document as well as the table content to a corresponding XML document, as it is shown in foil 9-24! Also, there will be a task where you have to apply the XQuery and XMLTable commands based on a given SQL table. That is, you have to be able to write an SQL query using these XML functions.

Can Türker, 09.06.2022

 

Summary: Today, the W3C standard XML is widely used as document format for exchanging data over the Internet. While the generation of XML data is easy, the management of XML data requires systems that can efficiently store, query, and process XML data. With other words, database technology is required for handling XML data. The goal of this lecture is to teach the interplay between XML and databases. The following aspects are studied in detail: semi-structured data model of XML, query languages (XPath, XQuery) for declarative access to XML data, XML processor technologies, mapping between XML and databases including efficient storage and index structures for XML data. A further central concern of this lecture is to show the practical relevance of all presented concepts by demonstrating how they are realized in major database management systems such as Oracle, IBM DB2, Microsoft SQL Server, and PostgreSQL.

Goal: Achieve deep understanding of XML and its interplay with database technology

Prerequisite: Databases (Bachelor level), i.e., basic knowledge in databases

Recommended: MSc Studies

Credit Points: 3