Department of Informatics – Database Technology

News
People
Research
Publications
Teaching
Software
Contact
Backup-News  
The Swiss National Science FoundationUniversity of ZurichAgroscope Liebefeld-Posieux

Student Projects

This page contains information on students work in context of:

New projects are:

The running projects are:

The completed projects are:


Short Projects


Visualization of the Animal Density in the Swiss Feed Database

The goal of this project is to extend the online Swiss Database with the visualization of poly- gons, where each polygon encloses one or a group of adjacent gemeinde and is colored according to the animal density within the enclosed area. The crucial aspect of this work is to experimentally determine the grouping of the gemeide that ensures fast performance and, at the same time, is sufficiently informative.

Resources:

Students:

Supervisors:

Andrej Taliun, Prof. Dr. Michael Böhlen


Dynamic Data Summary Structures and Computation of the Color Plots for the Temporal Analyses of Nutritive Containment

Computation of the color plots is done with a help of Kernel Density Estimation and Kernel Re- gression. These methods take into account all measurements of the query result and provide the expected nutritive value even in the areas that are between origins of the feed samples. On a technical level, the efficient computation of the color plots requires to maintain a data sum- mary structure. Specifically, we compute the density on grid points of a sparse regular grid and, then, interpolate the density between the grid points. Such an approach is efficient for the analyses of the static data, however, does not scale for the temporal analyses of the nutritive containment: while computing multiple color plots for the history of nutrient measurements, we must recompute the entire summary structure for each distinct time stamp in the database.

Resources:

Students:

Supervisors:

Andrej Taliun, Prof. Dr. Michael Böhlen


Embedding animal density information into the Swiss Feed Database

Data on animal density are available on a communal basis whereas data on hay nutrients are given on a postal code basis. A community is not always identical to a postal code and vice versa. One challenge is integrate the data on animal density into the Swiss Feed Database. The visualization of the spatial distribution of animal density classes can be solved by embedding color plots into Google maps. The color plots are computed with a help of Kernel regression technique.

Resources:

Students:

Supervisors:

Andrej Taliun, Prof. Dr. Michael Böhlen


Implementation of the Animated Color Plots in the On-line Swiss Feed Database

The goal of this project is to implement animated color plots in the on-line web application of the Swiss Feed Database. The approach will consists of the following steps. First, the time line is split into small intervals and, then, for each interval a static color plot is computed with a help of Kernel Regression on a sparse regular grid. Next, with a help of linear interpolation the static color plots are combined into a smooth animation and displayed on the map. In their work the students will work with the Postgres relational database, PHP, JavaScript and 'canvas' element of HTML5 standard.

Resources:

Supervisors:

Andrej Taliun, Prof. Dr. Michael Böhlen


Multiple constraint search in the Swiss Feed Database

Feed rations for animals are optimized to meet nutrient requirements. The efficient search for feed types that best match these requirements has to rely on query options that allow multiple, user defined constraints in the nutrient selection step. So far, the search function is restricted to feed type selection and nutrient selection. The query results can be sorted in an ascending or descending order but just one column at a time. A search additionally based on user defined nutrient ranges would help to solve many real world situations in an efficient way.

Resources:

Supervisors:

Andrej Taliun, Prof. Dr. Michael Böhlen


Implementation of a 1NN-Join technique

Derived facts must be computed on the fly because their value changes at each query. Depending on user selections, the same derived fact will be computed on a different partition of the dataset, and a different result value will be returned. Because of this, and due to the large size of the data, a fast and scalable solution to compute Nearest Neighbor Joins must be developed.

Resources:

Students:

Supervisors:

Francesco Cafagna,Prof. Dr. Michael Böhlen


Embedding of customized information as background to Google maps – case study with animal density information

There is strong evidence for regional influences on hay quality which are explained by altitude, botanical composition, production intensity and fertilizer intensity. In forage production including hay, the level of available fertilizer in the form of manure is directly linked with animal density. It is expected that high animal density results in high phosphorous content. The interesting questions are: which nutrients in hay samples do correlate with animal density? How can we visualize animal density and hay quality simultaneously? How are hay samples distributed over the different animal density classes?

Resources:

Supervisors:

Andrej Taliun, Prof. Dr. Michael Böhlen


Online calculation of the energy and nutritive value of feedstuffs for pigs

It is common practice to estimate the energy value of feedstuffs by regression technique. The multivariate equations rely on the proximate analysis of feedstuffs alone or in combination with digestibility coefficients that are stored in the feed database. The goal of this project is to implement web application for online computation of energy values

Resources:

Supervisors:

Andrej Taliun, Prof. Dr. Michael Böhlen


KNN-Join to compute derive attributes in the Swiss Feed Data Warehouse (completed)

Derived attributes can be defined as functions depending on other nutrients and on the time. Regarding the time, data are sparse. This means that not all the needed information are available for a given time point, therefore derived nutrients cannot be always computed. We want to provide the value of a derived attribute through an estimation that uses nearest neighbor search whenever a needed value is not available for a given time point.

Resources:

Students:

Supervisor: Francesco Cafagna,Prof. Dr. Michael Böhlen


Queries on local feed quality (completed)

The goal of this project is to study and apply the F-Test for the statistical comparison of regions based on the containment of the nutrients that are found in feed samples of these regions. We will pursue the following outcome functionality. First, the user selects two locations on the map using a mouse pointing device. Next, the system automatically gather all feed samples that are found in the surroundings of these two locations and compute the F-Test.

Resources:

Students:

Supervisor: Andrej Taliun


Clustering of Amino Acids Profiles (completed)

In this project we will use clustering analyses techniques on amino acid profiles in order to detect outliers and correctly classify the feed data. In particular we will use DBScan and Optics which are efficient and accurate density based clustering techniques for high dimensional data.

Resources:

Students:

Supervisor: Andrej Taliun


Visualization of the Spatial Feed Data using Color Plots (completed)

The goal of this project is to extend visualization of the query result with color plots which with different palettes color the map to show the density of the collected feed samples and content of the selected nutrients. We will pursue our goal with two techniques. First, we will use kernel density estimation to compute the density of the feed samples at any point of the map. Next, we will use the JavaScript language and 'canvas' element of HTML do dynamically color on the map based on the query results.

Resources:

Students:

Supervisor: Andrej Taliun


Online Statistical Computation (completed)

Measures of nutrients and minerals for any feed type in the Feed Database are incomplete. It is possible to restore missing measures based on a statistical method of regression. The goal of this project is to investigate how parametric and non-parametric regression methods can be efficiently integrated into the Feed Database.

Resources:

Students:

Supervisor:

Andrej Taliun


Temporal ER Model (completed)

The Feed Database stores aggregated measures of nutrients and minerals which compose different types of animal feed and does not contain historical information about the changes in the measure values. That significantly limit the analysis of the data. The goal of this project is to extend the design of the Feed Database with a temporal model.

Resources:

Students:

Supervisors:

Andrej Taliun, Prof. Dr. Michael Böhlen


Bachelor thesis

Visualization of the Varying Spatial Density Information in the Swiss Feed Database (completed)

The current on-line interface to query the feed data graphically illustrates feed samples with flags on the map and for each feed sample the containment of nutrient is given as a text in the list. There are two essential drawbacks while visualizing feed samples with flags. First, in case then there are many feed samples from the same location, all of them are visualized by a single flag and, therefore, it is hard for the user to compare different locations based on the number of feed samples. Second, flags do not reveal the information about the containment of the selected nutrients and, thus, it is not possible to visually compare the feed quality across different locations. The goal of this Bachelor thesis is to enrich visualization of the query results with color plots which with different palettes color the map to show the density of the collected feed samples and content of the selected nutrients.

Resources:

Students:

Supervisors:

Andrej Taliun, Prof. Dr. Michael Böhlen


Statistical Comparison of Regions in the Swiss Feed Database (completed)

The current on-line interface to query feed data linked to geographic information is so far limited to the selection criteria canton and altitude. Feedback from potential end-users show the need to extend the filter criteria to allow queries on local feed data that can be compared with other regions of similar altitude and the national average. The goal of this thesis is to study and apply the F-Test and T-Test for the statistical comparison of regions based on the containment of the nutrients that are found in feed samples of these regions. The procedure involves several steps. In a first step, the F-test is applied to test for equality/inequality of variance of the two populations. Depending on this result, either the formula for equal or unequal variances must be chosen in the subsequent Student’s t-test. In general, unequal sample size must be assumed. The t-test suits for univariate problems. A generalization of Student’s t statistics, called Hotelling’s T-square statistic, allows for the testing of hypotheses on multiple (multivariate), often correlated, measures, which is characteristic for feed data. Both, F-test and t-test are based on the assumption of a normal distribution. This may not always be the case, particularly with respect to minerals and trace elements. Normality can be tested by using the Shapiro-Wilk or Kolmogorov-Smirnov test.

Resources:

Students:

Supervisors:

Andrej Taliun, Prof. Dr. Michael Böhlen


On-Line Computation of Up-To-Date Summaries of Nutrient Measurements in the Swiss Feed Database (completed)

The main question that arises in the computation of up-to-date summaries is how to select the time period and relevant measurements for the aggregation. Measurements exhibit a number of unique properties that makes answering this question an interesting challenge. Firstly, measurements are often not single-valued but sets of values. Secondly, measurements are sparse in time: they are taken irregularly in order to reduce the analyses cost. Therefore, the choice of the time period and measurements for the aggregation is data driven and depends on the number and quality of collected measurements.

Resources:

Students:

Supervisors:

Andrej Taliun, Prof. Dr. Michael Böhlen


Managing and Querying Derived Nutrient Parameters in the Swiss Feed Database (completed)

The computation of derived nutrient parameters is based on known dependencies that are formalized with a help of algebraic expressions, a.k.a, regressions. The complexity of regressions varies depending on the number of involved nutrients. In one case, a regression involves measurements of only one nutrient, in an other case, measurements of many nutrients are required to compute the regression. Furthermore, regressions might be defined recursively, i.e., based on the output of other regressions. In all cases, the large number of available regressions makes it hard to manually update the Feed Database as new data becomes available.

Resources:

Students:

Supervisors:

Francesco Cafagna,Prof. Dr. Michael Böhlen


Development of a Database System based on geographical information (completed)

The goal of this project is to integrate geographical information (Postal Code, Area, Canton, City, etc.) about the provenance of feeds samples into the current version of the Swiss Feed Data Warehouse. This information should be used to query the database depending on the user’s selections. In particular, it should be possible to aggregate the feed parameters based on Cities, Regions, and so on.

Resources:

Students:

Supervisors:

Francesco Cafagna,Prof. Dr. Michael Böhlen