Details Colloquium Spring 2014

06.03.2014 - Learning to Construct and Reason with a Large Knowledge Base of Extracted Information

Speaker: Prof. William Cohen, Ph.D.
Host: Abraham Bernstein

Abstract

Carnegie Mellon University's "Never Ending Language Learner" (NELL) has been running for over three years, and has automaticlly extracted from the web millions of facts concerning hundreds of thousands of entities and thousands of concepts. NELL works by coupling together many interrelated large-scale semi-supervised learning problems. In this talk I will discuss some of the technical problems we encountered in building NELL, and some of the issues involved in reasoning with this sort of large, diverse, and imperfect knowledge base. This is joint work with Tom Mitchell, Ni Lao, William Wang, and many other colleagues.

Bio

Prof. William Cohen, Ph.D., received his bachelor's degree in Computer Science from Duke University in 1984, and a PhD in Computer Science from Rutgers University in 1990. From 1990 to 2000 Dr. Cohen worked at AT&T Bell Labs and later AT&T Labs-Research, and from April 2000 to May 2002 Dr. Cohen worked at Whizbang Labs, a company specializing in extracting information from the web. Dr. Cohen is President of the International Machine Learning Society, an Action Editor for the Journal of Machine Learning Research, and an Action Editor for the journal ACM Transactions on Knowledge Discovery from Data. He is also an editor, with Ron Brachman, of the AI and Machine Learning series of books published by Morgan Claypool. In the past he has also served as an action editor for the journal Machine Learning, the journal Artificial Intelligence, and the Journal of Artificial Intelligence Research. He was General Chair for the 2008 International Machine Learning Conference, held July 6-9 at the University of Helsinki, in Finland; Program Co-Chair of the 2006 International Machine Learning Conference; and Co-Chair of the 1994 International Machine Learning Conference. Dr. Cohen was also the co-Chair for the 3rd Int'l AAAI Conference on Weblogs and Social Media, which was held May 17-20, 2009 in San Jose, and was the co-Program Chair for the 4rd Int'l AAAI Conference on Weblogs and Social Media, which will be held May 23-26 at George Washington University in Washington, D. C. He is a AAAI Fellow, and in 2008, he won the SIGMOD "Test of Time" Award for the most influential SIGMOD paper of 1998. Dr. Cohen's research interests include information integration and machine learning, particularly information extraction, text categorization and learning from large datasets. He holds seven patents related to learning, discovery, information retrieval, and data integration, and is the author of more than 200 publications.

20.03.2014 - Digitalization of the Energy System: Interdisciplinary Research Questions in Smart Grids

Speaker: Prof. Dr. Jens Strüker
Host: Sven Seuken

Abstract

Electricity systems are increasingly characterized by distributed and intermittent generation sources. A promising way to address the resulting grid challenges is actively coordinating both the demand and supply side of energy systems. However, so-called demand side management and virtual power plants require detailed consumption, weather and generation data. The talk gives (a) an overview on different approaches to gather this data and on how to interact with market actors, (b) discusses emerging trade-offs between economic efficiency, data privacy and security of supply and (c) present possible approaches for solving them.

Bio

Prof. Dr. Jens Strüker holds since 2013 a professorship for energy management at the Fresenius University of Applied Sciences, Germany, where he is the Managing Director of the Institute of Energy Economics (INEWI). Since 2012 he runs a lectureship in energy informatics at the Albert-Ludwig University Freiburg, Germany. Prof. Strüker holds since April 2012 a venia legend (habilitation) for information systems and business administration of the Albert-Ludwig University of Freiburg, Germany. Between 2008 and 2009 he was a visiting scientist at SAP Labs, Palo Alto, U.S.A. Currently, he acts as the project leader of iUrban funded by the EU (Optimising Energy Systems in Smart Cities) and was between 2011 and 2013 the project leader of TORERO funded by the German National Science Foundation (DFG). In 2006 he received the Friedrich-August-von-Hayek Thesis Award for the best economics thesis in 2006 at the University of Freiburg, sponsored by Deutsche Bank

27.03.2014 - Automatic Scoring of Free-text Answers using Text Similarity and Textual Entailment

Speaker: Dr. Torsten Zesch
Host: Martin Volk

Abstract

Automated scoring of free-text answers is the key technology to (i) enable formative assessment in settings where no teacher is readily available, and (ii) decrease the large costs of large-scale summative assessment tasks (e.g. the PISA study). I will give an overview about the state of the art in scoring free-text answers that is based on computing the similarity between a reference answer and the student answers. From the discussion of various semantic text similarity measures it will become obvious that deeper semantic analysis is required in order to properly score the answers. We will look into textual entailment and especially partial textual entailment as promising tools in that direction.

Bio

Dr.-Ing. Torsten Zesch is a substitute professor for "Language Technology" at University of Duisburg-Essen and an associated researcher at the German Institute for Educational Research (DIPF). His primary research focus is on scalable natural language processing for educational purposes including automated testing, error detection, and exercise generation. Torsten Zesch received his Ph.D. in computer science from Technische Universität Darmstadt in 2009. For two years, he has been the head of the "Intelligent Language Systems" research group at UKP Lab, Technische Universität Darmstadt, before spending a year as a substitute professor for "Knowledge Mining & Assessment" at the German Institute for International Educational Research.

10.04.2014 - The Many Uses of Rules in Ontology-Based Data Access

Speaker: Prof. Dr. Harold Boley
Host: Abraham Bernstein

Abstract

Ontology-Based Data Access (OBDA) enables reasoning over an ontology as a generalized global schema for the data in local (e.g., relational or graph) databases. The term 'ontology' refers to a shared formal specification of a domain (e.g., its classes and properties), usually in a subset of first-order logic (e.g., a description logic). Realizing a broad notion of 'access', OBDA can, e.g., semantically validate, enrich, and integrate heterogeneous data sources. Motivated by rule-ontology synergies, this talk will discuss key concepts of OBDA and their foundation in four main groups of (logical) 'if-then' rules, using geospatial examples. (1) OBDA data integration is centered on Global-As-View (GAV) mappings, which are safe Datalog rules defining each global head predicate as a conjunction of local body predicates. These (heterogeneous) conjunctive queries can be further mapped to the database languages of the sources (e.g., to SQL or SPARQL). The GAV rules are employed for a kind of 'partial deduction' unfolding each global query to a union of local queries. (2) A query is itself a specialDatalog rule whose conjunctive body becomes GAV-mapped and whose n-ary head predicate instantiates the distinguished answer variables of the body predicates. (3) The OBDA ontology supports query rewriting through global-schema-level reasoning. It usually includes the expressivity of RDF Schema (RDFS), whose class and property subsumptions can be seen as single-premise Datalog rules with, respectively, unary and binary predicates, and whose remaining axioms are also definable by rules. OBDA ontologies often extend RDFS to the description logic DL-Lite (as in OWL 2 QL), including subsumption axioms that correspond to (head-)existential rules. Recent work has also explored 'Rule-Based Data Access', e.g. via Description Logic Programs (as in OWL 2 RL, definable in RIF-Core), Datalog+/-, and Disjunctive Datalog. (4) OBDA ontologies beyond RDFS expressivity usually permit negative constraints for data validation, which are translated to Boolean conjunctive queries corresponding to integrity rules.

Bio

Prof. Dr. Harold Boley is adjunct professor at the Faculty of Computer Science, University of New Brunswick, and chair of RuleML Inc. He is currently visiting researcher at WSL Birmensdorf. His specification of Web rules through RuleML has found broad uptake and is now being developed beyond Version 1.0. It has been combined with OWL to SWRL, has become the main input to the W3C Recommendation RIF, and has provided the foundation for OASIS LegalRuleML. His Rule Responder projects have enabled deployed distributed applications for the Social Semantic Web. His recent innovations in data-plus-knowledge representation include the object-relational PSOA RuleML and the visualization frameworkGrailog. Together with colleagues from Binarypark, Athan Services, and RuleML Inc. he has started theRuleML Blog & Social Mediazine (http://blog.ruleml.org)

08.05.2014 - A Unified Rolling Shutter and Motion Model for Dense 3D Visual Odometry

Speaker: Dr. Andrew Comport
Host: Davide Scaramuzza

Abstract

Motion blur and rolling shutter deformations both inhibit visual motion registration, whether it be due to a moving sensor or a moving target. Whilst both deformations exist simultaneously, no models have been proposed to handle them together. Furthermore, neither deformation has been considered previously in the context of monocular full-image 6 degrees of freedom registration or RGB-D structure and motion. As will be shown, rolling shutter deformation is observed when a camera moves faster than a single pixel in parallax between subsequent scan-lines. Blur is a function of the pixel exposure time and the motion vector. In this paper a complete dense 3D registration model will be derived to account for bothmotion blur and rolling shutter deformations simultaneously. Various approaches will be compared with respect to ground truth and live real-time performance will be demonstrated for complex scenarios where both blur and shutter deformations are dominant.

Bio

Andrew Comport is "Chargé de recherche" (permanent researcher) with the "Centre National de Recherche Scientifique" (CNRS). Since 2009 he is based at the I3S laboratory which is a joint lab between the CNRS and the University of Nice Sophia-Antipolis (UNS) where he leads the research on dense real-time visual localisation and mapping. He is also associate director of the "Signal, Images, Systems" department of the I3S Laboratory at the UNS. From 2005 to 2007 he carried out research at INRIA Sophia-Antipolis as a researcher in the Arobas team. In 2005 he completed a Ph.D. on robust real-time 3D tracking of rigid and articulated objects in the Lagadic project at INRIA-IRISA in Rennes. In 2001 he worked as a Research Assistant in the Intelligent Robotics Research Center (IRRC), Monash, Australia. In 2000 he graduated with a Bachelor of Engineering (BE) majoring in Electrical and Computer Systems Engineering with Honours from Monash University Australia. In 1997 he graduated with Bachelor of Science (BSc) majoring in Computer Science also from Monash University, Australia. Key interests include: dense visual odometry, multi-view tracking and mapping, dense rolling shutter and motion blur models, high-dynamic range dense modelling, real-time reflectance and shadowing for augmented reality, real-time interaction, dynamic environment modelling (both non-rigid and illumination variation), visual servoing and navigation and robot vision (ground robots, aerial robots and space robots).

15.05.2014 - From Model-Based to Model-Integrating Software

Speaker: Prof. Dr. Jürgen Ebert
Host: Prof. Martin Glinz and Prof. Gerhard Schwabe

Abstract

A strong focus of Software Engineering research in modeling is on software for models (e.g., modeling tools) as well as on models for software (e.g., in tool construction or reverse engineering). This talk gives a personal overview of models in Software Engineering, which started with early visual notations by a plethora of modeling languages and editors and lead to unifying approaches like UML (on the language side) and generic metaCASE software (on the tool side). In this era, foundational work on model representations, metamodeling, constraint descriptions, and semantics as well as on classification of modeling languages into a few modeling paradigms provided a deeper understanding of the world of modeling in general. (model-based software development) Adding the ability for code generation and model transformation, the process of software development was automated further by environments, which even provide additional services like model evolution, model querying, model execution, or model comparison. (model-driven software development) Integrating these capabilities into a crossplatform and crosslanguage infrastructure may now lead to software components which contain code and models as equal-level and cooperating parts at runtime, making software evolution easier to handle and leveraging, e.g., the development of adaptive software components or dynamic product lines. (model-integrating software)

Bio

Jürgen Ebert has been a professor for Software Engineering at the University of Koblenz-Landau in Koblenz since 1982. He got his PhD in Mathematics from the University of Münster and his habilitation in Computer Science from the University of Osnabrück, both in Germany. He retired in 2014. He has been one of the pioneers in the field of modeling in a Software Engineering context. His research is focused on design and construction of generic tools, especially using graph-based approaches. In the last decade, he worked primarily in the area of modelling, software reengineering, and software architecture.

22.05.2014 - Big Data Integration

Speaker: Divesh Srivastava, Ph.D.
Host: Michael Böhlen

Abstract

The Big Data era is upon us: data is being generated, collected and analyzed at an unprecedented scale, and data-driven decision making is sweeping through all aspects of society. Since the value of data explodes when it can be linked and fused with other data, addressing the big data integration (BDI) challenge is critical to realizing the promise of Big Data. BDI differs from traditional data integration in many dimensions: (i) the number of data sources, even for a single domain, has grown to be in the tens of thousands, (ii) many of the data sources are very dynamic, as a huge amount of newly collected data are continuously made available, (iii) the data sources are extremely heterogeneous in their structure, with considerable variety even for substantially similar entities, and (iv) the data sources are of widely differing qualities, with significant differences in the coverage, accuracy and timeliness of data provided. This talk explores the progress that has been made by the data integration community in addressing these novel challenges faced by big data integration, and identifies a range of open problems for the community.

Bio

Divesh Srivastava is the head of Database Research at AT&T Labs-Research. He received his Ph.D. from the University of Wisconsin, Madison, and his B.Tech. from the Indian Institute of Technology, Bombay. He is an ACM fellow, on the board of trustees of the VLDB Endowment and an associate editor of the ACM Transactions on Database Systems. His research interests and publications span a variety of topics in data management.