Department of Informatics

News
Department
Teaching
Research
Agenda
Akkreditierungen
Archive  

Details Colloquium Spring 2012

08.03.2012 - Automated Reasoning for Ontology Engineering

Speaker: Prof. Dr. Uli Sattler
Host: Avi Bernstein

Abstract

In the last decade, ontologies are being developed as logical theories capturing domain knowledge, and used in a variety of applications, most prominently in clinical and life sciences. They are used to design and maintain terminologies, to base information systems on, and to provide flexible access to data. Recent standardization of computer processable ontology languages, in particular OWL, have led to an increased interest in and tool development for ontologies. OWL is based, on the one hand, on web standards and, on the other hand, on Description Logics: these are decidable fragments of first order logic and close relatives of modal logics that have been developed for knowledge representation and reasoning. I will briefly introduce OWL and then describe some of the recent developments, in particular around automated reasoning for ontology engineering. On the one hand, progress was made regarding so-called standard reasoning tasks: e.g., we have seen the development of new reasoning techniques to cope with extremely large, modestly complex theories and to answer queries against databases w.r.t. ontologies. On the other hand, the "serious" usage of ontologies requires ever more and powerful non-standard reasoning services such as the extraction of modules, the logically sound partitioning and diff-ing of ontologies or the explanation of entailments from an ontology.

Bio

Uli Sattler is a professor in the Information Management Group in the School of Computer Science of the University of Manchester. Her general research interests are in logics for knowledge representation and automated reasoning. She has worked in the field of knowledge representation and reasoning since 1994, and an important part of this work has been devoted to the design of practical reasoning algorithms for expressive description logics (DLs). Jointly with Ian Horrocks and others, she has designed the SHIQ family of DLs, determined their complexity, and designed practical algorithms for their key inference problems. The SHIQ family forms the logical basis of ontology languages such as OWL DL and OWL 2. She is interested in the computational complexity of the reasoning problems that form the basis of system services of knowledge representation systems and in novel reasoning problems as requested by applications to molecular biology and the clinical sciences. Recently, she is involved in the development of logic-based notions for modularity of ontologies, e.g., for the re-use and comprehension of ontologies; in the design of explanation services for logical entailments; and in the design of ontology diffing services.

15.03.2012 - Economic Information Acquisition for Improved Predictive Modeling and Decision Making

Speaker: Prof. Dr. Maytal Saar-Tsechansky
Host: Avi Bernstein

Abstract

To better cope with business environments that are information intensive, complex, and highly dynamic, companies increasingly rely on intelligent, data-driven methods to automatically “learn” from experiences over time and to improve future decision-making. However, reliable data-driven induction depends on two critical ingredients: an effective induction (learning) method, and informative, relevant data. The capacity of even the most effective induction technique to extract a reliable model of any real-world phenomenon is bounded by what information is available about prior experiences. In practice, organizations often acquire information only passively, such through routine business transactions. However, the opportunity costs of acquiring information only passively can be substantial. This challenge poses some fundamental and fascinating scientific questions: How does one enable predictive modeling techniques to reason intelligently about opportunities to actively acquire additional information to improve modeling and the business decisions the models inform? How should knowledge (or uncertainty) about the particular decisions that predictive models will inform,influence what information is best to acquire? And how can we incorporate prior knowledge into data-driven learning, so as to mitigate the high cost of learning from scratch? I will present an overview of my past and current research on economic data mining addressing these and related questions. I will also discuss applications of the methods I develop for a broad range of problems, including personalized marketing, recommender systems, insurance fraud detection, and market mechanism design.

Bio

Maytal Saar-Tsechansky is an Associate Professor of Information, Risk and Operations Management at the McCombs School of Business, The University of Texas at Austin. She is currently also a Visiting Faculty at the Judge Business School at the University of Cambridge. Maytal’s research interests include machine learning and data mining methods for data-driven intelligence and decision making. Her work in these areas has been published in the Journal of Finance, Management Science, Information Systems Research, Journal of Machine Learning Research, and Machine Learning Journal, among other venues. She serves on the editorial board of the Machine Learning Journal, is an Associate Editor for the Information Systems Research (ISR) journal and the INFOMRSJournal on Computing. She was a guest editor of the Special Issue on Utility-Based Data Mining in the Journal of Data Mining and Knowledge Discovery, and is a frequent Program Committee member in the premier machine learning and data mining conferences. At McCombs Maytal has developed and teaches courses on business intelligence with data mining in the Executive MBA, full-time MBA, and the undergraduate business programs. Maytal received her Ph.D. from New York University’s Stern School of Business and obtained a B.S and M.S in Industrial Engineering from Ben Gurion University in Israel.

22.03.2012 - Web N-Grams as a Resource for Corpus Linguistics

Speaker: Prof. Dr. Stefan Evert
Host: Martin Volk

Abstract

In recent years, the rapid growth of the World Wide Web has enabled research in computational linguistics to scale up to Web-derived corpora a thousand times the size of the British National Corpus (BNC) and more. These huge text collections open up entirely new possibilities for training statistical models and unsupervised learning algorithms. With the release of Google's Web 1T 5-gram database (Brants & Franz 2006), a corpus on the teraword scale came within reach of the general research community for the first time, in the form of n-gram frequency tables. Since then, the Web1T5 database has been applied to a wide range of natural language processing tasks. In addition to the obvious use as training data for broad-coverage n-gram models (e.g. as part of a machine translation or speech recognition system), the database has been used for spelling correction, as a convenient replacement for online Web queries e.g. in knowledge mining, and even for the prediction of fMRI neural activation associated with concrete nouns (Mitchell et al. 2008). Computer scientists have also developed specialized indexing engines that allow fast interactive queries to the database, impressively demonstrated e.g. by http://www.netspeak.org/ (Stein et al. 2010). In my talk, I explore the usefulness of Web1T5 and similar n-gram databases as a resource for corpus linguistic studies, despite its well-known shortcomings: the inevitable frequency thresholds, a genre composition dominated by computer science, porn and advertising, an abundance of text duplicates and boilerplate, as well as a complete lack of linguistic annotation (lemmatization and part-of-speech tagging). As an example, I show how three essential types of corpus analysis -- word and phrase frequencies, collocational profiles, and distributional semantics -- can be carried out on Web1T5. A prerequisite for more wide-spread adoption of n-gram databases in corpus linguistics is the availability of open-source indexing software that is flexible enough to support these types of corpus analysis, fast enough for interactive exploration of the database, and that runs on off-the-shelf desktop hardware. I present a simple and convenient solution building on SQLite (an embedded relational database engine), Perl and the statistical software package R (Evert 2010). The last part of my talk attempts an evaluation of Web1T5 as a linguistic resource. For this purpose, frequency counts for words and n-grams are compared with the BNC and other standard corpora, and Web1T5 is applied to several collocation extraction and semantic similarity tasks. A closer look at the evaluation results reveals some fundamental differences between a Web-based n-gram database and traditional corpora. In this way, I hope to shed new light on the question whether more data are really always better data (Church & Mercer 1993).

Bio

Prof. Dr. Stefan Evert is professor of English Computational Corpus Linguistics at Technische Universität Darmstadt, Germany. He earned a PhD in Computational Linguistics from the University of Stuttgart in 2004 and held a position as assistant professor for Computational Linguistics at the Institute of Cognitive Science, University of Osnabrück from 2005 to 2011. His main interests lie at the boundary between linguistic research, statistical corpus analysis and natural language processing. Current research topics include the methodological foundations of corpus linguistics, collocations and multiword expressions, distributional semantics and multi-dimensional analysis of language variation. 

05.04.2012 - Your Noise is My Signal This colloquium talk has been cancelled!

Speaker: Prof. Dr. Shwetak N. Patel
Host: Elaine Huang

Abstract

Professor Patel will describe work on a new generation of electricity, water, and natural gas measurement systems that are low-cost, easy-to-install, and, most importantly, capable of providing disaggregated data on consumption down to the individual appliance or device from single sensing points. The vision is to provide high granularity resource sensing systems for homes and businesses that will fundamentally transform how end uses of electricity, water, and natural gas are understood, studied and, ultimately, consumed. All three systems share a common approach: they monitor side-effects of resource usage that manifest throughout a home's internal electricity, plumbing, or gas infrastructure. He will also describe a new approach to low-power wireless sensing in the home that uses the power lines as a receiving antenna and new approaches to natural user interfaces. Most of the work follows the theme of using the problems from one project as a solution for another.

Bio

Shwetak N. Patel is an Assistant Professor in the departments of Computer Science and Engineering and Electrical Engineering at the University of Washington. His research interests are in the areas of Human-Computer Interaction, Ubiquitous Computing, and User Interface Software and Technology. He is particularly interested in developing easy-to-deploy sensing technologies and approaches for activity recognition and energy monitoring applications. He is also interested in exploring novel interaction techniques for mobile devices, mobile sensing systems, and wireless sensor platforms. Dr. Patel was also a founder of Zensi, Inc., a demand side energy monitoring solutions provider, which was acquired by Belkin, Inc in 2010. He received his Ph.D. in Computer Science from the Georgia Institute of Technology in 2008 and B.S. in Computer Science in 2003. Dr. Patel received the MacArthur Foundation Fellowship (colloquially know as the "Genius Award") in 2011, received the TR-35 award in 2009, was named top innovator of the year by Seattle Business Magazine, was named Newsmaker of the year by Seattle Business Journal, and was a recipient of the Microsoft Research Faculty Fellowship in 2011. His past work was also honored by the New York Times as a top technology of the year in 2005.

19.04.2012 - Towards Energy-aware Distributed Systems

Speaker: Prof. Dr. Jean-Marc Pierson
Host: Lorenz Hilty

Abstract

In this talk, I will make a state of the art of the research on the energy-aware management of data centers. A particular focus will be put on monitoring and metrics. In a second part, I will examine how collected data can help to reduce the environmental impact of these data centers gathering hundreds of machines in dedicated environments: Task placement, scheduling and optimization will be more detailed on theoretical and practical aspects targeting mainly Cloud and High Performance computing.

Bio

Since September 2006, Jean-Marc Pierson has served as a Full Professor in Computer Science at the University of Toulouse (France). Jean-Marc Pierson received his PhD from the ENS-Lyon, France in1996. He was an Associate Professor at the University Littoral Cote-d'Opale (1997-2001) in Calais, then at INSA-Lyon (2001-2006). He is a member of the IRIT Laboratory. His main interests are related to large-scale distributed systems. He serves on several PCs and editorial boards in the Grid, Pervasive, and Energy-aware computing area. His researches focus on trust and reputation systems, cache and replica management, and energy aware distributed systems (in particular sensing, job placement and scheduling, green networking, autonomic computing, mathematical modeling). He is chairing the EU funded COST IC804 Action on “Energy Efficiency in Large Scale Distributed Systems” and participates in several national and European projects on energy efficiency.

03.05.2012 - On Supporting Location in Web Querying

Speaker: Prof. Dr. Christian S. Jensen
Host: Michael Böhlen

Abstract

Due in part to the increasingly mobile use of the web and the proliferation of geo-positioning technologies, the web is rapidly acquiring a spatial aspect. Content and users are being augmented with locations that are being used increasingly by a variety of services. The research community is hard at work inventing means of efficiently supporting new spatial querying functionality. Points of interest with a web presence, called spatial web objects, have locations as well as textual descriptions. Spatio-textual queries return such objects that are near a location argument and are relevant to a text argument. An important element in enabling such queries is to be able to rank spatial web objects. Another is to be able to determine the relevance of an object to a query. Yet another is to enable the efficient processing of such queries. The talk covers recent results on spatial web object ranking and spatio-textual querying obtained by the speaker and his colleagues.

Bio

Prof. Dr. Christian S. Jensen is a Professor of Computer Science at Aarhus University, Denmark, and he was previously with Aalborg University for two decades. His research concerns data management and data-intensive systems, and it concerns primarily temporal and spatio-temporal data management. He is an ACM and an IEEE fellow, he is a member of the Royal Danish Academy of Sciences and Letters, the Danish Academy of Technical Sciences, and the EDBT Endowment, and he is a trustee emeritus of the VLDB Endowment. He has received several national and international awards for his research. He is vice-chair of ACM SIGMOD and an editor-in-chief of The VLDB Journal, and he has served on the editorial boards of ACM TODS, IEEE TKDE, and the IEEE Data Engineering Bulletin. He was PC chair or co-chair for SSTD 2001, EDBT 2002, VLDB 2005, MobiDE 2006, MDM 2007, DMSN 2008, TIME 2008, and ACM SIGSPATIAL GIS 2011. He is PC co-chair for APWeb 2012 and IEEE ICDE 2013. 

24.05.2012 - A Random Graph Model of Multi-Hospital Kidney Exchanges

Speaker: Prof. Dr. David C. Parkes
Host: Sven Seuken

Abstract

In kidney exchanges, hospitals share patient lists and patient-donor pairs are matched into cycles or chains. Properties of interest include efficiency, incentive compatibility (reporting all pairs is in the interest of a hospital) and participation constraints. Adopting a random graph model and considering 2-way and 3-way cycles, we characterize the structure of the efficient outcome, providing a "square root" law to describe system benefits of a central exchange. A matching mechanism is described that is efficient and Bayes-Nash incentive compatible under idealized assumptions, while also providing robust efficiency and incentive alignment in realistic models. 

Bio

Prof. Dr. David C. Parkes is Gordon McKay Professor of Computer Science in the School of Engineering and Applied Sciences at Harvard University and currently Distinguished Visting Scholar at Christ's College Cambridge. He was the recipient of the NSF Career Award, the Alfred P. Sloan Fellowship, the Thouron Scholarship, the Harvard University Roslyn Abramson Award for Teaching and named as one of Harvard Class of 2010 Favorite Professors. D. Parkes received his Ph.D. degree in Computer and Information Science from the University of Pennsylvania in 2001, and an M.Eng. (First class) in Engineering and Computing Science from Oxford University in 1995. At Harvard, D. Parkes founded the Economics and Computer Science research group and teaches classes in artificial intelligence, machine learning, optimization, multi-agent systems, and topics at the intersection between computer science and economics. D. Parkes chairs the ACM SIG in Electronic Commerce and has served as Program Chair of ACM EC'07 and AAMAS'08 and General Chair of ACM EC'10, served on the editorial board of Journal of Artificial Intelligence Research, and currently serves as Editor of Games and Economic Behavior and on the boards of Journal of Autonomous Agents and Multi-agent Systems, ACM Transactions on Economics and Computation, and the INFORMS Journal of Computing. His research interests include computational mechanism design, electronic commerce, stochastic optimization, preference elicitation, market design, bounded rationality, computational social choice, networks and incentives, multi-agent systems, crowd-sourcing and social computing.

31.05.2012 - IT-enabled Collective Intelligence in Innovation Communities

Speaker: Prof. Dr. Jan Marco Leimeister
Host: Gerd Schwabe

Abstract

The increasing popularity of open innovation approaches has lead to the rise of various innovation platforms on the Internet which might contain 10.000s user-generated ideas. However, a company’s absorptive capacity is limited regarding such an amount of ideas so that there is a strong need for mechanism to identify the best ideas. Extending previous decision management research, the talk focuses on analyzing effective idea rating and selection mechanisms in online innovation communities and underlying explanations. Open innovation research also shows that most innovations are a result of intensive collaboration. Thus, fostering collaboration among idea contributors might be a fruitful approach for unleashing the communty’s entire creative potential and making innovation communities even more successful. This talk also shows how collaboration in Innovation Communities can be supported by IT applications and reports on results of various pilot studies in real-world settings. The results show that user collaboration can enhance idea quality and that supporting and inducing user collaboration is a viable design element for making innovation communities more effective. This can contributes to a more successful design, implementation and operation of idea competitions, as well as to better outcomes. In closing, I outline future avenues for leveraging collective action and collaborative intelligence in Innovation Communities.

Bio

Prof. Dr. Jan Marco Leimeister is a Full Professor and holds the Chair for Information Systems at Kassel University, Germany. He is director of the IS research center ITeG at Kassel University where he heads research groups on service, collaboration and IT innovation engineering and management, and manages several publicly funded research projects. His teaching and research areas include IT innovation management, service science, ubiquitous and mobile computing, collaboration engineering, and strategic IT management.

21.06.2012 - A Popperian Platform for Programming and Teaching the Global Brain

Speaker: Prof. Dr. Karl Lieberherr

Hosts: Avi Bernstein and Thomas Fritz

Abstract

In a recent article in the Communications of the ACM (May 2012) on Programming the Global Brain, Bernstein et al. make the point that developers of global brain systems need to be societal engineers coordinating societies of many diverse workers. Bederson et al. (2011) address the "remote person call" issue of global brain systems which is about exploitation and ethics issues. Over the last five years we have been experimenting with the Scientific Community Game (SCG) that tries to foster innovation in technological domains through good societal engineering but also learning so that workers benefit from the interaction with requesters and other workers (reducing exploitation, workers become both students and teachers). In the terminology of Bernstein et al., SCG is a constraint-based programming platform parameterized by playgrounds that define the detailed constraints for a domain. In the SCG we build knowledge bases of claims that are defended by members of the community against attackers. We use a critical rationalism (= Popperian) approach where each claim is disputable. The refutation protocol, in its simplest form, is: If you produce an x in X and I produce a y in Y, property p(x,y) holds. The successful defenders of claims have good technological know-how (for the given playground and relative to the quality of the other workers). It is this technological know-how which is of interest to the requesters and it is transfered to them as software or heuristic descriptions or by hiring the successful workers. I will introduce the rules of SCG, its playgrounds that are inhabited by workers or avatars ( produced by workers). Interesting specializations of SCG are the Quantifier Game from logic but also the Renaissance Game from the 16th century. I will report on our successes and failures to create games that produce innovations and learning (I have written a playground designer's guide that helps designers to be good societal engineers). Our most successful instantiation of the Scientific Community Game was through the learning tool piazza.com in an Algorithms class with 35 undergraduates. Supported by Novartis. Joint work with Ahmed Abdelmeged. SCG-Publications.

Bio

Karl Lieberherr started his research career in computer science as a theoretical computer scientist, focusing on the theory of P-optimal algorithms for the generalized maximum satisfiability problem (MAX-CSP), still an active area of research. This work has motivated the development of a game platform for refutation-based, constructive scientific domains, called the Scientific Community Game (SCG) also known as the Specker Challenge Game, named after former ETH Professor Ernst Specker. He also invented, independently and simultaneously on the other side of the Atlantic (at ETH Zurich), an early form of non-chronological backtracking based on learned clauses (superresolution) which has become a key feature of most state-of-the-art SAT and CSP solvers. In the mid 1980s, he switched to his current research area: Object-Oriented and Aspect-Oriented Software Development and focused on issues of software design and modularity. He founded the Demeter research team, which studied the then-novel idea of Adaptive Programming, also known as structure-shy programming and produced the Law of Demeter ("talk only to your friends": an explicit form of coupling control) and several systems for separating concerns in an object-oriented and functional programming context: From Demeter/Flavors to DemeterF.