LFC: LogFormatChecker

Introduction

Logs are being used to drive analytics and problem diagnosis in complex distributed systems (cloud infrastructures)[1]. Often runtime metrics are extracted from log statements in software systems. A whole industry has been erected around logs (log aggregators, log analysers, log storage, etc.) with prominent open source software like Logstash, and commercial software like Splunk, Loggly, and so on. However, to be able to turn log statements into data for analytics tools (Logentries, IBM Operations Analytics) and graphing frontends (Kibana, Graphing), these statements must conform to a certain (string) format that conveys a certain data structure. Some log analytics tools use conventions to define these logs or provide its own configuration language to extract key-value pairs or other data structures for analytics

Goals of this master project

Within the master project, students should design, build, and evaluate tooling to mitigate the aforementioned issues with log statements to support the analytics process.

LFE (LogFormatChecker) is a tool that is able to enforce log statement formats in accordance to convention and configuration of a central logging solution. There are different ways this kind of enforcement can take place. In the past, software engineers have used hooks in their workflow (e.g., within their IDEs, in Continuous Integration (Jenkins, Codeship, Travis-CI, etc.), or Git Hooks to enforce rules (i.e., QA) around their code. Evaluating and deciding what kind of option makes the most sense in the context of log format recommendations is part of the master project.

Task description

The main tasks of this projects are:

  • Ability to pull configuration from standard log management solutions (at least in Logstash)
  • Learn common log formats from the codebase through mining software repositories
  • CI Plugin that makes the build fail or sends out warnings when log statements are not properly formatted
  • Git Pre-Commit Hook that warns developers when they still have log statements in code that are deemed for development and not for production
  • Writing a report summarizing the results from the described work

(*) The scope depends on the number of students

References

  • [1]``Ditch the Debugger and Use Log Analysis Instead", [Link]

Posted: 04.11.2015

Contact: Jürgen Cito