I am tracing software executing; Am I tracing too much?
Supervisor : Yvan Labiche
Team size: Minimum 3, Maximum 5
| CSE | SE | Comm | Biomed | EE | Aero | Special |
|---|---|---|---|---|---|---|
| No | Yes | No | No | No | No | No |
Description
One mechanism often employed to increase our understanding of an existing piece of software is tracing. Through tracing, one instruments the code, for instance, the source code, thereby adding statements which, when executed, will generate trace statements that indicate where in the program a specific execution went through. With multiple traces, one can better understand the behaviour of a piece of software, one can estimate performance. There are however limits to what we can instrument. For instance, if we instrument too much, i.e., add too many tracing statements, we risk changing the behaviour of the traced software, to the extent that the traces are not valid, that is the trace does not represent what the software would have executed without instrumentation. Suppose for instance that you wish to better understand a producer-consumer implementation: two threads communicate through a buffer; the threads are expected to produce and consume under certain conditions, i.e., deadlines. If too much instrumentation is taking place, then the consumer may be slowed down by tracing statements that have been added to its code (instrumentation overhead), and end up missing its deadline: it does not consume fast enough what the producer produces. However, the same execution, but without tracing statements, may not miss the deadline. As a consequence, what we observe with instrumentation is wrong. It is, therefore, important to precisely understand the contributions of different elements of the tracing approach on execution time: Does the collecting of execution information takes too long? Does the processing of this information in a format that can lead to a trace taking too long? Should collecting trace data take place on the same machine as the instrumented software or is it better to trace in a distributed way? The project is to measure alternative instrumentation strategies for Java, using Aspect-Oriented Programming (with AspectJ) and statistically understand which elements of tracing need to be optimized. Some elements of a solution for experimental evaluation will be available. What you will learn/discover: - alternative designs for a given objective, with designs that are more or less optimized, with various performance results. - aspect-oriented programming and its realization in Java with AspectJ. - experimental setup, measurement, and control. - statistical analysis of data (data science) in the context of an experiment.
Prerequisites:
Nothing else than being registered in the course.
Keywords:
software instrumentation, aspect-oriented programming, experimentation