This is an overview of some recent additions to the SimGrid code related to actor synchronisation. It might be interesting for people using SimGrid, working on SimGrid or for people interested in generic C++ code for synchronisation or asynchronicity.
Debugging use-after-free with RR reverse execution
Published:
RR is a very useful tool for debugging. It can record the execution of a program and then replay the exact same execution at will inside a debugger. One very useful extra power available since 4.0 is the support for efficient reverse execution which can be used to find the root cause of a bug in your program by rewinding time. In this example, we reverse-execute a program from a case of use-after-free in order to find where the block of memory was freed.
In my previous SimGrid post, I talked about different solutions for a better isolation between the model-checked application and the model-checker. We chose to avoid the (hackery) solution based multiple dynamic-linker namespaces in the same process and use a more conventional process-based isolation.
In an attempt to simplify the development around the SimGrid model-checker, we were thinking about moving the model-checker out in a different process. Another different approach would be to use a dynamic-linker isolation of the different components of the process. Here is a summary of the goals, problems and design issues surrounding these topics.
In two previous posts, I looked into cleaning the stack frame of a function before using it by adding assembly at the beginning of each function. This was done either by modifying LLVM with a custom codegen pass or by rewriting the assembly between the compiler and the assembler. The current implementation adds a loop at the beginning of every function. We look at the impact of this modification on the performance on the application.
In order to help the SimGridMC state comparison code, I wrote a proof-of-concept LLVM pass which cleans each stack frame before using it. However, SimGridMC currently does not work properly when compiled with clang/LLVM. We can do the same thing by pre-processing the assembly generated by the compiler before passing it to the linker: this is done by inserting a script between the compiler and the assembler. This script will rewrite the generated assembly by prepending stack-cleaning code at the beginning of each function.
In the previous episode, we implemented a LLVM pass which does nothing. Now we are trying to modify this to create a (proof-of-concept) LLVM pass which fills the current stack frame with zero before using it.
The SimGrid model checker uses memory introspection (of the heap, stack and global variables) in order to detect the equality of the state of a distributed application at the different nodes of its execution graph. One difficulty is to deal with uninitialised variables. The uninitialised global variables are usually not a big problem as their initial value is 0. The heap variables are dealt with by memseting to 0 the content of the buffers returned by malloc and friends. The case of uninitialised stack variables is more problematic as their value is whatever was at this place on the stack before. In order to evaluate the impact of those uninitialised variables, we would like to clean each stack frame before using them. This could be done with a LLVM plugin. Here is my first attempt to write a LLVM pass to modify the code of a function.
In the previous episode, I talked about the implementation of a same-page-merging page store. On top of this, we can build same-page-merging snapshots for the SimGrid model checker.