Cleaning the stack by filtering the assembly

Published: Oct 6 2014

Updated: Oct 6 2014

In order to help the SimGridMC state comparison code, I wrote a proof-of-concept LLVM pass which cleans each stack frame before using it. However, SimGridMC currently does not work properly when compiled with clang/LLVM. We can do the same thing by pre-processing the assembly generated by the compiler before passing it to the linker: this is done by inserting a script between the compiler and the assembler. This script will rewrite the generated assembly by prepending stack-cleaning code at the beginning of each function.

Table of content

Table of content
Summary
Assembly rewriting script
Assembler wrapper
Compiler wrapper
Result

Summary

In typical compilation process, the compiler (here cc1) reads the input source file and generates assembly. This assembly is then passed to the assembler (as) which generates native binary code:

cat foo.c | cc1  | as      > foo.o
#         ↑      ↑         ↑
#         Source Assembly  Native

We can achieve our goal without depending of LLVM by adding a simple assembly-rewriting script to this pipeline between the the compiler and the assembler:

cat foo.c | cc1  | clean-stack-filter | as     > foo.o
#         ↑      ↑                    ↑        ↑
#         Source Assembly             Assembly Native

By doing this, our modification can be used for any compiler as long as it sends assembly to an external assembler instead of generating the native binary code directly.

This will be done in three components:

the assembly rewriting script (clean-stack-filter);
an assembler (as) wrapper which calls the assembly rewriting script before delegating to the real assembler;
a compiler wrapper (cc) which calls the real compiler program and configure it in order to call our assembler wrapper.

Assembly rewriting script

The first step is to write a simple UNIX program taking in input the assembly code of a source file and adding in output a stack-cleaning pre-prolog.

Here is the generated assembly for the test function of the previous episode (compiled with GCC):

main:
.LFB0:
	.cfi_startproc
	subq	$40, %rsp
	.cfi_def_cfa_offset 48
	movl	%edi, 12(%rsp)
	movq	%rsi, (%rsp)
	movl	$42, 28(%rsp)
	movl	$0, %eax
	call	f
	movl	$0, %eax
	addq	$40, %rsp
	.cfi_def_cfa_offset 8
	ret
	.cfi_endproc

We can use .cfi_startproc to find the beginning of a function and each pushq and subq $x, %rsp instruction to estimate the stack size used by this function (excluding the red zone and alloca() as previously). Each time we are seeing the beginning of a function we need to buffer each line until we are ready to emit the stack-cleaning code.

#!/usr/bin/perl -w
# Transform assembly in order to clean each stack frame for X86_64.

use strict;
$SIG{__WARN__} = sub { die @_ };

# Whether we are still scanning the content of a function:
our $scanproc = 0;

# Save lines of the function:
our $lines = "";

# Size of the stack for this function:
our $size = 0;

# Counter for assigning unique ids to labels:
our $id=0;

sub emit_code {
    my $qsize = $size / 8;
    my $offset = - $size - 8;

    if($size != 0) {
      print("\tmovabsq \$$qsize, %r11\n");
      print(".Lstack_cleaner_loop$id:\n");
      print("\tmovq    \$0, $offset(%rsp,%r11,8)\n");
      print("\tsubq    \$1, %r11\n");
      print("\tjne     .Lstack_cleaner_loop$id\n");
    }

    print $lines;

    $id = $id + 1;
    $size = 0;
    $lines = "";
    $scanproc = 0;
}

while (<>) {
  if ($scanproc) {
      $lines = $lines . $_;
      if (m/^[ \t]*.cfi_endproc$/) {
	  emit_code();
      } elsif (m/^[ \t]*pushq/) {
	  $size += 8;
      } elsif (m/^[ \t]*subq[\t *]\$([0-9]*),[ \t]*%rsp$/) {
          my $val = $1;
          $val = oct($val) if $val =~ /^0/;
          $size += $val;
          emit_code();
      }
  } elsif (m/^[ \t]*.cfi_startproc$/) {
      print $_;

      $scanproc = 1;
  } else {
      print $_;
  }
}

This is used as:

# Use either of:
clean-stack-filter < helloworld.s
gcc -o- -S hellworld.c | clean-stack-filter | gcc -x assembler -r -o helloworld

And this produces:

main:
.LFB0:
	.cfi_startproc
	movabsq $5, %r11
.Lstack_cleaner_loop0:
	movq    $0, -48(%rsp,%r11,8)
	subq    $1, %r11
	jne     .Lstack_cleaner_loop0
	subq	$40, %rsp
	.cfi_def_cfa_offset 48
	movl	%edi, 12(%rsp)
	movq	%rsi, (%rsp)
	movl	$42, 28(%rsp)
	movl	$0, %eax
	call	f
	movl	$0, %eax
	addq	$40, %rsp
	.cfi_def_cfa_offset 8
	ret
	.cfi_endproc

Assembler wrapper

A second step is to write an extended assembler as program which accepts an extra argument --filter my_shell_command. We could hardcode the filtering script in this wrapper but a generic assembler wrapper might be reused somewhere else.

We need to:

interpret a part of the as command-line arguments and our extra argument;
apply the specified filter on the input assembly;
pass the resulting assembly to the real assembler.

#!/usr/bin/ruby
# Wrapper around the real `as` which adds filtering capabilities.

require "tempfile"
require "fileutils"

def wrapped_as(argv)

  args=[]
  input=nil
  as="as"
  filter="cat"

  i = 0
  while i<argv.size
    case argv[i]
    
    when "--as"
      as = argv[i+1]
      i = i + 1
    when "--filter"
      filter = argv[i+1]
      i = i + 1

    when "-o", "-I"
      args.push(argv[i])
      args.push(argv[i+1])
      i = i + 1
    when /^-/
      args.push(argv[i])
    else
      if input
        exit 1
      else
        input = argv[i]
      end
    end
    i = i + 1
  end

  if input==nil
    # We dont handle pipe yet:
    exit 1
  end

  # Generate temp file
  tempfile = Tempfile.new("as-filter")
  unless system(filter, 0 => input, 1 => tempfile)
    status=$?.exitstatus
    FileUtils.rm tempfile
    exit status
  end
  args.push(tempfile.path)

  # Call the real assembler:
  res = system(as, *args)
  status = if res != nil
             $?.exitstatus
           else
             1
           end
  FileUtils.rm tempfile
  exit status
  
end

wrapped_as(ARGV)

This is used like this:

tools/as --filter "sed s/world/abcde/" helloworld.s

We now can ask the compiler to use our assembler wrapper instead of the real system assembler:

the -B switch prepend a directory to the list of directories used to find subprograms such as as;
for clang, the -no-integrated-as flag forces the compiler to pass the generated assembly to an external assembler instead of generating native binary code directly.

gcc -B tools/ -Wa,--filter,'sed s/world/abcde/' \
  helloworld.c -o helloworld-modified-gcc

clang -no-integrated-as -B tools/ -Wa,--filter,'sed s/world/abcde/' \
  helloworld.c -o helloworld-modified-clang

Which produces:

$ ./helloworld
Hello world!
$ ./helloworld-modified-gcc
Hello abcde!
$ ./helloworld-modified-clang
Hello abcde!

By combining the two tools, we can get a compiler with stack-cleaning enabled:

gcc -B tools/  -Wa,--filter,'stack-cleaning-filter' \
  helloworld.c -o helloworld

Compiler wrapper

Now we can write compiler wrappers which do this job automatically:

#!/bin/sh
path=(dirname $0)
exec gcc -B $path -Wa,--filter,"$path"/clean-stack-filter "$@"

#!/bin/sh
path=(dirname $0)
exec g++ -B $path -Wa,--filter,"$path"/clean-stack-filter "$@"

Warning

As the assembly modification is implemented in as, this compiler wrapper will output the unmodified assembly when using cc -S which be surprising. You need to objdump the .o file in order to see the effect of the filter.

Result

The whole test suite of SimGrid with model-checking works with this implementation. The next step is to see the impact of this modification on the state comparison of SimGridMC.