{"version": "https://jsonfeed.org/version/1", "title": "/dev/posts/ - Tag index - cms", "home_page_url": "https://www.gabriel.urdhr.fr", "feed_url": "/tags/cms/feed.json", "items": [{"id": "http://www.gabriel.urdhr.fr/2014/09/25/filtering-the-clipboard/", "title": "Filtering the clipboard using UNIX filters", "url": "https://www.gabriel.urdhr.fr/2014/09/25/filtering-the-clipboard/", "date_published": "2014-09-25T00:00:00+02:00", "date_modified": "2014-09-25T00:00:00+02:00", "tags": ["computer", "x11", "unix", "cms", "html"], "content_html": "
I had a few Joomla posts that I wanted to clean up semi-automatically.\nHere are a few scripts, to pass the content of the clipboard (or the\ncurrent selection) through a UNIX filter.
\nCleaning up the (HTML) content of the posts was quite time consuming\nand very repetitive:
\nstyle
attributes (hardcoded fonts, etc.);<p>
containing <br/>
in different <p>
s;<p>
s;Most of the job could be done by a script\n(cleanup_html
):
#!/usr/bin/env ruby\n# Remove some crap from HTMl snippets.\n\nrequire \"nokogiri\"\n\nif (ARGV[0])\n html = File.read(ARGV[0])\nelse\n html = $stdin.read\nend\ndoc = Nokogiri::HTML::DocumentFragment.parse html\n\n# Remove 'style':\ndoc.css(\"*[style]\").each do |node|\n style = node.attribute(\"style\")\n node.remove_attribute(\"style\")\n $stderr.puts \"Removed style: #{style}\\n\"\nend\n\n# Remove useless span:\ndoc.css(\"span\").each do |span|\n $stderr.puts \"Unwrapping span: #{span}\\n\"\n span.children.each do |x|\n span.before(x)\n end\n span.remove\nend\n\n# Split paragraphs on <br/>:\ndoc.css(\"p > br\").each do |br|\n p = br.parent\n\n # Clone\n new_p = p.document.create_element(\"p\")\n p.children.take_while{ |x| x!=br }.each do |x|\n new_p.add_child x\n end\n p.before(new_p)\n\n br.remove\nend\n\n# Remove empty paragraphs:\ndoc.css(\"p\").each do |node|\n if node.element_children.empty? && /\\A *\\z/.match(node.inner_text)\n node.remove\n end\nend\n\nprint doc.to_html\n
\nI wanted to do a semi-automatic update in order to have feedback on\nwhat was happening and fix the remaining issues straightaway. To do\nthis, the filter can be applied on the X11 clipboard:
\n#!/bin/sh\nxclip -out -selection clipboard | filter_html | xclip -in -selection clipboard\n
\nIt is even possible to do it on the current selection:
\n#!/bin/sh\nsleep 0.1\nxdotool key control+c\nsleep 0.1\nxclip -out -selection clipboard | filter_htm | xclip -in -selection clipboard\nxdotool key control+v\n
\nThis second script is quite hackish but it kind of works:
\nControl-c
and Control-v
for copy/paste;sleep
calls are needed.This can be generalized with this script (gui_filter
):
#!/bin/sh\n\nmode=\"$1\"\nshift\n\ncase \"$mode\" in\n primary | seconday | clipboard)\n xclip -out -selection \"$mode\" | command \"$@\" | xclip -in -selection \"$mode\"\n ;;\n selection)\n # This is an horrible hack.\n # It only works for C-c/C-v keybindings.\n sleep 0.1\n xdotool key control+c\n sleep 0.1\n xclip -out -selection clipboard | command \"$@\" | xclip -in -selection clipboard\n xdotool key control+v\n ;;\nesac\n
\nCalled with:
\n# Clean the HTMl markup in the clipboard:\ngui_filter clipboard html_filter\n\n# Base-64 encode the current selection:\ngui_filter selection base64\n\n# Base-64 decode the current selection:\ngui_filter selection base64 -d\n
\nNow we can bind this command to a temporary global hotkey with this\nscript based on the keybinder library:
\n#!/usr/bin/env python\n# Bind a global hotkey to a given command.\n# Examples:\n# keybinder '<Ctrl>e' gui_filter selection base64\n# keybinder '<Ctrl>X' xterm\n\nimport sys\nimport gi\nimport os\nimport signal\n\ngi.require_version('Keybinder', '3.0')\nfrom gi.repository import Keybinder\nfrom gi.repository import Gtk\n\ndef callback(x):\n os.spawnvp(os.P_NOWAIT, sys.argv[2], sys.argv[2:])\n\nsignal.signal(signal.SIGINT, signal.SIG_DFL)\nGtk.init()\nKeybinder.init()\nKeybinder.bind(sys.argv[1], callback);\nGtk.main()\n
\nThe kotkey is active as long as the keybinder
process is not killed.
keybinder '<Ctrl>e' gui_filter selection html_filter\nkeybinder '<Ctrl>e' gui_filter selection kramdown\nkeybinder '<Ctrl>e' gui_filter selection cowsay\nkeybinder '<Ctrl>e' gui_filter selection sort\n\n# More dangerous:\nkeybinder '<Ctrl>e' gui_filter clipboard bash\nkeybinder '<Ctrl>e' gui_filter clipboard ruby\nkeybinder '<Ctrl>e' gui_filter clipboard python\n
\nOn Emacs, the shell-command-on-region command (bound to\nM-|) can be used to pass the current selection to a given\ncommand: by default the output of the command will be pushed on the\nring buffer. Alternatively, C-u M-| can be used to replace\nthe selection.
\nThe ! command can be used to transform a given part of the\ncurrent buffer through a shell filter.
\nAtom can replace filter the current selection through a pipe\nwith the pipe
package.
There are some good\nplugins to\nexport Joomla content to WordPress. However, the free version does not\nrewrite the URIs. It is quite simple to read the Joomla database and\ngenerates a bunch of Apache Redirect
directives.
As far as I understood, the URI of a post in Joomla is\n$pathOfMenu/$slugOfPost
where $pathOfMenu
is the path of the menu\nmost specific menu associated with a category of the post.
The relevant SQL tables are:
\njoomla_categories
for the categories;joomla_menu
for the menus;joomla_content
for the posts.For each post, we can generate an Apache Redirect
directive as:
joomla_uri = \"/\" + menu_path + slug\nwordpress_uri = row[:pub].strftime(\"/%Y/%m/%d/\") + row[:alias] + \"/\"\nputs (\"RedirectPermanent \"+joomla_uri + \" \" + wordpress_uri)\n
\n\n"}]}