2010-12-19

How to write a C++ program without libstdc++

This blog post explains how to write a C++ program and compile with GCC (g++) so that the resulting binary doesn't depend on libstdc++. The reason for avoiding libstdc++ can be purity (don't depend on libraries you don't really need) and statically linking small tools (e.g. for Unix or Win32 console applications with MinGW).

The limitations are:

  • The standard C++ STL (e.g. #include <string> and #include <vector>) cannot be used. This functionality has to be reimplemented in the program.
  • The standard C++ streams (e.g. #include <iostream>) cannot be used. The standard C I/O library (e.g. #include <stdio.h>) is a smaller and faster replacement. A disadvantage: there is no polymorphic operator<<(ostream&, ...) method for convenient, type-agnostic output.
  • Exceptions cannot be used (i.e. try, catch and throw are disallowed).
  • RTTI (run-time type information) cannot be used.
  • dynamic_cast<...>(...) cannot be used (because it requires RTTI).

Here is how to do it:

  • Add the following C++ code to your program (can be in a separate source file):
    #include <stdlib.h>
    #include <unistd.h>  /* for write(), also available on Windows */
    extern "C" void* emulate_cc_new(unsigned len) { \
      void *p = malloc(len);
      if (p == 0) {
        /* Don't use stdio (e.g. fputs), because that may want to allocate more
         * memory.
         */
        (void)!write(2, "out of memory\n", 14);
        abort();
      }
      return p;
    }
    extern "C" void emulate_cc_delete(void* p) {
      if (p != 0)
        free(p);
    }
    void* operator new  (unsigned len) __attribute__((alias("emulate_cc_new")));
    void* operator new[](unsigned len) __attribute__((alias("emulate_cc_new")));   
    void  operator delete  (void* p)   __attribute__((alias("emulate_cc_delete")));
    void  operator delete[](void* p)   __attribute__((alias("emulate_cc_delete")));
    void* __cxa_pure_virtual = 0;
  • Compile your program with g++ -c to create individual .o files. Add flags -fno-rtti -fno-exceptions to get compile errors for disabled features (exceptions and RTTI).
  • Link your executable with gcc -o prog code1.o code2.o ... It's important that you use gcc here instead of g++ because g++ would link against libstdc++.

This method has been tested and found working with GCC 3.2, GCC 4.2.1 and GCC 4.4.1 (both native Linux compilation and MinGW compilation), and it probably works with other GCC versions as well.

If you need dynamic_cast, add void* __gxx_personality_v0 = 0;, don't use -fno-rtti, and add dyncast.a to the gcc -o ... command-line. Here is how to create dyncast.a (about 55kB) for GCC 4.4.1:

$ ar x /usr/lib/gcc/i486-linux-gnu/4.4/libstdc++.a \
  dyncast.o class_type_info.o si_class_type_info.o pointer_type_info.o \
  pbase_type_info.o tinfo.o fundamental_type_info.o
$ ar crs dyncast.a \
  dyncast.o class_type_info.o si_class_type_info.o pointer_type_info.o \
  pbase_type_info.o tinfo.o fundamental_type_info.o

How to boot GRUB from SysLinux

This blog post explains how to boot GRUB from SysLinux. Only GRUB1 is covered, the solution explained doesn't support GRUB2.

Older versions of SysLinux (such as 3.83) don't support booting GRUB, i.e. they cannot load and boot the stage2 file of GRUB. The newest version of SysLinux contains chain.c32, which can boot many operating systems and bootloaders, including GRUB. The solution explained here doesn't use this feature, so it works with old versions of SysLinux as well.

The fundamental idea is to convert the GRUB stage2 file to a format which SysLinux can boot directly. This format is bzimage, the big variant of the Linux kernel image. grub.exe, part of GRUB4DOS is already in this format, so adding the following lines to syslinux.cfg works:

label grub4dos
menu label GRUB4DOS
kernel grub.exe

However, one might wish to use the original GRUB instead, because GRUB4DOS has some features missing, e.g. it doesn't support GPT (GUID Partition Table) and UUID on reisrefs partitions. Converting the original GRUB stage2 file to bzimage format is easy: just append it to the first 20480 bytes of grub.exe. SysLinux can boot this hybrid, but then GRUB wouldn't be able to find its configfile (menu.lst). To fix that, the full pathname of the configuration file has to be embedded to the boot image. The Perl script grub2bzimage.pl automates this.

grub2bzimage is a Perl script which converts the GRUB bootloader code (stage2) to bzimage format, which can be booted directly by SysLinux and other bootloaders. grub2bzimage.pl doesn't support GRUB2.

Example use to create grubzimg:

$ perl ./grub2bzimage.pl stage2 grubzimg '(hd0,0)/boot/grub/menu.lst'
Write this into syslinux.cfg:
label grub
menu label GRUB
kernel grubzimg

Please note that it's not possible to specify the name of the configfile (menu.lst) in syslinux.cfg (using append). The configfile has to be specified when grubzimg is created.

grub2bzimage.pl has been tested with SysLinux 3.83. Please note that newer versions of SysLinux contain chain.c32 which supports loading GRUB stage2 files directly.

2010-12-14

How to get multiple clickable desktop notifications on Ubuntu Lucid

This blog post explains how to get multiple clickable desktop notifications (i.e. those that were present on Ubuntu Hardy) on the default GNOME desktop of Ubuntu Lucid.

Desktop notifications are short messages displayed by programs for a short time in one of the corners of the screen. In Ubuntu Lucid, by default, they are displayed in the top right corner, their background is black, they are not clickable (i.e. it's not possible to click them away), they disappear while the mouse is over them, and at least 1 of them is visible at the same time. In Ubuntu Hardy, they are displayed in the bottom right corner of the screen, their background is similar to light grey, they are clickable (i.e. they disappear for good if the user clicks on them), they don't disappear while the mouse is over them, and if more than one of them can be on screen without overlap.

Ubuntu Lucid uses the notify-osd backend for displaying notifications. It's not possible to configure the notify-osd backend to make it work like Ubuntu Hardy. However, it's possible to install notification-daemon, which was the default backend in Ubuntu Hardy to for displaying notifications. Here is how to make it work in Ubuntu Lucid:

$ sudo apt-get install notification-daemon
$ sudo perl -pi -e 's@^Exec=.*@Exec=/usr/lib/notification-daemon/notification-daemon@' /usr/share/dbus-1/services/org.freedesktop.Notifications.service
$ sudo killall notify-osd

Try it with:

$ notify-send foo; notify-send bar

Optional change to disable notify-osd completely: (It may screw up volume notifications etc., so don't use it unless it doesn't work without it.)

$ sudo rm -f /usr/share/dbus-1/services/org.freedesktop.Notifications.service.*

See this discussion with some links for other notify-osd improvements and alternatives.

2010-12-11

A dramatic colored picture of a tiger's head

This blog post documents my unsuccessful software archeology attempt to find the origin and the author of the famous PostScript tiger colorful vector graphics. Get the (mostly unchanged) EPS file from the Ghostscript SVN repository, here.

The earliest Ghostscript version containing the tiger I could dig up is version 2.6.1 (5/28/93) [download] in Slackware Linux 1.1.2. FYI The earliest Ghostscript version in Debian is 2.6.1 as well [download].

The image itself contains the comment %%CreationDate: 4/12/90 3:20 AM, so the earliest possible Ghostscript version that can contain it is 2.0 (released on 9/12/90) — but I wasn't able to find a download link for that Ghostscript. The author and the copyright is not indicated. The only description is found in the Ghostscript history.doc file, saying tiger.ps - A dramatic colored picture of a tiger's head.

Even Wikipedia doesn't specify the origin of the tiger graphics. All I could find is this question asking where it comes from.

Here is the relevant part of the EPS header in the tiger.ps and tiger.eps file:

%%Creator: Adobe Illustrator(TM) 1.2d4
%%For: OpenWindows Version 2
%%Title: tiger.eps
%%CreationDate: 4/12/90 3:20 AM
%%DocumentProcSets: Adobe_Illustrator_1.2d1 0 0
%%DocumentSuppliedProcSets: Adobe_Illustrator_1.2d1 0 0

What I've learned: Linux distributions (especially Debian and Slackware) are very useful sources of the source code of ancient versions of some free software.

The origin of the tiger thus remains unsolved.

2010-12-07

It is a misconception that C++ is a superset of C

This blog post shows (with a counterexample) that C++ is not a superset of C, i.e. there is a valid C program which is not a valid C++ program.

It's easy to find a counterexample: just use a C++ keyword as a variable name in C. The program is int class;. The proof:

$ gcc -v
gcc version 4.4.1 (Ubuntu 4.4.1-4ubuntu9)
$ g++ -v
gcc version 4.4.1 (Ubuntu 4.4.1-4ubuntu9)
$ echo 'int class;' | tee t.c t.cc >/dev/null
$ gcc -c -pedantic -ansi -W -Wall t.c
$ g++ -c t.cc
t.cc:1: error: expected identifier before ‘;’ token
t.cc:1: error: multiple types in one declaration
t.cc:1: error: declaration does not declare anything

There is a counterexample which doesn't contain any C++ keywords. The program is int i=&i;. The code:

$ echo 'int i=&i;' | tee t.c t.cc >/dev/null
$ gcc -c -pedantic -ansi -W -Wall t.c
t.c:1: warning: initialization makes integer from pointer without a cast
$ g++ -c t.cc
t.cc:1: error: invalid conversion from ‘int*’ to ‘int’

There is a counterexample which doesn't contain any C++ keywords, and it compiles without a warning as C code. The program is char *p=(void*)0;

$ echo 'char *p=(void*)0;' | tee t.c t.cc >/dev/null
$ gcc -c -pedantic -ansi -W -Wall t.c
$ gcc -c -pedantic -std=c89 -W -Wall t.c
$ gcc -c -pedantic -std=c99 -W -Wall t.c
$ gcc -c -pedantic -std=c9x -W -Wall t.c
$ gcc -c -pedantic -std=gnu89 -W -Wall t.c
$ gcc -c -pedantic -std=gnu99 -W -Wall t.c
$ gcc -c -pedantic -std=gnu9x -W -Wall t.c
$ tcc -W -Wall -c t.c
$ 8c -c -w t.c
$ g++ -c t.cc
t.cc:1: error: invalid conversion from ‘void*’ to ‘char*’

As a bonus: there is a program source which compiles in both C and C++, but it does something different. The following example is based on the fact that sizeof('x') is 1 in C++, but it's at least 2 in C.

#include <stdio.h>
int main(){return!printf("Hello, C%s!\n", "++"+1%sizeof('x')*2);}

Another idea to make the same program source do something else in C than C++ is to take advantage of that union t (etc.) create a type named t in C++, but not in C.

#include <stdio.h>
char t; int main(){union t{int u;};
return!printf("Hello, C%s!\n", "++"+(sizeof(t)<2)*2);}

So the true statement is: C and C++ are similar languages, with a large common subset, and C++ being much larger to the common subset than C is.

2010-12-05

On browser compatibility issues

This blog post lists the browser compatibility issues (with solutions) I've encountered when creating a simple HTML(4) + JavaScript + CSS web page containing presentation slides with a little bit of user interaction.

My goal was to make the web page compatible with Google Chrome 8.0 or later, Firefox 3.6 or later, Safari in Leopard or later, Opera 10.63 or later, Konqueror 4.4.2 or later, Internet Explorer 8.0 or later. (I'll call them main browsers from now on.) Please note that anything stated in this blog post may not be true for earlier web browser versions. There was no plan to make the web page work in any earlier web browser version, but it turned out that it was possible and easy to make the page work with Internet Explorer 7.0 with some small rendering quality degradation, so I happened to add support for that as well.

My workflow:

  • Edit the web page directly in a text editor, as a single HTML file (containing HTML, JavaScript and CSS). Split it to different files only after the prototype is found working and all browser compatibility issues have been registered (and most of them resolved).
  • Make sure the browser renders the page in standards compliant mode, not in quirks mode (see below how and why).
  • View the local HTML file in Google Chrome, reload the page after each file save. Try it after major modifications in Firefox. As soon as the web page works in these two browsers, fix it for other browsers as well.
  • When porting to other browsers, especially Internet Explorer, search in the jQuery source code for the feature (class or method name) that works in one browser. Most probably they jQuery source will contain its alternative implementations for other browsers.
  • Use the Developer Tools feature of Google Chrome (activate it with Ctrl-Shift-J) to debug the DOM, the CSS and the JavaScript on the current page. With Firefox, use the Firebug extension and/or the shell bookmarklet for debugging. Should you need it, use the Tools / Developer Tools (activate it with F12; it is available as an extension called Developer Toolbar for IE7 and earlier) in Internet Explorer 8 for debugging.
  • With Internet Explorer Developer Tools you can verify that the web page is indeed rendered in standards compliant mode, and you can also force a rendering which Internet Explorer 7 would do.
  • After each major change, validate the web page that it is valid HTML 4.01 transitional. You can use the HTML Validator Google Chrome extension.

My general knowledge and advice on cross-browser compatibility:

  • All main browsers (Google Chrome, Firefox, Safari, Opera and Konqueror) are very similar to each other except for Internet Explorer.
  • Google Chrome, Safari and Konqueror are even more similar to each other, because they share the same original code base (KHTML in Konqueror, which WebKit in Safari is based on, which WebKit in GoogleChrome is based on).
  • Rendering (application of CSS) in most main browsers in much more similar to each other when they render the web page in standards compliant mode rather than in quirks mode. So to get rid of many incompatibilities, just make sure that the browser renders the page in standards compliant mode.
  • To enable standards compliant mode, just start your HTML with a doctype (right at the beginning of the file). To get HTML 4.01 transitional, prepend this to your HTML file:
    <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"
        "http://www.w3.org/TR/html4/loose.dtd">
  • Make sure your editor saves the HTML file in UTF-8 encoding (character set), without the byte order mark (BOM).
  • Make sure the character set is indicatied the <head>. This is needed and used when the web page is loaded using a file:/// URL or imported to a word processor such as OpenOffice Writer or LibreOffice Writer. Example:
    <meta http-equiv="content-type" content="text/html; charset=utf-8">
  • Make sure your webserver returns the proper character set for the web page in the HTTP response headers (e.g. Content-Type: text/html; charset=utf-8). This setting is used for http:// and https:// URLs. The HTTP response header takes precedence over <meta http-equiv=. If you use the Apache webserver, you may be able to force the UTF-8 charset by creating a file named .htaccess (filename starting with a dot) next to the HTML file, containing
    AddDefaultCharset UTF-8
    . If it doesn't seem to take effect, then put an error (temporarily) to the .htaccess file (e.g. nosuchdirective 42), do a Shift-reload on your web page. If that causes an Internal Server Error, then .htaccess is indeed honored by Apache. If you don't get such an error, talk to your webmaster or system administrator (i.e. copy-paste this paragraph to an e-mail in addition to the URL).
  • The corresponding SVN command (for Google Code) for the character set is:
    svn propset 'svn:mime-type' 'text/html; charset=utf-8' webpage.html
  • Use jQuery, and make it take care of most browser compatibility issues. This document assumes, however, that no JavaScript framework is used, so all issues have to be resolved manually.

Some specific cross-browser compatibility issues I have encountered:

  • See a cross-browser comparison of all JavaScript browser events here.
  • Internet Explorer has a different API for installing and handling events.
    • Event propagation and bubbling semantics are different. You can avoid these issues by installing all event handlers for document.body and window. (Sorry, I don't know more about this.)
    • For Internet Explorer, change obj.addEventListener(eventName, handler, false) to obj.attachEvent('on' + eventName, handler).
    • For Internet Explorer, change obj.removeEventListener(eventName, handler, false) to obj.detachEvent('on' + eventName, handler).
    • Internet Explorer supports the DOMContentLoaded event under a different name (onreadystatechange) and different semantics, see below.
    • Internet Explorer needs event.preventDefault() to be specified differently. Here is the cross-browser solution:
      event.preventDefault ? event.preventDefault() : (event.returnValue = false)
    • Internet Explorer needs event.stopPropagation() to be specified differently. Here is the cross-browser solution:
      event.stopPropagation ? event.stopPropagation() : (event.cancelBubble = true)
    • Internet Explorer doesn't pass the event object to the event handler function. To make it cross browser, write it like this:
      function handler(event) { if (!event) event = window.event; ... }
  • Internet Explorer doesn't have window.getComputedStyle, so the current (computed) CSS properties have to be queried differently. An example, specific cross-browser solution:
    function getBackgroundColor(element) {
      return element.currentStyle ? element.currentStyle.backgroundColor :
             window.getComputedStyle(element, null).
             getPropertyValue('background-color')
    }
  • Internet Explorer 8 doesn't have elementOrDocument.getElementsByClassName, so it has to be emulated using a loop and e.g. elementOrDocument.getElementsByTagName and some manual checking. The emulation is a bit slower.
  • The string.substr method in Internet Explorer doesn't accept negative values in its first argument, so e.g. myString.substr(-2) has to be replaced by myString.substr(myString.length - 2).
  • The DOMContentLoaded event is fired as soon as the main HTML and the JavaScript and CSS files it references have finished loading — but before images and other objects on the page have finished. By that time, the rendering is done and measurements are made, so e.g. document.body.offsetWidth is valid. In contrast, the onload even fires as soon as images etc. are also finished loading Here is how to handle DOMContentLoaded in a cross-browser way:
    function onDomReady() {
      ...
    }
    if (navigator.userAgent.indexOf(' KHTML/') >= 0) {
      document.addEventListener('load', onDomReady, false) 
    } else if (document.addEventListener) {
      document.addEventListener('DOMContentLoaded', onDomReady, false)
    } else if (document.attachEvent) {
      document.attachEvent('onreadystatechange', function() {
        if (document.readyState == 'complete') {
          document.detachEvent('onreadystatechange', onReadyStateChange)
          onDomReady()
        }
      })
    }
    Here, the load event is used instead of DOMContentLoaded, because Konqueror 4.4.2 doesn't compute element dimensions (e.g. document.body.offsetWidth and CSS properties) at DOMContentLoaded time; but it returns bogus default values. For Internet Explorer, onreadystatechange has to be used instead.
  • There are two keyboard events when a key is pressed or auto-repeated: keydown and keypress.
    • Use keypress for Google Chrome and Safari (i.e. navigator.userAgent.indexOf(' AppleWebKit/') >= 0), because they don't send the arrow key (and any other non-letter) events for keypress.
    • Use keydown for Firefox (for 3.6), because it doesn't send auto-repeats for keydown (see more here).
    • Use onkeydown on Internet Explorer.
    • Some browsers have a meaningless event.keyCode (such as 0 for Firefox 3.6 or the same as the event.charCode for Konqueror 4.4.2) for some keys. In this case, use the event.charCode instead.
    • The event.charCode depends on whether Shift is down (e.g. 65 for A and 97 for Shift-A), but it doesn't depend on whether Ctrl or other modifiers are down. event.charCode is usually a Unicode code point (character code).
    • See event.keyCode constants here.
    • See more here about keyboard event compatibility.
  • Many browsers (such as Firefox 3.6) send multiple resize events (as part of an animation) when the window is maximized.
  • To get the vertical scroll position, use document.body.scrollTop || document.documentElement.scrollTop, because Konqueror always returns 0 for just document.body.scrollTop. Do this respectively for the horizontal scroll position (scrollLeft).
  • To scroll the page in a cross-browser way (which works in Konqueror as well), use window.scroll(newScrollLeft, newScrollTop) .
  • Internet Explorer doesn't support window.innerWidth, use window.innerWidth || document.documentElement.clientWidth for cross-browser compatibility. Do the same with window.innerHeight and document.documentElement.clientHeight as well.
  • Internet Explorer 7 doesn't support an extra comma right before ] and } in array and object constructors.

2010-11-28

Announcing uevalrun: self-contained computation sandbox for Linux

This blog post is to announce uevalrun.

uevalrun is a self-contained computation sandbox for Linux, using User-mode Linux for both compilation and execution of the program to be sandboxed. The program can be written in C, C++, Python, Ruby, Perl or PHP. uevanrun enforces memory limits, timeouts and output size limits in the sandbox. The primary use case for uevalrun is evaluation of solution programs submitted by contestants of programming contests: uevalrun compiles the solution, runs it with the test input, compares its output against the expected output, and writes a status report.

For your convenience, here is a (non-updated) copy of the documenation of uevalrun. See project home page for the most up-to-date version.

Installation

A 32-bit (x86, i386) or 64-bit (x86_64, amd64) Linux system is required with enough RAM for both compilation and execution, plus 6 MB memory overhead. The Linux distribution doesn't matter, because uevalrun uses statically linked binaries, and it's self-contained: it contains all the tools it needs, including the C and C++ compilers (from the uClibc gcc-4.1.2 toolchain).

uevalrun doesn't need root privileges: it runs as a simple user.

uevalrun needs about 160MB of disk space, most of which is consumed by the C compiler (31MB extracted + 17MB compressed), the scripting language interpreters (9MB compressed), and the virtual disk images (77MB, uncompressed). After removing all temporary files, 80MB will be enough.

Download and compile:

$ svn co http://pts-mini-gpl.googlecode.com/svn/trunk/uevalrun uevalrun
$ cd uevalrun
$ ./make  # This downloads some more files during compilation.

Try it (the italic parts are displayed by the program):

$ (echo '#! perl'; echo 'print "Hello"x2, "\n"') >foo.pl
$ echo HelloHello >expected.txt
$ ./uevalrun -M 10 -T 9 -E 9 -s foo.pl -t /dev/null -e expected.txt
...
@ result: pass
$ echo '/**/ int main() { return!printf("HelloHello\n"); }' >foo.c
$ ./uevalrun -M 10 -T 9 -E 9 -U 19 -N 32 -C 9 \
  -s foo.c -t /dev/null -e expected.txt
...
@ result: pass

How to use

This section has not been written yet. If you have questions, don't hesitate to ask!

Requirements

Security is the most important requirement of uevalrun, followed by high performance and usability.

The sandboxing must be secure, more specifically:

  • Programs inside the sandbox must not be able to communicate with the outside world, except for their stdin and stdout. So they don't have access to the filesystem (except for some read-only access to some whitelisted system libraries and binaries), and they can't do network I/O. If uevalrun is used for programming contest submission evaluation, this restriction prevents the program from finding and reading the file containing the expected output.
  • Sandboxed programs must not be able to use more system resources (memory, disk, CPU) than what was allocated for them.
  • Sandboxed programs must not be running for a longer period of time than what was allocated.
  • Even the compilation of programs to be run (executed) inside the sandbox must be sandboxed (possibly with different system resource allocation), to prevent the attacker from submitting a program source code which exploits a bug in the compiler.
  • The sandbox must be reverted to its initial state before each compilation and execution, so sandboxed programs won't be able to gather information about previous compilations and executions. For programming contests, this restriction prevents the program from reading other contestants' submissions or their output.
  • Timeouts (both real time and user time) must be enforced outside the sandbox, so the program will be reliably killed if it runs for too long time, even if it tampers with time measurement inside the sandbox.
  • Sanboxed programs must not be able to lie about their success, their performance or the correctness of their output by writing special marker characters to their stdout or stderr.

The sandbox must be fast and waste few resources, more specifically:

  • Reverting the sandbox to its initial, empty state must be very fast (preferably faster than 1 second).
  • Starting a program inside the sandbox must be very fast (preferably faster than 1 second).
  • Running a program inside the sandbox must not be much slower than running the same program outside the sandbox. If the program is CPU-intensive, it may run 1.1 times slower (i.e. 10% slower) inside than outside. If the program is I/O-intensive, it may run 10 times slower inside than outside. This requirement is intentionally quite loose on I/O virtualization performance.
  • Running a program inside the sandbox must not need more than 10MB more memory than running the same program outside.
  • Running a program inside the sandbox must not need any disk space, except for the disk space needed by the program binary, the test input and the expected output, all stored as files outside the sandbox.
  • Compilation inside the sandbox must require only a reasonable amount of temporary disk space (for the file to be compiled, the temporary files, and the output binary).

The sandbox must be easy to use and versatile, more specifically:

  • Sandboxed programs must be able to accept any 8-bit binary input (stdin).
  • Sandboxed programs must be able to write any 8-bit binary output to their stdout.
  • Multiple sandboxed programs must be able to run at the same time on the same host system, without affecting each other.
  • If a sandboxed program fails (e.g. because it writes to its stdout different from what was expected, or it does a memory access violation, it reaches a timeout, it exceeds its allocated memory etc.), the proper failure reason has to be reported (e.g. ``wrong answer'' must not be reported if the program times out or vice versa).
  • The sandboxing software must have as little few system dependencies as possible. It must be able to run in a restricted (e.g. chroot) environment, it must not depend on system libraries or configuration. It must work on 32-bit (x86, i386) and 64-bit (x86_64, amd64) Linux systems, on any Linux distirubution.
  • Sandboxed programs can be written in C, C++, Python, Ruby, Perl and PHP, or in any language whose compiler can produce a 32-bit (i386) Linux statically linked executable binary.

Design

To fullfill the requirements, the following fundamental building blocks are used.

User-mode Linux (UML) is used for virutalization: both compilation and execution is performed in a UML guest, reading data from the host system using virtual block devices (ubd), and writing its output to its /dev/tty1 (con0), which can be read by a process on the host. A UML guest kernel (currently 2.6.31.1) tailered to the sandboxing requirements (security and performance) is configured and compiled. Some kernel patches are applied to increase security, reliability and performance. (Please note that these patches apply to the UML guest kernel only, so the host remains unpatched, and rebooting is not needed either.) Networking and the virtual file system are disabled in the UML guest kernel for increased security. All unnecessary drivers and features are removed from the UML guest kernel to get fast boot times and to reduce the overhead.

The underlying virtualization technology used by UML is ptrace(), which doesn't need root privileges or a kernel module or kernel modifications in the host. As an alternative to UML, Seccomp could be used for sendboxing, but that's quite cumbersome, because the sandboxed process cannot allocate memory for itself (see the Google Chrome Seccomp sandbox for a generic solution), but it's still prohibitively cumbersome to sandbox GCC this way (with its requirements for forking and creating temporary files). Most sandboxing approaches on Linux require root privileges for a short amount of time (for chroot, PID namespaces (see clone(2)) and other namespaces). Most virtualization approaches (like KVM, Xen, VirtualBox, except possibly for QEMU) need loading of kernel modules or patching the kernel, and support only slow guest boot times.

With UML, it's possibly to boot the UML guest, run a hello-world binary, and halt the UML guest in less than 0.02 second (sustained). For that speed, one needs a (guest) kernel patch (included and automatically applied in uevalrun) which skips the Calibrating delay loop kernel startup step, which usually takes 0.33 second. The memory overhead of UML is also quite low (6 MB with a stripped-down kernel).

All software running inside and outside UML is written in C (for high performance) and compiled with uClibc and linked statically to avoid depending on the libc on the host, and to avoid the need for installing a libc in the guest. (The total size of these custom-built binaries is less than 100kB.) There is no Linux distribution installed the guest – only a few custom binaries are copied, like a custom /sbin/init, which mounts /proc, sets up file descriptors and resource limits, and starts the sandboxed program, and then halts the guest. The /lib directory in the guest is empty, except for compilation, where /lib contains libc.a, libstdc++.a etc.

BusyBox (statically linked with uClibc) is used as a Swiss Army Knife tool for scripting and automation (of the build and the setup process), both inside and outside UML. Please note, however, that once all the binaries are built and the filesystem images are created, BusyBox is not used anymore for running sandboxed programs and evaluating their output (except for temporary filesystem creation) because of performance reasons, but work is done by the small and fast custom C tools just compiled.

Sandboxed program sources can be written in C (using a large subset of uClibc as the system library), C++ (using uClibc and libstdc++), Python (using all the built-in Python modules), Ruby (using just the built-in Ruby modules and classes which are implemented in C), Perl (using only the few dozen most common modules), or PHP (using just the built-in PHP functions which are implemented in C).

For C and C++ compilation, a GCC 4.1.2 toolchain is used which is based on uClibc and produces statically linked executables.

Interpreters for Python, Ruby, Perl and PHP are provided as precompiled, self-contained, single, statically linked, 32-bit (i386) Linux executables. It's easy to copy them to the /bin directory of the UML guest.

All tools used for building the software are either shipped with the software, or the software downloads them during compilation. All binary tools are statically linked, 32-bit (i386) Linux executables. This provides maximum host system compatibility and reproducible behavior across systems.

In our speed measurements, CPU-intensive programs running inside UML run 1% slower than outside UML, but I/O-intensive programs (such as C++ compilation with GCC) can be about 6 times slower.

The Minix filesystem is used in the UML guests for both read-only filesystems (e.g. the root filesystem with the scripting language interpreters) and read-write filesystems (such as /tmp for the temporary assembly files created by GCC). The Minix filesystem was chosen instead of ext2 and other usual Linux filesystem, because it has less size overhead, it's codebase is smaller, and it has equally good performance for the small and mostly read-only filesystems it is used for.

License

uevanrun has been released under the GNU GPL v2.

Bugs? Problems? Contact the author. Send feedback!

Please add your feedback to the issue tracker. All feedbacks are welcome.

2010-11-19

How to download Skype without having to register

Recently Skype has introduced the requirement of registration or logging in on its web page before downloading the Windows version of their Skype software. This blog post explains how to download Skype without registration or logging in.

Follow the link relevant to your operating system:

2010-11-15

Remote desktop sharing for Linux without software installation and firewall configuration

This blog post explains how to share two desktops (screen, keyboard and mouse) without software installation (on Linux) and firewall configuration. The instructions given have been verified on Ubuntu Lucid running on 32-bit (i386) and 64-bit (amd64) architecture -- but they should work on any modern Linux desktop.

TeamViewer is an excellent cross-platform application which provides desktop sharing, file transfer, chat and video conferencing, and it's free for personal use. Its use is straightforward. Installers for Linux, Mac OS X, Windows and iPhone are downloadable here.

If you don't want to install anything, you can try the QuickSupport edition of TeamViewer on Windows and the Mac OS X. (Get it from the same download page.) The QuickSupport edition doesn't let the user initiate a connection (to become a client), but it listens as a server waiting for connections.

As of now, there is no official Linux QuickSupport edition of TeamViewer, so I've created one; it downlads and starts the regular TeamViewer for Linux application, without installing it. To use it on a vanilla Ubuntu Lucid box, just press Alt-, and type (or copy-paste) the following command, and press Enter:

bash -c 'wget -O- goo.gl/k9imm | bash'

Please note that the O between the dashes is a capital O. All other characters are lowercase.

Remote desktop sharing for Linux without software installation and firewall configuration

This blog post explains how to share two desktops (screen, keyboard and mouse) without software installation (on Linux) and firewall configuration. The instructions given have been verified on Ubuntu Lucid running on 32-bit (i386) and 64-bit (amd64) architecture -- but they should work on any modern Linux desktop.

TeamViewer is an excellent cross-platform application which provides desktop sharing, file transfer, chat and video conferencing, and it's free for personal use. Its use is straightforward. Installers for Linux, Mac OS X, Windows and iPhone are downloadable here.

If you don't want to install anything, you can try the QuickSupport edition of TeamViewer on Windows and the Mac OS X. (Get it from the same download page.) The QuickSupport edition doesn't let the user initiate a connection (to become a client), but it listens as a server waiting for connections.

As of now, there is no official Linux QuickSupport edition of TeamViewer, so I've created one; it downlads and starts the regular TeamViewer for Linux application, without installing it. To use it on a vanilla Ubuntu Lucid box, just press Alt-, and type (or copy-paste) the following command, and press Enter:

bash -c 'wget -O- goo.gl/k9imm | bash'

Please note that the O between the dashes is a capital O. All other characters are lowercase.

2010-11-10

How to disable PulseAudio on Ubuntu Lucid and Oneiric without uninstalling it

This blog post explains how to disable PulseAudio on Ubuntu Lucid without uninstalling it. The solution described here may work on other Linux distributions as well. Please note that you may lose volume controls integrated to GNOME (e.g. the volume control hotkeys), and you'll have to adjust the volume with alsamixer or it's GNOME equivalent, gnome-alsamixer.

Install the volume control programs:

$ sudo apt-get install alsa-utils gnome-alsamixer
$ sudo apt-get remove gamix  # Applications / Sound & Video / ALSA Mixer

Stop the PulseAudio server (pulseaudio) and remove temporary files:

$ pkill -STOP '^pulseaudio$'
$ rm -rf ~/.pulse

Make sure all application attempts to connect to the PulseAudio server and/or to start a new PulseAudio server will fail:

$ mkdir -p ~/.pulse
$ (if grep -q \=10 /etc/lsb-release; then
     echo 'default-server = 0.0.0.1'
   else
     echo 'daemon-binary = /dev/null/no-daemon'
   fi
   echo 'autospawn = no') >~/.pulse/client.conf

Kill the PulseAudio server:

$ pkill -KILL '^pulseaudio$'

Optionally, to disable PulseAudio for all users, add the following lines to /etc/pulse/client.conf for Ubuntu Lucid:

default-server = 0.0.0.1
autospawn = no
The corresponding lines for Ubuntu Oneiric are:
daemon-binary = /dev/null/no-daemon
autospawn = no

Restart your web browsers (for the Flash Player) and your Skype, and possibly other applications which use sound. To do so, it's best to log out and log in again.

2010-11-03

How to get rid of the GNOME panel menu delay on Ubuntu Lucid

This blog post gives instructions how to get rid of the GNOME panel menu delay, i.e. the small delay one experiences when moving the mouse pointer over the System menu, and the submenus like Preferences and Administration don't open immediately. The solution shown has been tested on Ubuntu Lucid, but it might work on other systems as well.

Run the following command to configure newly started applications to omit the delay:

$ echo "gtk-menu-popup-delay = 0" >> ~/.gtkrc-2.0

Run this command to restart the GNOME panel currently running:

$ killall gnome-panel
If you are interested speeding up the loading of icons, see more information on http://www.ubuntugeek.com/how-to-make-gnome-menus-faster-in-ubuntu.html.

2010-10-30

pysshsftp: proof-of-concept SFTP client for Unix which uses the OpenSSH ssh(1) command-line tool

This blog post is an announcement for pysshsftp, a proof-of-concept, educational SFTP client for Unix, which uses the OpenSSH ssh(1) command-line tool (for establishing the secure connection), but it doesn't use the sftp(1) command-line tool (so it can e.g. upload files without truncating them first).

Only very little part of the SFTP protocol has been implemented so far (initialization, uploading, and stat()ting files). The SFTP protocol was reverse-engineered from sftp-client.c in the OpenSSH 5.1 source code.

The motivation behind writing pysshsftp was to have an SFTP client which

  • supports uploading files without truncating them (the OpenSSH sftp(1) always truncates the file before uplading data bytes);
  • can be easily scripted from Python;
  • uses the OpenSSH(1) command-line tool for establishing the secure connection (other Python libraries like pysftp and paramiko can't use the user's public key configuration properly by default: they don't support passphrase reading for passphrase-protected keys, they don't support reading keys from ssh-agent, and they don't support reading ~/.ssh/id_rsa and ~/.ssh/id_dsa exactly the same way as OpenSSH uses them).

2010-10-29

Multilingual programs in Haskell, Prolog, Perl and Python

This blog posts shows program source codes that work in multiple languages (including Haskell, Prolog, Perl, Perl and Python).

Here is a program which works in Haskell, Prolog and Perl:

$ cat >m1.prg <<'END'
foo( {-1/*_} ) if!'*/%'; print "Hello, Perl!\n" if<<'-- */'
}). :- write('Hello, Prolog!'), nl, halt.
/* -} x) = x
main = putStrLn "Hello, Haskell!"
-- */
END
$ perl m1.prg
Hello, Perl!
$ swipl -f m1.prg
Hello, Prolog!
$ cp m1.prg m1.hs
$ ghc m1.hs -o m1
$ ./m1
Hello, Haskell!

Here is a program which works in Haskell, Prolog and Python:

$ cat >m2.prg <<'END'
len( {-1%2:3} ); print "Hello, Python!"; """
}). :- write('Hello, Prolog!'), nl, halt.
/* -} x) = x
main = putStrLn "Hello, Haskell!"
-- """ # */
END
$ python m2.prg
Hello, Python!
$ swipl -f m2.prg
Hello, Prolog!
$ cp m2.prg m2.hs
$ ghc m2.hs -o m2
$ ./m2
Hello, Haskell!

It's possible to have polyglots of 8, 10 and even 22 programming languages, see them here.

2010-10-26

How to upload a video to Picasa from Linux, Mac OS X and Windows

This blog post explains how to upload a video to Google Picasa from Linux, Mac OS X and Windows.

It is not possible to upload a video to Picasa Web Albums from your web browser. (However, it is possible to upload images this way.)

Please note that the video you upload will be transcoded, most probably to lower quality than the original, and the original video file won't be available for download or watching on Picasa Web Albums.

On Windows, use the latest Picasa desktop application ([download]). Version 3.8 or newer works. Alternatively, you can use GoogleCL ([project page]).

On Mac OS X, use the Picasa Web Albums Uploader. ([download])

On Linux, use the Google command line tools (GoogleCL). [project page] Example command to upload photos and videos:

$ google picasa create --title album_name
$ google picasa post   --title album_name ~/Photos_and_videos/*

It might take a few minutes for the video to become transcoded and made available. In the meanwhile, you'll see this image on picasaweb instead:

On Linux, you may have to install the newest googlecl (0.9.11 works) and the newest python-gdata (2.0.12 or newer works), and remove your

~/.googlecl
directory to make it work.

Installation instructions for Ubuntu Lucid, before you can run the

google
command above:
$ sudo apt-get install python-setuptools
$ sudo easy_install gdata
$ sudo easy_install http://googlecl.googlecode.com/files/googlecl-0.9.11.tar.gz
$ mv -f ~/.googlecl ~/.googlecl.old

2010-09-23

Panasonic Lumix DMC-FS30 battery information

This blog post contains information about some digital cameras and their corresponding batteries and battery chargers.

The Panasonic Lumix DMC-FS30, FMC-FS11, DMC-FS10 and DMC-FS9 digital camera comes with the battery pack model CGA-S/106C, which is a 3.6V, 740 mAh, 2.7Wh Li-ion battery. The corresponding battery charger is the Panasonic DE-A60 battery charger.

2010-08-29

How to emulate a WebSocket client in JavaScript using the Java plugin

This blog post shows a proof of concept implementation which can be extended to emulate a WebSocket client in JavaScript in browsers where WebSocket support is not available, but there is a working Java plugin installed.

The motivation behind this blog post is to design drop-in replacements for WebSocket for older browsers (e.g. Internet Explorer 9 and earlier, Opera 9 and earlier, Safari 4 and earlier, Firefox 3.7 and earlier), where WebSocket client support is not available, but some browser plugin (like Java, Flash or Silverlight) could be used to emulate it. This blog post is a proof-of-concept demonstrating how the Java plugin can be used for such an emulation.

The trick is to use the java symbol exported to JavaScript. Here is the relevant HTML page with JavaScript code, which implements a simple, proof-of-concept HTTP client in JavaScript using Java:

<html><head>
<title>Java WebSocket proof-of-concept test</title>
<script type="text/javascript">
function output(str) {
  var output = document.getElementById("output")
  var text = document.createTextNode(str)
  var br = document.createElement('br')
  output.insertBefore(br, null)  
  output.insertBefore(text, null)
}
function init() {
  output("find java")
  java.lang
  var hostItems = window.location.host.split(':')
  var hostName = hostItems[0]
  var port = 80
  if (hostItems.length > 1) {
    port = parseInt(hostItems[1], 10)
    if (port == 443 && window.location.protocol == 'https:')
      port = 80
  }
  output("connecting to host " + hostName + ", port " + port)
  var socket = new java.net.Socket(hostName, port)
  output("connected")
  var osw = new java.io.OutputStreamWriter(socket.getOutputStream(), "UTF-8")
  var isr = new java.io.InputStreamReader(socket.getInputStream(), "UTF-8")
  var bis = new java.io.BufferedReader(isr)   
  output("sending request (GET /answer.html)")  
  osw.write("GET /answer.html HTTP/1.0\r\n\r\n")
  osw.flush()
  output("reading response")
  // SUXX: This blocks the whole browser (Firefox) until the response is
  // ready.
  var line
  while (true) {
    line = bis.readLine()    
    if (line == null) break
    if (line == '' || line == '\r') break
    output("got header line: " + line)
  }
  output("done with header")
  if (line != null) {
    while (true) {
      line = bis.readLine()  
      if (line == null) break
      output("got body line: " + line)
    }
  }
  output("done with body")
  output("closing")
  isr.close()
  osw.close()
  socket.close()
  output("done")
}
</script>
</head><body onload="init()">
<div id="output" style="border:1px solid black;padding: 2px">Welcome!</div>
</body></html>

The code above could be easily upgraded to use the WebSocket protocol.

Please note that the implementation above has a show-stopper bug: it locks the whole browser UI (all functionality in all tabs, in Firefox 3.6) until the server returns the HTTP response. It might be possible to solve it by using java.nio.SocketChannel, but we haven't investigated that yet.

Please note that it is possible to do WebSocket emulation with Flash instead of Java, see web-socket-js for the client-side code (JavaScript and Flash) and web-socket-ruby for the server-side code.

Please note that we haven't investigated if it is possible to do WebSocket emulation with Silverlight instead of Java.

2010-08-11

How to try Stackless Python on Linux without installing it?

This blog post gives instructions on trying Stackless Python without installing it on Linux systems (32-bit and 64-bit).

Use the Stackless version of the StaticPython binary executable, which is a portable version of Stackless Python 2.7 for Linux systems. It is linked statically, with most standard Python modules and C extensions included in the executable binary, so no installation is required. Here is how to try it:

$ wget -O stackless2.7-static \
  https://raw.githubusercontent.com/pts/staticpython/master/release/stackless2.7-static
$ chmod +x stackless2.7-static
$ ./stackless2.7-static
Python 2.7 Stackless 3.1b3 060516 (release27-maint, Aug 11 2010, 13:55:35) 
[GCC 4.1.2] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import stackless
>>> 

See more information about features, suggested uses and limitations on the StaticPython home page.

StaticPython released

This blog post is an announcement of the StaticPython software distribution and its first release.

What is StaticPython?

StaticPython is a statically linked version of the Python 2.x (currently 2.7) interpreter and its standard modules for 32-bit Linux systems (i686, i386). It is distributed as a single, statically linked 32-bit Linux executable binary, which contains the Python scripting engine, the interactive interpreter with command editing (readline), the Python debugger (pdb), most standard Python modules (including pure Python modules and C extensions), coroutine support using greenlet and multithreading support. The binary contains both the pure Python modules and the C extensions, so no additional .py or .so files are needed to run it. It also works in a chroot environment. The binary uses uClibc, so it supports username lookups and DNS lookups as well (without NSS).

Download and run

Download the lastest StaticPython 2.7 executable binary.

Here is how to use it:

$ wget -O python2.7-static \
  https://raw.githubusercontent.com/pts/staticpython/master/release/python2.7-static
$ chmod +x python2.7-static
$ ./python2.7-static
Python 2.7 (r27:82500, Aug 11 2010, 10:51:33) 
[GCC 4.1.2] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>>

For what purpose is StaticPython useful?

  • for running Python scripts in chroot and other restricted environments (e.g. for running untrusted standard Python code)
  • for running CGI scripts on some web hosting providers where Python is not preinstalled (this is slow)
  • for trying the newest Python and running scripts on a very old Linux system or on a system where package installation is not possible or cumbersome
  • for deploying and running Tornado, Eventlet or Twisted-based networking applications on a machine without a working Python installation (please note that the framework itself has also to be deployed)

When isn't StaticPython recommended?

  • if dependencies need Python package installation (e.g. distutils, setuptools, setup.py, Pyrex, Cython) -- distutils is not included, there is no site-packages directory, loading .so files is not supported
  • if dependencies are or need shared libraries (.so files) -- StaticPython doesn't support loading .so files, not even with the dl or ctypes modules
  • if startup and module import has to be fast -- since StaticPython stores .py files (no .pyc) in a ZIP file at the end of the binary
  • for GUI programming -- no GUI or graphics library is included, loading .so files is not supported

How to extend, customize or recompile StaticPython?

Compiling python2.7-static was a one-off manual process. Automating this process has not been implemented yet. Please contact the author if you need this.

Feature details

Features provided

  • command-line editing with the built-in readline module
  • almost all standard Python 2.7 modules built-in
  • almost all standard C extensions built-in: _bisect, _codecs, _codecs_cn, _codecs_hk, _codecs_iso2022, _codecs_jp, _codecs_kr, _codecs_tw, _collections, _csv, _curses, _elementtree, _functools, _heapq, _hotshot, _io, _json, _locale, _lsprof, _md5, _multibytecodec, _multiprocessing, _random, _sha, _sha256, _sha512, _socket, _sockobject, _sqlite3 (added 1004307 bytes to the executable binary size), _sre, _struct, _symtable, _weakref, array, audioop, binascii, bz2, cPickle, cStringIO, cmath, crypt, datetime, errno, fcntl, fpectl, future_builtins, gc, grp, imageop, itertools, math, mmap, operator, parser, posix, pwd, pyexpat, readline, resource, select, signal, spwd, strop, syslog, termios, thread, time, timing, zipimport, zlib
  • greenlet integrated for coroutine support
  • multithreading (using thread and threading as usual)
  • line editing even without a terminfo definition or inputrc file (useful in chroot, provided by libreadline/libncurses by default)
  • the usual help, license used in interactive mode

Executable binary layout

  • Python 2.7 (r27:82500) on Linux i386, statically linked compiled and linked with uClibc, so it supports username lookups and DNS lookups as well (without NSS).
  • pure Python and C extensions integrated to a single, statically linked, i386 (i686) Linux executable
  • compiled with uClibc so it can do DNS lookups without /lib/libss_*so*
  • can run from any directory, even in chroot containing only the binary, even in interactive mode, even without /proc mounted

Features missing

  • missing: loading .so files (C shared libraries, Python C extensions)
  • missing: the ctypes module and the dl module (since no .so file loading support)
  • missing: the distutils module, custom extension installation
  • missing: !OpenSSL, SSL sockets
  • missing: IPv6 support
  • missing: GUI bindings (tkinter, GTK, Qt etc.)
  • missing: Stackless Python

2010-08-09

How to try the latest MariaDB on Linux

This blog post explains how to download and start the bleeding edge MariaDB on a fairly recent 32-bit or 64-bit Linux system, for trial and development, the most straightforward way, without overwriting an existing MySQL installation or its data.

MariaDB is an improved, backward compatible, drop-in replacement of the MySQL Database Server, by Michael "Monty" Widenius, the original author of MySQL.

Please note that parts of this blog post are obsolete. See also the new blog post.

The simplest download and startup instructions for MariaDB 5.2.1 for 32-bit (i386, i686) Linux systems is the following:

# Stop any MySQL server currently running.
$ wget -O /tmp/mariadb-compact.tbz2 \
  http://pts-mini-gpl.googlecode.com/files/mariadb-compact-5.2.1-beta-linux-i386.tbz2
$ cd /tmp
$ tar xjvf mariadb-compact.tbz2
$ cd mariadb-compact
$ ./bin/mysqld --datadir=$PWD/data --pid-file=$PWD/mysqld.pid \
  --socket=$PWD/mysqld.sock --language=$PWD/share/mysql/english \
  --log-error=/proc/self/stderr
...

To stop the server, press Ctrl-Backslash, Enter in the window mysqld_safe is running, and wait 15 seconds for the process to exit.

To connect to the server, install the MySQL client. (e.g. with $ sudo apt-get install mysql-client on an Ubuntu system), and then run

$ mysql --socket=/tmp/mariadb-compact/mysqld.sock --user=root --database=test

or

$ mysql --host=127.0.0.1 --user=root --database=test

to connect. Please note that mariadb-compact comes with insecure default settings: it lets anyone connect as user root (on TCP 127.0.0.1:3306 (but not on other IP addresses) and on the Unix socket as well) without a password. Please use the appropriate mysqladmin commands (or modify the tables mysql.user, mysql.host and mysql.db directly) to set up access restrictions. use the --skip-networking flag of mysqld to prevent it from listening on TCP ports (not even 127.0.0.1:3306).

FYI Here is how mariadb-compact.tbz2 was created from the official MariaDB Linux binaries. The official download was located on http://askmonty.org/wiki/MariaDB:Download http://askmonty.org/wiki/MariaDB:Download:MariaDB_5.2.1-beta, and then the following commands were executed:

$ wget -O /tmp/mariadb-5.2.1-beta-Linux-i686.tar.gz \
  http://ftp.rediris.es/mirror/MariaDB/mariadb-5.2.1-beta/\
  kvm-bintar-hardy-x86/mariadb-5.2.1-beta-Linux-i686.tar.gz
$ mkdir -p /tmp/mariadb-preinst
$ cd /tmp/mariadb-preinst
$ tar xzvf /tmp/mariadb-5.2.1-beta-Linux-i686.tar.gz mariadb-5.2.1-beta-Linux-i686/\
  {bin/mysqld{,_safe},bin/my_print_defaults,scripts/mysql_install_db,\
  share/mysql/english/errmsg.sys,share/fill_help_tables.sql,\
  share/mysql_fix_privilege_tables.sql,share/mysql_system_tables.sql,\
  share/mysql_system_tables_data.sql,share/mysql_test_data_timezone.sql}
$ cd mariadb-5.2.1-beta-Linux-i686
$ strip -s bin/mysqld
$ ./scripts/mysql_install_db --basedir="$PWD" --force --datadir="$PWD"/data
$ rm -rf scripts
$ rm -f share/*.sql
$ cd ..
$ mv mariadb-5.2.1-beta-Linux-i686 mariadb-compact
$ tar cjvf /tmp/mariadb-compact.tbz2 mariadb-compact
$ rm -rf /tmp/mariadb-5.2.1-beta-Linux-i686.tar.gz
$ rm -rf /tmp/mariadb-preinst
$ ls -l /tmp/mariadb-compact.tbz2 
-rw-r--r-- 1 pts pts 4194393 Aug  8 18:10 /tmp/mariadb-compact.tbz2

The script mysql_install_db creates the initial databases and tables (e.g. the mysql.user table used for authentication) in the data directory.

All the other commands above just extract the absolute minimum necessary files from the official binary distribution, strip the binaries, and create a small .tbz2 archive.

Please note that the binaries in the official binary distributions are dynamically linked:

$ ldd bin/mysqld
        linux-gate.so.1 =>  (0xb7727000)
        libnsl.so.1 => /lib/tls/i686/cmov/libnsl.so.1 (0xb76f9000)
        librt.so.1 => /lib/tls/i686/cmov/librt.so.1 (0xb76f0000)
        libpthread.so.0 => /lib/tls/i686/cmov/libpthread.so.0 (0xb76d6000)
        libwrap.so.0 => /lib/libwrap.so.0 (0xb76cd000)
        libdl.so.2 => /lib/tls/i686/cmov/libdl.so.2 (0xb76c9000)
        libresolv.so.2 => /lib/tls/i686/cmov/libresolv.so.2 (0xb76b5000)
        libcrypt.so.1 => /lib/tls/i686/cmov/libcrypt.so.1 (0xb7683000)
        libstdc++.so.6 => /usr/lib/libstdc++.so.6 (0xb7590000)
        libm.so.6 => /lib/tls/i686/cmov/libm.so.6 (0xb756a000)
        libc.so.6 => /lib/tls/i686/cmov/libc.so.6 (0xb7425000)
        /lib/ld-linux.so.2 (0xb7728000)
        libgcc_s.so.1 => /lib/libgcc_s.so.1 (0xb7407000)

It would be straightforward to create the 64-bit version of mariadb-compact.tbz2, by starting from a different official binary distribution (http://ftp.rediris.es/mirror/MariaDB/mariadb-5.2.1-beta/kvm-bintar-hardy-amd64/mariadb-5.2.1-beta-Linux-x86_64.tar.gz), and modifying the the commands above appropriately.

A newer version (5.2.9) of this precompiled 32-bit Linux MariaDB is new available: http://pts-mini-gpl.googlecode.com/svn/trunk/portable-mariadb.release/portable-mariadb.tbz2. Here is the shell script which generated it from the official MariaDB 5.2.9 sources: http://code.google.com/p/pts-mini-gpl/source/browse/trunk/portable-mariadb/build-mariadb.sh. Here is the newer blog post: http://ptspts.blogspot.com/2011/11/announcing-portable-mariadb-small.html.

2010-07-29

How to create a bootable Ubuntu Lucid USB pen drive with persistent storage

This blog explains how to create a bootable Ubuntu Lucid Lynx (10.04) USB pen drive with persistent storage. Persistent storage means that all changes made to the system settings and all files in the user's home directory are saved to the pen drive and are available after system reboot.

Connect your USB pen drive.

The Ubuntu Lucid live system supports persistent storage by default. On an existing Ubuntu Lucid system System / Administration / Startup Disk Creator on an existing Ubuntu Lucid system. If you have an existing Ubuntu Karmic system, run sudo apt-get install usb-creator to install the Startup Disk Creator program, and then run System / Administration / USB Startup Disk Creator.

Make sure that the Stored in reserved extra space is selected (it's selected by default).

Download the live CD ISO image to your hard drive from http://releases.ubuntu.com/lucid/ubuntu-10.04-desktop-i386.iso. If you want to download from a faster mirror, please find the ISO image link on http://releases.ubuntu.com/lucid/. For maximum compatibility, select the Intel x86 (or i386, or 32-bit) version. The download may take 30 minutes or more.

In the startup disk creator window, specify the downloaded .iso file by clicking on the Other... button.

For maximum compatibility, format your pen drive, by clicking on the Format button in the startup disk creator window.

Click on the Make Startup Disk button.

Eject (remove) the USB pen drive in the file manager. Disconnect the pen drive.

Boot the computer from your USB stick as usual. There is now configuration necessary. All files you create in the live system in the user's home directory, and all settings you change get retained through reboots.

2010-07-27

How to add (generate) locales on Debian and Ubuntu

This blog post gives instructions to add (and generate) locales on Debian and Ubuntu systems. The instructions given here work on Debian Etch and newer, and Ubuntu Hardy and newer (including Intrepid, Jaunty and Karmic).

To add the locales hu_HU.ISO8859-2 and hu_HU.UTF-8, run

$ echo 'hu_HU.ISO8859-2 ISO-8859-2' | sudo tee -a /var/lib/locales/supported.d/hu
$ echo 'hu_HU.UTF-8 UTF-8' | sudo tee -a /var/lib/locales/supported.d/hu
$ sudo dpkg-reconfigure locales 

To verify that the locales are installed correctly, run the following commands and verify that they don't print anything:

$ LC_ALL=hu_HU.ISO8859-2 perl -e0
$ LC_ALL=hu_HU.UTF-8 perl -e0

2010-07-17

An approximation of pi in bc

This blog post shows an approximation of the constant π (3.14...) implemented as a bc program using Machin's formula.

Machin's formula is documented on http://en.wikipedia.org/wiki/Numerical_approximations_of_%CF%80#Machin-like_formulae. The implementation below computes acot using its Taylor series expansion, always rounding down for divisions. The implementation knows the number of good digits by computing an upper bound of the total rounding error.

#! /usr/bin/bc -q
# by pts@fazekas.hu at Sat Jul 17 17:38:25 CEST 2010

/** Return an approximation and lower bound for acot(x) * 10 ^ u. */
define acot(x, u) {
  auto sum, xpower, xx, n, term
  sum = xpower = u / x
  xx = x * x
  n = 3
  for (;;) {
    xpower /= xx
    term = xpower / n
    if (term == 0)
      return sum
    sum -= term
    n += 2
    xpower /= xx
    term = xpower / n
    if (term == 0)
      return sum
    sum += term
    n += 2
  }
}

/* Return an integer upper bound for the >= 0 error value pi * 10 ^ u - f(u),
 * where u is 10 ^ nd, f(u) is the integer 4 * (4 * acot(5, u) - acot(239, u)).
 *
 * The magic constants in the maxerr implementation were derived from analyzing
 * the acot implementation, taking into account the rounding (truncation) done
 * in each division.
 */
define maxerr(nd) {
  return (286135312 * nd + 41739380) / 10000000
}

/* Return a string of at most nd characters, prefix of pi,
 * assuming nd >= 4.
 */
define pi(nd) {
  auto u, y
  u = 10 ^ nd
  y = 4 * (4 * acot(5, u) - acot(239, u))
  y /= 10 ^ (length(maxerr(nd)) - 1)
  while (y % 10 == 0) {
    y /= 10
  }
  return y / 10
}

/* Print pi with increasing precision infinitely (until aborted). */
define pinfinite() {
  auto b, nb, nd, a, na
  print "3."
  b = 3
  nb = 1
  nd = 8
  for (;;) {
    a = pi(nd)
    na = length(a)
    /* Print digits not printed yet in previous iterations. */
    print a % (10 ^ (na - nb))
    b = a
    nb = na
    nd *= 3
  }
}

pinfinite()  /* Infinite loop. */
Here is the equivalent program in dc:
[lulx/dspsslxd*sw3snlDx]sC[ls2Q]sS[lplw/dspln/dst0=Slslt-ssln2+snlplw/dsp
ln/dst0=Slslt+ssln2+snlDx]sD[ly10/dsy10%0=Q]sQ[10ld^sz5sxlzsulCx4*239sxlz
sulCx-4*10ld286135312*41739380+10000000/Z1-^/dsy10%0=Qly10/]sP[10lksdlPxd
sgZdsilj-^lgr%nlgshlisjlk3*sklIx]sI[3.]n3sh1sj8sklIx

2010-06-27

How to enable bitmap fonts on Ubuntu Karmic and Ubuntu Lucid?

This blog post gives instructions to enable bitmap fonts for GNOME and other Xft X11 applications. The instructions given have been tested on Ubuntu Karmic and Ubuntu Lucid, but they might work on other Unix systems as well with minor modifications. The reason for enabling bitmap fonts is to get crisp font rendering in dialog boxes, menus and window titles.

Run these commands to install some bitmap fonts from package:

$ sudo apt-get install xfonts-mplus xfonts-bitmap-mule xfonts-base
$ sudo apt-get install xfonts-75dpi{,-transcoded} xfonts-100dpi{,-transcoded}

Run the command to install some Microsoft vector fonts (e.g. Times New Roman, Arial, Courier New, Verdana). Installation might take a few minutes, because the font files have to be downloaded form an external site.

$ sudo apt-get install msttcorefonts

If you are using Ubuntu Karmic or earlier (not Ubuntu Lucid or later), then run these commands to install some QT configuration tools (needed for Skype):

$ sudo apt-get install qt3-qtconfig qt4-qtconfig

Run these commands to install some useful fonts (fixedsc is the classic 6x13 xterm monospaced bitmap font helpfully renamed to FixedSC, and helxetica is the Helvetica bitmap font helpfully renamed to Helxetica, for QT):

$ wget -qO- https://raw.githubusercontent.com/pts/fonts/master/helxetica.tgz |
       (cd / && sudo tar xzv)
$ wget -qO- https://raw.githubusercontent.com/pts/fonts/master/fixedsc.tgz  |
       (cd / && sudo tar xzv)

Enable bitmap fonts (this works in Ubuntu Hardy, Ubuntu Intrepid, Ubuntu Jaunty as well as in Ubuntu Karmic and Ubuntu Lucid):

$ sudo rm -f /etc/fonts/conf.d/70-{yes,no,force}-bitmaps.conf
$ if test -f /usr/share/fontconfig/conf.avail/70-force-bitmaps.conf
  then sudo ln -s /usr/share/fontconfig/conf.avail/70-force-bitmaps.conf /etc/fonts/conf.d/70-force-bitmaps.conf
  elif test -f /etc/fonts/conf.avail/70-force-bitmaps.conf
  then sudo ln -s {../conf.avail,/etc/fonts/conf.d}/70-force-bitmaps.conf
  else sudo ln -s {../conf.avail,/etc/fonts/conf.d}/70-yes-bitmaps.conf
  fi

The if above is needed, because /etc/fonts/conf.avail/70-force-bitmaps.conf has been introduced and /etc/fonts/conf.avail/70-yes-bitmaps.conf was made almost empty in Ubuntu Lucid. The correct (non-empty) contents are:

<?xml version="1.0"?>
<!DOCTYPE fontconfig SYSTEM "fonts.dtd">
<fontconfig>
<!-- Accept bitmap fonts -->
 <selectfont>
  <acceptfont>
   <pattern>
     <patelt name="scalable"><bool>false</bool></patelt>
   </pattern>
  </acceptfont>
 </selectfont>
</fontconfig>

Run these commands to rebuild the font filename cache (so it will find all bitmap fonts, including the newly installed fonts):

$ sudo rm -f /var/cache/fontconfig/*
$ rm -rf "$HOME/.fontconfig"
$ sudo fc-cache
$ fc-cache
$ fc-list | grep -E 'Helxetica|FixedSC' | sort
FixedSC:style=Bold
FixedSC:style=Regular
Helxetica:style=Bold
Helxetica:style=Bold Oblique
Helxetica:style=Oblique
Helxetica:style=Regular

To have crisp fonts in dialog boxes, menus an window titles, then in System / Preferences / Appearance / Fonts, set the following font settings (on Ubuntu Lucid, use Helxetica instead of Helvetica):

  • Application font: Helvetica | 9
  • Document font: Arial | 10
  • Desktop font: Helvetica | 9
  • Window title font: Helvetica Bold | 9
  • Fixed width font: FixedSC | 10

To make the GNOME Panel reload its menu font, you have to restart it. Run

$ killall gnome-panel

In Applications / Accessories / Terminal / Edit / Profile preferences, make sure the Use the system fixed width font is ticked.

On Ubuntu Karmic and earlier, press Alt-F2, type qtconfig-qt3, press Enter, then wait for the Qt Configuration window to appear. In the tab Fonts, change the Family: to Helxetica (sic, it's not Helvetica), and the Point Size: to 9. In the menu, choose File / Save, then File / Exit.

On Ubuntu Karmic and earlier, repeat the previous paragraph with qtconfig-qt4 instead of qtconfig-qt3.

Notes:

  • There is no need to log out, restart Skype, restart Firefox restart Nautilus, restart the GNOME Terminal or restart the GNOME Panel. All these applications pick up the font changes immediately.
  • The Fixed (6x13) had to be renamed to FixedSC, because the original 6x13 font has the tag semicondensed, which makes it impossible to select in the GNOME font selection dialog (or in any fontconfig-based app).
  • The Helvetica bitmap font had to be renamed to Helxetica, because QT4 applications (such as Skype 2.1) cannot render non-latin-1 characters properly (they will find a substitution font for those characters, even though the characters are available in the Helvetica font).

2010-06-20

Does Btrfs survive silent disk data corruption in RAID1 mode?

Some of my experimenting with Btrfs in Linux 2.6.34 yielded the following results:

  • If only one disk (out of two) contains corrupt data, then Btrfs detects some checksum failures, and then recovers (overwriting the corrupt data with the corresponding good data from the other disk). This works even if the corruption happened while the filesystem was mounted.
  • If both disks contain corrupt data (but not at the same location), then Btrfs detects some checksum failures, then recovers some data, but it won't recover all of it: some files continue to have I/O errors when reading, and the syslog will contain checksum failures again and again. The explanation for this behavior may be that in Btrfs RAID 1 mode the two copies of a block of data might be at a different offset on the two disks.

Here is how I did the experiment:

  • I had a Linux 2.6.34 system.
  • I had 2 partitions of size 2000061 KB each. (1 KB == 1 << 10 bytes.)
  • # mkfs.btrfs -m raid1 -d raid1 /dev/sdc1 /dev/sdb1
  • # mount /dev/sdb1 /mnt/p
  • I copied /var/lib/dpkg (7248 small files of 36.93 MB) recursively to /mnt/p.
  • I copied 4 large files of 1517.98 MB in total to /mnt/p. (So the filesystem became >75% full.)
  • I created 10000 empty files.
  • I calculated the checksum of all files in /mnt/p with a userspace tool.
  • I introduced the single-disk corruption by running dd if=/dev/zero of=/dev/sdb1 bs=1M count=1600 seek=200
  • I calculated the checksum of all files again. At this point the kernel reported some block checksum mismatches in the syslog, but eventually Btrfs has recovered all the data, and the file checksums matched with the previous run.
  • I introduced the non-ovarlapping multi-disk corruption by running dd if=/dev/zero of=/dev/sdb1 bs=1M count=800 seek=200 && dd if=/dev/zero of=/dev/sdc1 bs=1M count=800 seek=1000
  • I calculated the checksum of all files again. At this point the kernel reported some block checksum mismatches in the syslog, and it could reover some blocks, but not all, and some files yielded an I/O error, but the checksum of the non-erroneous files matched with the previous run.

2010-06-18

How to use the ssh-agent programmatically for RSA signing

This blog post explains what an SSH agent does, and then gives initial hints and presents some example code using the ssh-agent wire protocol (the SSH2 version of it, as implemented by OpenSSH) for listing the public keys added to the agent, and for signing a message with an RSA key. The motivation for this blog post is to teach the reader how to use ssh-agent to sign a message with RSA. Currently there is no command-line tool for that.

The SSH agent (started as the ssh-agent command in Unix, or usually as Pageant in Windows) is a background application which can store in its process memory some SSH public key pairs in unencrypted form, for the convenience of the user. When logging in, ssh-agent is usually started for the user; the user then calls ssh-add (or a similar GUI application) to add his key pairs (e.g. $HOME/.ssh/id*) to the agent, typing the key passphrases if necessary. Afterwards, the user initiates SSH connections, for which the keys used for authentication are taken from the currently running agent. The SSH agent provides the convenience for the user that the user doesn't have to type the key passphrases multiple time, plus that if agent forwarding is enabled (ForwardAgent yes is present in $HOME/.ssh/config), then the agent is available in SSH connections initiated from SSH connections (of arbitrary depth). The agent forwarding feature of ssh-agent is unique, because other in-memory key stores such as gpg-agent don't have that feature, so the keys stored there are available only locally, and not within SSH sessions.

The SSH agent stores both the public and the private keys of a key pair (and the comment as well), but it only ever discloses the public keys to applications connecting to it. The public keys can be queried (displayed) with the ssh-add -L command. But to the ssh-agent can prove to the external world that it knows the private keys (without revealing the keys themselves), because it offers a service to sign the SHA-1 checksum (hash) of any string. SSH uses public-key cryptography, which has the basic assumption that a party can sign a message with a key only he knows the private key; but anyone who knows the public key can verify the signature.

ssh-add uses the SSH_AUTH_SOCK environment variable (containing the pathname of a Unix domain socket) to figure out which ssh-agent to connect to. The permissions for the socket pathname are set up so that only the rightful owner (or root) can connect, other users get a Permission denied.

For more information about how SSH uses public-key cryptography and agents, please read this excellent article.

Below there is an example wire dump of an application using the SSH agent protocol to ask the list of public keys added to an ssh-agent (just like ssh-add -L), and to ask the ssh-agent to sign a message with an RSA key added to it. The sign request is usually sent when an SSH client (ssh(1)) is authenticating itself to an SSH server, when establishing a connection.

To understand the wire dump below, one has to know that in the RSA public-key cryptography system the public key consists of a modulus and a public exponent, the modulus being a very large integer (2048 bits or longer), the exponent being positive a small or large integer, smaller than the modulus. Verifying a signature consits of interpreting a signature as a large integer, exponentiating it to the public exponent, taking the result modulo the modulus, and comparing that result with the original message (or, in case of SSH, the SHA-1 checksum of the original message). Please read the article linked above for a full definition and the operation of RSA.

# The client connects to the agent.
client.connect_to_agent()

# The clients lists the key pairs added to the agent.
client.write("\0\0\0\1")  # request_size == 1
  client.write("\v")  # 11 == SSH2_AGENTC_REQUEST_IDENTITIES
agent.write("\0\0\3\5")  # response_size == 773
  agent.write("\f")  # 12 == SSH2_AGENT_IDENTITIES_ANSWER
  agent.write("\0\0\0\2")  # num_entries == 2
    agent.write("\0\0\1\261")  # entry[0].key_size == 433
      agent.write("\0\0\0\7")  # key_type_size == 7
        agent.write("ssh-dss")
      agent.write("...")  # 443-4-7 bytes of the DSA key
    agent.write("\0\0\0\25")  # entry[0].comment_size == 21
    agent.write("/home/foo/.ssh/id_dsa")
    agent.write("\0\0\1\25")  # entry[1].key_size == 275
      agent.write("\0\0\0\7")  # key_type_size == 7
        agent.write("ssh-rsa")
      agent.write("\0\0\0\1")  # public_exponent_size == 1
        agent.write("#")  # public_exponent == 35
      agent.write("\0\0\1\1")  # modulus_size == 257
        agent.write("\0...")  # p * q in MSBFirst order
    agent.write("\0\0\0\25")  # entry[1].comment_size == 21
    agent.write("/home/foo/.ssh/id_rsa")

# The client gets the agent sign some data.
data = "..."  # 356 arbitrary bytes to get signed.
client.write("\0\0\2\206")  # request_size == 646
  client.write("\r")  # 13 == SSH2_AGENTC_SIGN_REQUEST
  client.write("\0\0\1\25")  # key_size == 277
    client.write("\0\0\0\7")  # key_type_size == 7
      client.write("ssh-rsa")
    client.write("\0\0\0\1")  # public_exponent_size == 1
      client.write("#")  # public_exponent == 35
    client.write("\0\0\1\1")  # modulus_size == 257
      client.write("\0...")  # p * q in MSBFirst order
  client.write("\0\0\1d")  # data_size == 356
    client.write(data)  # arbitary bytes to sign
  client.write("\0\0\0\0")  # flags == 0
agent.write("\0\0\1\24")  # response_size == 276
  agent.write("\16")  # 14 == SSH2_AGENT_SIGN_RESPONSE  (could be 5 == SSH_AGENT_FAILURE)
  agent.write("\0\0\1\17")  # signature_size == 271
    agent.write("\0\0\0\7")  # key_type_size == 7
      agent.write("ssh-rsa")
    agent.write("\0\0\1\0")  # signed_value_size == 256
      agent.write("...")  # MSBFirst order

# The client closes the SSH connection.
client.close()
agent.read("")  # EOF

Below there is an example Python script which acts as a client to ssh-agent, listing the keys added (similarly to ssh-add -L), selecting a key, asking ssh-agent to sign a message with an RSA key, and finally verifying the signature. View and download the latest version of the script here.

#! /usr/bin/python2.4
import cStringIO
import os
import re
import sha
import socket
import struct
import sys

SSH2_AGENTC_REQUEST_IDENTITIES = 11
SSH2_AGENT_IDENTITIES_ANSWER = 12
SSH2_AGENTC_SIGN_REQUEST = 13
SSH2_AGENT_SIGN_RESPONSE = 14
SSH_AGENT_FAILURE = 5

def RecvAll(sock, size):
  if size == 0:
    return ''
  assert size >= 0
  if hasattr(sock, 'recv'):
    recv = sock.recv
  else:
    recv = sock.read
  data = recv(size)
  if len(data) >= size:
    return data
  assert data, 'unexpected EOF'
  output = [data]
  size -= len(data)
  while size > 0:
    output.append(recv(size))
    assert output[-1], 'unexpected EOF'
    size -= len(output[-1])
  return ''.join(output)

def RecvU32(sock):
  return struct.unpack('>L', RecvAll(sock, 4))[0]

def RecvStr(sock):
  return RecvAll(sock, RecvU32(sock))

def AppendStr(ary, data):
  assert isinstance(data, str)
  ary.append(struct.pack('>L', len(data)))
  ary.append(data)

if __name__ == '__main__':
  if len(sys.argv) > 1 and sys.argv[1]:
    ssh_key_comment = sys.argv[1]
  else:
    # We won't open this file, but we will use the file name to select the key
    # added to the SSH agent.
    ssh_key_comment = '%s/.ssh/id_rsa' % os.getenv('HOME')

  if len(sys.argv) > 2:
    # There is no limitation on the message size (because ssh-agent will
    # SHA-1 it before signing anywa).
    msg_to_sign = sys.argv[2]
  else:
    msg_to_sign = 'Hello, World! Test message to sign.'

  # Connect to ssh-agent.
  assert 'SSH_AUTH_SOCK' in os.environ, (
      'ssh-agent not found, set SSH_AUTH_SOCK')
  sock = socket.socket(socket.AF_UNIX, socket.SOCK_STREAM, 0)
  sock.connect(os.getenv('SSH_AUTH_SOCK'))

  # Get list of public keys, and find our key.
  sock.sendall('\0\0\0\1\v') # SSH2_AGENTC_REQUEST_IDENTITIES
  response = RecvStr(sock)
  resf = cStringIO.StringIO(response)
  assert RecvAll(resf, 1) == chr(SSH2_AGENT_IDENTITIES_ANSWER)
  num_keys = RecvU32(resf)
  assert num_keys < 2000  # A quick sanity check.
  assert num_keys, 'no keys have_been added to ssh-agent'
  matching_keys = []
  for i in xrange(num_keys):
    key = RecvStr(resf)
    comment = RecvStr(resf)
    if comment == ssh_key_comment:
      matching_keys.append(key)
  assert '' == resf.read(1), 'EOF expected in resf'
  assert matching_keys, 'no keys in ssh-agent with comment %r' % ssh_key_comment
  assert len(matching_keys) == 1, (
      'multiple keys in ssh-agent with comment %r' % ssh_key_comment)
  assert matching_keys[0].startswith('\x00\x00\x00\x07ssh-rsa\x00\x00'), (
      'non-RSA key in ssh-agent with comment %r' % ssh_key_comment)
  keyf = cStringIO.StringIO(matching_keys[0][11:])
  public_exponent = int(RecvStr(keyf).encode('hex'), 16)
  modulus_str = RecvStr(keyf)
  modulus = int(modulus_str.encode('hex'), 16)
  assert '' == keyf.read(1), 'EOF expected in keyf'

  # Ask ssh-agent to sign with our key.
  request_output = [chr(SSH2_AGENTC_SIGN_REQUEST)]
  AppendStr(request_output, matching_keys[0])
  AppendStr(request_output, msg_to_sign)
  request_output.append(struct.pack('>L', 0))  # flags == 0
  full_request_output = []
  AppendStr(full_request_output, ''.join(request_output))
  full_request_str = ''.join(full_request_output)
  sock.sendall(full_request_str)
  response = RecvStr(sock)
  resf = cStringIO.StringIO(response)
  assert RecvAll(resf, 1) == chr(SSH2_AGENT_SIGN_RESPONSE)
  signature = RecvStr(resf)
  assert '' == resf.read(1), 'EOF expected in resf'
  assert signature.startswith('\0\0\0\7ssh-rsa\0\0')
  sigf = cStringIO.StringIO(signature[11:])
  signed_value = int(RecvStr(sigf).encode('hex'), 16)
  assert '' == sigf.read(1), 'EOF expected in sigf'

  # Verify the signature.
  decoded_value = pow(signed_value, public_exponent, modulus)
  decoded_hex = '%x' % decoded_value
  if len(decoded_hex) & 1:
    decoded_hex = '0' + decoded_hex
  decoded_str = decoded_hex.decode('hex')
  assert len(decoded_str) == len(modulus_str) - 2  # e.g. (255, 257)
  assert re.match(r'\x01\xFF+\Z', decoded_str[:-36]), 'bad padding found'
  expected_sha1_hex = decoded_hex[-40:]
  msg_sha1_hex = sha.sha(msg_to_sign).hexdigest()
  assert expected_sha1_hex == msg_sha1_hex, 'bad signature (SHA1 mismatch)'
  print >>sys.stderr, 'info: good signature'

The wire dump above was created by attaching to a running ssh-agent with strace. The Python example code was written by understanding the ssh-agent.c in the OpenSSH codebase.