Look out Martians
Nein!
My friend posix4e has today the most awesome description evah of the nature of actual programming:
Basically programming is often like playing golf in alice in wonderland with a referee who was trained by the SS.
e-bookin’
Ooh – I have a blog. Maybe I should use it to post things?
On that theory, I present British author Richard Herley. Mr. Herley – whose novel The Penal Colony was the basis for the film No Escape – is offering his books for free download. They are provided under a CC A-NC-ND licence, but he requests that “honourable” readers pay him a small fee if they enjoy the books. I really hope this sort of business model can work.
Ode to building, part 2
In the first part of this series I introduced the idea of what I termed abstract build policies — higher-level methods for describing how to produce the DAG of a build from a description of the desired inputs (sources) and results (targets). An abstract build policy allows developers to specify their sources and targets in abstract terms and let the build system sort out the details. For example, the autotools allow a developer to specify that they want to generate a shared library from a set of C source files:
lib_LTLIBRARIES = libexample.la libexample_la_SOURCES = example.c example.h ...
But what's really going on under the hood?
Instead of requiring developers to describe the complete DAG of all sources-to-targets steps in a build, the autotools conspire to allow the developer to operate in terms of a set of higher-order abstractions:
- systems:: The build is occurring on a particular kernel/OS/hardware combination (the build-system) to run on another (the host-system) and may — if producing a compiler, etc. — generate system-specific entities for another (the target-system).
- variants:: The user invoking the build may choose to add optional part of the build, remove other parts, and cause yet other parts certain parts to occur in an alternative fashion.
- sources:: The inputs to the build, some of which may themselves need to be generated from predecessor sources.
- targets::
The kinds from a restricted set (e.g., shared library), names
(
libexample.so), and other properties (somewhat abstract installation location or lack thereof) of entities to result from the build.
The key aspect of the abstraction is that the developer describes her build only in terms of the details relevant for the particular build. "This set of source files produces a shared library." The autotools handle the nitty-gritty of populating the DAG with all the nasty little details only real toolchain-weenies actually care about.
In a perfect world (or for sufficiently simple/well-written code), the
developer doesn't care whether the code is being built with gcc or the Sun
Workshop compiler, for OpenBSD or AIX. If the code depends upon OS features
which vary from system to system, then the developer needs to account for those
variations, but only those variations. On Linux accessing /proc involves
text-file parsing, on Solaris using a library interface, but the developer
still doesn't care which compiler the build invokes or which flags the linker
requires to coax it into producing a shared library.
The key limitation of the autotools is that they provide no facility for
generating new abstractions. As I said in part 1, they provide a set of
abstract build policies, but no a real build policy abstraction. The beauty
and downfall of the autotools is that they produce truly portable build
descriptions — build descriptions which depend only on the POSIX shell, POSIX
utilities, POSIX make, and a toolchain for the source language. But providing
even as simple a new abstract policy as one for producing a new target type or
compiling a new source language is impossible without wholesale modification of
the autotools core. And even then a new policy has little power to leverage
intermediate abstractions, ultimately needing to itself generate portable POSIX
make rules.
As originally promised for this part, part 3 will look at build tools which allow definition of new abstractions and what that implies.
Ode to building, part 1
[This is an only slightly edited repost from an earlier incarnation of my blog. The archives from that blog didn't make the transition, but I'm planning to continue the series, so here's this post again.]
How does one build a piece of software from source? The superficial answer is run the tools1 on the source and intermediate files in the proper order. I'm sure we've all slapped something in a single C source file and just run the compiler on it from the shell. The next logical step beyond this would be a batch mechanism like a shell script running all the necessary commands. Perhaps worthy of a post to worsethanfailure.com if I still had access to the original, I've seen production binaries built with a file like:
#! /bin/sh cc -c source1.c cc -c source2.c # ... cc -o product source1.o source2.o # ...
In practice most of us use higher-level tools (thank god). The make tool
represents the obvious next step "Unix philosophy"-wise. It assembles a
directed acyclic graph of files in the build and performs the steps necessary
to build the parts of the graph not yet present, leveraging the POSIX shell to
represent and execute each build step. Some projects — it seems to be
especially popular for kernels — use what I'll call Plain Old Make (POM),
applying make "directly."
I put "directly" in quotation marks because most large POM projects put a fair
amount of engineering effort into constructing higher-level abstractions
applied consistently and automatically throughout the project. In standard
Unix fashion make provides "mechanism not policy." In this case make
implements the "mechanism" of bringing all the DAG nodes up to date, but
requires make users to specify the "policy" of how to create each of those
nodes from their dependencies2.
This leads to two interesting observations:
- Most projects — especially large ones — want factored build policies (node patterns) they can apply consistently throughout the project.
- Most projects share an awful lot of policy in common.
The GNU version of make includes extension to help with the first3, but for
the second we move on to yet higher-level tools.
Most open source projects these days describe their builds using the GNU autotools: autoconf, automake, and libtool. The autotools provide abstract build policies which factor reusable build logic. They provide these policies in two ways. First, the autotools "know about" certain kinds of common build targets — they include the logic necessary to build a "program" or a "shared library" on supported platforms, requiring the developer only to specify the kind of target desired, not all the steps necessary to achieve it. Second, the autotools provide a policy for producing certain kinds of variant builds — they probe for features of the target platform and implement a policy for reacting to those features4.
Unfortunately, the autotools limit their abstract policy capabilities in an
important way: the inability to implement new abstract policies. The autotools
make it possible to descend to the mechanism level of make, m4, and the POSIX
shell, but only in the same concrete way those tools do on their own. The
autotools provide a collection of abstract policies, but no real policy
abstraction.
Build tools which implement real build policy abstractions move us out of the realm of current popular use and on into part 2 of this series...
1 e.g., compiler, linker, Whiz-bang Source Frobnicator Enterprise Edition
2 The pedantic will note make support for "pattern rules," but I consider
the fact that POSIX make pattern rules don't account for e.g. the
different ways an object file needs to be generated for inclusion in an
executable vs. a shared library enough for me to ignore them in this
discussion.
3 $(include) and friends, although they're pretty clunky in practice.
4 This would be your config.h and its HAVE_FOO_H etc. macros.
A compendium of awesomeness
Miro rocks. Due to it's awesomeness I've been catching up on all those Google
Tech Talks I haven't found the time to stare at in a Web browser. And it gives
me a reason to stop cursing how wmii/stable1 considers "fullscreen" for
"chumps" —
miro in "fullscreen" mode fits quite nicely in one corner of my
laptop display.
Among the Google Tech Talks, the one on 7 Habits For Effective Text Editing 2.0
included a mention of a vim command which (in Emacs-speak) starts an
incremental search on the token at point. I rather liked that, which yields:
;; I-search with initial contents (defvar isearch-initial-string nil) (defun isearch-set-initial-string () (remove-hook 'isearch-mode-hook 'isearch-set-initial-string) (setq isearch-string isearch-initial-string) (isearch-search-and-update)) (defun isearch-forward-at-point (&optional regexp-p no-recursive-edit) "Interactive search forward for the symbol at point." (interactive "P\np") (if regexp-p (isearch-forward regexp-p no-recursive-edit) (let* ((end (progn (skip-syntax-forward "w_") (point))) (begin (progn (skip-syntax-backward "w_") (point)))) (if (eq begin end) (isearch-forward regexp-p no-recursive-edit) (setq isearch-initial-string (buffer-substring begin end)) (add-hook 'isearch-mode-hook 'isearch-set-initial-string) (isearch-forward regexp-p no-recursive-edit)))))
But then the real awesomeness I can't believe I've only just recently discovered — what language do you think this is?:
cdef extern from "awesomelib.h": cdef int completely_awesome_function(char *nice) cdef class CoolThing: cdef int totally def __init__(self, nice): self.totally = completely_awesome_function(nice)
Python? Not quite. C? Closer, in a sense. It's Pyrex, a Python-like language which encapsulates (but distinguishes) both Python and C constructs, then compiles to C code which uses the Python/C API to provide a Python module. Pyrex does the standard simple type conversions SWIG or a given <Foo>Inline will handle, but also allows the easy creation of extension types (Python objects with C data members) and by-god-simple Python callback routines passed to C callback APIs.
This might just be the straw which pushes me back into the squeezy embrace of Python as my LoC.
1 Debian's wmii/testing is 3.5, which breaks my ruby-wmii setup.
Will the real hack please stand up
I think I've changed my mind re: SVK and svn:externals, because it turns out
this works just like you'd hope it would1:
svk mirror svn://project1/ //mirror/project1 svk mirror svn://project2/ //mirror/project2 svk sync --all svk cp //mirror/project1 //local/project1 svk cp //mirror/project2 //local/project1/vendor/project2 svk push //local/project1
Then later do:
svk pull //local/project1/vendor/project2 svk push //local/project1
I've decided that svn:externals is the real hack. I've generally seen it used
to solve two different problems: (a) exporting common modules as part of
different projects stored in the same repository; and (b) importing "vendor
branches" from other repositories. The problem is that it doesn't really do
either well.
First, svn:externals requires a fully-qualified URI. This causes it to break
if either the repository moves or the repository is accessed by different
protocols. Both of those problems can be "solved" with "don't do that," but
I've personally had cause for both. Obviously either of these will break
people vendoring from your repo, but a URI-change shouldn't break internal
module sharing. The fact of the matter is that the URI for a repo is a
non-canonical name — and Subversion even accounts for this fact by generating
a UUID for each repository.
Second, directly incorporating some chunk of repo into your project is rarely
what you want to do. I've been guilty of this before — and am rather
embarrassed thinking about it. I might be going out on a limb, but I think
most vendor branches need to (a) be pinned to a known-stable version; and (b)
include local modifications. For the former, svn:externals depends on either
good tagging in the source project or (shiver) pinning to a specific revision.
For the latter, svn:externals provides no help whatsoever. In fact, the
Subversion manual chapter on vendor branches barely mentions svn:externals,
preferring instead careful branch-management and merge practices2.
So I think SVK's way of doing things is quite a bit closer to a solution, although still not quite. It solves the problems discussed above, but still requires the local mirror exist (i.e., external configuration), and breaks when switching the actual source (tag, etc.) of the vendor branch3. "Here's 2 cents — go get yourself a real distributed version control system," I hear some of you saying...
1 For the SVK-unfamiliar, this is almost exactly what Piston lets you do, only in both directions.
2 And a Perl script, apparently.
3 Breaks pull and the UUID-tracking, that is. A generic smerge will still
work just fine.
Home sweet version-controlled home
I've been meaning to do this for quite a while, but I finally have my entire home directory under revision control. I've been keeping most of my important rcfiles in a Subversion repo — and of course no source code goes unversioned — but everything else has just never made it into anything other than my... um... "infrequent"1 off-site back-ups. Until now!
I looked briefly over several version control systems before finally deciding to stick with Subversion. It's stable, the most widely used by open source projects (after CVS), hacked on by someone I know, and only lacks one feature I'd miss. Work-use of SCM might have affected my decision, but my current day job is using StarTeam2 so... no.
The one feature I'd miss is distributed development. All the cool kids are doing it — not to mention the more recent change-set oriented OSS version control systems. So I'm layering that on with SVK. If you haven't heard of it, SVK is... well... a giant hack. And written in Perl none the less. It essentially uses the Subversion libraries to implement an alternate interface for working with Subversion repositories. The main intent is something like distributed development3, but the way SVK goes about it has some interesting effects.
The big one is how working copies are maintained. The SVN tools maintain with each working copy a pristine copy of the revision checked out from the remote repository. This allows for fast revert, diff, etc., and is also probably the easiest way to allow preparation of the change-set for commit. SVK instead of maintaining that pristine revision, mirrors the remote repository. This allows diffing not only against the checked out revision, but against any revision in the mirror — or between any revisions in the mirror. Want to checkpoint some experimental code? Commit it locally and only push it out after it's finished. Need to switch from your experimental code to mainline development? Create local branches switch your working copy between them at will.
And this is kind of petty, but not having those .svn directories in each
checkout means I can stop using this monstrosity:
export SFIND_PRUNE_PATS='*~:*/CVS:*/.svn' sfind() { local dirs='' while [ $# != 0 ]; do echo "$1" | grep '^-' >/dev/null 2>&1 && break dirs="$dirs '$1'"; shift done local opts="$(squote "$@")" local prune_args="-path '$(\ echo "$SFIND_PRUNE_PATS" | sed -e "s,:,' -o -path ',g")'" eval "find $dirs \ '(' '(' $prune_args ')' -prune ')' \ -o $opts -print" }
SVK used to have a reputation for being a bit unstable. So far, the only problem I've had there is trying to use version 2.0's "views" feature, which is clearly tagged as "very beta." And even if it does fall over, my "real" repository is still just plain old Subversion.
Lest I mislead you into believing SVK to be all kittens and sunshine, I have
had some problems. SVK doesn't allow overlapping working copies, so keeping
all of ~ in a repo involves symlinks. SVK keeps all pending non-file
changes4 in one big YAML file — prepare a commit with a lot of deletes, and
you can feel the parse time. SVK doesn't support svn:externals, so WYSIWYG
when it comes to your repo's directory hierarchy. SVK's "views" are supposed
to provide similar functionality, but in addition to not being there yet
implementation-wise, they seem a bit off design-wise — and still aren't
svn:externals. And did I mention that it's written in Perl?
But it's mostly kittens and sunshine, and I don't have to worry about letting cups of coffee near my laptop anymore. For next blog post: where I'm keeping this homedir Subversion repo of mine...
1 i.e., usually made shortly after some sort of laptop + liquid incident.
2 I can vaguely understand using a proprietary version control system, but one which didn't have atomic commits until its most recent release? Come on, people.
3 Projects are still tied to a central repository, which is either impure or just practical, depending on how you look at it.
4 Directory hierarchy changes, property modifications, etc.
Fun with screen
Despite being such an Emacs zealot, I've never been able to love any of the
Emacs shell solutions. Emacs' terminal emulation is just off enough to bug me,
and eshell drives me nuts1. So that leaves me running a shell in a dedicated
terminal (emulator). Because I do most of my work in Emacs — including using
tramp to edit files on remote systems — I rarely need to have more than one
terminal visible at any one time, however many I have running. Solution: the
GNU screen "terminal multiplexer."
Check it out if you haven't heard of it, but this post is mostly about a screen
configuration trick I've found useful. Screen has the ability to have a status
line showing the open virtual terminals, each designated by a title. The title
of each window can either be static, or dynamically updated based upon a
watched-for pattern in the terminal output. The pattern-watching mechanism is
geared toward titling each virtual terminal with the currently running
command2. My configuration hack is to use that mechanism to display instead
the name of the host I've remotely logged into in that terminal.
Here's the relevant bit of my .screenrc:
shelltitle '@|'
hardstatus alwayslastline '%-Lw%{= BW}%50>%n%f* %t%{-}%+Lw%<'
The hardstatus directive is taken pretty much directly from the screen
documentation. The shelltitle directive tells screen that after it sees the
(built-in hard-coded) escape sequence "ESC k ESC \" everything between the
next '@' and the end of the line should be the terminal title.
For "normal" use of this feature, one would embed the escape sequence and pattern in one's shell prompt. I do the same thing, only tricky:
# 'screen' title escape PS1="\033k\033\134@\h\n\033[1A"
I build my $PS1 incrementally in a prompt_command shell function, so that's
only the first part, but it's where the magic happens. Each time the shell
displays its prompt, it will print the "ESC k ESC \" escape sequence, then the
'@' screen is searching for, the hostname (which screen will pick up as the new
title), a newline (ending the title), then finally the terminal control escape
sequence to go back up a line.
End result —
screen picks up the hostname as the title, but rest of the prompt
overwrites the hostname output in the visible terminal text.
1 I could probably hack the way command history and tab completion work, but scrolling behavior is so annoying I've just assumed it must be a hard problem to fix.
2 screen expects the new terminal title to end with a newline character.
Printing to e-ink
I bought myself a Sony Reader a few weeks ago. I'd been lusting after an e-ink based e-book reader since I first read an article about the technology in Popular Science almost 10 years ago. And now they exist! And can display PDFs!
I've been hacking on various randomness for the Reader since buying it. I've gotten familiar with the various PDF/PostScript-manipulation/-generation tools1; figured out the PDF viewer's exact usable screen area; wrote a small LaTeX class for Reader content and a usable-for-me Project Gutenberg TXT --> LaTeX converter2 (more on those after polishing); and figured out how to "print" directly to the Reader from Firefox (er, "Iceweasel") on GNU/Linux.
The following instructions form a rough guide to the last of those:
- Grab my lpr-reader.rb Ruby script and stick it in your
PATH. Modify the script if you want "printjobs" to end up elsewhere than on your SD card. It depends on facets, Ghostscript, pdftk, Xdialog3, and Kovid Goyal's libprs500 command-line tools, so install all of those and their dependencies. - Wasn't that fun? Nod your head "yes."
- In Firefox, go to the infamous URI
about:configand find/create the keyprint.printer_list. It's a space-separated list of printer names, so add to it 'reader-portrait reader-landscape'. - Find your Firefox profile directory (~/.mozilla/firefox/<profile>/),
find/create
user.jsthere, and copy into it the contents of lpr-reader.js (not just copylpr-reader.jsinto the directory). - Restart Firefox.
- Plug in your Reader and use the new printers to print a Web page! As the
names might suggest,
reader-portraitproduces portrait-orientation PDFs whilereader-landscapeproduces landscape-orientation PDFs.
Enjoy!, and please submit any patches to code or process via comments or e-mail.
1 Mostly the Ghostscript suite, html2ps, and pdftk.
2 I do know about GutenMark, but it has some pretty significant limitations — I'm getting much better results for LaTeX generation from a 200 line Ruby script.
3 Yes, yes — it should actually use a widget library. Feel free to submit a patch.