I've been doing a fair bit more reading on my Android phone, although recently I've switched from FBReaderJ to Aldiko. Each new release of FBReaderJ has gotten better, but Aldiko includes an actual CSS-based renderer1 and presently provides a much smoother reading experience2. I feel a little guilty using a piece of commercial software when a free-as-in-freedom solution exists, but not yet guilty enough to try hacking on FBReaderJ.

Guilt aside, reading more on a smaller screen has had me thinking again about the tension between book creator and reader in e-book formatting and layout. EPUB on the mobile phone highlights this tension more than e-ink devices if for no other reason than that a phone screen even less resembles a book page than an e-ink screen does. One conclusion I've come to is that it's somewhat unfortunate that Adobe has thus far been the biggest contributor to EPUB as a commercial e-book format.

On the one hand, someone had to do it, and it's good that someone has done it. Even the DRM thing, to some degree — most publishers aren't ready to do without it, and at least Adobe had the good grace to use an easily circumventable system. The major concern I have is with some aspects of the apparent mindset behind Adobe Digital Editions.

At present, Digital Editions is the EPUB viewer to beat — I have no idea what the actual usage figures look like, but it's the only viewer one can legally use for all those commercially-sold Adobe-DRMed EPUB books, so one has to imagine that DE commands the lion's share of the market. Adobe represents Digital Editions as being more than just an EPUB viewer. The advertising copy on the DE web site touts it as "offer[ing] an engaging way to view and manage eBooks and other digital publications." To this end, DE supports not only EPUB, but also the document format much more central to Adobe's business — PDF. And because PDF is so much more important to Adobe, DE caters to PDF at the expense of EPUB.

It's no surprise to the average e-book enthusiast that PDF's fixed-page nature makes it a poor e-book format. The most obvious reason is that PDF files can't be cleanly reflowed to alternative page sizes. A perhaps less obvious corollary is that PDF leaves little room for user control of basic formatting parameters such as font size, line height, text alignment, and paragraph marking. But via one or more chains of causality, because PDF rendering does not allow user-control of these properties, Adobe DE doesn't allow setting them for EPUB either. This yields an EPUB viewer where the reader has control over only the font size and page size. And even then only partial control! — DE will not rescale any size a book specifies as an absolute size, and DE allows books to provide "page template" files which control how the available screen area is divided into text regions and body columns.

The most generous interpretation of this decision is interface consistency. As long as Adobe is attempting to present PDF as an e-book format and provide a viewer which handles "digital publications" regardless of format, it only complicates that viewer's interface to provide options which apply to some formats but not others. Somewhat less charitably, Adobe's focus on PDF may have led to an unconscious bias toward creator control of document formatting, to the extent of perhaps not even considering placing control beyond font size (a.k.a. "zoom") in the readers' hands. And way over on the conspiracy-theory side of the fence, perhaps it represents a conscious decision on the part of Adobe to limit the usefulness (and adoption) of EPUB by making it only a small improvement over PDF even for the documents most suited to reflowable formatting.

So for whatever reason, the most popular EPUB viewer on the market has limitations which severely restrict the usefulness of the format. But EPUB is an open standard, which means other, better viewers are free to compete with Digital Editions and supplant it, right? Except they aren't really, because of DRM.

As I mentioned above, it seems that most publishers are not yet ready to do without DRM, despite the lesson of the music industry. The overwhelming majority of commercially-sold books are encumbered with DRM, and all of the DRM-encumbered EPUB books sold use Adobe's ADEPT DRM. It would be technically possible for a competitor to begin offering a different EPUB DRM scheme, but I can only imagine the degree of confusion mutually incompatible "EPUB" books would cause among average consumers. So any successful EPUB viewer device or application needs to license the ADEPT DRM technology from Adobe.

To facilitate this, Adobe has begun offering the "Adobe Reader Mobile 9 SDK." The SDK is available by license agreement only4, so we can but speculate from the marketing copy on the capabilities and interfaces it provides. The "features" list in the SDK FAQ focuses entirely on features of the "Reader Mobile document rendering engine," suggesting that SDK primarily/only provides a rendering engine. If this is the case, then Adobe is not expecting — or potentially allowing — other vendors to write competing ADEPT-compatible EPUB renderers. Instead, all the available and announced EPUB reader apps/devices using the Adobe SDK5 will simply repackage the Adobe renderer and the paucity of options for user control it provides.

One example in support of this theory is Amazon's PDF support in the Kindle DX. Although not widely advertised, Amazon is apparently using the Adobe SDK, just integrating only the PDF renderer. Notably missing in the DX's PDF support vs. all the Kindles' Mobipocket support is the ability to add annotations to documents. It seems to me that this would be a "must have" feature, not only for parity with Mobipocket support, but also for the target market as a device for textbooks and technical documents. To me the most obvious explanation for this feature's lack is that there isn't an easy way to add it while still using the SDK to allow rendering of DRMed PDFs. There are plenty F/OSS PDF renderers to which Amazon could have (comparatively) easily added annotation support, which suggests that the Adobe SDK's DRM support is an implicit part of the included renderer, and that SDK licensees cannot use the SDK to read DRMed documents independently of the renderer.

Another interesting angle is to compare the Adobe approach with how other book formats/viewers handle the creator-reader tension.

The desktop version of MSReader allows setting only the font size, but a significant number of users seem to regard it as still the best desktop e-book viewer available. A major component of this seems to be very well-chosen defaults for properties like line-height, and the infrequency with which books alter those properties. I have seen much more mixed reactions to the Mobile version, perhaps because on the smaller screens of mobile devices tuning the line-height, margins, etc to an individually comfortable size is much more important. Perhaps a commenter could fill me in?

The various Mobipocket viewers support differing assortments of user-controlled properties. Most support setting font-size, line-height, and paragraph alignment. Interestingly, Mobipocket's treatment of these properties demonstrates all three possible resolutions of the creator-reader formatting tension. Line-height may be set by the reader, but not by the book creator — the format simply provides no way to specify it. The font-size may be set by the book creator, but only in terms of a size relative to the reader-selected base size. And paragraph alignment may be specified by either the book creator or reader, but the creator's setting overrides the reader's when specified.

Correspondingly, paragraph alignment is the subject of one of Mobipocket's most forceful formatting recommendations: "alignment must NOT be set if it is not strictly needed." In contrast, the EPUB specification documents contain little resembling formatting guidelines (beyond an admonition against using absolute positioning). The one purely formatting recommendation in Adobe's "EPUB Best Practices Guide" is "use spacing that looks more like a book," suggesting including CSS rules to eliminate default space around block-level elements. It does contain some sensible recommendations e.g. against using
tags in most contexts, but otherwise the "Guide" documents Adobe's EPUB extensions and DE's quirks more than actual "best practices." Which means that EPUB combines the most extensive e-book formatting capabilities with the fewest guidelines for producing actually readable books. And this is something of a challenge for authors of EPUB viewers.

Coming full circle to the beginning of this post, the Aldiko EPUB viewer does allow the user to set some basic formatting properties, including margins, font, font-size, and line-height6. The first three work seamlessly, but line-height somewhat less so. Aldiko handles books which don't specify their own line-height fine, but any book-specified line-height "wins." Which isn't intended as a slight on Aldiko — this is a difficult problem to solve.

Something like page margin can really only be set via CSS in one "most sensible" way7, and thus is easy to override coherently and consistently in a viewer. Font-size is potentially trickier, but the most obvious ways of specifying font sizes (relative sizes and the CSS named absolute sizes) make a simple solution fairly straightforward. Properties like 'line-height' unfortunately lack such a solution — all the allowed relative values for the CSS 'line-height' property are interpreted as relative to the 'font-size', not the 'line-height' of the parent element. This means that the font-size solution of "change the base and respect subsequent relative changes" doesn't work for line-height. Without doing some sort of layout analysis, all a viewer can do is either ignore all book-specified line-heights or respect all book-specified line heights.

Solution? I'm not sure. One possible solution would be for book producers and viewer author to agree on guidelines which allow viewers to consistently override some set of formatting properties. Most of these could be fairly simple, like Mobipocket's "alignment must not be set" rule. Another solution would be for e-book reader apps to do some sort of pre-rendering layout analysis which allows them to produce automatically produce a per-book user stylesheet. I like hands-off technological solutions, but I'm not sure how feasible that will be on mobile devices.

Other ideas?

1 To my surprise, a good enough one to handle the 'max-width' property on images.

2 Quite literally, in the case of the page-turn animation.

4 Probably to stop people from reverse-engineering the DRM system.

5 The ones I'm currently aware of: the Sony Reader, Lexcycle Stanza, the Bookeen Cybook, and the Elonex ebook.

6 Not paragraph alignment yet, but via e-mail the author has said it'll probably be added soon.

7 One could set left and right margins on every block level element, but hopefully no one actually does that.

Thoughtcrime Experiments EPUB edition

posted on May 01, 2009

This past weekend Leonard Richardson (of BeautifulSoup fame) and Sumana Harihareswara released Thoughtcrime Experiments, a 100% independent, CC BY-NC-SA 3.0 licensed F/SF anthology. It's pretty rockin', in both concept and execution.

Ararche Jerico beat me to it by a couple days with his e-book editions of Thoughtcrime Experiments, but what I lack in speed, I make up in obsessive-compulsive attention to trivial details. Thus I present to you the Thoughtcrime Experiments EPUB pedant edition. To what sorts of trivial details does this edition attend? Glad you asked!:

  • Book-optimized stylesheet. I created a new stylesheet for the book from scratch, roughly following the design of the Thoughtcrime Experiments PDF/print edition. Beyond just looking more "book-like" (paragraphs marked by indentation rather than vertical whitespace, etc), I've also ensured that all body text maintains vertical rhythm, lines appearing at the same positions even following headings, breaks, and so on.
  • High-quality art. Includes higher-quality version of the artwork than used for the Web edition, providing some degree of future-proofing for larger e-book reading displays. I've also taken pains to display the art nicely, despite the best efforts of the CSS standard to the contrary.
  • A beautiful, fully-scalable SVG cover. Ok, the SVG cover actually looks like crap / doesn't display at all in everything but AdobeDE. But it looks quite nice there, thank you very much.
  • Hand-corrected HTML->XHTML conversion. The source HTML had a handful of missing-tag etc. issues. I individually validated each file and hand-corrected it, ensuring perfect markup.
  • Fully normalized punctuation. The source HTML contained some variation in punctuation conventions and handful of punctuation errors (a missing opening quotation mark, two hyphens instead of an emdash, etc). I've corrected the errors and normalized to a single set of conventions.
  • Squeaky-clean metadata. Not only passes epubcheck1, but does so with panache! Includes all required metadata, identifies each contributor with a separate metadata entry, and includes a metadata table of contents which mimics the structure of the PDF/print edition TOC.
  • Embeds a font. This is a bit of a pet peeve of mine, but the OPS spec doesn't actually require that EPUB reader systems provide a default set of glyphs for any particular set of Unicode characters in any particular typeface. This means that embedding a font is the only option if you want to be certain all the characters in your book will display properly. I chose TeX Gyre Pagella because it's (a) free-as-in-freedom and (b) looks nice and sassy2 at 10 points.

Enjoy!

1 Actually not completely true at the time of writing, due to what appears to be a bug in NCX validation.

2 By which I mean "legible."

Books on the small screen

posted on April 08, 2009

I've now read two full novels and a novella1 on my G1 using FBReaderJ. FBReaderJ is loosely a Java port of FBReader, which is a somewhat strange piece of software. Although FBReader has a fairly small (if vocal) user base in the English-speaking world, it's the de facto standard in Russia, where newly released books are even sold commercially in the FBReader-specific "Fiction Book 2" (FB2) format. The original FBReader supports most currently-used e-book formats, but FBReaderJ supports only FB2 and and EPUB.

EPUB! That means it must have a full XHTML and CSS renderer, probably integrating WebKit in some fashion, right? Actually, no, and the software is probably better for it.

In e-book formats and reading software there exists a tension between book creator and reader control of formatting features. Mobipocket and eReader (and FB2) put almost all the control in the hands of the reader. In Mobipocket, for example, the book creator can specify the paragraph alignment and indentation, but not the line spacing or base font size2. LIT strikes something of a compromise by providing a desktop reader application which respects many creator-specified properties like base font size and line-spacing, but ignores them in the PocketPC version of the software. EPUB swings the pendulum all the way in to the side of the book creator — the creator may specify a particular typeface (and embed the necessary fonts), type size, line spacing, margins, column arrangement, and so on.

In theory this aspect of EPUB is wonderful, but I've become less and less enthusiastic about it. For many reasons, please see my last post on how Mobipocket hits a worse-is-better engineering sweet spot. One I just barely touch on there is usability (in the weak sense). I've spent many many hours writing and tweaking my EPUB-conversion pipeline in order to produce what I find to be optimally-readable files for my Sony Reader. I've had to bring to bear the entirety of my engineering skill to specify the font I want, the line-spacing I want, and the margins I want. In FBReaderJ, these are just settings in the application, and I was able to sort out something pleasing and readable in about 5 minutes.

This isn't without cost, of course. FBReaderJ doesn't have a real CSS parser, and barely understand HTML. In the books I've read with it so far, this has led to completely missing significant whitespace and absent meaningful formatting (such as text emphasized with italics). This is probably pushing the tradeoff too far the counter-creator direction, but I do have to admit — rendering the text with a legible line-spacing is for more important for basic readability than a handful of blank lines and italics.

1 Richard Herley's Refuge, Cory Doctorow's Little Brother, and Doctorow and Rosenbaum's "True Names."

2 Well, mostly. Apparently there are some "large print" Kindle books out there which use the almost unbelievably bad hack of wrapping all text in <font/> elements and using a default @size of 5.

Worse-is-better in e-book formats

posted on April 03, 2009

It is the year 2009 and the most popular electronic book format in the world consists of HTML 3.2 in a Palm database. Why?

For those of you who aren't total e-book wonks, I'm talking about the Mobipocket format. The eponymous controlling company first debuted in 2000, proved the most successful of the initial crop of PDA-centric e-book format vendors, and was acquired by Amazon in 2005. The format is now not only the most popular "device independent" commercial e-book format, but is also the format used for the vast majority of Kindle e-books. More devices support Mobipocket than any other commercially-sold format, and although we don't have any actual sales data, I don't think there's much doubt that Amazon is selling more devices and books than any other vendor, at least in the US.

A lot of technical detail follows, but here's most of the conclusion: I think that Mobipocket has managed to hit a sort of technology/complexity engineering/usability sweet-spot. It uses (an albeit bastardized) HTML for markup, which gives it an edge over eReader; it uses an extremely simple container format, which gives it an edge over LIT; and it uses an appallingly simple rendering model, which gives it an edge over EPUB. This is kind of difficult to explain without going into loads and loads of detail about the e-book formats, so bear with me if I get long-winded.

Here's a summary of how Mobipocket stacks up technically against the other commercial reflowable formats:

  • Mobipocket: Container is a Palm database, providing primarily a mapping between a numeric identifier and "record" content. Book text is a single stream of HTML 3.2-ish with proprietary extensions (mostly formatting-oriented), no separate style language, and proprietary compression in 4k chunks. Images are GIF and JPEG, which some viewers limit to 64k in size.
  • eReader: Container is a Palm database. Book text is a single stream of "PML," a proprietary, non-SGML, primarily formatting-oriented markup language. Images are PNG, limited to 64k.
  • Microsoft LIT: Container is an MS "ITOL/ITLS" HTML Help 2.0 file, providing all sorts of exciting features like LZX compression, binary representation of markup content, and arbitrary optional auxiliary data associated with each content stream. Book text is an arbitrary number of OEBPS 1.0 markup streams (essentially HTML 4.0 as well-formed XML) plus a subset of OEBPS 1.0 CSS (no contextual selectors). Images are GIF, JPEG, and PNG.
  • EPUB: Container is a ZIP file with some extra JAR-like metadata. Book text is an arbitrary number of XHTML 1.0 streams1 with CSS 2 styling. Images are JPEG, PNG, and SVG.

Mobipocket and eReader are near the same level technically, but Mobipocket's success — and eReader's continued existence — in the face of LIT and EPUB I think is quite interesting. EPUB is still very new, but Microsoft first released Microsoft Reader in 2000, the same year that Mobipocket incorporated. LIT is a closed format which relies on quite a bit of MS-specific technology, but EPUB is an open standard and composed mostly of other existing and well-supported open standards. Cross-comparison is far from free of contaminating historical/environmental factors, but I think there is something to be learned from Mobipocket — that worse is still better.

Mobipocket and eReader were founded around the same time (eReader in 1998 as Peanut Press) and with essentially the same constraints. Both primarily targeted PDAs running PalmOS and thus needed a format which they could easily render on (by today's standards) woefully under-powered devices. EReader chose to create their own simple, easy-to-parse markup language called PML (Peanut Markup Language). Even after 10 years, the core language remains comparatively elegant, containing only a few dozen tags and no redundancy outside of deprecated features. It simply and directly supports most of the formatting features possible on a Palm PDA with no pretence of semantic connotation.

In contrast, Mobipocket chose to extend the existing HTML 3.2 markup language. It's somewhat difficult to understand all the reasons for this decision without knowing the politics involved or the existing state of HTML rendering on the Palm platform at the time, but certain things are clear. First, rendering HTML 3.2 is more difficult than rendering a PML-like language — HTML is much more complicated, is harder to parse, and contains many redundancies in the formatting-wise interpretation of its elements. HTML 4 and CSS 2 were the cutting-edge standards in 2000, so choosing HTML 3.2 didn't provide much coherence with existing Web standards, a problem furthered by the addition of proprietary features implementing such things as page breaks and paragraph indentation.2

It is unclear to me how much Mobipocket's similarity to HTML aids strict interoperability, but the perception that they are iteroperable clearly exists, and for quick-and-dirty case, for interpretation of texts with no fancy formatting, the correspondence is simply good enough. The correspondence between HTML semantics and Mobipocket formatting is sufficiently weak that when I wrote Mobipocket-generation support for calibre I opted to treat "MobiML" as a completely distinct pure-formatting language. This approach allows for a higher degree of formatting fidelity (although not complete — Mobipocket's formatting limitations are many and baroque), but in 99% of cases, throwing some HTML at the Mobipocket renderer will render well enough to actually read the text.

One of the big simplicity wins a language like PML has over HTML is that it has an explicitly "flat" rendering and markup model. Tags intoducing paragraphs completely determine the paragraph-level formatting of their contained text, the renderer either disregarding or disallowing any previously active formatting state. In contrast, all versions of HTML allow some degree of arbitrary block-level nesting in which the active formatting state combines and merges with the new state. This means that to start accurately rendering at some arbitrary point — at the destination of a hyperlink, or just where the user left off the last time they were reading — a reader application needs to be able to figure out the current formatting state at that point. There are a few solutions to this problem.

One is to do what Microsoft LIT does and put lots of extra information in the container. The LIT container is compressed in 64k chunks, which allows full compression with random seeking to anything in the container with minimal decompression overhead. LIT contains indices of all the hyperlink target elements and all page-breaking elements which specify the positions of all their ancestor elements. These combined with LIT's simplified contextless CSS mean a full LIT renderer3 can figure out the formatting state of anywhere in the book with a minimum amount of extraneous processing. Accurate rendering is just a few index lookups away!

The downside is that all this is very complicated. Microsoft is able to piggy-back off of their HTML Help support libraries, but third-party implementations would need to read the whole mulilayered ITOL/ITLS archive goodness from scratch and incorporate the indices into their renderers. There are many third-party reader applications which can handle Mobipocket, but very few which handle LIT, and none that I know which actually use all the extra information in the LIT container format.4

Another solution to the problem is to do what EPUB does and depend on having fast enough hardware that figuring out the appropriate rending isn't an issue. EPUB is conceptually very simple — XHTML in a ZIP archive. On contemporary hardware with contemporary embedable HTML renderers like WebKit this is almost trivial to implement — I believe Kovid Goyal put together the first version of the calibre EPUB viewer in about a week.

The downside is that your cellphone or e-ink display reader isn't quite so powerful. The EPUB specification places no limitation on CSS complexity or XHTML flow size, allowing for example a CSS sibling selector applied over 100MB XHTML file. Not to mention that you have to decompress that 100MB flow all at once and keep the whole thing around in memory while rendering it. Ouch. There are a few ways to keep processing time generally sane while still rendering most markup correctly, but Adobe's implementation on the Sony Reader of simply refusing to render any flow larger than 300k has forced that simple expedient as the most common method.

And then there's Mobipocket's solution — don't render the markup correctly. Yep, you heard me correctly. The text still displays, but if that hyperlink drops you in the markup right after the italics tag, then the text which was supposed to be in italics won't be. Oh, it has chunked compression LIT so it can seek anywhere with minimal decompression overhead, but when it gets there it just starts rendering with what it's got.

When I first realized this I was completely aghast. "It's wrong! They're letting text be rendered wrong!" But the more I thought about it, the more I came to feel that this was actually a brilliant decision. It means that rendering is always instantaneous, no matter where the user jumps to in the book. Although book-producers have to do some extra work if they want all their links to point to Mobipocket-sensible places in the markup, if they don't then the book is still readable. Pathological cases merely degrade rendering, not disrupt it. Contrast this with the EPUB experience-thus-far, where books which cannot be read on the Sony Reader are an unfortunately common occurrence.

Of course, I still completely detest the Mobipocket format and hope it dies in a fire. Seriously, a Palm database in 2009? What I'd like to see is a new version of the EPUB standard which takes the lessons learned from other formats more to heart. With a reduction in complexity, standardization of certain necessary size limits, and at least reader application guidelines for imposing user stylesheets, I think that EPUB can still be the format to beat going forward.

1 Or technically DTBook, although I've never seen one in the wild and don't know which — if any — viewers support it.

2 Although there is evidence some of these were added later. For example, if the Mobipocket file header indicates one the earliest version of the format then the <hr> tag induces an explicit page break.

3 That is to say, Microsoft Reader.

4 In fact, AFAIK I was the first person to even bother reverse-engineering them when I implemented LIT generation for calibre.

Is this thing on?

posted on August 19, 2008

I've been busy and blog-neglecting lately, but I'm still here and hope to start blogging again soon. And I want to test that my bits haven't completely rotted while I wasn't paying attention.

Update: Still fresh!

Look out Martians

posted on March 26, 2008

Thomas Edison has your number.

Nein!

posted on March 25, 2008

My friend posix4e has today the most awesome description evah of the nature of actual programming:

Basically programming is often like playing golf in alice in wonderland with a referee who was trained by the SS.

e-bookin’

posted on March 21, 2008

Ooh – I have a blog. Maybe I should use it to post things?

On that theory, I present British author Richard Herley. Mr. Herley – whose novel The Penal Colony was the basis for the film No Escape – is offering his books for free download. They are provided under a CC A-NC-ND licence, but he requests that “honourable” readers pay him a small fee if they enjoy the books. I really hope this sort of business model can work.

Ode to building, part 2

posted on August 08, 2007

In the first part of this series I introduced the idea of what I termed abstract build policies — higher-level methods for describing how to produce the DAG of a build from a description of the desired inputs (sources) and results (targets). An abstract build policy allows developers to specify their sources and targets in abstract terms and let the build system sort out the details. For example, the autotools allow a developer to specify that they want to generate a shared library from a set of C source files:

lib_LTLIBRARIES = libexample.la
libexample_la_SOURCES = example.c example.h ...

But what's really going on under the hood?

Instead of requiring developers to describe the complete DAG of all sources-to-targets steps in a build, the autotools conspire to allow the developer to operate in terms of a set of higher-order abstractions:

systems:: The build is occurring on a particular kernel/OS/hardware combination (the build-system) to run on another (the host-system) and may — if producing a compiler, etc. — generate system-specific entities for another (the target-system).
variants:: The user invoking the build may choose to add optional part of the build, remove other parts, and cause yet other parts certain parts to occur in an alternative fashion.
sources:: The inputs to the build, some of which may themselves need to be generated from predecessor sources.
targets:: The kinds from a restricted set (e.g., shared library), names (libexample.so), and other properties (somewhat abstract installation location or lack thereof) of entities to result from the build.

The key aspect of the abstraction is that the developer describes her build only in terms of the details relevant for the particular build. "This set of source files produces a shared library." The autotools handle the nitty-gritty of populating the DAG with all the nasty little details only real toolchain-weenies actually care about.

In a perfect world (or for sufficiently simple/well-written code), the developer doesn't care whether the code is being built with gcc or the Sun Workshop compiler, for OpenBSD or AIX. If the code depends upon OS features which vary from system to system, then the developer needs to account for those variations, but only those variations. On Linux accessing /proc involves text-file parsing, on Solaris using a library interface, but the developer still doesn't care which compiler the build invokes or which flags the linker requires to coax it into producing a shared library.

The key limitation of the autotools is that they provide no facility for generating new abstractions. As I said in part 1, they provide a set of abstract build policies, but no a real build policy abstraction. The beauty and downfall of the autotools is that they produce truly portable build descriptions — build descriptions which depend only on the POSIX shell, POSIX utilities, POSIX make, and a toolchain for the source language. But providing even as simple a new abstract policy as one for producing a new target type or compiling a new source language is impossible without wholesale modification of the autotools core. And even then a new policy has little power to leverage intermediate abstractions, ultimately needing to itself generate portable POSIX make rules.

As originally promised for this part, part 3 will look at build tools which allow definition of new abstractions and what that implies.

Ode to building, part 1

posted on August 08, 2007

[This is an only slightly edited repost from an earlier incarnation of my blog. The archives from that blog didn't make the transition, but I'm planning to continue the series, so here's this post again.]

How does one build a piece of software from source? The superficial answer is run the tools1 on the source and intermediate files in the proper order. I'm sure we've all slapped something in a single C source file and just run the compiler on it from the shell. The next logical step beyond this would be a batch mechanism like a shell script running all the necessary commands. Perhaps worthy of a post to worsethanfailure.com if I still had access to the original, I've seen production binaries built with a file like:

#! /bin/sh

cc -c source1.c
cc -c source2.c
# ...
cc -o product source1.o source2.o # ...

In practice most of us use higher-level tools (thank god). The make tool represents the obvious next step "Unix philosophy"-wise. It assembles a directed acyclic graph of files in the build and performs the steps necessary to build the parts of the graph not yet present, leveraging the POSIX shell to represent and execute each build step. Some projects — it seems to be especially popular for kernels — use what I'll call Plain Old Make (POM), applying make "directly."

I put "directly" in quotation marks because most large POM projects put a fair amount of engineering effort into constructing higher-level abstractions applied consistently and automatically throughout the project. In standard Unix fashion make provides "mechanism not policy." In this case make implements the "mechanism" of bringing all the DAG nodes up to date, but requires make users to specify the "policy" of how to create each of those nodes from their dependencies2.

This leads to two interesting observations:

  • Most projects — especially large ones — want factored build policies (node patterns) they can apply consistently throughout the project.
  • Most projects share an awful lot of policy in common.

The GNU version of make includes extension to help with the first3, but for the second we move on to yet higher-level tools.

Most open source projects these days describe their builds using the GNU autotools: autoconf, automake, and libtool. The autotools provide abstract build policies which factor reusable build logic. They provide these policies in two ways. First, the autotools "know about" certain kinds of common build targets — they include the logic necessary to build a "program" or a "shared library" on supported platforms, requiring the developer only to specify the kind of target desired, not all the steps necessary to achieve it. Second, the autotools provide a policy for producing certain kinds of variant builds — they probe for features of the target platform and implement a policy for reacting to those features4.

Unfortunately, the autotools limit their abstract policy capabilities in an important way: the inability to implement new abstract policies. The autotools make it possible to descend to the mechanism level of make, m4, and the POSIX shell, but only in the same concrete way those tools do on their own. The autotools provide a collection of abstract policies, but no real policy abstraction.

Build tools which implement real build policy abstractions move us out of the realm of current popular use and on into part 2 of this series...

1 e.g., compiler, linker, Whiz-bang Source Frobnicator Enterprise Edition

2 The pedantic will note make support for "pattern rules," but I consider the fact that POSIX make pattern rules don't account for e.g. the different ways an object file needs to be generated for inclusion in an executable vs. a shared library enough for me to ignore them in this discussion.

3 $(include) and friends, although they're pretty clunky in practice.

4 This would be your config.h and its HAVE_FOO_H etc. macros.

old posts