Ted Logan

Fun with PostScript

Saturday, 23 January 2010

One of the side effects of being married to a librarian is a large number of books in the house. One of the side effects of being an engineer with a mild streak of OCD is the desire to catalog and organize the books. This includes printing Library of Congress spine labels for all of our non-fiction collection so I can sort them by subject.

Last Saturday I decided to label the forty-three unlabeled books in my collection, many of which joined my collection at Christmas. When I started printing LC labels, using 80-labels-per-sheet return address labels, I was so frustrated by OpenOffice.org's support for labels (especially my inability to import a long list of text to print on individual labels, or even copy-and-paste across multiple labels) that I set out on my own, using CSS to format my HTML to position text at the precise position on the printed page. This worked fine at first; I told my browser to suppress all headers and print at zero-inch margins and all was well. Last weekend it failed horribly; I couldn't get any of the browsers at my disposal to print zero-inch margins that were actual zero-inch margins, so I had to manually adjust the physical positions I gave in my CSS to match the arbitrary offsets given by my browser. This felt dirty but worked fine, but I knew I could do better.

Having rejected word processing systems for my label-printing needs (given the manual intervention required and the lack of accessible scriptability), I set out in search of something that I could generate from a Perl script and precisely control on-page formatting. I looked briefly at TeX, which seemed a bit overly-complicated for my needs, and realized I could do everything I needed to do in PostScript.

On evenings during the week, I downloaded and read PostScript Language Tutorial and Cookbook, then scanned PostScript Language Reference, third edition for specific information on picking page sizes and determining font metrics. (It's pretty cool that PostScript allows a single document to be reformatted to fit the media at hand, but my application requires prior knowledge of the page size.) I started banging out PostScript code, starting with a simple box and moving on to drawing a sequence of boxes outlining the labels on the page. (I don't want to outline the labels in practice, but it gives me a very useful way to see how well my printed output matches the sheet of labels.) I wrote a for loop in PostScript:

0 1 cols 1 sub {
  /x exch def
  0 1 rows 1 sub {
    /y exch def

    gsave
    left x hpitch mul add pageheight top sub y vpitch mul sub translate

    % Draw the bounding box
    newpath
    0 0 moveto
    width 0 rlineto
    0 height neg rlineto
    width neg 0 rlineto
    0 height rlineto
    closepath
    stroke

    grestore
  } for
} for

For those unfamiliar with PostScript, this may look like gibberish. PostScript is a stack-based language; each operation pops and pushes some values from and to the stack. The for loop takes a block of code, which it iterates through, leaving the value for the current iteration on top of the stack. Adapting to this way of thinking was an interesting exercise; I had to visualize what was on the stack at any given time, and I had the opportunity to write more complicated compound expressions than I might have otherwise attempted. (I'm also worried I ended up with under-documented write-only code; I haven't yet figured out the best way to format and document my code. Debugging was also tricky; I did find the operator to dump the current stack to the debugging console, which gave me enough information to figure out that I was drawing my box up from the current position rather than down as I had intended.)

To write the text, I added an array with the labels to print:

% Content to print on labels
/content [
(BL1130.A4 B472)
(DS406.B56 2009)
(DS406.B76 2008)
(DS407.G75 2008)
(DS436.T17 1999)
...
] def

Then added code inside my inner loop to index into the array, find the width of the string in the current font, and center the string horizontally and vertically inside the current box:

    % Write the text itself
    i content length lt {
      0 0 moveto
      content i get
      dup stringwidth pop
      width exch sub 2 div
      height fontsize add 2 neg div
      moveto
      show
    } if

    /i i 1 add def

That last line of code is the PostScript equivalent of "i++".

Those interested in reading my PostScript program for themselves may find my full code here: labels.ps. Further refinement is needed in the font selection, vertical centering, and automatic generation from the LibraryThing database, but I think it's a pretty impressive result for a few hours of work.