Изменить стиль страницы

Seibel: What’s the biggest system that you’ve worked on that you remember how it was designed?

Deutsch: I’ve been the primary mover on three large systems. Ghostscript—not counting the device drivers, which I didn’t write most of—was probably somewhere on the order of between 50,000 and 100,000 lines of C.

On the ParcPlace Smalltalk virtual machine I worked only on the just-in-time compiler, which was probably 20 percent of it, and that was probably in the low single-digit thousands of lines of C. Maybe 3,000, 5,000—something like that.

And the Interlisp implementation, as much of it as I was concerned with, was probably a couple thousand lines of microcode, and maybe—I’m guessing now—maybe another 5,000 lines of Lisp. So Ghostscript is probably the largest single system I’ve ever been involved with.

Seibel: And other than the device drivers written by other people, you basically wrote that by yourself.

Deutsch: Up to the end of 1999, I basically wrote every line of code. At the beginning I made a few architectural decisions. The first one was to completely separate the language interpreter from the graphics.

Seibel: The language being PostScript?

Deutsch: Right. So the language interpreter knew nothing about the data structures being used to do the graphics. It talked to a graphics library that had an API.

The second decision I made was to structure the graphics library using a driver interface. So the graphics library understood about pixels and it understood about curve rendering and it understood about text rendering but it knew as little as I could manage about how pixels were encoded for a particular device, how pixels were transmitted to a particular device.

The third decision was that the drivers would actually implement the basic drawing commands. Which, at the beginning, were basically draw-pixmap and fill-rectangle.

So the rendering library passed rectangles and pixel arrays to the driver. And the driver could either put together a full-page image if it wanted to or, for display, it could pass them through directly to Xlib or GDI or whatever. So those were the three big architectural decisions that I made up front and they were all good ones. And they were pretty much motherhood. I guess the principle that I was following was if you have a situation where you have something that’s operating in multiple domains and the domains don’t inherently have a lot of coupling or interaction with each other, that’s a good place to put a pretty strong software boundary.

So language interpretation and graphics don’t inherently interact with each other very much. Graphics rendering and pixmap representation interact more, but that seemed like a good place to put an abstraction boundary as well.

In fact I wrote a Level 1 PostScript interpreter with no graphics before I wrote the first line of graphics code. If you open the manual and basically go through all the operators that don’t make any reference to graphics, I implemented all of those before I even started designing the graphics. I had to design the tokenizer; I had to decide on the representation of all the PostScript data types and the things that the PostScript manual says the interpreter has to provide. I had to go back and redo a lot of them when we got to PostScript Level 2, which has garbage collection. But that’s where I started.

Then I just let my experience with language interpreters carry me into the design of the data structures for the interpreter. Between the time that I started and the time that I was able to type in 3 4 add equals and have it come back with 7 was about three weeks. That was very easy. And by the way, the environment in which I was working—MS-DOS. MS-DOS with a stripped-down Emacs and somebody’s C compiler; I don’t remember whose.

Seibel: This was a task that you had done many times before, namely implementing an interpreter for a language. Did you just start in writing C code? Or fill up a notebook with data-structure diagrams?

Deutsch: The task seemed simple enough to me that I didn’t bother to write out diagrams. My recollection was that first I sort of soaked myself in the PostScript manual. Then I might have made some notes on paper but probably I just started writing C header files. Because, as I say, I like to design starting with the data.

Then I had some idea that, well, there’d have to be a file with a main interpreter loop. There’d have to be some initialization. There’d have to be a tokenizer. There’d have to be a memory manager. There’d have to be something to manage the PostScript notion of files. There’d have to be implementation of the individual PostScript operators. So I divided those up into a bunch of files sort of by functionally.

When I took the trouble of registering the copyright in the Ghostscript code I had to send them a complete listing of the earliest Ghostscript implementation. At that point it was like 10 years later—it was interesting to me to look at the original code and the original structure and the original names of various things and to note that probably 70 or 80 percent of the structure and naming conventions were still there, 10 years and 2 major PostScript language revisions later.

So basically that’s what I did—data structures first. Rough division into modules. My belief is still, if you get the data structures and their invariants right, most of the code will just kind of write itself.

Seibel: So when you say you write a header file, is that to get the function signatures or the structs, or both?

Deutsch: The structs. This was 1988, before ANSI C—there weren’t function signatures. Once ANSI C compilers had pretty much become the standard, I took two months and went through and I made function signatures for every function in Ghostscript.

Seibel: How has your way of thinking about programming or your practice of programming, changed from those earliest days to now?

Deutsch: It’s changed enormously because the kinds of programs that I find interesting have changed enormously. I think it’s fair to say that the programs that I wrote for the first few years were just little pieces of code.

Over time I’ve considered the issues of how do you take a program that does something larger and interesting and structure it and think about it, and how do you think about the languages that you use for expressing it in a way that manages to accomplish your goals of utility, reliability, efficiency, transparency?

Now I’m aware of a much larger set of criteria for evaluating software. And I think about those criteria in the context of much larger and more complex programs, programs where architectural or system-level issues is where all the hard work is. Not to say that there isn’t still hard work to be done in individual algorithms, but that’s not what seems most interesting to me any more—hasn’t for a long time.

Seibel: Should all programmers grow up to work at that level?

Deutsch: No. In fact, I was just reading that an old friend of mine from Xerox PARC, Leo Guibas, just received some fairly high award in the field. He has never really been a systems guy in the sense that I’ve been; he’s been an algorithms guy—a brilliant one. He’s found a way to think about certain classes of analysis or optimization algorithms in a way that’s made them applicable to a lot of different problems, and that has yielded new tools for working with those problems. So, it’s wonderful work. Programmers should be able to grow up to be Leo Guibas, too.

There’s a parallel between architectural principles and the kinds of algorithmic design principles that Leo and people like him use to address these hard optimization and analysis problems. The difference is that the principles for dealing with algorithmic problems are based a lot more directly on 5,000 or10,000 years’ worth of history in mathematics. How we go about programming now, we don’t have anything like that foundation to build on. Which is one of the reasons why so much software is crap: we don’t really know what we’re doing yet.