Physics was also less satisfying to me because it has kind of stalled. There’s something not quite right when you have these big inductive theories where people are polishing corners and inventing stuff like dark energy, which is basically unfalsifiable. I was gravitating toward something that was more practical but still had some theoretical strength based in mathematics and logic.
Then I went to University of Illinois Champaign-Urbana to get a master’s degree, at least. I was thinking of going all the way but I got stuck in a project that was basically shanghaied by IBM. They had a strange 68020 machine they had acquired from a company in Danbury, Connecticut, and they ported Xenix to it. It was so buggy they co-opted our research project and had us become like a QA group. Every Monday we’d have the blue suit come out and give us a pep talk. My professors were kind of supine about it. I should’ve probably found somebody new, but I also heard Jim Clark speak on campus and I pretty much decided I wanted to go work at Silicon Graphics.
Seibel: What were you working on at SGI?
Eich: Kernel and networking code mostly. The amount of language background that I used there grew over time because we ended up writing our own network-management and packet-sniffing layer and I wrote the expression language for matching fields and packets, and I wrote the translator that would reduce and optimize that to a short number of mask-and-match filters over the front 36 bytes of the packet.
And I ended up writing another language implementation, a compiler that would generate C code given a protocol description. Somebody wanted us to support AppleTalk in this packet sniffer. It was a huge, complex grab bag of protocol syntax for sequences and fields of various sizes and dependent types of… mostly arrays, things like that. It was fun and challenging to write. I ended up using some of the old Dragon book—Aho and Ullman—compiler skills. But that was it. I think I did a unifdef clone. Dave Yost had done one and it didn’t handle #if expressions and it didn’t do expression minimization based on some of the terms being pound-defined or undefined, so I did that. And that’s still out there. I think it may have made its way into Linux.
I was at SGI from ’85 to ’92. In ’92 somebody I knew at SGI had gone to MicroUnity and I was tired of SGI bloating up and acquiring companies and being overrun with politicians. So I jumped and it was to MicroUnity, which George Gilder wrote about in the ’90s in Forbes ASAP as if it was going to be the next big thing. Then down the memory hole; it turned into a $200 million crater in North Sunnyvale. It was a very good learning experience. I did some work on GCC there, so I got some compiler-language hacking. I did a little editor language for MPEG2 video where you could write this crufty pseudospec language like the ISO spec or the IEC spec, and actually generate test bit streams that have all the right syntax.
Seibel: And then after MicroUnity you ended up at Netscape and the rest is history. Looking back, is there anything you wish you had done differently as far as learning to program?
Eich: I was doing a lot of physics until I switched to math and computer science. I was doing enough math that I was getting some programming but I had already studied some things on my own, so when I was in the classes I was already sitting in the back kind of moving ahead or being bored or doing something else. That was not good for personal self-discipline and I probably missed some things that I could’ve studied.
I’ve talked to people who’ve gone through a PhD curriculum and they obviously have studied certain areas to a greater depth than I have. I feel like that was the chance that I had then. Can’t really go back and do it now. You can study anything on the Internet but do you really get the time with the right professor and the right coursework, do you get the right opportunities to really learn it? But I’ve not had too many regrets about that.
As far as programming goes, I was doing, like I said, low-level coding. I’m not an object-oriented, design-patterns guy. I never bought the Gamma book. Some people at Netscape did, some of Jamie Zawinski’s and my nemeses from another acquisition, they waved it around like the Bible and they were kind of insufferable because they weren’t the best programmers.
I’ve been more low-level than I should’ve been. I think what I’ve learned with Mozilla and Firefox has been about more test-driven development, which I think is valuable. And other things like fuzz testing, which we do a lot of. We have many source languages and big deep rendering pipelines and other kinds of evaluation pipelines that have lots of opportunity for memory safety bugs. So we have found fuzz testing to be more productive than almost any other kind of testing.
I’ve also pushed us to invest in static analysis and that’s been profitable, though it’s fairly esoteric. We have some people we hired who are strong enough to use it.
Seibel: What kind of static analysis?
Eich: Static analysis of C++, which is difficult. Normally in static analysis you’re doing some kind of whole-program analysis and you like to do things like prove facts about memory. So you have to disambiguate memory to find all the aliases, which is an exponential problem, which is generally infeasible in any significant program. But the big breakthrough has been that you don’t really need to worry about memory. If you can build a complete controlflow graph and connect all the virtual methods to their possible implementation, you can do a process of partial evaluation over the code without actually running it. You can find dead code and you can find redundant tests and you can find missing null tests.
And you can actually do more if you go to higher levels of discourse where we all operate, where there’s a proof system in our head about the program we’re writing. But we don’t have a type system in the common languages to express the terms of the proof. That’s a real problem. The Curry-Howard correspondence says there’s a correspondence between logic systems and type systems, and types are terms and programs are proofs, and you should be able to write down these higher-level models that you’re trying to enforce. Like, this array should have some constraint on its length, at least in this early phase, and then after that it maybe has a different or no constraint. Part of the trick is you go through these nursery phases or other phases where you have different rules. Or you’re inside your own abstraction’s firewall and you violate your own invariants for efficiency but you know what you’re doing and from the outside it’s still safe. That’s very hard to implement in a fully type-checked fashion.
When you write Haskell programs you’re forced to decide your proof system in advance of knowing what it is you’re doing. Dynamic languages became popular because people can actually rapidly prototype and keep this latent type system in their head. Then maybe later on, if they have a language that can support it, or if they’re recoding in a static language, they can write down the types. That was one of the reasons why in JavaScript we were interested in optional typing and we still are, though it’s controversial in the committee. There’s still a strong chance we’ll get some kind of hybrid type system into a future version of JavaScript.
So we would like to annotate our C++ with annotations that conservative static analysis could look at. And it would be conservative so it wouldn’t fall into the halting-problem black hole and take forever trying to go exponential. It would help us to prove things about garbage-collector safety or partitioning of functions into which control can flow from a script, functions from which control can flow back out to the script, and things to do with when you have to rematerialize your interpreter stack in order to make security judgments. It would give us some safety properties we can prove. A lot of them are higher-level properties. They aren’t just memory safety. So we’re going to have to keep fighting that battle.