Изменить стиль страницы

Seibel: As a Java guy at Google, do you think it could be used more? Leaving aside the force of history and historical choices, if somehow you could wave a magic wand and replace all of the C++ with Java, could that work?

Bloch: Up to a point. Large parts of the system could be written that way, and over time, things are moving in that direction. But for the absolute core of the system—the inner loops of the index servers, for instance—very small gains in performance are worth an awful lot. When you have that many machines running the same piece of code, if you can make it even a few percent faster, then you’ve done something that has real benefits, financially and environmentally. So there is some code that you want to write in assembly language, and what is C but glorified assembly language?

I’m not religious. If it works, great. I wrote C code for 20 years. But it’s much more efficient, in terms of programmers’ time, to use a more modern language that provides better safety, convenience, and expressiveness. In most cases, programmer time is much more valuable than computer time.

But that isn’t necessarily so if you’re running the same program on many, many thousands of machines. So there are some programs that we write where probably using less-safe languages to extract every ounce of performance is worth it. I think for most programs these days the performance of all modern languages is a wash and if anyone tells you that their language is ten times more efficient, they’re probably lying to you.

But in terms of efficiency, in terms of use of engineers’ time, it’s far from a wash. More modern languages, first of all, are exempt from large classes of errors. Second of all, they have marvelous sets of tools which make engineers more efficient. To some degree it’s cultural; it’s what languages people learned in schools. But to some degree I think it’s actually fundamental engineering at work. For example, if a language has a macro processor it’s much harder to write good tools for it. Parsing C++ is a much trickier business than parsing Java.

Google is writing a lot more of its code in Java now than it used to. I don’t know what the numbers are, but if the lines haven’t already crossed, they will soon. So there’s a big difference between how many lines of code do we have in each language versus how many cycles are getting executed in each language. And I think it would be a fool’s errand and not particularly meritorious, either, to try and get the inner loops of the indexing servers written in Java. If you were starting a company to do this sort of thing today, you might write things largely in Java or in some other modern, safe language, and then escape it when you needed to. But we have this engineering infrastructure. Libraries and monitoring facilities and all of that stuff that makes it go. And finally Java is, if not an equal partner in this, it’s reasonably usable within these systems, which is good. When I arrived that wasn’t the case yet.

Companies establish their DNA very early on. It can make them tremendously successful, but it can also make it hard for them to escape when what served them well in the early days doesn’t serve them so well any more. I remember being an intern at IBM Research in Yorktown Heights around 1982, seeing the culture still dominated by batch processing. Even when they were doing timesharing, they talked in terms of virtual card readers and virtual card punches. Everything was still 80-column records. With DEC, it was the timesharing mentality that they never escaped. And I suppose with Microsoft it’s an open question whether they’ll be able to move beyond the desktop-PC mentality.

Seibel: And 20 years from now people will be talking about how Google can’t get past how to sell ads on the Internet.

Bloch: Absolutely. Anyway, there was this sort of cultural meme at Google that Java is slow and unreliable. And it’s obvious where it came from: Blackdown Java on Linux, around 1999, was slow and unreliable. And old ideas die very hard. Although the truth is, Google uses Java for many sorts of business-critical functions, including, by the way, ads.

So at some level they understand that it’s neither slow nor unreliable. But the actual search pipeline, which is the most intense in terms of machine cycles, that stuff is all basically C++ and there’s an obvious reason having to do with the genesis of the company. And I think that will continue to affect us for quite some time.

Seibel: What are the tools you actually use to program?

Bloch: I knew this was coming; I’m an old fart and I’m not proud of it. The Emacs keystrokes are wired into my brain. And I tend to write smaller programs, libraries and so forth. So I do too much of my coding without modern tools. But I know that modern tools make you a lot more efficient.

I do use IntelliJ for larger stuff, because the rest of my group uses it, but I’m not terribly proficient. It is impressive: I love the static analysis that these tools do for you. I had people from those tools—IntelliJ, Eclipse, NetBeans, and FindBugs—as chapter reviewers on Java Puzzlers, so many of the traps and pitfalls in that book are detected automatically by these tools. I think it’s just great.

Seibel: Do you believe you would really be more productive if you took a month to really learn IntelliJ inside out?

Bloch: I do. Modern IDEs are great for large-scale refactorings. Something that Brian Goetz pointed out is that people write much cleaner code now because they do refactorings that they simply wouldn’t have attempted before. They can pretty much count on these tools to propagate changes without changing the behavior of the code.

Seibel: What about other tools?

Bloch: I’m not good with programming tools. I wish I were. The build and source-control tools change more than I would like, and it’s hard for me to keep up. So I bother my more tool-savvy colleagues each time I set up a new environment. I say, “How do you do it these days?” They roll their eyes and help me and I use the environment until it doesn’t work anymore.

I’m not proud of this. Engineers have things that they’re good at and things they’re not so good at. There are people who would like to pretend that this isn’t so, that engineers are interchangeable, and that everyone can and should be a total generalist. But this ignores the fact that there are people who are stunningly good at certain things and not necessarily so good at other things. If you force them all to do everything, you’ll probably make mediocre products.

In particular there are some people who, in Kevin Bourrillion’s words, “lack the empathy gene.” You aren’t going to be a good API designer or language designer if you can’t put yourself in the shoes of an ordinary programmer trying to use your API or language to get something done. Some people are good API and language designers, though. Then there are people who are stunningly good at the technical aspects of language design where they can say, “Oh, this will make the thing not LALR(1) and you need to tweak it in just such a way.” That’s an incredibly useful skill. But it’s no substitute for having the empathy gene and knowing you have this awful language that’s unusable.

I know other people who are stunningly good at extracting that last percentage of performance. You want to put them in a position where that’s what they’re doing. They’ll be happy and they’ll do good stuff for your company. I think you’ve got to figure out what your engineers are good at and use them for that. So that’s my apologia for why I suck at tools. Lame, I know.

Seibel: Let’s talk about debugging. What’s the worst bug you ever had to track down?

Bloch: One that comes to mind, which was both horrible and amusing, happened when I worked at a company called Transarc, in Pittsburgh, in the early ‘90s. I committed to do a transactional shared-memory implementation on a very tight schedule. I finished the design and implementation on schedule, and even produced a few reusable components in the process. But I had written a lot of new code in a hurry, which made me nervous.