Изменить стиль страницы

Seibel: Because you have transaction isolation.

Peyton Jones: Because they are put in isolation. So that’s really rather a powerful reasoning principle. Because it says you can use your sequential reasoning about imperative code despite the fact that the program’s concurrent. You’ve got to establish what those top-level invariants are, but that’s good for your soul, too. Because then you know what things you are trying to maintain. If you get an exception thrown in the middle of a transaction, that’s cool, too—that can’t destroy the invariants because the transaction is abandoned without effect. I think this is fabulous. Then reasoning about performance issues is a different level—you’ve guaranteed a certain sort of correctness; now you want to make sure you haven’t got any performance holes. Those are harder to get at—at the moment I don’t know anything better than profiling and directed feedback tools for doing that.

Seibel: It strikes me that while optimistic concurrency has been used from time to time in persistent databases, it’s never really gotten a foothold there compared to lock-based concurrency.

Peyton Jones: Of course STM can be implemented in all sorts of ways—optimistic concurrency is only one of them. You can lock as you go, which is more like a pessimistic concurrency model.

Seibel: But there’s also a reason that lock managers are the hairiest part of databases.

Peyton Jones: Right. So the thing about STM is you want to make sure that one person, or one team, gets to implement STM and everybody else gets to use it. You can pay them a lot of money and shut them in a small dark room for a year to try to make sure they do a really good job.

But then that work is usable by everybody through a very simple interface. That’s what I think is nice about it. What I want to avoid is having that level of expertise in everybody’s head. The example that I used at a talk I gave yesterday—this comes from Maurice Herlihy—is a double-ended queue: insert and delete elements.

A sequential implementation of a double-ended queue is a first-year undergraduate programming problem. For a concurrent implementation with a lock per node, it’s a research paper problem. That is too big a step. It’s absurd for something to be so hard. With transactional memory it’s an undergraduate problem again. You simply wrap “atomic” around the insert and delete operations—job done. That’s amazing, I think. It’s a qualitative difference. Now the people who implement the STM, they’d have to make sure they atomically commit a bunch of changes to memory as one. That’s not easy to do, just with compare and swaps. It can be done but you have to be careful about it.

And then if there are performance problems to do with starvation then you may need to do some application-level thinking about how to avoid that. But then you express the results of your application-level thinking, again using STM. I do think for that kind of program it’s a leap forward.

There was one other thing I wanted to mention—this goes back to functional programming. STM, of course, has nothing directly to do with functional programming at all. It’s really about mutating shared state—that doesn’t sound very functional.

But what happened is this, I went to a talk given by Tim Harris about STM in Java. I’d never heard about STM before; I just happened to go to his talk. He was describing an STM in which he had “atomic” but really not much else. You could implement these atomic transactions.

I said, “Wow, that seems really neat. Ah, so you need to log every side effect on memory. Every load and store instruction. Gosh, well there are a lot of those in Java.” But in Haskell there are practically none because they occur in this monadic setting. So loads and stores in Haskell are extremely explicit—programmers think of them as a big deal.

So I thought, “Oh, we should just try replicating this atomic memory stuff in Haskell because it would be a very cool feature to have.” And off we went—I talked to Tim about how to do this. Before long, because of the kind of framework that we had—this kind of pure, rather sparer framework that we had—we’d invented retry and orElse. Retry is this mechanism that allows you to do blocking within a transaction and orElse is the bit that allows you to do choice within a transaction. Neither of these are things that had occurred to him or his colleagues in developing transactional memory for Java because the rest of their context was rather more complicated.

So they hadn’t really thought much about blocking. Or maybe they just assumed that the way you do blocking was you say, atomically, “Only run this transaction when this predicate holds.” But that’s very noncompositional—supposing you wanted to get something out of one bank account and put it in another, well, what’s the condition that the transaction can run under? Answer: well, if there’s enough money in the first bank account and there’s enough space in the second—let’s say that they’re limited at both ends. So that’s a rather complicated condition to figure out. And it gets even more complicated if there’s a third bank account involved. It’s very noncompositional—you have to look inside the methods and pull all their preconditions out to the front.

That’s what he had and it kind of worked fine for small programs but clearly wasn’t very satisfactory. So in a Haskell context we came up with this retry, orElse thing which we’ve since transplanted back into the mainstream imperative context and they’re busy doing retry and orElse as well. That’s great.

Seibel: So there’s nothing actually inherent about Haskell that enabled that concept? It was just that you were able to think of it?

Peyton Jones: That’s right. There was less crap, basically, so the cool idea stood out in higher relief. It became more disgusting that there was no way to do blocking without losing the abstraction. That’s what led us to retry and orElse. I think a really good place for functional programming to be, or a role for it to play, is as a kind of laboratory in which to examine the beast. And then ideas can feed back. And this STM was a particularly clear example because there was a transition in both directions. Here there was a loop that actually got closed, which I thought was lovely.

Seibel: What’s your desert-island list of books for programmers?

Peyton Jones: Well, you should definitely read Jon Bentley’s Programming Pearls. Speaking of pearls, Brian Hayes has a lovely chapter in this book Beautiful Code entitled, “Writing Programs for ‘The Book’” where I think by “The Book” he means a program that will have eternal beauty. You’ve got two points and a third point and you have to find which side of the line between the two points this third point is on. And several solutions don’t work very well. But then there’s a very simple solution that just does it right.

Of course, Don Knuth’s series, The Art of Computer Programming. I don’t think it was ever anything I read straight through; it’s not that kind of book. I certainly referred to it a lot at one stage. Chris Okasaki’s book Purely Functional Data Structures. Fantastic. It’s like Arthur Norman’s course only spread out to a whole book. It’s about how you can do queues and lookup tables and heaps without any side effects but with good complexity bounds. Really, really nice book. Everyone should read this. It’s also quite short and accessible as well. Structure and Interpretation of Computer Programs. Abelson and Sussman. I loved that. And Compiling with Continuations, Andrew Appel’s book about how to compile a functional program using continuation passing style. Also wonderful.

Books that were important to me but I haven’t read for a long time: A Discipline of Programming by Dijkstra. Dijkstra is very careful about writing beautiful programs. These ones are completely imperative but they have the “Hoare property” of rather than having no obvious bugs they obviously have no bugs. And it gives very nice, elegant reasoning to reason about it. That’s a book that introduced me for the first time to reasoning about programs in a pretty watertight way. Another book that at the time made a huge impression on me was Per Brinch Hansen’s book about writing concurrent operating systems. I read it lots of times.