>> Awesome! Thank you. That was great. So now... We have Andrew Gwozdziewycz, presenting -- "Understanding Garbage Collection through visualizing a One Pass Real Time Generational Mark-Sweep Garbage Collector!" And that is my description of his talk. >> So I don't have nearly as much energy as... Um... All right. Now I can go like this, maybe. All right! So... Where did my... All right. Maybe not. Um... I don't want to mirror. Because I want the speaker notes. Whatever. Maybe you guys can just... Not... All right. That'll work. Okay. Great. So... Does anybody know what these are? >> Animals? >> They're mallocs, and they should be free. So... Memory management is kind of difficult, and most of us probably don't think a lot about it, because we use Java or Python or Ruby or some other language which does it for us. And there's a reason for this. It's really kind of difficult. It's difficult to get right, and affects the entire program that we have. And in most cases, we just want to get work done and we just don't care. Give me some memory and I'll do something with it. >> Can you move the mic closer to your mouth? >> Yes. Is this better? Is this better? Okay. I will talk like this, then. >> You can move the mic. >> Like... >> Take it off the stand and hold it up to your face. >> Okay. Sorry. So if you look at Firefox, you'll know that this is true, because Firefox is full of memory leaks and all this other stuff, and this has kind of historically been accurate. But there are smart people that are working on things called garbage collectors. And when we talk about garbage collection, we're talking about a program that basically manages our memory for us. Which has usually got a... It's usually scrutinized for being pretty slow, but it doesn't have to be the case. So if we think about how to manually manage memory, you think about it in terms of -- acquire some memory, do something with that memory, and then eventually throw it away. And we're pretty good at acquiring stuff. We're pretty good at processing stuff, but we're not very good at giving stuff back. And so that leads to memory leaks, it leads to segfaults, and various other problems that are kind of hard to debug. So there's a simple algorithm for this. It's called the naive mark sweep algorithm. And it works very simple. So it looks at all the references that you're currently using, so-called roots. These are the things that -- these are the memory references or the pointers, rather, in registers, or on the stack, and it just goes through all of them, recursively, and says -- mark this thing as being used. It's the mark part of this. And then in a separate phase, it goes through the entire stack, or the entire heap, rather, and frees anything that is not marked. Very, very simple. So essentially the mark phase binds all the things that are in use, the sweep phase eliminates all the things that aren't in use. But mark sweep stops the world. And while it's running, your program isn't running. Your program has to wait. And maybe that's not so good. But we can make this better by using a more sophisticated algorithm. Unfortunately, I only have seven minutes left, so we'll try to get there. But let's think of some ideas. So what can we do to make this better? Well, we can combine the mark and sweep phase. If we do that, then we are only going through all of this stuff once. Right? And if we mark everything in a naive mark sweep, we have to... If we mark everything that's in use on the heap, mark sweep is going to have to go through everything, but it's not going to retrieve anything. It's not going to free anything. And in the opposite case, if we don't mark anything, we still have to go through the heap. So if we can do it just once, you know, we save lots of time in all these cases. And another idea. So maybe we can do only part of the work. Some of the time. Right? So if we do part of the work more often, we don't have to pause so much. So that's kind of the idea of incremental garbage collection. So there's another thing we can do. So... Or there's another thing that we get if we can do this sort of shorter pauses by doing more work... Doing less work more. We can get these sort of realtime guarantees. That is to say that we can say... We're going to collect memory... Or do garbage collection for ten nanoseconds. Which might be enough to give us properties of a garbage collector that we can use in a realtime system, or at least a soft realtime system. And then there's this other idea of generational collection, and David Ungar has this thing called the weak generational hypothesis, that states -- most objects die young. If you think about it, if you allocate some temporary thing, you throw it away pretty quickly. So it seems to be kind of true, and empirically, it seems to be true. And all these things exist in a realtime... Or sorry, in a garbage collector near you. But if you're building a toy language, you probably don't want to mess with this, because it's complicated, and it's tricky. But if you do build a successful programming language that you need a real garbage collection engine for, just let someone else do it, because they probably know how to do it better. And they'll want to do it. But there is this really cool paper that Joe Armstrong and Robert Virding produced in 1995, and I'm not going to reproduce that thing. But we can get some of these properties rather simply. The first thing that we can do is we have this heap here. And if we always allocate memory... If we allocate memory from the low end to the high end, saying that we always get it from the top of the heap... Or... Well, sort of. Anyway... If we always get it from there, and we always say that no pointer can point into the future, meaning it always has to point backwards into the lower end of the heap, we can do something really cool. And the algorithm is very similar to mark all this and find in the garbage. It's literally that. While this scavenger, which is this thing on the left, is not at the low end of the heap, if it's marked, just mark the things that this object refers to. Mark the pointers within that object only. So it doesn't do so recursively. And then it just keeps going. It's just an iterative process. But that's not very useful. Because we can't reclaim any of the objects there. Because that would violate our core invariant that we can't have objects point into the future. Only to the past. But we can simulate this heap organization just by adding an extra pointer to all of our objects. Which points to the previous thing that was allocated in the system. So when we get a new object, we just have to do some bookkeeping. And all this is -- we set the previous pointer to the thing that was allocated before us, and then we update the allocated -- the most recently allocated thing to the thing that we just allocated. And then, when we want to garbage collect -- this is the entire algorithm right here -- and the most important thing is the bottom here. Because all that you're doing to free things is you're swapping the pointer of... You're swapping the last allocation, which is basically the thing that you processed before, in this scavenger, this thing, and you're just setting that previous pointer to what you're currently scavenging. And you're throwing away this temporary variable, the thing that the scavenger is pointing to. Super simple. But wait -- there's more! Because this actually turns out to be something that you can make incremental, really quickly. There was that while loop condition. Right? Let's say that you wanted to process only a portion of the heap. You can do that. Like, our invariant protects us from being able to screw ourselves up. As long as we're only pointing to memory that was allocated before us, that's no big deal. This is generational. In much the same way. We can modify the loop conditional to say -- only allocate... Or only go through some part of the heap. And it turns out that the pointer swapping thing, when we free stuff, means that older objects move to the lower end of the heap. Which is exactly this... Which is exactly an encoding of this... Young objects... Or new objects die young. Whatever I was saying before. That I've now totally forgotten. But it's the exact encoding of that. So we get this generational property. But is this practical? Well, I mentioned that Armstrong and Verding wrote this in '95. It served as the garbage collector for Erlang for a while. I don't know exactly how long it did. But it did. And the algorithm does have some practical advantages. Over mark sweep, at least, it's extremely simple. Mark sweep has this property that I didn't get into, that you have to recursively mark things. So once you find the one thing that you're marking, you recursively go and mark the rest of what that points to. Which is tricky, and if you're on the stack, you're allocating stuff on the stack, and you don't have any memory, why are you allocating more stuff? Right? It's kind of weird. There's an ease of extension. The basic algorithm -- I just extended to do incremental and generational collection. And objects can be used much more quickly, because I can go only so far into the heap, and say -- that's it. I'm only collecting that much. Or the fact that I'm just doing one pass means that I'm already going faster than mark sweep. But don't use this. Because there are much better algorithms out there that take into consideration a lot more stuff. Thanks! (applause)