ALLISON:  I think I'm the 15th microphone test today?  How's the 
sound?  Not good.
LEO:  It's surprising how much it actually doesn't help.
ALLISON:  How much the...?
LEO:  How much the mic holder doesn't help the way that you 
want.  Putting it here...
ALLISON:  How is here and I just project?  Still no?
AUDIENCE MEMBER:  No, project.
ALLISON:  Having my lapel and then I project?  That's okay?  
That's okay.  I'll go for the projecting.  Now, for the next 
kind of projecting.  Okay.  Hi.  I'm Alison Kaptur and I work at 
Dropbox in San Francisco.  I used to work at the Recurse Center 
when it was Hacker School.  Today I want to talk about the 
recursion limit in Python without recursion or any limits.  
That's the two pieces.  So recursion, a recursive function is 
one that calls itself.  Here's factorial for n of one, it 
returns one, and otherwise we'll take n and multiply it by the 
result of calling the factorial n - 1.  So it works the way that 
you would expect, here's factorial ten, here's factorial of 500. 
 Here's factorial of 999.  How about a thousand?  No, that does 
not work.  So instead, Python is throwing an exception that the 
maximum recursion depth has been exceeded.  What's going on 
here?  Well, in Python there's a recursion limit that governs 
home levels of recursion that you can have, and that limit is 
thousand and so we tried thousand it doesn't work.  It's 
recursion but hang on a minute, we can set the recursion limit 
in runtime, and here's a screenshot of the doc saying a user may 
have to set the limit higher when she has a program that 
requires deep recursion.  That's me!  Great!  So let's set it to 
be really big in case we needed really big factorials, like 100 
million, let's say.  So we calculate factorial of 100,000 and it 
works, great.  Let's try factorial of a million!
[ Laughter ]
So now, instead of throwing an error, Python crashes.  It's 
insightful and sending information to Apple pops up.  So that's 
not better.  Although I will say, say, segfaulting Python is 
really fun.  And any day we fault Python is a good day in my 
book.  So yeah, what happened?  So if we look back at this 
documentation there's stuff that we ignored like, causing an 
overflow of the C stack and then crashing Python.  So at some 
point, the c stack gets too big and the point at which that 
happens depends on your system, so it might be on my computer 
than on yours.  Fun.  So here's a question:  We know that the 
Python recursion limit is a thousand by default and we tried to 
calculate a factorial of a thousand and this function requires 
one level of recursion per n.  So a thousand should work, right? 
 Why did we get the limit -- why did we hit the limit on 
thousand, instead of a thousand and one?  So the first thing 
that you might think is there's an off by one error and the 
limit is enforcing less than a thousand.  But we can check this 
in the source code and it's definitely not.  This is the 
relevant snippet.  Depth greater than the limit, so that's not 
the error.  So where's this extra level of recursion coming 
from?  So in fact, what's going on is that the recursion limit 
in Python -- I have to put scare quotes around that -- is not a 
limit on recursion per se, it's a limit on the total number of 
stack frames now, we get one stack frame for each recursive call 
we made, but in this case it could have been a rebel, or a 
module, or a function, or somewhere else.  In fact we don't 
actually need to have any recursive calls at all.  Python will 
through a recursion limit, will throw this limit when you hit 
the number of stack frames you specify regardless whether or not 
any of them were recursive.  So someone mentioned it involves 
user dictionary words.  So I experimentally, that you can hit 
the recursion limit this way definitely?  Put so you'll get this 
exception whenever you reach the recursion limit in the number 
of calls in the stack without any recursion.  So that's part 
one.  The recursion without any recursion.  Now, let's do part 
two, the recursion without any limits.  So if you're a fan of 
recursion, you might be familiar with TCO.  Tail call 
optimization.  And tail call optimization is an optimization in 
some languages that says, "if we don't need to keep these lying 
around, we might as well not build the call stack in the first 
place, we'll just toss the frames and we'll just have this one 
frame not using a huge amount of memory and not crashing 
Python."  And so in our example there is work to do while 
unwinding the stack.  We said recurs down to n of one as we 
build up the stack and then as we unwind do that multiplication 
and so we're doing data on that stack, and all these and we need 
that around to do the multiplication.  The formal way to put 
this is that:  The recursion is not in the tail position put we 
can fix this.  And all we have to do is move the multiplication 
to precede the recursion instead of following it.  So here we're 
keeping it an accumulated product and when we get to n (1) we're 
done. so now we've moved into recursion, and there's no extra 
work to be done, we're unwinding the call stack.  Great, so now 
that the factorial function is not recursive, we don't have to 
worry about the recursion anymore, right?  Not exactly.  In 
fact, not at all.  We've written our function with recursion in 
the tail position but the compiler doesn't care about that.  The 
compiler is doing almost exactly the same thing that it was 
doing before and it's going through the entire giant call stack 
and we have a problem again.  Python doesn't support tail call 
optimization.  What follows here is some of the goofiest and fun 
Python that I've ever written and I have to give red credit to 
my co-conspirators to it to Paul Dimonteand Claire Acleve 
(phonetic).  Tail here's the idea, tail call optimization is 
basically turning it into a loop.  What we want is Python when 
it gets to the recursive call at the end, update that -- so we 
want to go too, basically.  But that's silly because there's no 
go-to in Python.
           There's not a go-to in Python but there is a go-to in 
Python byte code.  Byte code is the language that the 
interpreter speaks that the compiler generates and our beef 
today is with the compiler doing the wrong thing.  So the 
interpreter will step through the instructions that it's given 
and if they're well formed then it's great, and if not, then 
it's false.  In Python there are jump instructions and this is a 
really fundamental thing because jump instructions are how you 
implement things like loops and if-statements.  The compiler 
says, now, jump back and do these instructions again or skip 
these instructions under certain circumstances.  Those are loops 
and conditions.
           So this is the existing byte code for the function.  
I notice this is kind of small, the details are not important.  
I want to draw your attention to three things.  The first one is 
that there's a jump instruction in this byte code already, the 
pop, jump, and false, that's the conditional.  So we're 
interested today in two other instructions.  The first is load 
global which is the one that loads up that recursive call and 
then here closer at the bottom of 33 is call function which, you 
know, actually calls the function.  So we can use load global to 
see if we're making the recursive call and then use the call 
function to try to make that jump and in trying to go from here 
to here basically instead of load global don't, and instead of 
call function, jump.  So not too much of a difference.  Okay.  
So how can we do this?  So the first delightful surprise is that 
it's trivial to replace the code object associated with a Python 
function.  You can literally just swap them out.  Please never 
do this in real life.  It totally works.  It's also not hard to 
make a code object given an existing code object.  So all we 
need to do is figure out how to change the byte code.  Here's 
how to write our own byte code.  Again you can ignore the 
details of this.  I've included this to show that the janky 
version of this really requires less than 20 lines and this 
basically says that if the instruction is load global and the 
name that we're loading is our own name then don't.  And if 
we're calling a function, then store the arguments of the 
function, so the arguments of n in the accumulator and then jump 
back to the byte code.  Again, if you like faulting Python I 
highly recommend hacking byte code.  Here we're calculating 
factorial of a thousand, and it totally works.
AUDIENCE MEMBER:  Yay!
ALLISON:  Now, let's try a million.  This also works... 
eventually.  I timed this.  It took 23 minutes.  I didn't 
include the result on the slide because it is 5.5 million digits 
long.  So it didn't fit... so that's part two.  Recursing with 
no limit in Python with hacking byte code.  In the last minute 
that I have here I want to tell you about a really fun corner 
that I painted myself into while preparing for this doc, and 
that's when you set the Python recursion limit to one.  A 
recursion of one means that you can only have one frame on the 
call stack.  So that means... unlike other things the call 
function's -- so here we've succeeded setting the recurs limit 
to one, because we can call functions again but then we can't 
unset it and it's actually even better than that.  Python can't 
print the message associated with this runtime error because 
that takes a frame.  After that Python also can't gave a 
traceback because that also takes frames and so we're in a bit 
of a pickle here.  And it gets even worse than that not only can 
we not call functions we can't do anything at all because 
executing a line of code in the referrer requires a frame and so 
we're totally stuck here.  There's no way out of this without 
putting in Python, and in fact...
[ Laughter ]
[ Applause ]
  Is a recursion of one and thankfully Python is smart enough to 
ignore that when it's hitting the recursion limit and does let 
us quit the program and get back to bash.  Thank you.
LEO:  Now that I know how to get stuck -- we're moving on to I'm 
in ur javaz haxxing ur codez.
>> My voice is kind of dead.  Can you hear me now?  We're good?  
Cool.  So this is I'm in your Javas, hacking your codes.  So the 
first thing that I want to talk about is a public speaking hack. 
 This is the most UK photo of myself that I can find, I'm 13 
years old and I'm showing you this now because nothing say in 
this sentence is going to be more embarrassing than that.  So 
onwards, about a year ago, I've seen this dialogue pop-up.  And 
this is from a program called Little Snitch which is a Mac 
firewall and you set it up and any time your program in your 
computer makes a connection, it says, hey, this program is 
trying to connect here.  And it's trying to connect to 
repo.maven.apache.org... and for some context, maven is a Java 
tool.  And maven, central, which is what repo.maven.apache.org 
is, ask the central for what maven is.  So if you're using STP 
for Scala, you're pulling some stuff from there.  And there's 
this company called Sonotype.  That hosts maven central which is 
an expensive thing to do.  There's, like, 900,000 different Java 
artifacts on there and it's probably a lot of megabytes of 
space.  And they have this policy where, if you wanted to 
download stuff encrypted over SSL, you had to give them $10.  
The $10 would then be donated to the Apache Foundation.  I don't 
know why they had this policy and I like to yell at the 
Internet.  And they said that all features of all products are 
not free.  SSL is offered with Nexus Pro and OSS, for those 
wishing for security, we offer that.  Okay.  That made me sad.  
So I decided to do something about it.  And there's, like, a 
"well actually" here that you might be thinking.  You're 
downloading Java libraries, maybe they're all cryptographically 
signed so even though they're being downloaded over HTTP, we'll 
still check the signature and that's just not true.  Some of 
them are signed.  Some of them are signed.  But then the 
signature's not really checked by maven.  So we're all good.  So 
here's what we want to do.  I want to somehow, like, proxy when 
I download a new library.  I decided to start a new project, I 
downloaded a bunch of new libraries, I want to proxy that.  And 
then I want to put evil stuff in those jars and I want to, like, 
make sure that the evil stuff actually runs on your machine and 
I want it to, like, not break things too much.  I mean, I want 
to break things, but I want to not "really" break things.  So 
first of all, we need to, like, somehow proxy the developer when 
they're downloading a new library.  You could, like, set up a 
router, you know, in a place where there's a lot of programmers. 
 And you use IP tables to forward all the traffic to a program 
that you wrote, which you can do.  There's, like, what I did is 
there's an XML file that maven uses and you can set the proxy.  
So there's this amazing library -- there's this amazing tool 
called MITMProxy.  You point your browser at it, and it takes 
all the requests and you can edit them.  And they have a library 
they released as well which makes it this literally this easy to 
modify traffic.
           You write one method and it gets this flow object and 
that has the request and I look at the request and I say if it's 
coming from repo.maven.apache.org and it's a jar, put something 
evil there.  So then we want to get our code to execute.
           So let's say I wrote some evil Java.  Like, you know, 
I've made a package, I could do something evil.  To get it in a 
jar that I intercepted is pretty easy.  A jar is just a zip 
file.  So all I have to do is just compile that Java into a 
class, take that class, put it in a zip file and send that file 
along.  So the next step is I need to somehow modify the other 
code in that jar in order to get it to run.  So there's this 
library called Krakatau which is a Python/Java decompiler 
disassembler, assembler.  The GitHub author username is 
storyyeller.  But storyyeller.com is the website of Melvil 
Dewey, hip hop superstar.  And that's Melvil Dewey's first 
album, Papercuts.  So this is how you use it.  It's a really 
nice library.  It's really not written to be used as anything 
other than a command line tool.  And so I just copied and pasted 
a bunch of code but what really matters is you just disassemble 
some Java and you put more of Jasmine in and you assemble it 
back and what Jasmine is, it's a way of writing Java byte code 
in a way that looks like assembly syntax.  So this is the evil 
Java that we want.  And how many of you have seen that static 
thing just like that, bare in Java 'cause I never did and it 
took me, like, three days to figure out how to do this because 
what I really want is, I don't know what class you're going to 
load and when you load some class into memory I want it to 
execute my code without, like, you actually having to call a 
function.  And I really wanted a class initializer, which in 
Java are called static blocks which some of you know, which I 
didn't, and it took me forever to find that.  But I figured it 
out and so whenever you import the class and you load it into 
memory, that static block gets executed and in Java, it 
invocation static in that method.  So -- it invokes static in 
that method.  I also inject a little bit of Java byte code into 
that class that you're going to use when you use the library 
that I've backdoored.  Which class?  All of the classes.  So I 
just put that byte code in every single class in the jar for the 
best.  You'll probably use one of them if you're using the jar.  
I don't want to break things too much, though.  To in my evil 
backdoor, I just check if it ran and if it did, I don't do 
anything but I do it at least once.  Is this going to work?  So 
I wrote an awesome "hello world" that uses joda time to print 
the time, and also hello world and I'm typing really slowly for 
some reason.  And so it runs, and you can see on the bottom of 
the screen it says the current local time and it says "hello 
world."  And then I'm going to open a terminal and delete joda 
time from, like, my system as I slowly type out what I'm doing.  
But my apologies.  I recorded this video, like, without a talk 
and I can't really do this anymore as you'll find out soon.  So 
I have to use this video.  And so I delete it, and I go back to 
teleJay -- I do a maven install, so it's going to tell this 
thing.  And you can see I'm backdooring every single class.  One 
of them is the one that I used to, like, figure out what time it 
is, to tell you when I tell you hello world.  When I go and I 
run this code again.  There it is, it's loaded again.  Run 
mistakes!  Run the code!  Stop typing!  They already know!  Why 
are you so slow?  Yes!  So...
[ Applause ]
So we successfully put a LOLCat in Java updates.  So the 
epilogue is, HTTPS/SSL support now!  And that post was July 8th, 
and this post was August 4th, so it was an extremely short 
turnaround time.  So it was a good job.  Thank you.