Archive for the ‘python internals’ Category

python bytecode: loops

August 18th, 2013

New to the series? The previous entry was part 4.

We have seen quite a bit of bytecode already - we've even seen the structure of a bytecode module. But we haven't seen loops yet.

A while loop

We'll start with the simplest kind of loop. It's not really a typical loop given that it breaks after a single execution, but it's enough to give us a look at the looping machinery.

def loop(x):
    while x > 3:
        print x

Disassembly of loop:
  2           0 SETUP_LOOP              22 (to 25)
        >>    3 LOAD_FAST                0 (x)
              6 LOAD_CONST               1 (3)
              9 COMPARE_OP               4 (>)
             12 POP_JUMP_IF_FALSE       24

  3          15 LOAD_FAST                0 (x)
             18 PRINT_ITEM          
             19 PRINT_NEWLINE       

  4          20 BREAK_LOOP          
             21 JUMP_ABSOLUTE            3
        >>   24 POP_BLOCK           
        >>   25 LOAD_CONST               0 (None)
             28 RETURN_VALUE

The control flow is a bit convoluted here. We start off with a SETUP_LOOP which pushes a block onto the stack. It's not really clear to me why that's necessary, given that Python does not use blocks for scopes. But it might be that the interpreter needs to know which level of looping it's at.

We then load the variable x and the constant 3 and run COMPARE_OP. This opcode actually takes a parameter to tell it which operation to perform (greater than in this case). The result of that will be a boolean value on the stack.

Now we need to know whether we're going to execute the loop body or jump past the loop, so that's POP_JUMP_IF_FALSE, which may jump to location 24 where the loop ends.

Assuming we are in the loop body, we simply load the variable x and print it. Interestingly, the print statement requires two opcodes PRINT_ITEM and then PRINT_NEWLINE, which seems a bit over the top.

We now have a BREAK_LOOP instruction. Notice that if we were to ignore and execute the JUMP_ABSOLUTE just behind it that would return us to the loop predicate, and we might continue looping. But that's not supposed to happen after a break: a break ends the loop even if the loop predicate is still true. So this must mean that we won't reach JUMP_ABSOLUTE.

After this we execute POP_BLOCK which will officially end the loop by taking the block off the stack again.

A for loop

A for loop, then, is not very different. The main difference is that we are not looping on a boolean condition - we are looping over an iterable.

def loop(x):
    for i in range(x):

Disassembly of loop:
  2           0 SETUP_LOOP              25 (to 28)
              3 LOAD_GLOBAL              0 (range)
              6 LOAD_FAST                0 (x)
              9 CALL_FUNCTION            1
             12 GET_ITER            
        >>   13 FOR_ITER                11 (to 27)
             16 STORE_FAST               1 (i)

  3          19 LOAD_FAST                1 (i)
             22 PRINT_ITEM          
             23 PRINT_NEWLINE       
             24 JUMP_ABSOLUTE           13
        >>   27 POP_BLOCK           
        >>   28 LOAD_CONST               0 (None)
             31 RETURN_VALUE

To do that we have LOAD_GLOBAL to load the range function on the stack. This is an opcode we haven't seeb before, and it simply means this name comes from somewhere outside this module (the __builtin__ module in this case). We then load x and call the function. This produces a list.

Now, since Python uses iterators so heavily, the loop will use this method to move through the list. It means you could also loop over any other iterable object (tuple, dict, string, your own custom iterators etc). In fact, GET_ITER amounts to calling the iter function on the list (which returns an iterator object). And FOR_ITER calls the iterator's next method to get the next item.

We now have the first int in the list, and we bind it to the name i with STORE_FAST. From there on, we may use i in the loop body.

You will notice that there is something odd about the way i is manipulated. At location 16 the int is sitting on the stack, and gets bound to a name with STORE_FAST. This consumes it on the stack. We then immediately push it on the stack again with LOAD_FAST. These two instructions cancel each other out: we could remove them without changing the meaning of the program.

So why do we have to store and load? Well, imagine i were used again in the loop body - it would have be bound, right? So it could be optimized away in this case, but not in the general case.

python bytecode: object model

August 12th, 2013

New to the series? The previous entry was part 3.

Okay, so we've seen a fair amount of bytecode already. We know how modules, classes and functions work and today we'll study an example where it all comes together. It's more about consolidating our insights rather than learning new material.

A simple module

Right as you see this code you'll be thinking in bytecode already. We have three top level bindings:

  • pencils is bound to the integer 13
  • give_back is bound to a function object
  • Person is bound to a class object

pencils = 13

def give_back(x):
    return x

class Person(object):
    default_age = 7

    def __init__(self, age):
        self.age = age or self.default_age

    def had_birthday(self):
        self.age += 1

    def how_old(self):
        return self.age

Let's disassemble this module so we can complete the picture. Ned Batchelder was kind enough to provide some example code for this purpose.

We won't go through all of the bytecode, as it's a bit long. But this diagram illustrates some of the key components of this module. On display we have:

  • The module's code object. Two of its constants are code objects too:
    • The function give_back's code object.
    • The class Person's code object. it has three constants that are code objects, one of which is:
      • The method how_old's code object.

So yes, a .pyc file simply contains a code object (of type types.CodeType), which has a bunch of attributes. Not all of those attributes will be set - it depends on what kind of functional object (module, class, function, ...) the code operates on.

A function that uses local variables will store them in .co_varnames.

A class or a function will have a .co_name value.

Many code objects will have .co_filename value that tells you where its source code comes from.

And in many cases a code object will have other code objects in its .co_consts tuple, and those will be code objects representing classes or functions or what have you.

python bytecode: classes

August 11th, 2013

New to the series? The previous entry was part 2.

Today we're going to see how classes work. Classes are a bit tricky as they provide the glue that makes functions into methods. They are also created dynamically, as we saw last time, so they must have some compiled bytecode that gets executed when the class actually gets created.

We'll do this in several steps, because unlike functions classes don't come with code objects that we can inspect directly. So we'll use some trickery to see what happens step by step.

Executing a class body

Before we look at a class in constructed form, let's poke around its namespace.

We'll use this simple class as a running example.

import dis

class Person(object):
    default_age = 7

    def __init__(self, age):
        self.age = age or self.default_age

    def had_birthday(self):
        self.age += 1

    def how_old(self):
        return self.age

    # throws NameError if these are not bound
    default_age, __init__, had_birthday, how_old

    # We'll see what this function looks like

# disassemble
 13           0 LOAD_FAST                0 (self)
              3 LOAD_ATTR                0 (age)
              6 RETURN_VALUE

What happens here is that we let all the class members get defined. This is basically like making bindings at the module level. In this scope there is nothing to suggest that we are inside a class body. (Well, except that the name __module__ is also bound to "__main__", which would not be the case at module level.)

We disassemble one of the functions to show that there is nothing special about it - it doesn't contain any special sauce that would reveal it to be a method instead of a function. self is a local variable like any other that just happens to be called self.

The class code

As mentioned there is no way to get at the code given a class object, but what we can do is put the class definition in a function body, retrieve the function's code object - which we know contains all the constants and variables used in that function. One of those is bound to be the code object of the class-to-be-constructed.

import dis

def build_class():
    class Person(object):
        default_age = 7

        def __init__(self, age):
            self.age = age or self.default_age

        def had_birthday(self):
            self.age += 1

        def how_old(self):
            return self.age

cls_code = build_class.func_code.co_consts[2]

# disassemble
  4           0 LOAD_NAME                0 (__name__)
              3 STORE_NAME               1 (__module__)

  5           6 LOAD_CONST               0 (7)
              9 STORE_NAME               2 (default_age)

  7          12 LOAD_CONST               1 (<code object __init__ at 0xb7533ba8, file "", line 7>)
             15 MAKE_FUNCTION            0
             18 STORE_NAME               3 (__init__)

 10          21 LOAD_CONST               2 (<code object had_birthday at 0xb7533c80, file "", line 10>)
             24 MAKE_FUNCTION            0
             27 STORE_NAME               4 (had_birthday)

 13          30 LOAD_CONST               3 (<code object how_old at 0xb7533e30, file "", line 13>)
             33 MAKE_FUNCTION            0
             36 STORE_NAME               5 (how_old)
             39 LOAD_LOCALS         
             40 RETURN_VALUE

So we reach into the function's code, into it's co_consts tuple and grab the code object. It happens to be at index 2, because index 0 is None and index 1 is the string "Person".

So what does the class code do? Just like at module level, it binds all the names in its namespace, and it also binds the name __module__, because a class is supposed to know the module it's defined in.

And then? Once all those bindings have been made, it actually just returns them. So basically the class code builds a dict and returns it.

This helps complete the picture from last time. To recap, at module level the code first a) calls the class as if it were a function with CALL_FUNCTION (to which this "function" returns a dict, as we've just seen), and then b) BUILD_CLASS on that return value (ie. on the dict), which wires everything together and produces an actual class object.


Okay, now let's find out something else. We know that functions and methods are not the same type of thing. What about their code? We saw before how a function defined in a class body has no signs of being a method. Has it changed during the construction of the class? A function object replaced by a method object perhaps?

import dis

class Person(object):
    default_age = 7

    def __init__(self, age):
        self.age = age or self.default_age

    def had_birthday(self):
        self.age += 1

    def how_old(self):
        return self.age


# disassemble
 13           0 LOAD_FAST                0 (self)
              3 LOAD_ATTR                0 (age)
              6 RETURN_VALUE

The answer to that is no. The object is unchanged. In fact, the object stored in the class is a function. That's right, a function. Try reaching into Person.__dict__ to get it and what you get is a function object. It isn't until you do an attribute access on the class object (Person.how_old) that the method object appears, so the method is like a view on the function, it's not "native".

How does that work? You already know: descriptors.

func = Person.__dict__['how_old']
print func
# <function how_old at 0xb749e994>

print Person.how_old
# <unbound method Person.how_old>
print func.__get__(None, Person)
# <unbound method Person.how_old>

person = Person(4)
print person.how_old
# <bound method Person.how_old of <__main__.Person object at 0xb73f5eec>>
print func.__get__(person, Person)
# <bound method Person.how_old of <__main__.Person object at 0xb73f5eec>>

Function objects implement the descriptor protocol. Getting the function through the class (ie. getting the method) is equivalent to calling the function object's __get__ method with that class object as the type. This returns an unbound method (meaning bound to a class, but not to an instance of that class).

If you also give it an instance of the class you get a bound method.

So there you have it, classes and methods. Simple, right? Well, ish. One last thing: is the bound/unbound method object created on the fly? As in: does Python perform an object allocation every time you access a method? Because that would be... bad. Well, it doesn't. At least as far as the user can tell, it's always the same object with the same memory address.

python bytecode: modules

August 8th, 2013

If you're new here I recommend starting with part 1.

Last time we looked at statements inside function bodies. Exciting stuff, but a bit puzzling. I bet you were asking yourself: How did those functions get into that module in the first place?

Statements at module level

For some reason dis.dis does not seem to produce any output when given a module object other than showing you its functions. For a more insightful view we will use python -m dis

Let's try a really simple one first.

a = 7
b = a + a

# disassembly
  1           0 LOAD_CONST               0 (7)
              3 STORE_NAME               0 (a)

  2           6 LOAD_NAME                0 (a)
              9 LOAD_NAME                0 (a)
             12 BINARY_ADD          
             13 STORE_NAME               1 (b)
             16 LOAD_CONST               1 (None)
             19 RETURN_VALUE

The first thing to notice is the use of LOAD_NAME and STORE_NAME. We're at module level here, and LOAD_FAST/STORE_FAST only apply to local variables in a function body. But just like a function object a module stores the names it contains in a tuple which can be indexed into, as shown here.

It's a bit more obscure to get at that storage, because a module does not have a func_code attribute attached to it like a function does. But we can create a module object ourselves and see what it contains:

code = compile("a = 7; b = a + a", "", "exec")
print code.co_names
# ('a', 'b')

And there's our storage. The module has various other co_* attributes which we won't go into right now.

Also worth noting: modules return None like functions do, which seems a bit redundant given that there isn't a way to capture that return value: value = import os is not valid syntax. And module imports feel like statements, not like expressions.


Think about this first: when you import a module which contains two functions, what do you expect its namespace to contain? Those two names bound to their function objects, of course! See, that was not a trick question!

def plus(a, b):
    return a + b

def main():
    print plus(2, 3)

# disassembly
  1           0 LOAD_CONST               0 (<code object plus at 0xb73f4920, file "", line 1>)
              3 MAKE_FUNCTION            0
              6 STORE_NAME               0 (plus)

  4           9 LOAD_CONST               1 (<code object main at 0xb73f47b8, file "", line 4>)
             12 MAKE_FUNCTION            0
             15 STORE_NAME               1 (main)
             18 LOAD_CONST               2 (None)
             21 RETURN_VALUE

And so it is. Python will not give any special treatment to functions over other things like integers. A function definition (in source code) becomes a function object, and its body becomes a code object attached to that function object.

We see in this output that the code object is available. It's loaded onto the stack with LOAD_CONST just like an integer would be. MAKE_FUNCTION will wrap that in a function object. And STORE_NAME simply binds the function object to the name we gave the function.

There is so little going on here that it's almost eerie. But what if the function body makes no sense?? What if it uses names that are not defined?!? Recall that Python is a dynamic language, and that dynamism is expressed by the fact that we don't care about the function body until someone calls the function! It could literally be anything (as long as it can be compiled successfully to bytecode).

It's enough that the function object knows what arguments it expects, and that it has a compiled function body ready to execute. That's all the preparation we need for that function call. No, really!


Classes are much like functions, just a bit more complicated.

class Person(object):
    def how_old(self):
        return 5

# disassembly
  1           0 LOAD_CONST               0 ('Person')
              3 LOAD_NAME                0 (object)
              6 BUILD_TUPLE              1
              9 LOAD_CONST               1 (<code object Person at 0xb74987b8, file "", line 1>)
             12 MAKE_FUNCTION            0
             15 CALL_FUNCTION            0
             18 BUILD_CLASS         
             19 STORE_NAME               1 (Person)
             22 LOAD_CONST               2 (None)
             25 RETURN_VALUE

Let's start in the middle this time, at location 9. The code object for the entire class has been compiled, just like the function we saw before. At module level we have no visibility into this class, all we have is this single code object.

With MAKE_FUNCTION we wrap it in a function object and then call that function using CALL_FUNCTION. This will return something, but we can't tell from the bytecode what kind of object that is. Not to worry, we'll figure this out somehow.

What we know is that we have some object at the top of the stack. Just below that we have a tuple of the base classes for our new class. And below that we have the name we want to give to the class. With all those lined up, we call BUILD_CLASS.

If we peek at the documentation we can find out that BUILD_CLASS takes three arguments, the first being a dictionary of methods. So: methods_dict, bases, name. This looks pretty familiar - it's the same inputs needed for the __new__ method in a metaclass! At this point it would not be outlandish to suspect that the __new__ method is being called behind the scenes when this opcode is being executed.

python bytecode: single scope

August 6th, 2013

Python bytecode, meet the language behind the language! It's a language like any other, with its own syntax, semantics, design patterns and pitfalls. It's even better than Python, because it's executed directly, not compiled to some intermediate easy-enough-to-interpret language. So if those bytecode compilation times are bugging you, you know what to do.

As you know, CPython is a stack based VM. Okay, what does that mean? It means that if you examine a simple Python expression it will generally follow this pattern:

  1. Push (load) the operand on the stack. This can be a constant or a variable.
  2. Perform the operation on the operand.
    This will consume the operand and replace it on the stack with the result of the operation.
  3. Store the value in a variable (if there is an assignment) or discard it (if there isn't).

If the operation were binary you would push two operands on the stack instead, and so on. And each of the steps is a bytecode instruction, so one "atomic" looking Python expression becomes a handful of bytecode instructions. The stack is like a desk - you put something down that you're going to use, perform the work, and clean up the desk afterwards (just like we all do in real life, amirite?).

An assignment statement

Not convinced? import dis and follow me, then.

def main(a):
    # We will disassemble the body of this function
    a = a + 7

import dis

Disassembly of main:
  2           0 LOAD_FAST                0 (a)
              3 LOAD_CONST               1 (7)
              6 BINARY_ADD          
              7 STORE_FAST               0 (a)
             10 LOAD_CONST               0 (None)
             13 RETURN_VALUE

Woah, what's this? The 2 in the left column is the line number of where this function starts. It's actually the first line in the module, so that would be line 0 or line 1 depending on your counting system, but close enough.

Let's dig in. We have 6 instructions just to do a = a + 7, seems a little much.

The first instruction is LOAD_FAST 0. LOAD_FAST operates on local variables, so it will load the zeroth local variable from somewhere onto the stack. You can see dis printing the value of the variables/constants in parentheses in the right column. The value is a.

The next instruction is LOAD_CONST 1, whose value is 7. We are loading the first constant onto the stack. (Which seems odd, as I only spot one constant in this function, what's the zeroth constant then? Patience, Robinson, we'll get there soon.)

Next, we have BINARY_ADD, which does not take an argument. Conveniently, it finds two operands ready and waiting on the stack, however. It pops them both off, performs the operation, and pushes the result onto the stack again.

Finally, we have STORE_FAST 0, which stores whatever is on the stack into the zeroth local variable, whose name is a. So we've accounted for our Python code and everything looks rosy.

But the bytecode compiler has a dilemma. This is a function it's compiling, and functions are supposed to have return values, even if this one doesn't. The caller of the function will be pissed if he doesn't get a return value.

A cunning plan. Why not just return nothing? Push the value None on the stack (by golly, there's our zeroth constant!), then return it.

A branch

Okay, loading variables and literals, performing operations. *shrug* What if you had a branch, tough guy, would you indent your bytecode, huh?

def main(a):
    if condition:
        a = 7
        a = 9

Disassembly of main:
  2           0 LOAD_GLOBAL              0 (condition)
              3 POP_JUMP_IF_FALSE       15

  3           6 LOAD_CONST               1 (7)
              9 STORE_FAST               0 (a)
             12 JUMP_FORWARD             6 (to 21)

  5     >>   15 LOAD_CONST               2 (9)
             18 STORE_FAST               0 (a)
        >>   21 LOAD_CONST               0 (None)
             24 RETURN_VALUE

Well, no. And hang tight, because this is where Python bytecode becomes a little assembly-ish.

We load the condition variable and perform the check. If true we just continue to the next instruction, but if false we jump to location 15, which is the body of the else block.

a) if we're executing the if case, we perform the assignment, then jump forward by 6 locations to location 21, which is the end of the function (and return None).

b) if we're executing the else case, we perform the assignment and just continue on, returning None.

Object lookups

So far we have looked at some very basic operations, and the opcodes are pretty similar to what you would find when compiling the same code in C and looking at the assembly. (Well, except that we are still in the world of dynamic typing and BINARY_ADD has _no idea_ what kind of things it will be trying to add until it actually tries to perform the operation and only then! discovers the types of the operands.)

Let's try something more high level - like operations on objects. Two of the most common things we do in Python are: indexing into objects (for lists and dicts), and looking up attributes on them.

def main(obj):

Disassembly of main:
  2           0 LOAD_FAST                0 (obj)
              3 LOAD_GLOBAL              0 (x)
              6 BINARY_SUBSCR       
              7 POP_TOP             

  3           8 LOAD_FAST                0 (obj)
             11 LOAD_ATTR                0 (x)
             14 POP_TOP             
             15 LOAD_CONST               0 (None)
             18 RETURN_VALUE

Why that seems surprisingly concise! The index operation aside (a single opcode), attribute lookup is a complicated piece of logic that involves looking at obj.__dict__, looking at the class dict (and base class dicts), calling __getattribute__ and possibly __getattr__. All of that in a single opcode? That's value for money in my book.

At this point it should dawn on you that bytecode compilation does not actually compile the program very much. It's basically rewriting the same thing in bytecode, and that's why the compilation is so fast.

Approaches to interpretation

So why bother? Let's think about our options.

  1. Parse and interpret Python source code on-the-fly.
    Impractical, because the code you need to execute determines how many tokens you need to parse. An assignment may only need the rest of the tokens on the line, but a class definition needs the whole class body. Noone actually does this.
  2. Parse the source code to an Abstract Syntax Tree (AST) and write the interpreter as an AST walker.
    Possible, but it makes the interpreter a bit complicated. It has to remain in sync with the structure of the AST at all times. Ruby MRI 1.8 uses this approach.
  3. Parse the source code to an AST, compile that to bytecode, interpret the bytecode.
    Most interpreted languages use this approach. It decouples the interpreter from the AST, so instead of traversing a tree you're executing a linear stream of instructions.