implicit self (part 2 of 4) with eval

2013.May.07

In the last post of this series, we looked at instance, class, and static methods in Python. We looked at explicit self and looked at an example of what we might call implicit self.

1
2
3
4
5
6
7
class Foo(object):
	@apply
	class bar(object):
		def __get__(desc, self, cls):
			def bar():
				return self
			return bar

In this example, bar has self in scope but without having it provided implicitly as part of the instance method descriptor and without having to specific it explicitly in the method signature.

The Python documentation provides a brief primer on descriptors. In short, when an attribute lookup is performed for a class, the interpreter checks whether the attribute being retrieved is a descriptor. If so, instead of returning the value of that attribute, the interpreter returns the result of invoking the descriptor’s __get__ method.

In a later post, I’ll show all the various ways that we can retreive an attribute from an object in Python (including how __slots__ factors into this.)

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
class Descriptor(object):
	# the __get__ function is invoked with
	#   the instance that we're doing
	#   the lookup on (`foo'), the owner, and
	#   the type of that instance (`Foo'), the
	#   owner's type
	# the __get__ function looks like an instance
	#   function like any other that takes
	#   an implicit first parameter, self
	# indeed, this first parameter references
	#   the instance of the descriptor object
	#   which is unique to the owner's class but
	#   to the owner's instance
	# unfortunately, it isn't actually the case
	#   that this parameter is an implicitly provided
	#   `self' and that the mechanism for retrieving this
	#   function from the descriptor object follows
	#   the unified attribute retrieval protocol
	# to see this disunity in action, try making __get__
	#   a @classmethod or @staticmethod
	# if __get__ were a regular instance method being
	#   called on a regular class, then we would expect
	#   to be able to make it a static method or class method
	# this doesn't work
	# it only happens to work if __get__ is a regular
	#   instance method
	# and, in fact, the mechanism by which __get__ is invoked
	#   is itself coincidentally similar to but not unified
	#   with the mechanism by which an instance method is invoked
	def __get__(descr, owner, owner_type):
		return descr, owner, owner_type

class Foo(object):
	bar = Descriptor()

foo = Foo()

# when lookup up `bar' in foo, 
#   the interpreter sees that
#   it is an object and that
#   this object has a __get__
#   member function
# therefore, instead of returning
#   this object, the interpreter
#   invokes the __get__ member
#   function and returns that value
foo.bar

In the above, the descriptor allows us to get a reference to the instance (self) and the class (cls.) We make them available to our member function bar by allowing bar to close over them.

I will explain how this mechanism works in the next post.

If we wanted to automatically capture these variables in a closure, we could try something like the following. We’ll write a decorator called instancemethod that will convert a regular instance method into something looking like the above. In order to do this, the decorator will use the inspect module to retrieve the textual source code for the defined method. It will parse this source code to remove @instancemethod, remove the initial indentation, then indent it so it can be inserted into our template. Then, the decorator will paste this transformed source code into a template that defines a new descriptor with a __get__ method that returns our function, having it close around self and cls.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
from inspect import getsource
from textwrap import dedent
from re import split
from itertools import takewhile, imap, dropwhile, chain, ifilter
def instancemethod(f): 
	# retrieve and parse source
	method_source = '\n'.join(                 # build the text block by
	  imap(lambda line: '\t\t{}'.format(line), #   indenting all of the lines
	    (lambda source: chain(                 #   where the lines are the concatenation of
	                                           #   any decorators except @instancemethod
	      takewhile(lambda line: not line.strip().startswith('def'), 
	        ifilter(lambda line: not line.strip().startswith('@instancemethod'), source)),
	                                           #   and the source that follows
	      dropwhile(lambda line: not line.strip().startswith('def'), source)))
                                               #   operating on the original textual source
	  (split('\r|\n', dedent(getsource(f))))))
	method_name = f.__name__

	# paste it into a new template
	method = dedent('''
	@apply
	class {method_name}(object):
		def __get__(descr, self, cls):
	{method_source}
			return {method_name}
	''').format(method_source = method_source, method_name = method_name)

	# execute the new code string in a namespace
	#   and retrieve & return the resulting method
	namespace = globals().copy()
	exec method in namespace 
	return namespace[method_name]

class Foo(object):
	@instancemethod
	def bar():
		return self, cls

foo = Foo()
assert foo.bar() == (foo, type(foo))

Please, don’t use this code!

One particular reason to not use this code is that this source code transformation technique breaks in couple of ways. For example, if we try to use it at the interactive console, the inspect.getsource won’t be able to find the textual source code, so it won’t be able to rewrite our instance methods!

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
>>> from implicitself import instancemethod
>>> class Foo(object):
...   @instancemethod
...   def bar():
...     return self, cls
... 
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "<stdin>", line 2, in Foo
  File "implicitself.py", line 16, in instancemethod
    (split('\r|\n', dedent(getsource(f))))))
  File "/usr/lib/python2.7/inspect.py", line 701, in getsource
    lines, lnum = getsourcelines(object)
  File "/usr/lib/python2.7/inspect.py", line 690, in getsourcelines
    lines, lnum = findsource(object)
  File "/usr/lib/python2.7/inspect.py", line 538, in findsource
    raise IOError('could not get source code')
IOError: could not get source code

Additionally, if we try to decorate any function that does not have source code, this will fail.

1
2
3
4
5
# doesn't work, since we can't find the
#   source code for a C-function like
#   the builtin len
class Foo(object):
	bar = instancemethod(len)

If one of our decorators wraps the method before instancemethod is invoked, inspect.getsource will get the source of the wrapper function as opposed to the original definition of the member function. This is an opaque wrapping. The wrapped, inner function whose source text we want will show up in the wrapper’s closure. In practice, it will be very difficult for us to identify this case and to pick out the function whose source text we want.

As a result, we probably will not be able to make the following work:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
# we'll accidentally get the source for `wrapped'
#   instead of for `bar'

from functools import wraps
def decorator(f):
	@wraps(f)
	def wrapped(*args, **kwargs):
		return f(*args, **kwargs)
	return wrapped

from implicitself import instancemethod
class Foo(object):
	@instancemethod
	@decorator
	def bar():
		return self, cls

In our parsing, we need to skip the @instancemethod decoration, and we do so by textually removing it. However, we can’t be guaranteed that we can statically identify the decorator to remove. Since we can’t determine what the decorator looks like, our parsing may fail to remove it from our textual source code, instancemethod will call instancemethod and we’ll loop until we hit the recursion limit.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
import implicitself

# doesn't work, since we only look for
#   `@instancemethod'
class Foo(object):
	@implicitself.instancemethod
	def bar():
		return self, cls

# doesn't work, since we have no way
#   to statically determine
#   which decorator is actually
#   the one we need to remove
from implicitself import instancemethod
def decorator(f):
	return f
decorator, instancemethod = instancemethod, decorator
class Foo(object):
	@instancemethod
	@decorator
	def bar():
		return self, cls

When we exec the new method definition, we do so in a new namespace we’ve created. However, there is a possibility that the processing of this definition will try to load names that are not in this name space. We can try to use inspect.getouterframes to climb up to the proper frame and use f_locals and f_globals to provide all the names that should be in scope. However, actually writing this turns out to be very difficult, as it has to handle all the different places the definition could sit:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
# these will break, because we can't find `decorator'
def Foo():
	def decorator(f):
		return f
	def inner():
		class Foo(object):
			@instancemethod
			@decorator
			def bar():
				return self, cls
		return Foo()
	return inner()

def Foo():
	def decorator(f):
		return f
	def inner():
		class OuterFoo(object):
			class Foo(object):
				@instancemethod
				@decorator
				def bar():
					return self, cls
		return OuterFoo.Foo()
	return inner()

The above may seem like corner cases, but they are all things that a regular programmer might try and might expect to work.

It’s fair for us to expect all of these to work. All of these corner cases work with normal, non-source-text-manipulating decorators.

Please, don’t use this code.

Please don’t write code that does source text manipulation in a decorator.

I have seen such code in widespread production use. This code was written fairly sloppily, but managed to work in a lot of cases, as long as users followed a strict pattern. However, slight but entirely reasonable deviations from the pattern would cause it to fail in hard-to-predict and hard-to-debug ways. This lead to increasingly preposterous work-arounds which dramatically worsened the overall quality of the code base.

If you try to write something like this, it’s almost certain that you’ll miss one of these corner cases. If you manage to account for all of them, you’ll probably find there are other corner cases you didn’t account for. If you try to force users to stick to a set pattern which you can guarantee will work, you’ll find that your code cannot compose with many standard tools which will not hold themselves to your new constraints.

Next time, we’ll look at another approach to this problem.