implicit self (part 2 of 4) with eval

2013.May.07

In the last post of this series, we looked at instance, class, and static methods in Python. We looked at explicit self and looked at an example of what we might call implicit self.

1 class Foo(object):
2 	@apply
3 	class bar(object):
4 		def __get__(desc, self, cls):
5 			def bar():
6 				return self
7 			return bar

In this example, bar has self in scope but without having it provided implicitly as part of the instance method descriptor and without having to specific it explicitly in the method signature.

The Python documentation provides a brief primer on descriptors. In short, when an attribute lookup is performed for a class, the interpreter checks whether the attribute being retrieved is a descriptor. If so, instead of returning the value of that attribute, the interpreter returns the result of invoking the descriptor’s __get__ method.

In a later post, I’ll show all the various ways that we can retreive an attribute from an object in Python (including how __slots__ factors into this.)

 1 class Descriptor(object):
 2 	# the __get__ function is invoked with
 3 	#   the instance that we're doing
 4 	#   the lookup on (`foo'), the owner, and
 5 	#   the type of that instance (`Foo'), the
 6 	#   owner's type
 7 	# the __get__ function looks like an instance
 8 	#   function like any other that takes
 9 	#   an implicit first parameter, self
10 	# indeed, this first parameter references
11 	#   the instance of the descriptor object
12 	#   which is unique to the owner's class but
13 	#   to the owner's instance
14 	# unfortunately, it isn't actually the case
15 	#   that this parameter is an implicitly provided
16 	#   `self' and that the mechanism for retrieving this
17 	#   function from the descriptor object follows
18 	#   the unified attribute retrieval protocol
19 	# to see this disunity in action, try making __get__
20 	#   a @classmethod or @staticmethod
21 	# if __get__ were a regular instance method being
22 	#   called on a regular class, then we would expect
23 	#   to be able to make it a static method or class method
24 	# this doesn't work
25 	# it only happens to work if __get__ is a regular
26 	#   instance method
27 	# and, in fact, the mechanism by which __get__ is invoked
28 	#   is itself coincidentally similar to but not unified
29 	#   with the mechanism by which an instance method is invoked
30 	def __get__(descr, owner, owner_type):
31 		return descr, owner, owner_type
32 
33 class Foo(object):
34 	bar = Descriptor()
35 
36 foo = Foo()
37 
38 # when lookup up `bar' in foo, 
39 #   the interpreter sees that
40 #   it is an object and that
41 #   this object has a __get__
42 #   member function
43 # therefore, instead of returning
44 #   this object, the interpreter
45 #   invokes the __get__ member
46 #   function and returns that value
47 foo.bar

In the above, the descriptor allows us to get a reference to the instance (self) and the class (cls.) We make them available to our member function bar by allowing bar to close over them.

I will explain how this mechanism works in the next post.

If we wanted to automatically capture these variables in a closure, we could try something like the following. We’ll write a decorator called instancemethod that will convert a regular instance method into something looking like the above. In order to do this, the decorator will use the inspect module to retrieve the textual source code for the defined method. It will parse this source code to remove @instancemethod, remove the initial indentation, then indent it so it can be inserted into our template. Then, the decorator will paste this transformed source code into a template that defines a new descriptor with a __get__ method that returns our function, having it close around self and cls.

 1 from inspect import getsource
 2 from textwrap import dedent
 3 from re import split
 4 from itertools import takewhile, imap, dropwhile, chain, ifilter
 5 def instancemethod(f): 
 6 	# retrieve and parse source
 7 	method_source = '\n'.join(                 # build the text block by
 8 	  imap(lambda line: '\t\t{}'.format(line), #   indenting all of the lines
 9 	    (lambda source: chain(                 #   where the lines are the concatenation of
10 	                                           #   any decorators except @instancemethod
11 	      takewhile(lambda line: not line.strip().startswith('def'), 
12 	        ifilter(lambda line: not line.strip().startswith('@instancemethod'), source)),
13 	                                           #   and the source that follows
14 	      dropwhile(lambda line: not line.strip().startswith('def'), source)))
15                                                #   operating on the original textual source
16 	  (split('\r|\n', dedent(getsource(f))))))
17 	method_name = f.__name__
18 
19 	# paste it into a new template
20 	method = dedent('''
21 	@apply
22 	class {method_name}(object):
23 		def __get__(descr, self, cls):
24 	{method_source}
25 			return {method_name}
26 	''').format(method_source = method_source, method_name = method_name)
27 
28 	# execute the new code string in a namespace
29 	#   and retrieve & return the resulting method
30 	namespace = globals().copy()
31 	exec method in namespace 
32 	return namespace[method_name]
33 
34 class Foo(object):
35 	@instancemethod
36 	def bar():
37 		return self, cls
38 
39 foo = Foo()
40 assert foo.bar() == (foo, type(foo))

Please, don’t use this code!

One particular reason to not use this code is that this source code transformation technique breaks in couple of ways. For example, if we try to use it at the interactive console, the inspect.getsource won’t be able to find the textual source code, so it won’t be able to rewrite our instance methods!

 1 >>> from implicitself import instancemethod
 2 >>> class Foo(object):
 3 ...   @instancemethod
 4 ...   def bar():
 5 ...     return self, cls
 6 ... 
 7 Traceback (most recent call last):
 8   File "<stdin>", line 1, in <module>
 9   File "<stdin>", line 2, in Foo
10   File "implicitself.py", line 16, in instancemethod
11     (split('\r|\n', dedent(getsource(f))))))
12   File "/usr/lib/python2.7/inspect.py", line 701, in getsource
13     lines, lnum = getsourcelines(object)
14   File "/usr/lib/python2.7/inspect.py", line 690, in getsourcelines
15     lines, lnum = findsource(object)
16   File "/usr/lib/python2.7/inspect.py", line 538, in findsource
17     raise IOError('could not get source code')
18 IOError: could not get source code

Additionally, if we try to decorate any function that does not have source code, this will fail.

1 # doesn't work, since we can't find the
2 #   source code for a C-function like
3 #   the builtin len
4 class Foo(object):
5 	bar = instancemethod(len)

If one of our decorators wraps the method before instancemethod is invoked, inspect.getsource will get the source of the wrapper function as opposed to the original definition of the member function. This is an opaque wrapping. The wrapped, inner function whose source text we want will show up in the wrapper’s closure. In practice, it will be very difficult for us to identify this case and to pick out the function whose source text we want.

As a result, we probably will not be able to make the following work:

 1 # we'll accidentally get the source for `wrapped'
 2 #   instead of for `bar'
 3 
 4 from functools import wraps
 5 def decorator(f):
 6 	@wraps(f)
 7 	def wrapped(*args, **kwargs):
 8 		return f(*args, **kwargs)
 9 	return wrapped
10 
11 from implicitself import instancemethod
12 class Foo(object):
13 	@instancemethod
14 	@decorator
15 	def bar():
16 		return self, cls

In our parsing, we need to skip the @instancemethod decoration, and we do so by textually removing it. However, we can’t be guaranteed that we can statically identify the decorator to remove. Since we can’t determine what the decorator looks like, our parsing may fail to remove it from our textual source code, instancemethod will call instancemethod and we’ll loop until we hit the recursion limit.

 1 import implicitself
 2 
 3 # doesn't work, since we only look for
 4 #   `@instancemethod'
 5 class Foo(object):
 6 	@implicitself.instancemethod
 7 	def bar():
 8 		return self, cls
 9 
10 # doesn't work, since we have no way
11 #   to statically determine
12 #   which decorator is actually
13 #   the one we need to remove
14 from implicitself import instancemethod
15 def decorator(f):
16 	return f
17 decorator, instancemethod = instancemethod, decorator
18 class Foo(object):
19 	@instancemethod
20 	@decorator
21 	def bar():
22 		return self, cls

When we exec the new method definition, we do so in a new namespace we’ve created. However, there is a possibility that the processing of this definition will try to load names that are not in this name space. We can try to use inspect.getouterframes to climb up to the proper frame and use f_locals and f_globals to provide all the names that should be in scope. However, actually writing this turns out to be very difficult, as it has to handle all the different places the definition could sit:

 1 # these will break, because we can't find `decorator'
 2 def Foo():
 3 	def decorator(f):
 4 		return f
 5 	def inner():
 6 		class Foo(object):
 7 			@instancemethod
 8 			@decorator
 9 			def bar():
10 				return self, cls
11 		return Foo()
12 	return inner()
13 
14 def Foo():
15 	def decorator(f):
16 		return f
17 	def inner():
18 		class OuterFoo(object):
19 			class Foo(object):
20 				@instancemethod
21 				@decorator
22 				def bar():
23 					return self, cls
24 		return OuterFoo.Foo()
25 	return inner()

The above may seem like corner cases, but they are all things that a regular programmer might try and might expect to work.

It’s fair for us to expect all of these to work. All of these corner cases work with normal, non-source-text-manipulating decorators.

Please, don’t use this code.

Please don’t write code that does source text manipulation in a decorator.

I have seen such code in widespread production use. This code was written fairly sloppily, but managed to work in a lot of cases, as long as users followed a strict pattern. However, slight but entirely reasonable deviations from the pattern would cause it to fail in hard-to-predict and hard-to-debug ways. This lead to increasingly preposterous work-arounds which dramatically worsened the overall quality of the code base.

If you try to write something like this, it’s almost certain that you’ll miss one of these corner cases. If you manage to account for all of them, you’ll probably find there are other corner cases you didn’t account for. If you try to force users to stick to a set pattern which you can guarantee will work, you’ll find that your code cannot compose with many standard tools which will not hold themselves to your new constraints.

Next time, we’ll look at another approach to this problem.