implicit self (part 1 of 4)

2013.May.03

This is a four-part post about “implicit self” in Python.

When writing a class in Python, we distinguish between instance methods, class methods, and static methods as follows.

An instance method is a method whose argspec (argument signature) contains one implicit parameter (typically named self). This is the default method type. This parameter is a reference to the instance upon which the method was called.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
class Foo(object):
	# by default, bar is a instance method (strictly, it's a
	#   function whose __get__ descriptor returns a bound instancemethod)
	# it takes one implicit parameter, which references the instance of the class
	#   upon which the method was invoked
	def bar(self):
		print 'Foo.bar({})'.format(self)
		return self

foo = Foo()

# notice the asymmetry in the calling convention
assert foo.bar() is foo # invoking bar on an instance
assert Foo.bar(foo) is foo # invoking bar on the class

In the above, I call self an implicit parameter, because it is passed to bar implicitly when the method is invoked on an instance. However if bar is invoked on the class, we have to explicitly pass this parameter.

The mechanism behind this all is handled by a descriptor on function objects. I’ll go into this mechanism in greater depth in a later blog post.

In Python 2, there is another dimension to this mechanism: bound and unbound methods. Unbound methods have been removed in Python 3. In short, foo.bar gives a bound method, in that the method implicitly binds to the instance.

1
2
3
4
5
/* Instance method objects are used for two purposes:
   (a) as bound instance methods (returned by instancename.methodname)
   (b) as unbound methods (returned by ClassName.methodname)
   In case (b), im_self is NULL
*/

Python also features class methods and static methods.

A class method is a method whose argspec contains one implicit parameter, typically named cls. This parameter is a reference to the class upon which the method was called or the class of the instance upon which the method was called. Class methods require the use of a decorator, classmethod.

1
2
3
4
5
6
7
8
9
10
11
12
class Foo(object):
	# bar is a class method
	# it takes one implicit parameter, which references the class
	#   upon which the method was invoked
	@classmethod
	def bar(cls):
		print 'Foo.bar({})'.format(cls)
		return cls

foo = Foo()
assert foo.bar() is Foo # invoking bar on an instance
assert Foo.bar() is Foo # invoking bar on the class

Note that, in the above, the parameter cls is implicitly bound whether we invoke the function off of the class or off of the instance.

A static method is a method whose argspec contains no implicit parameters. Static methods require the use of a decorator, staticmethod.

1
2
3
4
5
6
7
8
9
10
class Foo(object):
	# bar is a static method
	# it takes no implicit parameters
	@staticmethod
	def bar():
		print 'Foo.bar()'

foo = Foo()
assert foo.bar() is None # invoking bar on an instance
assert Foo.bar() is None # invoking bar on the class

There are a number of ways that we can conceptualise these three types of methods.

For example, we can note that when we invoke these methods on an instance, the calling convention is identical (since the self or cls parameters are bound implicitly.) As a result, this gives us an escalation pathway. We can view the type of method (static/class/instance) as a marker for the function’s visibility: can it see no additional information (staticmethod,) can it see information about the class (classmethod,) can it see information about the instance (instancemethod)? (Note that these cases overlap: an instancemethod can operate on the class by operating on type(self).) Therefore, we could write our class with staticmethods to start and increase visibility only as necessary. The caller (as long as it operates on an instance) can be ignorant of these changes. We might do this as a mechanism for enforcing correctness: by virtue of restricting access to class and instance information, we can informally assert that our methods are unable to view or change class and instance state.

Unfortunately, the above approach runs into a few problems in practice. These problems have unfortunate implications on the way that we must write our code. I’ll blog about these problems in a later post.

When we talk about explicit self in Python, we generally refer to the need to explicitly specify this implicitly bound parameter. In contrast, in the context of a member function, the this keyword in C++ refers to the instance of the class that function was invoked upon.

The keyword this names a pointer to the object for which a non-static member function is invoked or a non-static data member’s initializer is evaluated. –ISO C++ Standards Committee

Could we modify Python to allow us to implicitly reference self and cls? How might this work and what are the implications of the different approaches?

How can we inject self or cls into a function without having to specify it explicitly? Well, isn’t this precisely what a closure does?

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
class Foo(object):
	@apply
	class bar(object):
		def __get__(desc, self, cls):

			# the method itself, notice no
			#   explicit reference to self
			#   because method closes around
			#   the instance 

			def bar():
				return self

			return bar

foo = Foo()

# note the calling convention of the below is identical!
assert foo.bar() is foo  # invoked on the instance, get the instance
assert Foo.bar() is None # invoked on the class, get None

In the second example above, we wrap our method in a descriptor which gets a reference to the instance and the class it is called upon. The method bar closes around these two variables. As a consequence, we no longer have an asymmetry in the calling convention.

In the next three posts, we’ll investigate three ways to accomplish the above.