a (monadic?) modelling problem in Python

2013.May.06

Say we have a business that is structured in the following way.

We manage fulfillment for a set of manufacturing groups. Each manufacturing group consists of a set of manufacturers. These sets of manufacturers can overlap (since there are a number of joint ventures.) Each individual manufacturer has a warehouse for storing its inventories. Each warehouse contains multiple inventories for the different product lines of the manufacturer. Each inventory contains widgets that the manufacturer sells. These widgets may be common, since some manufacturers sell white-labelled versions of off-shore, mass-produced widgets.

Here’s an example of the how the structure of this data might look:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
manufacturing_groups = { 'multiglobal manufacturing group',
                         'intergalactic joint ventures',
                         'transdimensional megacorp', }
manufacturers = { 'multiglobal manufacturing group': { 'red heron manufacturers',
                                                       'blue bass super-corp',
                                                       'yellow dog llc',
                                                       'orange penguin family manufacturers',
                                                       'green hornet industries', },
                 'intergalactic joint ventures': { 'yellow dog llc',
                                                   'purple butterfly inc', },
                 'transdimensional megacorp': { 'maroon baboon manufacturer',
                                                'orange pengiun family manufacturers',
                                                'green hornet industries', }, }
warehouses = { 'red heron manufacturers': 'neptune building',
               'blue bass super-corp': 'saturn warehouse', 
               'yellow dog llc': 'mars storage', 
               'orange penguin family manufacturers': 'venus building', 
               'green hornet industries': 'jupiter building', 
               'purple butterfly inc': 'pluto ministorage', 
               'maroon baboon manufacturer': 'mercury warehouse', }
inventories = { 'neptune building': {'rose widgets', 'petunia widgets'},
                'saturn warehouse': {'rose widgets', 'daffodil widgets'},
                'mars storage': {'poppy widgets', 'forget-me-not widgets'},
                'venus building': {'goldenrod widgets'},
                'juptier building': {'magnolia widgets'},
                'pluto ministorage': {'indian paintbrush widgets', 'daffodil widgets'},
                'mercury warehouse': {'carnation widgets'}, }

Let’s presume that we have rich objects representing each of these types of objects, ManufacturingGroup, Manufacturer, Warehouse, Inventory, Widgets.

We’re asked to write functions to navigate through our data model. If we’re asked to write a function, widgets_from_inventory that gives us all the widgets from an inventory that meet some predicate, we could write it in a number of different ways:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
def widgets_from_inventory(inventory, predcate):
	if not isinstance(inventory, Inventory):
		raise TypeError('inventory must be of type Inventory')
	if not isinstance(predicate, Callable):
		raise TypeError('predicate must be a Callable')
	for widget in inventory:
		if predicate(widget):
			yield widget

def widgets_from_inventories(inventories, predcate):
	if not isinstance(inventory, Inventory):
		raise TypeError('inventory must be of type Inventory')
	if not isinstance(predicate, Callable):
		raise TypeError('predicate must be a Callable')
	for inventory in inventories:
		for widget in inventory:
			if predicate(widget):
				yield widget

Which should we write? Well, widgets_from_inventory could be considered the simpler of the two, and widgets_from_inventories could be constructed from it.

1
2
3
4
5
from itertools import chain
widgets_from_inventories = lambda inventories, predicate: chain.from_iterable(widget_from_inventory(inventory, predicate) for inventory in inventories)

# of course, we could always do:
widgets_from_inventory = lambda inventory, predicate: widgets_from_inventories([inventory], predicate)

Let’s say that we want to write widget_descriptions_from_manufacturing_groups, and we’ve already written each one-step function.

1
2
3
4
5
6
7
manufacturers_from_manufacturing_group = lambda manufacturing_group: None # ...
warehouses_from_manufacturer = lambda manufacturer: None # ...
inventories_from_warehouse = lambda warehouse: None # ...
widgets_from_inventory = lambda inventory: None # ...

# every widget has a description associated with it
description_for_widget = lambda widget: None # ...

Can we write a simple framework or higher-order function that takes the above five and allows us to create a widget_descriptions_from_manufacturing_groups function?

Now, what if we want that function to be able to track the manufacturing group the description came from? Can we write a simple framework, a higher-order function, or a generator tool that gives us widget_descriptions_from_manufacturing_groups that returns a list of tuples of the form (manufacturing group, manufacturer, inventory, widget, description)?

In most cases, we would write the function directly:

1
2
3
4
5
6
7
def widget_descriptions_from_manufacturing_groups(manufacturing_groups):
	for group in manufacturing_groups:
		for manufacturer in manufacturers_from_manufacturing_group(group):
			for warehouse in warehouses_from_manufacturer(manufacturer):
				for inventory in inventories_from_warehouse(warehouse):
					for widgets in widgets_from_inventory(inventory):
						yield group, manufacturer, warehouse, inventory, widget, description_for_widget(widget)

But how can we write it with three functions as follows?

1
2
3
4
5
6
fm = lambda gen, func: None # apply func across the results of gen
bd = lambda *gens: None # chain the generators together, collecting each intermediate value
lf = lambda gen: None # turn a generator that operates on a single value into one that operates on a Collection


widget_descriptions_from_manufacturing_groups = lambda manufacturing_groups: fm(description_for_widget, bd(lf(manufacturers_from_manufacturing_group), warehouses_from_manufacturrer, inventories_from_warehouse, widgets_from_inventory))(manufacturing_group)

I think that we often run into problems that look like this. In most cases, we write all of our derived functions directly, since we don’t have nice itertools-like helpers for combining generators or higher order functions. However, we should be able to write these helpers, and nature of how we must model these helpers should tell us a lot about the problem and about the tools we’re using.

I’ll share a possible solution for this in a later blog post.