Here are a few ways we can take a nested sequence and flatten it.
In the following examples, I’ll assume we’re looking at a list-of-lists: a structure that needs to be flattened once to produce a list.
I’ll call the list-of-lists
nested and each list within that will be designated
xs and each element within that will be designated
Let’s start with the simplest, most naïve approach.
1 2 3 4 5 6 7 8 9 def flatten(nested): temp =  for xs in nested: for x in xs: temp.append(x) return temp data = [[1,2,3],[4,5,6]] assert flatten(data) == [1,2,3,4,5,6]
The above code will work not only for a list of lists but also for a generator of lists or a generator of generators. (In fact, all of the examples provided here will work for generators of generators or lists of lists or structures in-between.)
If we look through the standard library, we’ll come across
itertools.chain which allows us the following formulations.
1 2 3 4 5 from itertools import chain flatten = lambda nested: chain(*nested) data = [[1,2,3],[4,5,6]] assert list(flatten(data)) == [1,2,3,4,5,6]
chain allows us to provide an iterable, so we don’t have to do explicit unpacking with the asterisk/argument unpacking operator.
1 2 3 4 5 from itertools import chain flatten = lambda nested: chain.from_iterable(nested) data = [[1,2,3],[4,5,6]] assert list(flatten(data)) == [1,2,3,4,5,6]
These two formulations have the advantage that they do not prematurely materialise our flattened structure. This might be useful if we have memory constraints or if we wanted to feed the result into another data structure (a set or some other Container) without first putting it into a list.
Interestingly enough, we can perform this flattening without any libraries.
Comprehension syntax allows us muliple
for clauses, and we can use these to iterate over the elements of the nested structure.
1 2 3 4 5 6 7 8 9 10 11 12 13 from itertools import chain flatten = lambda nested: (x for xs in nested for x in xs) data = [[1,2,3],[4,5,6]] assert list(flatten(data)) == [1,2,3,4,5,6] # or, if we want to materialise eagerly: from itertools import chain flatten = lambda nested: [x for xs in nested for x in xs] data = [[1,2,3],[4,5,6]] assert flatten(data) == [1,2,3,4,5,6]
Now, these are all well and good, but I think a lot of folks came here to see code they seriously shouldn’t ever use.
To that effect, I present the following list flattening function (Python 3.3< only):
1 2 3 4 5 6 7 from sys import version_info assert version_info.major == 3 and version_info.minor >= 3, 'Python 3.3< only!' flatten = lambda nested: (None for xs in nested if (yield from xs) and False) data = [[1,2,3],[4,5,6]] assert list(flatten(data)) == [1,2,3,4,5,6]
Did you know that you can put an explicit
yield in a generator expression?
* why do explicit
yield from in generator expressions work?
* what is
yield from, anyway?
* what does the following do, why, and how does it work?
(((yield x), (yield x)) for x in xrange(10))
* should I write or publish a library function for flattening higher order sequences of sequences? (quick answer: probably not)
* why are strings kind of weird? (quick answer: they are structures with infinite, nested iterability; there is no constituent scalar element type.)