7. Generators and Iterators#
Let’s change topics just for a moment (we’ll get to context managers in a moment, which we are building toward). Let’s look into iterators, which are a form of generator. I’m sure you’ve seen one:
for i in range(4):
print(i)
0
1
2
3
But what is range(4)
? It’s not a list, it’s a custom object:
range(4)
range(0, 4)
Python has built into it the concept of iteration. What the various looping structures do is call iter()
on the object first, then call next()
over and over until a StopIteration
Exception is raised. Try it:
it = iter(range(0, 4))
next(it)
0
7.1. Defining iterators#
You could implement __iter__
and __next__
yourself, but Python has a built in syntax shortcut for making iterators (and generators, which have a __send__
too):
def range4():
yield 0
yield 1
yield 2
yield 3
A function that has at least one yield
in it creates a factory function that returns a generator (iterator).
range4()
<generator object range4 at 0x7fec5e182e30>
for i in range4():
print(i)
0
1
2
3
The presence of a single yield anywhere in a function turns it into an iterator. Notice “calling” the iterator factory function just produces an iterable object, it does not run anything yet. Then, when you iterate it, it “pauses” at each yield.
Many Python functions take iterables, like list
and tuple
:
list(range4())
[0, 1, 2, 3]
If Python were rewritten today, there would likely be a keyword, like iter def
, to indicate that a def
is making an iterator instead of a normal function; but for historical reasons, you just have to look for yield
’s inside the function.
If you like list comprehensions:
[a for a in range(4)]
[0, 1, 2, 3]
Then you’ll be glad to know there is a generator comprehension too:
(a for a in range(4))
<generator object <genexpr> at 0x7fec5e1830d0>
What about restarting?
Generators are often “one shot”, and expected to be recreated if needed again - this is true for the yield
based syntax above. But you can make a multiple passes that supports restarting if you do it yourself. In fact, range
supports this:
r = range(4)
print(list(r), list(r))
[0, 1, 2, 3] [0, 1, 2, 3]
But normally, you call them inplace, such as list(range(4))
, so it is not often missed if they can’t be restarted.
7.2. Factoring iterators and generators#
You can also factor out genators, just like you can factor out functions:
def middle_two():
yield 1
yield 2
def range4_factored():
yield 0
yield from middle_two()
yield 3
list(range4_factored())
[0, 1, 2, 3]
You might be tempted to place a loop inside the generator with a yield (for item in middle_two(): yield item
), but yield from
is simpler and also works correctly with generators (next section).
7.3. General generators#
A generator that only returns values is called an iterator, and that’s mostly what you directly see. Generators that are not iterators support two-way communication. You rarely need these, and there really isn’t a nice syntax method for sending information to a generator, but here is one, just as an example:
def generator():
received = yield 1
print("Received", received)
received = yield 2
print("Received", received)
# Prepare generator
active = iter(generator())
print("Running first send")
print(f"{active.send(None) = }")
print("Running second send")
print(f"{active.send(10) = }")
try:
active.send(20)
except StopIteration:
print("Done")
Running first send
active.send(None) = 1
Running second send
Received 10
active.send(10) = 2
Received 20
Done
next(active)
is the same thing as active.send(None)
- the first send must always be None
, since it hasn’t reached the first =
sign yet. The final send does not need to be None, though I didn’t accept anything for the second yield above, so I just used next
to end it.