- Understanding generators
Consider the following function:
def specialRange(n):
# seemingly the same as range(n), but omits multiples of 3
result = [ ]
for x in range(n):
if (x % 3 != 0):
result.append(x)
return result
This function appears to work the same as range(n), but
omits the multiples of 3. To wit:
for v in specialRange(10):
print(v) # prints 1 2 4 5 7 8
Yep, this looks like it works just fine. But it doesn't!
Let's try that last example again, only with a much larger
range, up to 10**8 say. And we can't print out so many values,
so we'll break after we print the first value, like so:
# this time we'll see a problem....
for v in specialRange(10**8):
print(v)
break
Did you see the problem? There was a huge delay before anything
printed. Why? Because we are constructing an enormous list of values,
and Python must build the whole list before we can use any of it.
At 10**8 that takes a long time. By 10**10 it may take impossibly long,
plus we will also soon run out of memory. So this is clearly
not working well for large n.
As you may have guessed, there is a better way. We can convert
our specialRange function into a generator. Instead of constructing
a list and returning it, we can yield each value that we would
have placed on the list. Python stops running the function at that
point, uses the yielded value, then starts the function right where
it left off to get the next yielded value. Awesome!
To make this work, we simply change our specialRange function from this:
def specialRange(n):
# seemingly the same as range(n), but omits multiples of 3
result = [ ]
for x in range(n):
if (x % 3 != 0):
result.append(x)
return result
To this:
def specialRange(n):
# this time as a generator!
for x in range(n):
if (x % 3 != 0):
yield x
Then re-run the examples above that use specialRange(n),
and you will see that they work
exactly the same with the generator, only there is no delay
at all, even for huge n. Try this:
for v in specialRange(10**50):
# 10**50 is ridiculously huge, yet this still works just fine with a generator
print(v)
break
The generator never creates the list, it just creates one value of
the list at a time. This is a beautiful solution!
- The allSublists(L) generator
Now you get to actually write your own generator! The goal
is to the write the generator allSublists(L) that takes a possibly-empty
list L, and generates all the sublists of L -- that is, all the lists
made up of some (or all) of the elements of L. If L was a set and
not a list, this is what we would call the powerset of L.
Here are some test cases:
assert(sorted(allSublists([1])) == [ [], [1] ])
assert(sorted(allSublists([3, 5])) == [ [], [3], [3, 5], [5] ])
assert(sorted(allSublists([6,7,8])) == [ [], [6], [6, 7], [6, 7, 8],
[6, 8], [7], [7, 8], [8] ])
We include a call to sorted
in the test cases because
your generator can produce the sublists in any order (so we sort
them and compare the result to the sorted order).
Now, how to do this? We will first observe that a list with N elements
in it has 2**N sublists (think about that). Then, we realize that we
can have a counter k run from 0 to (2**N)-1, and we can use that counter
to generate the kth sublist. How? By considering k in base 2, and
using 0's to exclude values and 1 to include them in the sublist.
You may have to re-read that a couple times. Plus, an example would help.
Say we have the list L = [5,6,7,8]. Here, N=4, and there are 2**4 or 16 sublists. So we run k from 0 to 15 inclusive. What happens when k is,
say, 13? Well, 13 in base 2 is 1101. We use those 4 digits to determine
whether each of the 4 values of the list L are in this sublist:
L = [ 5, 6, 7, 8 ]
k = 1 1 0 1 (13)
Look at the figure above, and see that there is a 1 under 5, 6, and 8,
and a 0 under 7. So when k==13, the sublist is [5, 6, 8].
Use this technique to write the generator allSublists(L)
.
Also, as a hint, you probably should write getKthBinaryDigit(n, k),
which works the same as getKthDigit(n, k) only in binary instead of decimal
(basically, change the 10 to a 2 and you're done).
- Solving subsetSum
Ok, now we can use that generator to solve a real problem!
The problem we will try to solve is called subsetSum.
Given a list L, return a sublist of L that sums to 0, or
return None if no such sublist exists. We will discuss this
problem a lot more, later on in this course, since it has
some fascinating properties. For now, we'll just try to
solve it.
How do we solve it? Quite easily at this point, in fact.
We just use the allSublists generator you just wrote, and
check if each sublist sums to 0 (using sum(sublist)).
This should be quick!
One word of caution: while this does solve the problem,
the solution is very slow (in fact, exponentially slow).
So only test it on very small lists.
In any case, the starter file includes testSolveSubsetSum
to help check if you got this step right.
- The heapsAlgorithmForPermutations(L) generator
Ok, so we now understand generators and how they can be
used to solve some kinds of puzzle problems (basically by
trying every possible answer). We'll move on to a new
kind of generator. This time, instead of allSublists, we
are interested in permutations. That is, for a list L,
every possible way to order the values in L.
There are numerous ways to solve this, but we will use a specific
approach: namely, the iterative (non-recursive)
form of Heap's Algorithm. You can read about it
on the linked Wikipedia page if you wish, or not, as you can
do this problem without fully understanding how Heap's
Algorithm works. You just have to translate it into a Python
generator. This is the pseudocode from that website:
procedure generate(n : integer, A : array of any):
//c is an encoding of the stack state. c[k] encodes the for-loop counter for when generate(k+1, A) is called
c : array of int
for i := 0; i < n; i += 1 do
c[i] := 0
end for
output(A)
//i acts similarly to the stack pointer
i := 0;
while i < n do
if c[i] < i then
if i is even then
swap(A[0], A[i])
else
swap(A[c[i]], A[i])
end if
output(A)
//Swap has occurred ending the for-loop. Simulate the increment of the for-loop counter
c[i] += 1
//Simulate recursive call reaching the base case by bringing the pointer to the base case analog in the array
i := 0
else
//Calling generate(i+1, A) has ended as the for-loop terminated. Reset the state and simulate popping the stack by incrementing the pointer.
c[i] := 0
i += 1
end if
end while
Your task here is to closely translate that pseudocode into
the generator heapsAlgorithmForPermutations(L).
Here are some test cases for you:
assert(sorted(heapsAlgorithmForPermutations([1])) == [[1]])
assert(sorted(heapsAlgorithmForPermutations([1,2])) == [
[1,2], [2,1]
])
assert(sorted(heapsAlgorithmForPermutations([3,1,2])) == [
[1,2,3], [1,3,2], [2,1,3], [2,3,1], [3,1,2], [3,2,1]
])
Important Hint: use yield copy.copy(A)
instead of yield A
. This avoids a challenging bug
to find based on aliasing.
- Solving cryptarithms
Ok, we're on the last step! Here, we will use the permutation generator
you just wrote to solve cryptarithms! First, read the solvesCryptarithm
problem above. Here, instead of testing if a solution
solves a cryptarithm, we will find the solution that
solves a cryptarithm. Sweet!
- solveCryptarithm
We will write the function solveCryptarithm(puzzle) (which is different than solvesCryptarithm)like so:
- extract the 3 words from the puzzle (so go from 'SEND + MORE = MONEY' to the list ['SEND', 'MORE', 'MONEY']).
- generate a string of the unique letters in the 3 words in the puzzle (so for 'SEND + MORE = MONEY', this would be 'SENDMORY'
(in any order)).
- augment that string with dashes so it is of length 10 (so
the string above becomes 'SENDMORY--'
- Now we observe that any possible solution to the puzzle
must be a permutation of the string we just made.
This is key. So...
- We use the generator we just wrote to generate every possible
permutation of the string, and we check each one in turn
to see if it does in fact solve the puzzle. If so, we return
it (actually, as you will see in the test cases, we return
a more-nicely-formatted version of it). If none of the permutations
solves the puzzle, then we return None.
That's it. We're done! Except... This runs SO slowly that it is
almost not even testable. Ugh. So... We will modify the problem
a little bit, constraining it so that we can in fact test it.
- solveCryptarithmWithMaxDigit
For that, we will add a maxDigit
, and require that the
solution not contain any digits larger than maxDigit. Thus,
we change solveCryptarithm(puzzle)
to
become solveCryptarithmWithMaxDigit(puzzle, maxDigit)
.
This way, we only construct permutations of strings of length
(maxDigit+1), and then we append (10 - (maxDigit+1)) dashes to
those strings to make candidate solutions to use as above.
Read this a few times and think about it!
Here is a test function for you:
def testSolveCryptarithmWithMaxDigit():
print(' Testing solveCryptarithmWithMaxDigit()...', end='')
assert(solveCryptarithmWithMaxDigit('RAM + RAT = ANT', 4) == '''\
RAM + RAT = ANT
120 + 123 = 243''')
assert(solveCryptarithmWithMaxDigit('ANT + CAT = EEL', 4) == None)
assert(solveCryptarithmWithMaxDigit('ANT + CAT = EEL', 5) == '''\
ANT + CAT = EEL
125 + 315 = 440''')
print('Passed!')
- countCryptarithmsWithMaxDigit
Ok, we are getting close. To best test these puzzles, it would
be nice to find puzzles that have a single unique solution (can you
see why?). Ok, let's do that. For this, you will write the function
countCryptarithmsWithMaxDigit(puzzle, maxDigit)
.
This requires just a tiny change or two to the function
you just wrote,
solveCryptarithmWithMaxDigit(puzzle, maxDigit)
.
Instead of returning the first match, now you just count all
the matches, and return that count. That's it!
- getAllSingletonCryptarithmsWithMaxDigit
Ok, now we can write the last function for this exercise,
getAllSingletonCryptarithmsWithMaxDigit(words, maxDigit)
.
This function takes a list of words, from which it will generate
possible puzzles. Then it will call your countCryptarithmsWithMaxDigit
function on each such puzzle,
and if the result is 1, then that means that that puzzle has a
single unique solution (with the maxDigit restriction). Great!
We will include that puzzle in our result, which is just a multiline
string of all such puzzles.
Note that your output must list the puzzles in alphabetical order.
Note that 'A + B = C' and 'B + A = C' are considered the same
problem, so only the first would be included in our answer, as
'A' is alphabetically before 'B'.
To do that, our function first sorts the words,
which helped to that end.
We also included 3 nested 'for' loops -- one for each of the 3
words in the puzzle.
Finally, here is a test function you may want to closely study:
def testGetAllSingletonCryptarithmsWithMaxDigit():
print(' Testing getAllSingletonCryptarithmsWithMaxDigit()...', end='')
words = ['EEL', 'RAM', 'CAT', 'BEE', 'FLY',
'HEN', 'RAT', 'DOG', 'ANT']
assert(getAllSingletonCryptarithmsWithMaxDigit(words, 3) == '')
assert(getAllSingletonCryptarithmsWithMaxDigit(words, 4) == '''\
RAM + RAT = ANT
120 + 123 = 243''')
assert(getAllSingletonCryptarithmsWithMaxDigit(words, 5) == '''\
ANT + CAT = EEL
125 + 315 = 440
ANT + CAT = HEN
105 + 315 = 420
ANT + RAT = EEL
125 + 315 = 440
ANT + RAT = HEN
105 + 315 = 420
BEE + EEL = FLY
411 + 112 = 523''')
words = ['DEER', 'BEAR', 'GOAT', 'MULE', 'PUMA',
'COLT', 'ORCA', 'IBEX', 'LION', 'WOLF']
assert(getAllSingletonCryptarithmsWithMaxDigit(words, 5) == '')
assert(getAllSingletonCryptarithmsWithMaxDigit(words, 6) == '''\
BEAR + DEER = IBEX
4203 + 1223 = 5426
COLT + GOAT = ORCA
4635 + 1605 = 6240''')
print('Passed!')
That's it, you made it, you solved two really interesting problems
using two different combinatorial generators. Great job!!!!