Why Not Python?, Part 4
To solve the puzzle presented and discussed in
Part 3 of this
series, as well as similar puzzles, I have to figure
out how to code Step 2.6 in Python. (See Part 3 for an outline of all the
steps in this problem-solving exercise.) The code should be
straightforward, except for saving (Step 2.6.1) and restoring (Step 2.6.4)
values of compound data structures.
In fact, this part of the code turned out to be harder to write than I
expected. A list of my mistakes can be seen below in Appendix B, but
the important lesson is this: simply assigning a list doesn't copy it, as
Learning Python says. Instead, another reference
is created to the same list. Here's where we bump into that issue:
% python Python 2.4 (#1, Mar 22 2005, 21:42:42) [GCC 3.3.5 20050117 (prerelease) (SUSE Linux)] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> old = [ 1, 2, 3 ] >>> new = old >>> old.append(4) >>> old [1, 2, 3, 4] >>> new [1, 2, 3, 4] >>>
Here, I saved a copy of "old", but it's useless as a backup, because
modifying old causes "new" to change too. This is true because only was
ever one list, with both old and new pointing to it. One easy way to
prevent this occurrence is to copy a slice of the old list, like this:
>>> old = [ 1, 2, 3 ] >>> new = old[:] # copy a "slice" of "old" >>> old.append(4) # modify "old" >>> old [1, 2, 3, 4] >>> new # "new" is unaffected [1, 2, 3] >>>
Here I modified old, but new kept its original value.
Ultimately I decided to create a few new functions within the
class itself--I think Java and C++ people call them "methods". The
functions deal with "opaque" objects--opaque to the caller, that is--and
are responsible for knowing what information has to go into a copy and what
must be restored from it. The getbackup() and restore() member
functions serve this purpose. Let's add them to cell_class.py:
61 def getpos(self): return self.XYZpos 62 def getbackup(self): 63 return [ self.XYZvalue, self.XYZset_of_possibles[:] ] 64 def restore(self,bkp): 65 self.XYZvalue = bkp[0] 66 self.XYZset_of_possibles = bkp[1][:]
Now cells[N].getbackup() returns a two-element
list. The first element contains the current value of cell N, which
may be 0 for unknown. The second element is itself a list, the
list of cell N's possible values. But because we can modify the
"XYZset_of_possibles" piece of the cell, we need a copy of the list
instead of another reference to it. Thus, we have to copy the slice, as
shown in the example above.
On the restore() side, if bkp is a value returned from
cells[N].getbackup(), restore() puts those
values back in. Until very recently, I had this coded as:
def restore(self,bkp):
self.value = bkp[0]
self.set_of_possibles = bkp[1]
But this code contained a subtle problem. I explain it below in Appendix C.
In case getpos() isn't self-explanatory, let me explain it here. As with
the other accessor routines, getpos() is there so my only access to the
"XYZpos" field of a cell is through this read-only function. I use this
to keep track of what the program is doing, for debug messages. Its
use will be apparent below.
Why am I so hung up on not using field names outside of the class?
The field "pos"--actually "XYZpos", as I've changed it--is something
that shouldn't be written after it's created. I've accidentally changed
fields/variables unintentionally before, and this time I wanted to protect
myself from doing so. I'm not sure it's worth the effort, though;
it's not a complete solution, because getrow(), for example, gives
a reference to a list. And it's harder to shoot yourself in the foot
in Python than it is in C. Think, for example, of the classic mistake
if (x=1) {.
With that under our belts, let's take a look at s3.py, the complete
program. I've snuck in a few more changes, because this old C hacker
still is learning
1 #!/usr/bin/python 2 import cell_class 3 import sys 4 import string 5 6 # Constants 7 YEA = 1 8 NAY = None 9 10 # Rather than using "ord()" all over the place, do this: 11 MyAlphabet = '-123456789' 12 13 # becomes YEA if "-v" specified. 14 verbose = NAY 15 16 # We have to define this somewhere. 17 Inconsistent = "formerly 'Unknown_with_no_hope'" 18 19 cells = [] 20 21 # step 1 22 def loadit(input): 23 global cells 24 for i in range(0,81): cells.insert(i,cell_class.Cell(i)) # 1.1 25 26 # Here, any cell's set_of_possibles should be full 27 28 which_cell = 0 29 for one_input_line in input.readlines(): 30 for char in one_input_line: 31 if char in '\t\n ': continue 32 if char in '-.': 33 which_cell = which_cell + 1 34 elif char in MyAlphabet: 35 cells[which_cell].setvalue(string.index(MyAlphabet,char)) 36 which_cell = which_cell + 1 37 if which_cell > 81: 38 raise too_much_input 39 pass # so the debugger can break here 40 input.close() 41 42 43 def solve_simply(): 44 global cells 45 something_was_set = 1 46 while something_was_set: 47 something_was_set = 0 48 unknown_cells = filter(lambda a: not a.known(), cells) 49 50 # 2.3 quit when all cells are known 51 if len(unknown_cells) == 0: break ## Done! All known. 52 53 for acell in unknown_cells: 54 55 # 2.1 only one member in set_of_possibles? 56 possible_digits = acell.possibles() 57 if len(possible_digits) == 1: 58 acell.setvalue(possible_digits[0]) 59 something_was_set = 1 60 continue 61 62 # 2.4 unknown but no possibilities? 63 if len(possible_digits) == 0: 64 raise Inconsistent 65 66 # 2.2 any digit in only one cell in a row, column or submatrix? 67 global dig 68 for dig in possible_digits: 69 whohas = filter(lambda a: a.could_be(dig), acell.getrow()) 70 if len(whohas) == 1: break # unique in row 71 whohas = filter(lambda a: a.could_be(dig), acell.getcolumn()) 72 if len(whohas) == 1: break # unique in col 73 whohas = filter(lambda a: a.could_be(dig), acell.getsubmatrix()) 74 if len(whohas) == 1: break # unique in sub 75 else: # "acell" has no unique digit (no "break" hit) 76 continue 77 78 # Unique! Set that value 79 acell.setvalue(dig) 80 something_was_set = 1 81 pass # end of "for acell in unknown_cells" 82 pass # end of "while something_was_set" 83 84 # print len(unknown_cells), "unknown cells" 85 return unknown_cells 86 87 88 # Code for step 3 89 def dumpit(): 90 for bor in range(0,81,9): 91 for i in range(bor,bor+9): 92 print MyAlphabet[cells[i].getvalue()], # '-' if unknown 93 print # end of this row 94 pass # end of dumpit() 95 96 97 # return YEA if solved. Stop at first solution. 98 def solve_completely(level): 99 try: 100 unknown_cells = solve_simply() 101 except Inconsistent: # part of 2.6.4 102 return NAY 103 104 if len(unknown_cells) == 0: return YEA # declare victory 105 106 # 2.6, create a guess 107 if verbose: 108 print "level =", level, ";", len(unknown_cells), "cells unknown" 109 dumpit() 110 111 # 2.6.1: Make a copy: a list of objects returned from cell_class.getbackup() 112 mycopy = [] 113 for acell in cells: 114 mycopy.append(acell.getbackup()) 115 116 # unknown_cells.sort( 117 # lambda c1, c2: cmp(len(c1.possibles()), len(c2.possibles()))) 118 mycell = unknown_cells[0] # ... shortest set_of_possibles 119 myset = mycell.possibles() 120 if verbose: print "L=", level, "cell", mycell.getpos(), myset 121 for myval in myset: 122 if verbose: print "L=", level, "; cell[", mycell.getpos(), "] =", myval 123 mycell.setvalue(myval) # 2.6.2 124 if solve_completely(level + 1): # 2.6.3: call myself 125 return YEA # 2.6.5: It worked; we're done! 126 127 # 2.6.4: It didn't work. Restore my backup copy. 128 for i in range(81): 129 cells[i].restore(mycopy[i]) 130 131 pass # Now try the next value in myset 132 return NAY # None of the values worked; tell caller 133 134 135 def doit(argv): 136 global verbose # 'cause we're going to modify it 137 if len(argv) > 1 and argv[1] and argv[1] == '-v': 138 verbose = YEA 139 del argv[1] 140 141 if len(argv) > 1 and argv[1]: 142 input = open(argv[1], 'r') 143 else: 144 input = sys.stdin 145 146 loadit(input) # step 1 147 solve_completely(0) # step 2 148 dumpit() # step 3 149 150 151 # main begins here 152 if __name__ == '__main__': doit(sys.argv)
In lines 7-8 I create YEA and NAY, because Python, like C, doesn't have
intuitive names for "yes" and "no". Python does have "1" and "None", but
their meanings aren't as obvious to me as YEA and NAY are. I suppose I
could have used TRUE and FALSE, but I've run into header file problems in
C with them, so I got out of the habit.
A reader of Part 2 pointed out that I was making a lot of calls to
ord(). This version uses the string MyAlphabet; see lines 11, 34-35, 90.
The change in line 90 will help if we want to enhance the program to
deal with 16x16 Sudoku puzzles, in which they use digits and letters, not
only digits.
In lines 34-35, I want to do the equivalent of
MyAlphabet.index(char), which would work if MyAlphabet were a list.
But I have to use string.index() instead, so I need to import Python's
built-in string package, which is done in line 4. I prefer using a
string here, because Python strings are "immutable". That is, I can't change the
contents in place, as I could with a list.
I added a "verbose" flag (line 14) that can be, you guessed it, YEA
or NAY. If YEA, the program gives a little extra output so you can see
what's going on.
In Step 2.4, I want to raise the "Inconsistent" signal and catch it
in certain cases. But to do that, Python requires me to define it,
which I do in line 17.
Now we come to the major restructuring. Where s2.py had one large
function, doit(), that contained Steps 1, 2 and 3, s3.py breaks this up
into functions. The functions are loadit() for Step 1, solve_simply()
for Steps 2.1-2.5 and dumpit() for Step 3.
Why did I choose to do this breakup, which incidentally wasn't hard to do?
Because we might have to make multiple guesses, and each guess could cause
a recursive call into that same routine in order to (possibly) make another
guess. I explain further below.
The code for loadit() is basically the same as before, except for the
use of string.index, mentioned above. Also, we now read the input lines
using more typical Python idioms.
The code for Steps 2.1-2.5 now is contained in solve_simply(), lines 43-85.
It's basically unchanged, except that I renamed the signal we raise
when we discover that the puzzle is inconsistent, usually because
of an incorrect guess. Also, solve_simply() returns unknown_cells, line
85. That is, it returns a list of Cell instances whose values are not
yet known.
Step 3 now is implemented by dumpit() in lines 89-94.
The next routine, solve_completely(), basically embodies all the parts
of Step 2.6, It takes a single parameter, "level", which basically says
how many guesses we've had to make on this path through the puzzle. The
first thing it does is call solve_simply(). If solve_simply()
runs into an inconsistency, it raises the Inconsistent signal.
This causes execution to resume at line 102, and solve_completely()
returns right there.
Typically, however, execution proceeds to line 104. For many
puzzles, solve_simply() returns an empty list, so solve_completely()
returns YEA. In other words, except for very hard puzzles, s3.py
acts exactly like s2.py.
But for those harder puzzles, we get to line 104 only to discover that
there are 54 unknown cells. This is where the fun starts. In lines
111-114, we execute Step 2.6.1. That means we we create a backup copy
of each cell's current value and set_of_possibles, using getbackup(),
as described above.
In line 118, we choose a cell from unknown_cells[]. We're going to make
a guess at the correct value to assign to "mycell".
Before that, lines 116-117 sorted unknown_cells[], based on the length of
each cell's set of possible values. This seemed like a good idea at the
time I wrote it, but only because it assumes that sorting is free. Doing
the sort makes the program take nearly three times as long to solve the
hardest-known puzzle. This is true even if I change line 117 to use
".XYZset_of_possibles" instead of ".possibles()".
The loop in lines 121-131 embodies the core of the guessing algorithm,
Steps 2.6.2 through 2.6.5. Basically it plows through each value in
myset[] (line 121), assigning it to "mycell" (123) and calling itself
(124) with an incremented value for "level". When that happens, the new
instance of solve_completely() calls solve_simply at line 100. If an
inconsistency is found, line 102 is hit. We resume execution in the old
instance at line 127, where we restore the backup copy in lines 128-129.
We then try the next value in myset[] by running the next pass of the
loop, starting at line 122.
If no inconsistency is found, however, the new instance proceeds as the
old one did. It chooses a cell from unknown_cells[], saves a copy of
each cell's value and set_of_possibles, tries each value in the set
by calling itself and so on.
I should mention here, in case it's not obvious, that each time
solve_completely() calls itself, it gets a new set of local
variables--unknown_cells, mycell, myset and so on. These local variables do
not interact across calls. Each instance or each level of recursion is
independent of the others.
As with all recursive routines, we need to be careful of a couple of
things. First, we need to make sure the "base cases" are covered, which
are those cases where you don't want the routine to call itself at all.
On this score, we're okay because, as mentioned above, for the simple case, s3.py behaves
exactly like s2.py did. It calls solve_simply(), and if that solves the
puzzle, the puzzle is solved.
The second thing to do in recursion is to make sure that you really do get
to a base case. In other words, make sure the recursion terminates so you
don't have an infinite descent. In this aspect too, solve_completely() is safe,
because on each pass, we assign a value to a formerly unknown cell (Step 2.6.2,
line 123) before calling solve_completely() again. Thus, we are guaranteed that
solve_simply (at line 100) returns a strictly shorter list than it did before.
So those aspects of the recursion look reasonable.
If, at any level, we try all the values in myset and never get a
solution, when we get to line 132, it returns NAY. On a predecessor
level, this should result in another value being tried. So the program
eventually should get to a solution, provided that a solution exists--which
it may not, if you mis-type the puzzle.
The doit function begins at line 135, where you can see how this all of
fits together. The new doit() function:
- decides if we're going to run with verbose YEA or
NAY (137-139) - figures out where the puzzle input is coming from
(141-144) - pulls the puzzle into the cells[] array
(146) - calls the complete solver and prints the result
(147-148)
So, does it work?
% cat hardest-known
- - - - 7 - 9 4 -
- 7 - - 9 - - - 5
3 - - - - 5 - 7 -
- 8 7 4 - - 1 - -
4 6 3 - - - - - -
- - - - - 7 - 8 -
8 - - 7 - - - - -
7 - - - - - - 2 8
- 5 - 2 6 8 - - -
% ./s3.py hardest-known
2 1 5 8 7 6 9 4 3
6 7 8 3 9 4 2 1 5
3 4 9 1 2 5 8 7 6
5 8 7 4 3 2 1 6 9
4 6 3 9 8 1 7 5 2
1 9 2 6 5 7 3 8 4
8 2 6 7 4 3 5 9 1
7 3 4 5 1 9 6 2 8
9 5 1 2 6 8 4 3 7
%
Sure enough, it does.
Now this program is quite short and does only a small part of the
Sudoku problem. That is, it only solves puzzles; it doesn't create them.
It assumes that a puzzle has only one solution; it doesn't try to verify
that by looking for multiple solutions. And it takes a "dumb" approach
to the puzzle; it starts with any old cell and assigns any old value.
But it does accomplish what we set out to do--solve any
Sudoku puzzle under the sun.
Although the program isn't long--less than 200 lines, not
counting blank and all-comment lines--we used a number of Python's
features, included:
- defining and using a class
- elements in a class--data structures and member
functions or methods - the __init__ method
- some pitfalls of copying lists
- importing modules
- operations on data structures: strings,
lists - loops: "for" and "while"
- using "global"
- some functional programming features: filter and
lambda - opening, reading and closing a file
- raising and catching an exception
- elementary argument (sys.argv) handling
- recursion
- elementary debugger operations
This experience has been a lot of fun. Thanks for joining me on this
ride.
Appendix B: Debugging Step 2.6
The code here is rather long, and some of my debugging required a lot
of printouts. I'm not going to give you a complete code listing so
you can repeat the exact same mistakes on your Linux box. Instead I'll
give you a summary. My most recent mistake, or "bug" if you will,
is detailed in Appendix C. Also, this part is based on an older version,
before cell_class.py was separated from s3.py. So the line numbers
may not make much sense.
a. When I tried to run my original version, I got
% python s3_orig.py 1109.puz
File "s3_orig.py", line 141
def save_a_copy()
^
SyntaxError: invalid syntax
%
I forgot the : at the end. I had the same problem a few lines later.
D'oh!
b. Next, I hit this:
% python s3_orig.py 1109.puz
Traceback (most recent call last):
File "s3_orig.py", line 197, in ?
doit(sys.argv) ## COMMENT OUT FOR DEBUGGING
File "s3_orig.py", line 193, in doit
solve_completely(0) # step 2
File "s3_orig.py", line 159, in solve_completely
except Inconsistent: # part of 2.6.4
NameError: global name 'Inconsistent' is not defined
%
I should have remembered this from my work on the Coconuts problem. I added
this earlier in the program:
# We have to define this somewhere.
Inconsistent = "formerly 'Unknown_with_no_hope'"
c. With that fixed, I got
% python s3.py 1109.puz
Traceback (most recent call last):
File "s3.py", line 199, in ?
doit(sys.argv) ## COMMENT OUT FOR DEBUGGING
File "s3.py", line 195, in doit
solve_completely(0) # step 2
File "s3.py", line 160, in solve_completely
unknown_cells = solve_simply()
TypeError: solve_simply() takes exactly 1 argument (0 given)
%
That's dumb! At some point I thought solve_simply was going to take
a "lev" parameter, but there's really no point. So, I fixed it like
this:
% diff s3_orig.py s3.py
58a59,60
> # We have to define this somewhere.
> Inconsistent = "formerly 'Unknown_with_no_hope'"
86c88
< def solve_simply(lev):
---
> def solve_simply():
%
d. Then, I got a name problem with cells:
% python s3.py 1109.puz
Traceback (most recent call last):
File "s3.py", line 199, in ?
doit(sys.argv) ## COMMENT OUT FOR DEBUGGING
File "s3.py", line 195, in doit
solve_completely(0) # step 2
File "s3.py", line 160, in solve_completely
unknown_cells = solve_simply()
File "s3.py", line 92, in solve_simply
unknown_cells = filter(lambda a: not a.known(), cells)
NameError: global name 'cells' is not defined
%
What caused that? Formerly, "cells" was defined in the doit()
function. When I split doit() into three parts, I forgot to
declare cells in the global space and to say "global cells" in
every function that modifies it.
e. With that fix, it worked for the simple case.
% python s3.py 1109.puz
1 5 2 3 4 8 7 6 9
8 9 4 5 6 7 3 2 1
3 6 7 2 1 9 5 8 4
6 3 1 7 9 4 2 5 8
4 8 5 6 3 2 1 9 7
7 2 9 1 8 5 6 4 3
5 7 8 4 2 1 9 3 6
2 4 6 9 7 3 8 1 5
9 1 3 8 5 6 4 7 2
%
f. Now what happens when we get to a harder puzzle?
% python s3.py hardest-known
Traceback (most recent call last):
File "s3.py", line 203, in ?
doit(sys.argv) ## COMMENT OUT FOR DEBUGGING
File "s3.py", line 199, in doit
solve_completely(0) # step 2
File "s3.py", line 181, in solve_completely
if solve_completely(level + 1): # 2.6.3: call myself
File "s3.py", line 181, in solve_completely
if solve_completely(level + 1): # 2.6.3: call myself
File "s3.py", line 181, in solve_completely
if solve_completely(level + 1): # 2.6.3: call myself
File "s3.py", line 181, in solve_completely
if solve_completely(level + 1): # 2.6.3: call myself
File "s3.py", line 181, in solve_completely
if solve_completely(level + 1): # 2.6.3: call myself
File "s3.py", line 181, in solve_completely
if solve_completely(level + 1): # 2.6.3: call myself
File "s3.py", line 180, in solve_completely
mycell.setvalue(myval) # 2.6.2
File "s3.py", line 35, in setvalue
if val not in self.set_of_possibles: raise setting_impossible_value
NameError: global name 'setting_impossible_value' is not defined
%
That didn't work so well. I tried running verbose, which didn't help
too much. So I added one more output:
myset = mycell.set_of_possibles[:]
+ if verbose: print "L=", level, "cell", mycell.pos, myset
for myval in myset:
if verbose: print "L=", level, "; cell[", mycell.pos, "] =", myval
mycell.setvalue(myval) # 2.6.2
That didn't help either, so I had to run the debugger. First, I commented out
"doit". Then, in order to catch the signal, I put the "raise" statements on
their own lines, like this:
32 # setvalue is used for 1.2.2 33 def setvalue(self, val): 34 # a couple of sanity checks 35 if val not in range(1,9+1): 36 raise val_must_be_between_1_and_9 37 if val not in self.set_of_possibles: 38 raise setting_impossible_value ### BREAKPOINT HERE ### 39 if self.known(): 40 raise setvalue_called_but_already_known
Let's see how it goes:
% python
Python 2.4 (#1, Mar 22 2005, 21:42:42)
[GCC 3.3.5 20050117 (prerelease) (SUSE Linux)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import s3
>>> import pdb
>>> pdb.run('s3.doit(["whatever", "-v", "hardest-known"])')
> <string>(1)?()
(Pdb) b s3.doit
Breakpoint 1 at /home/collin/sudoku-article/s3.py:194
(Pdb) n
> /home/collin/sudoku-article/s3.py(196)doit()
-> if len(argv) > 1 and argv[1] and argv[1] == '-v':
(Pdb) b 38
Breakpoint 2 at /home/collin/sudoku-article/s3.py:38
(Pdb)
Much debugging output later, I got this:
L= 6 cell 24 [2, 6]
L= 6 ; cell[ 24 ] = 2
L= 6 ; cell[ 24 ] = 6
This is the first time in execution that we are trying to set
the same cell to a second value.
> /home/collin/sudoku-article/s3.py(38)setvalue()
-> raise setting_impossible_value
(Pdb) p self
<s3.Cell instance at 0x4049398c>
(Pdb) p self.pos
24
(Pdb) p self.set_of_possibles
[]
Wait a minute, in that new line I added for the verbose
output, it printed "myset" as containing [2, 6]. Why is
it now saying that cell 24 can't be that?
(Pdb) where
/usr/lib/python2.4/bdb.py(366)run()
-> exec cmd in globals, locals
<string>(1)?()
/home/collin/sudoku-article/s3.py(206)doit()
-> solve_completely(0) # step 2
/home/collin/sudoku-article/s3.py(188)solve_completely()
-> if solve_completely(level + 1): # 2.6.3: call myself
/home/collin/sudoku-article/s3.py(188)solve_completely()
-> if solve_completely(level + 1): # 2.6.3: call myself
/home/collin/sudoku-article/s3.py(188)solve_completely()
-> if solve_completely(level + 1): # 2.6.3: call myself
/home/collin/sudoku-article/s3.py(188)solve_completely()
-> if solve_completely(level + 1): # 2.6.3: call myself
/home/collin/sudoku-article/s3.py(188)solve_completely()
-> if solve_completely(level + 1): # 2.6.3: call myself
/home/collin/sudoku-article/s3.py(188)solve_completely()
-> if solve_completely(level + 1): # 2.6.3: call myself
/home/collin/sudoku-article/s3.py(187)solve_completely()
-> mycell.setvalue(myval) # 2.6.2
> /home/collin/sudoku-article/s3.py(38)setvalue()
-> raise setting_impossible_value
(Pdb) up
> /home/collin/sudoku-article/s3.py(187)solve_completely()
-> mycell.setvalue(myval) # 2.6.2
(Pdb) l
182 mycell = unknown_cells[0] # ... shortest set_of_possibles
183 myset = mycell.set_of_possibles[:]
184 if verbose: print "L=", level, "cell", mycell.pos, myset
185 for myval in myset:
186 if verbose: print "L=", level, "; cell[", mycell.pos, "] =", myval
187 -> mycell.setvalue(myval) # 2.6.2
188 if solve_completely(level + 1): # 2.6.3: call myself
189 return YEA # 2.6.5: Done.
190 # 2.6.4: It didn't work. Restore and try the next value
191 restore_the_copy(mycopy)
192 return NAY # Propagate "didn't work" back
(Pdb) p mycopy[24]
<s3.Cell instance at 0x4049398c>
(Pdb) p mycopy[24].pos
24
(Pdb) p mycopy[24].set_of_possibles
[]
(Pdb) p cells[24].set_of_possibles
[]
(Pdb)
My copying code is wrong. How wrong is it?
(Pdb) p cells[24].value
2
(Pdb) p mycopy[24].value
2
(Pdb) p cells[24]
<s3.Cell instance at 0x4049398c>
(Pdb) p mycopy[24]
<s3.Cell instance at 0x4049398c>
It's totally and completely wrong! See how cells[24]
and mycopy[24] refer to the exact same address?
This
tutorial suggests that maybe a shallow copy could work:
(Pdb) import copy
(Pdb) foo = copy.copy(cells[1])
(Pdb) foo.set_of_possibles = foo.set_of_possibles[:]
(Pdb) print cells[1]
<s3.Cell instance at 0x404934ac>
(Pdb) p foo
<s3.Cell instance at 0x4049cc0c>
(Pdb) p foo.set_of_possibles
[1, 2]
(Pdb) p cells[1].set_of_possibles
[1, 2]
(Pdb) cells[1].setvalue(1)
(Pdb) p cells[1].set_of_possibles
[]
(Pdb) p foo.set_of_possibles
[1, 2]
(Pdb)
Yes, that should work fine. So try this:
# for 2.6.1: maybe overkill since we need only value and
# set_of_possibles
def save_a_copy():
the_copy = []
for acell in cells: # have to copy the set of possibles, not just a ref.
acopy = copy.copy(acell)
acopy.set_of_possibles = acell.set_of_possibles[:]
the_copy.append(acopy)
return the_copy
Fools rush in where wise men fear to tread; uncomment the last
line and let 'er rip:
% python s3.py hardest-known
2 1 5 8 7 6 9 4 3
6 7 8 3 9 4 2 1 5
3 4 9 1 2 5 8 7 6
5 8 7 4 3 2 1 6 9
4 6 3 9 8 1 7 5 2
1 9 2 6 5 7 3 8 4
8 2 6 7 4 3 5 9 1
7 3 4 5 1 9 6 2 8
9 5 1 2 6 8 4 3 7
%
Although that worked, it's not so nice, because both
save_a_copy() and restore_the_copy() refer to fields in the
class. It's violating the object-oriented principle of
information hiding. Instead, let's hide the information
within the class and treat the backup copy as an opaque
object. We only need the value and the set_of_possibles,
so that's what we'll put into each opaque object. Add
these lines to s3cell_class.py:
def getbackup(self):
return [ self.value, self.set_of_possibles[:] ]
def restore(self,bkp):
self.value = bkp[0]
self.set_of_possibles = bkp[1] # See Appendix C on this.
Now, code save_a_copy and restore_the_copy to use the above routines:
# for 2.6.1: cell.getbackup() returns an opaque (to us) object for
# later use
def save_a_copy():
the_copy = []
for acell in cells:
the_copy.append(acell.getbackup())
return the_copy
# for 2.6.4: restore, using the object given by cell.getbackup()
def restore_the_copy(acopy):
for i in range(81):
cells[i].restore(acopy[i])
pass # end of restore_the_copy
Appendix C: Why cell_class.restore() Must Restore a Slice
We were discussing these methods:
62 def getbackup(self): 63 return [ self.XYZvalue, self.XYZset_of_possibles[:] ] 64 def restore(self,bkp): 65 self.XYZvalue = bkp[0] 66 self.XYZset_of_possibles = bkp[1][:]
I mentioned that restore originally was coded as
def restore(self,bkp):
self.value = bkp[0]
self.set_of_possibles = bkp[1]
which caused some subtle problems. What kind of problems might this
cause? My original thinking was this: for getbackup(), the "original"
list--the one in cells[]--is about to change, hence we need to make
a copy of the list. But we don't need to do that on restore(),
because after all, what's going to happen to the backup copy?
We're simply going to restore from it again, aren't we?
That thinking has a fatal flaw. Recall that the backup copy was saved
only once (lines 112-114). The copy, and in particular the copy of
the set_of_possibles, is completely separate from the data structures in
cells[]. But what happens if the loop (lines 121-131) is executed
three or four times? Suppose, in other words, that at line 121,
myset contains [1,2,4]. We execute lines 123-124 with myval==1. Assuming
that we didn't find a solution, we execute the loop in
128-129. When restore() was coded the old way, the set_of_possibles
(XYZset_of_possibles) was set to reference the same list that's
in mycopy. We don't have a problem yet, because what's in cells[] does
match what was there before.
Now we execute 123-124 with myval==2. During the course of that
execution, the set_of_possibles[] in various cells is modified.
Because the old version of restore() copied references, at
this point we were modifying the lists stored in mycopy[]. Let's assume
that line 124 returns NAY a second time, and we execute lines 128-129.
The restore() call restores the old value that any cell had, but
it does absolutely nothing with a cell's XYZset_of_possibles, because
the cell's XYZset_of_possibles is a second reference to the
same list that mycopy[] has!
So, when we execute lines 123-124 with myval==4, each cell's
XYZset_of_possibles is set to the exact same set that it was after
the failed attempt that occurred when myval was 2. This results in spurious
executions of line 64--Inconsistent is raised--and the puzzle is not solved.
What hid this problem was the sort operation in s3.py:
116 # unknown_cells.sort( 117 # lambda c1, c2: cmp(len(c1.possibles()), len(c2.possibles()))) 118 mycell = unknown_cells[0] # ... shortest set_of_possibles 119 myset = mycell.possibles() 120 if verbose: print "L=", level, "cell", mycell.getpos(), myset 121 for myval in myset: 122 if verbose: print "L=", level, "; cell[", mycell.getpos(), "] =", myval
When lines 116-117 were uncommented, line 118 would set mycell
to a cell with a short set_of_possibles. In the case of the
hardest-known puzzle--the only puzzle I know of that actually requires
guessing--this means len(myset)==2 always. Therefore, the loop
(lines 122-131) is executed twice at the most. Because the loop never
is executed a third time in the same instance of solve_completely(),
we don't run into the case where a cell's XYZset_of_possibles[] loses
some values erroneously. Thus, the defect in restore() remained hidden.
So how did I come to find out about the defect? Well, the original
program executed the sort operation, and the program took some 25
seconds on my old Toshiba 460CDX laptop (Pentium MMX, bogomips 163).
I tried commenting out lines 116-117 and found that the hardest-known
puzzle was solved in about eight seconds. So I wondered what would
happen if I ran the recursion with the longest list first. I uncommented
lines 116-117 and changed the sort to arrange unknown_cells in descending
order, like this:
116 unknown_cells.sort( 117 lambda c2, c1: cmp(len(c1.possibles()), len(c2.possibles()))) 118 mycell = unknown_cells[0]
Notice that c1 and c2 are reversed in line 117. Then, line 118 gets
a cell with the longest set of possibles--six possible values,
actually. That's how I discovered the defect with the original version of
restore().
Collin Park works for Network Appliance, where he uses Linux on his
desktop and laptop computers. He does data recovery and other
Linux-related stuff at home, where he lives with his wife and their two
teenage daughters. All use Linux to meet their computing needs.










This week 5 lucky Members will receive a copy of The Official Ubuntu Server Book by Benjamin Mako Hill and Linux Journal's very own Kyle Rankin. No entry necessary. Check back here early next week to find out who the lucky Online Members are.




Comments
Great article!
Collin Park deserves commendation for this great article. I sure hope it will be published. When I read part1 and 2 of this article, I started buying the Linux Journal. (At my ripe age I like to read in bed, besides I spend too much time at the monitor already).
It is a courageous account of a C haker exploring the intricacies of Python. And an instructive account. I sure want to see it printed!
From down-under Australia,
With sincere best wishes to Collin, his better half and their two brighter-than-your daughters (but I would have to compare them to our granddaughters...)
OldAl.
I hope this doesn't get
I hope this doesn't get printed.
Pure code
Is it possible to get a pure code copy of you program. It's not as straghtforwards as I thought to remove the numbering when copy-pasting it directly from here.
unnumbered code
OK, I've put a copy here with a pointer back to part4 of the article.
Yes and No constants.
Python does actually have yes and no constants, they're True and False. None is closer to null rather than no.
Yes/No, YEA/NAY, True/False
I have rather older pythons. In Part 2 I mentioned using 1.5.x, which is the Python version on my Toshiba 460CDX laptop that I bought used in Japan in 1998 -- I wrote most of these articles on that old reliable laptop. But even 2.2 doesn't grok "False":
Python 2.2 (#1, Mar 26 2002, 15:46:04)
[GCC 2.95.3 20010315 (SuSE)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> def foo(urk):
... if urk==17: return True
... return False
...
>>> print foo(14)
Traceback (most recent call last):
File "", line 1, in ?
File "", line 3, in foo
NameError: global name 'False' is not defined
>>>
Almost there ...
That is because they were introduced in 2.2.1, not 2.2.0. ;)
Post new comment