of this article introduced the Python programming language
from the perspective of an old C hacker and presented a short program
to solve the "Coconuts" problem. Part 2
described the Sudoku puzzle and showed how a Sudoku-solving program might be
written in Python. The program would contain three steps:
- read in the values of the cells specified and draw some elementary
conclusions about what the blanks will be
- fill in the empty cells; that is, solve the puzzle
- print out the answer
Last time, we coded the sections of the algorithm for Step 1 and Step 3:
- 1.2: read in each value from the input
- 1.2.1: if unknown (i.e, a '-' or '.'), leave the cell
- 1.2.2: if known: a. set the data structure's digit (the "if known" part) to
that known value; andb. remove that value from the set of possible digits in the
rest of that cell's row, column and 3x3 sub-matrix.
We refer to this code as we refine Step 2 below, which involves
filling in the empty cells--solving the puzzle. The first thing
that came to my mind was something like this:
- 2.1: for each cell whose value we don't yet know, if
its set_of_possibles has only one member, assign that value to the cell and execute 1.2.2b
- 2.2: for each cell whose value we don't yet know, for each digit in
its set_of_possibles, if this cell is the only one
in the row, column or 3x3 sub-matrix that could be that value,
assign that value to the cell and execute 1.2.2b
- 2.3: repeat 2.1 and 2.2 until no assignments are made or all cells
are filled in
- 2.4: if any cell is unknown but its set_of_possibles is empty,
we have an inconsistent puzzle; raise a signal
- 2.5: if all cells are known, declare victory and go to
- 2.6: create a "guess"
Creating a guess involves the following steps:
- a. for each cell, save a copy of its value and its
- b. hypothetically fill in an unknown cell with the first
value in its set_of_possibles and execute 1.2.2b
- c. execute 2.1-2.6 recursively
- d. if we get an inconsistency, restore the copy
from 2.6.1 and then go back to 2.6.2. But, this time use the
next value in the set_of_possibles; repeat for all
values in the set_of_possibles unless...
- e. if 2.5 said "go to Step 3", then we're done (assume that
the puzzle has only one solution, which we've just found).
To see what I mean by 2.1 and 2.2, consider this picture of the above
puzzle, where I've labeled three of the "unknown" cells.
Let's apply step 2.1 to the cells marked "A" and "B" in the figure.
Cell A's row already has a 1, 2, 3 and 4. Its column has 3, 8 and 7.
So it looks like cell A legitimately could be 5, 6, or 9. So 2.1
doesn't help us with A.
Does 2.1 help us with B? Well, B's row has 4, 5, 7 and 8, while its column has
1, 3, 5, 7 and 9. Its submatrix (the grey area) has 3, 4, 5, 6 and 7. The only
possible value for B is 2. So Step 2.1 allows us to set cell B to 2.
Now let's apply step 2.2 to A. Because A could be 5, 6 or 9, let's
start with the value 5 and see if any other cell in A's row (that is, row 0)
also could have the value 5. Well, none of them can. Row 0, column 5 is
in the same (grey) submatrix as row 1, column 3 (=5) so row 0, column 5
cannot also be 5. Row 0 columns 6, 7 and 8 can't be 5, because row 2, column 6
is a 5. So row 0, column 1 is the only cell in its row that could be a 5.
Thus, Step 2.2 allows us to set cell A to 5.
Step 2.3 says to go through the cells until either all of the cells are filled
in or we've made a whole pass without filling in anything. Why do that?
Consider cell C. Before cell B is filled in, we might think that
cell C could be either 2 or 6. But once we set cell B to 2, C
no longer can be 2. Step 2.1 then allows us to set cell C to 6.
Step 2.4 is for the times when we have to make a guess (see Step 2.6).
Sometimes we will guess wrong, and we'll want to get out of that.
Now what's this about guessing? Well, Steps 2.1-2.5 are enough to solve
every puzzle I've tried that came from a newspaper. But there are some
puzzles, such as the one shown below, that require you to guess at some
point. Basically, when Steps 2.1 and 2.2 don't let you fill in anything
at all, we have to fill in a cell tentatively--we have to guess--and
see where it takes us. Sometimes, a guess leads to an inconsistency,
which is what Step 2.4 is for. In this case, we know our guess was
wrong and we have to forget it. We also must forget all the cells we
filled in, including any new guesses, based on that first wrong guess.
And that part about forgetting all the cells that we filled in based on the
wrong guess--that's why we need Steps 2.6.1 and 2.6.4.
But one thing at a time. For now, let's simply code Steps 2.1-2.5. As
I mentioned above, that's enough to solve all the Sudoku puzzles I've
tried from the newspaper.
So let me make a few revisions in the class Cell. I need to add a few access
functions, plus a test for whether a certain cell could have a given value.
While I'm at it, I should put the Cell class into a separate module. To do
that, I move everything in the class Cell clause from the main program
into a separate file, cell_class.py.
I'm also going to change the names of the fields in cell_class.py.
I want some fields to be modified only by functions that are part of the
class. So, although Python doesn't let you actually hide data, I can make
it my policy always to use the access routines introduced in Part 2 whenever
I want to access those fields. To that end, I took the names of all the
data fields and prefixed "XYZ" to them, as a kind of "Stop me before I modify
private data again!" tool. Then, in the main part of the program, I can make
sure there aren't any XYZ's and know I've been a good boy.
1 class Cell: 2 rows = [ , , , , , , , ,  ] # nine each... 3 columns = [ , , , , , , , ,  ] 4 submatrices = [ , , , , , , , ,  ] 5 6 # for step 1.1, do "cells[i]=Cell(i)" for i in range(0,81) 7 def __init__(self,pos): 8 # "pos" = position in the puzzle. Valid values: range (0,81) 9 global rows, columns, submatrices 10 if pos not in range(0,81): 11 raise Illegal_pos_in_Cells_initializer 12 self.XYZpos = pos 13 self.XYZvalue = 0 14 self.XYZset_of_possibles = range(1, 10) # 1-9 INclusive 15 16 # For step 1.2.2b, track which row, col, sub that I belong to. 17 myrow = int(pos / 9) 18 mycol = pos % 9 19 mysub = int(myrow/3) * 3 + int(mycol/3) 20 21 self.XYZrow = Cell.rows[myrow] 22 self.XYZcol = Cell.columns[mycol] 23 self.XYZsub = Cell.submatrices[mysub] 24 25 self.XYZrow.append(self) 26 self.XYZcol.append(self) 27 self.XYZsub.append(self) 28 29 def known(self): 30 return (self.XYZvalue != 0) 31 32 # setvalue is used for 1.2.2 33 def setvalue(self, val): 34 # a couple of sanity checks 35 if val not in range(1,9+1): 36 raise val_must_be_between_1_and_9 37 if val not in self.XYZset_of_possibles: 38 raise setting_impossible_value 39 if self.known(): 40 raise setvalue_called_but_already_known 41 42 self.XYZvalue = val # 1.2.2a 43 self.XYZset_of_possibles =  # make life easier in step2.6 44 45 # Now do 1.2.2b 46 for other in self.XYZrow + self.XYZcol + self.XYZsub: 47 if other is self: continue 48 if other.known(): continue 49 if val in other.XYZset_of_possibles: 50 # Though "remove" isn't in the book, it's not deprecated... 51 other.XYZset_of_possibles.remove(val) 52 53 def getvalue(self): return self.XYZvalue 54 def getrow(self): return self.XYZrow 55 def getcolumn(self): return self.XYZcol 56 def getsubmatrix(self): return self.XYZsub 57 def possibles(self): return self.XYZset_of_possibles[:] 58 def could_be(self,val): 59 if self.known(): return (val == self.XYZvalue) 60 return (val in self.possibles())
I slid a few more changes in, which I describe here. In lines 10-11
and 35-40, the "raise" statements are put on their own lines to make
debugging (with pdb) easier. For example, I could set a breakpoint on
line 36, and I'd hit that breakpoint only when I got an illegal value.
This is much more convenient than having a breakpoint on line 35 and
hitting it every time I check for a bad value.
Once a cell's value is decided, there is no point in speculating about
what other values it might have. Line 43 makes that explicit. It's
not really necessary, but the data structure feels a little cleaner this way.
Instead of an .index() and a del call, I put a single .remove() call
in line 51. Learning Python, my reference book of
choise, didn't say anything about .remove, but it is mentioned in on-line
documentation. I didn't see anything that saying .remove was deprecated.
Now, I'll do some minor surgery on what's left. The first few lines now
1 #!/usr/bin/python 2 import cell_class 3 import sys 4 def doit(argv): 5 cells =  6 for i in range(0,81): cells.insert(i,cell_class.Cell(i)) # 1.1 7 8 # Here, any cell's set_of_possibles should be full
Line 1 was added because I got tired of typing python
s2.py.... I wanted to tell the Linux kernel to run
s2.py as a Python program. Besides adding line 1, though, I also had
to set execute permission on s2.py, which I did like this:
% chmod +x s2.py
After that, I could type:
% ./s2.py 1109.puz
instead of explicitly typing python every time.
Line 2 says we want to use cell_class.py. Note, though, that we simply say
import cell_class; Python will add .py, so it
looks for a file named cell_class.py.
Finally, line 6 was changed from
for i in range(0,81): cells.insert(i,Cell(i)) # 1.1
for i in range(0,81): cells.insert(i,cell_class.Cell(i)) # 1.1
I did this because, as the tutorial explains, you have to give the name of the
module (cell_class in this case) as part of the name:
>>> import fibo
This does not enter the names of the functions defined in
fibo directly in the current symbol table; it only enters the
module name fibo there. Using the module name you can access
Initially, I forgot to do this; see Appendix A. I then consulted the tutorial
and saw the error of my ways.
Below is the complete s2.py program. It contains the code for Steps 1,
2.1-2.5 and 3 as described in this article series. It solves many, but
not all, Sudoku puzzles.
Note that there is no XYZ to be found, so the only accesses to
cell's data fields are in fact via the access routines. In other
words, I have tamed my inner hacker--for now, anyway. Step 2
code starts after line 30:
1 #!/usr/bin/python 2 import cell_class 3 import sys 4 def doit(argv): 5 cells =  6 for i in range(0,81): cells.insert(i,cell_class.Cell(i)) # 1.1 7 8 # Here, any cell's set_of_possibles should be full 9 10 if len(argv) > 1 and argv: 11 input = open(argv, 'r') 12 else: 13 input = sys.stdin 14 15 all_input_lines = input.readlines() 16 input.close() 17 18 which_cell = 0 19 for one_input_line in all_input_lines: 20 for char in one_input_line: 21 if char in '\t\n ': continue 22 if char in '-.': 23 which_cell = which_cell + 1 24 elif ord(char) in range (ord('1'), ord('9')+1): 25 cells[which_cell].setvalue(ord(char)-ord('0')) 26 which_cell = which_cell + 1 27 if which_cell > 81: raise too_much_input 28 pass # so the debugger can break here 29 30 # Step 2 should go here 31 something_was_set = 1 32 while something_was_set: 33 something_was_set = 0 34 unknown_cells = filter(lambda a: not a.known(), cells) 35 36 # 2.3 quit when all cells are known 37 if len(unknown_cells) == 0: break ## Done! All known. 38 39 for acell in unknown_cells: 40 41 # 2.1 only one member in set_of_possibles? 42 possible_digits = acell.possibles() 43 if len(possible_digits) == 1: 44 acell.setvalue(possible_digits) 45 something_was_set = 1 46 continue 47 48 # 2.4 unknown but no possibilities? 49 if len(possible_digits) == 0: raise Unknown_with_no_hope 50 51 # 2.2 any digit in only one cell in a row, column, or submatrix? 52 global dig 53 for dig in possible_digits: 54 whohas = filter(lambda a: a.could_be(dig), acell.getrow()) 55 if len(whohas) == 1: break # unique in row 56 whohas = filter(lambda a: a.could_be(dig), acell.getcolumn()) 57 if len(whohas) == 1: break # unique in col 58 whohas = filter(lambda a: a.could_be(dig), acell.getsubmatrix()) 59 if len(whohas) == 1: break # unique in sub 60 else: # "acell" has no unique digit (no "break" hit) 61 continue 62 63 # Unique! Set that value 64 acell.setvalue(dig) 65 something_was_set = 1 66 pass # end of "for acell in unknown_cells" 67 pass # end of "while something_was_set" 68 69 print len(unknown_cells), "unknown cells" 70 71 # Code for step 3 72 for bor in range(0,81,9): 73 for i in range(bor,bor+9): 74 if cells[i].known(): print cells[i].getvalue(), # usual case 75 else: print '-', # unknown 76 print # end of this row 77 78 # main begins here 79 if __name__ == '__main__': doit(sys.argv)
Let me explain some of the less obvious parts. Lines 31-33 set up the
main loop for Steps 2.1-2.5, where I try to deduce the value of the
unknown cells using, well, Steps 2.1 and 2.2. When I go through the
entire list of unknown cells and can't determine any value, it's time
to give up rather than spin my wheels. So we try, and each time we
set the value of a cell (with setvalue()), we make
something_was_set "true"; see lines 44-45 and 64-65.
Line 34 uses both "filter" and "lambda". What I'm trying to do in this
line is create a list of cells whose values we don't know. These are the
cells that we'll look at in Steps 2.1 and 2.2. Now, as the tutorial
"filter(function, sequence)" returns a sequence (of the same type,
if possible) consisting of those items from the sequence for which
function(item) is true. For example, to compute some primes:>>> def f(x): return x%2 != 0 and x%3 != 0 ... >>> filter(f, range(2, 25)) [5, 7, 11, 13, 17, 19, 23]
The example from the tutorial uses "f" for the function. What if I don't have
a convenient function "f" lying around, and I'm too lazy to define
one? After all, the function--I might call it "cell_is_unknown"
or something--is going to be used in only one place. Well, it turns
out that Python offers a way to create an anonymous function using
this lambda thing. In line 34, lambda a: not
a.known() means we have a function that returns "true" whenever its single
argument refers to a cell of unknown value. So, the effect of line
34 is to put the unknown cells (from the array "cells") into the
If there are no unknown cells, the puzzle is solved already,
so we should declare victory and go print out the answer (Step 2.3,
line 37). Otherwise, we look at each unknown cell in the loop
Lines 42-46 implement Step 2.1. If a given cell can have only one
value, we go ahead and set that value (line 44) and then tel the loop
that we did something so it's worth trying again (line 45).
Line 49 is going to be important to Step 2.6. Here, the basic idea is:
if any of the cells isn't known, but it can't be anything, then the
puzzle--as it is right now--can't be solved, because we can't fill in the value for this cell.
This is coded as an exception because, at least before we code Step 2.6,
it shouldn't ever happen.
Lines 52-60 check to see if, for any of the possible values
this cell could have, nobody else in this cell's row (lines 54-55),
column (lines 56-57) or submatrix (lines 58-59) possibly could
hold that value. If so, we execute lines 64-65 and set that value.
I didn't realize that I would need to declare "dig" as
a global in line 52. It's needed because otherwise the
"lambda" in line 54 (or 56 or 58) wouldn't know about "dig", due to
the Python naming rules. In Learning Python,
Lutz and Ascher refer to this as the "LGB" rule--Local, Global, Built-in, the three naming
scopes. Line 53 knows about dig because it's local (to the present
function), but the lambdas don't. They are lexically separate from
the module or function that creates them.
At line 69, the program tells you how many cells are still unknown.
I may remove this later. Line 79 is a way that the program can
run with or without the debugger. (A reader provided this tip.)
So, let's apply this program to the puzzle shown in Part 2 of this
article. The file 1109.puz is an ASCII representation of that puzzle:
% cat 1109.puz 1 - 2 3 4 - - - - - - - 5 6 7 - - - - - 7 - - - 5 8 4 6 3 - 7 - 4 - - - 4 8 - - - - - 9 7 - - - 1 - 5 - 4 3 5 7 8 - - - 9 - - - - - 9 7 3 - - - - - - - 5 6 4 - 2 %
When we run s2.py on that file, we get:
% ./s2.py 1109.puz 0 unknown cells 1 5 2 3 4 8 7 6 9 8 9 4 5 6 7 3 2 1 3 6 7 2 1 9 5 8 4 6 3 1 7 9 4 2 5 8 4 8 5 6 3 2 1 9 7 7 2 9 1 8 5 6 4 3 5 7 8 4 2 1 9 3 6 2 4 6 9 7 3 8 1 5 9 1 3 8 5 6 4 7 2 %
As before, in the spirit of full disclosure, my debugging diary appears in
Appendix A below.
Now, does this program solve all possible puzzles? It does solve all
of the ones I tried from the newspaper, but it doesn't solve this one:
- - - - 7 - 9 4 - - 7 - - 9 - - - 5 3 - - - - 5 - 7 - - 8 7 4 - - 1 - - 4 6 3 - - - - - - - - - - - 7 - 8 - 8 - - 7 - - - - - 7 - - - - - - 2 8 - 5 - 2 6 8 - - -
According to some of the postings on
Sudoku.com, it's the hardest known
puzzle having a unique solution. It's certainly too hard for my
simple program above.
Appendix A: Debugging Steps 2.1-2.5
As I said before, the working code for steps 2.1-2.5 didn't just
happen. My first attempt to run the code looked something like this:
% python s2_orig.py 1109.puz Traceback (innermost last): File "s2_orig.py", line 79, in ? doit(sys.argv) ## COMMENT OUT FOR DEBUGGING File "s2_orig.py", line 53, in doit whohas = filter(lambda a: cell_could_be(a,dig), acell.getrow()) File "s2_orig.py", line 53, in <lambda> whohas = filter(lambda a: cell_could_be(a,dig), acell.getrow()) NameError: dig %
I had forgotten to declare "dig" as global, so the lambda expression in
line 53 (now 54) couldn't find it; the remedy was provided here:
52 global dig
After fixing that, there was another problem:
% python s2_orig.py 1109.puz Traceback (innermost last): File "s2_orig.py", line 80, in ? doit(sys.argv) ## COMMENT OUT FOR DEBUGGING File "s2_orig.py", line 61, in doit unknown_cells = unknown_cells + 1 TypeError: illegal argument type for built-in operation %
What was I thinking? unknown_cells is a list!
I probably thought I was counting up the number of unknown cells. So, I
simply deleted the line, because we're using the length of
"unknown_cells"; the length is recomputed each time.
Collin Park works for Network Appliance, where he uses Linux on his
desktop and laptop computers. He does data recovery and other
Linux-related stuff at home, where he lives with his wife and their two
teenage daughters. All use Linux to meet their computing needs.