Haskell Sudoku

This post is incredibly technical – it is software engineer at a deep level.  I’m not sorry.

I have a working Haskell program to solve Sudoku puzzles; right now, the board name is hard-coded, but I’m at least reading from a file, and I am quite sure there are boards I cannot solve – given my method is incomplete.
But next I am going to work on a program to run tests fr. outside Haskell, and other improvements.

The code is 17 lines in the main file (plus forty lines of “test” code); and SBoard.hs is 261 lines total. Finally, 24 lines of file reading: one main function, one sub-function and a one-liner for typed Integer square root of an Int

*BoardIO> :t sqrtInt
sqrtInt :: (Integral b, Integral a) => a -> b

(I’m no longer sure why I did that, but here’s the actual code:

sqrtInt i = floor $ sqrtFloat $ 1.0 * (fromIntegral i)

Now then, the primary function is:

main =
      b <- doRead "n.sdk"
      -- print b
      -- return (solve b)
      solv_dbg b

As you can see, I orignally called solve, which takes a Board as input and returns a Board, but since main has an IO monad, I needed the return there — which has nothing to do with “return” as most programmers know it and love it.

Sans putStrLn and print expressions, here is my main function for solving:

solve b =
  let b1 = calcUniqs b
       b2 = calcOpen b1
   if b2 == b
      then b
     else solve b2


To examine the function to calculate “unique” cells (by that I mean cells where only one value is posssible – being forced by being unique), first I need to example this utility function

    calculate a cell - if it's already known, done; 
         if there's only one possibility, done
    otherwise, it's zero for now
calc_cell c =
  let d = fst c
       u = snd c
     if d > 0
        then d
       else if length u == 1
               then head u
              else 0

This is used in the “final” expression, below.  Given tuples (a, [a]) wehere a is an Ord and also a Num (due to its usage in the function); the first is the current value of the cell (zero = blank), and the second part is the list of “possible” values for this cell — it’s possible, given the calculation of this list, that there may be multiple values even though the cell is already solved, but this function doesn’t care.  The logic is,

If this “cell” is already calculated, then we’re done, return that value; else, if the list (of possible values) length is 1, then that is the new value; otherwise, the value is (still) unknown/blank, so return zero.

I should explain, this “fell out” naturally from the way I decided to encode the Board itself, and also the resulting possible values and so on.

type Symbols = [Char]
data Board = B Symbols Int [String]

A Board is, initially, a list of symbols (identical to a string, by the way), an integer which is the square root of symbols (for a 9×9 board, this is 3), then a list of Strings (technically, I ought to make that a list of symbols, I suppose…)

One very important helper function converts the list of strings — each of which is one row — into a list of lists of integers;


The following is the “heart” of most of the calcs that follow (and precede)

   convert a board into an array of "cells,"
   where each cell is (d, [Ints]) - if d is zero, the cell is open
   the list of integers represents the possible* values there
   (* note that is not true if d is non-zero, but in that case, we don't care)
toSolve b =
        rs = arrFromBoard b
        cs = transpose rs
        ps = patches b
        unf_r = map avail rs
        unf_c = map avail cs
        unf_p = map avail ps
        sq = square b
        unf = map (\x -> map (intersect x) unf_c) unf_r
        -- the following creates an "array" (list of lists)
        -- which is the same dimensions as arrFromBoard
        idx_p = map (\r -> map (\c -> sq * (sect r sq) + (sect c sq))
                           [0..(sq*sq-1)]) [0..(sq*sq-1)]
        cmplx = map (\r -> zip (fst r) (snd r)) $ zip unf idx_p
        unf2 = map (\r -> map (\i -> intersect (fst i) (unf_p !! snd i) ) r) cmplx
        map (\r -> zip (fst r) (snd r)) $ zip rs unf2

From the inside out, the underlined map takes a list (e.g. “[0..8]”) of column indicies and returns a list of indicies of the patch for this “cell” (r, [0..8]).  That map is within a func. mapped over a list of row indicies

At the end of this, idx_p is a list of lists which represents the patch index of the corresponding cell in the board array.

So, the let section of this calculates the rows (rs), then the columns (cs), then the patches (ps) — my term for the sub-squares, 3×3 on a 9×9 board; the remaining values per row (unf_r), column (unf_c) and patch (unf_c).  Next, unf represents the “intersection” of remaining values per row and column. 

Now “cmpx” is an array of (p, [Int]) tuples – same dimensions as the Sudoku board — twice is use the “pattern”

   map (\r (zip (fst r) (snd r)) $ zip <something> <else>

Where <something> and <else> represent matching arrays; in one case, “elements” of <something>” are lists, in the other, elements of <else> are lists.   The result is an array of tuples (s_i, e_i) where each “s” and each “e” are an element of <something> and <else> respectively.
In both cases, <something> and <else> are lists of lists of elements – either [[a]], or [[[a]]]; in the easier case, a list of rows, where each row is a list of columns, where each column is a single value (an Int, say) which is the value of a cell; in the more complex case, a list of rows, of columns, where each column ‘value’ is itself a list of some data representing a list of possible values, for instance, for this cell.

… and that is perhaps the most complex description I’ll write.  (I only hope that, should I come back to read this later, it will still make sense.)


calcOpen b =
  let syms = symbols b
   --  the following assumes that the blank space (" ") is not a symbol
   B syms (square b) $ arrToSyms (map (\r -> map calc_cell r) (toSolve b)) syms ' '

So this calls toSolve on b, the Board, then “map (\r -> map…” because it’s a list of lists calls calc_cell on each element — and that’s why calc_cell takes the input it does.

The function toSolve takes a Board as input, as we’ve seen, and returns an array [a list of lists] of cells, where each cell is a tuple, which can be input to calc_cell far above.

After the map-map, the result is an array of new values, zero thru nine for a 9×9 board, and arrToSyms <array> syms ' ' converts that back to the [String] data to store in the Board.  “B...” is the board constructor.


The following code, in three functions, allows me to (pretty) print a Board

rowToStr r s =
  if length r > s 

     then show (take s r) ++ "  " ++ (rowToStr (drop s r) s)

    else show r

extraRow a s =

  if length a > s

     then (take s a) ++ [[]] ++ (extraRow (drop s a) s)

    else a

instance Show Board where

  show b =
    let s = square b
        rows = map (\r -> rowToStr r s) $ arrFromBoard b

     in unlines $ extraRow rows s

The magic, which I ought to remember, is in “instance Show Board where\n show…”

The function takes a Board [type] and returns a string.


What I’ve learned

There’s a lot to write here, but I’ve started to get the “hang” of monds — which I’ll have to write later, but others have done better.

Most important, for now, is this: the book Real World Haskell recommends that “beginners” should write the type signature first, to validate the code they get is what they intended.  I, however, took only half that advise — I thought carefully about what I wanted, but coded the function then checked the result, without including the “declaration.”  (I think I’d get the book, if I can find an updated version; there is a lot more in there than I know, yet.)




Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: