views:

4269

answers:

19

Hi,

Does anyone know a simple algorithm to check if a Sudoku-Configuration is valid? The simplest algorithm I came up with is (for a board of size n) in Pseudocode

for each row
  for each number k in 1..n
    if k is not in the row (using another for-loop)
      return not-a-solution

..do the same for each column

But I'm quite sure there must be a better (in the sense of more elegant) solution. Efficiency is quite unimportant.

Best Regards,

Michael

+4  A: 

Just a thought: don't you need to also check the numbers in each 3x3 square?

I'm trying to figure out if it is possible to have the rows and columns conditions satisfied without having a correct sudoku

Luk
I don't now whether this is true: first row 1..9, second row 2..9,1, third 3..9,1,2 ... will lead to correct rows and columns, but not 3x3 squares. Or do I miss something here?
Ralph Rickenbach
@malach: You're absolutely right. Without the box constraint, it is a latin square. The set of valid, completed sudoku of a size is therefore a subset of all latin squares of that size. For example, the 4x4 grid with rows [1234][2143][3412][4321] is a latin square, but not a valid 2x2 sudoku.
Michael Madsen
You do need to check all three (rows, columns, and boxes).
Bill the Lizard
A: 

Hello Lance,

that was my first thought too. But if you have a 3x3 Sudoku

2 2 2
2 2 2
2 2 2

your algorithm would also return that it's a correct solution.

should be a comment, not an answer...
SPWorley
A: 

You need to check for all the constraints of Sudoku :

  • check the sum on each row
  • check the sum on each column
  • check for sum on each box
  • check for duplicate numbers on each row
  • check for duplicate numbers on each column
  • check for duplicate numbers on each box

that't 6 checks altogether.. using a brute force approach. Some sort of mathematical optimization can be used if you know the size of the board (ie 3x3 or 9x9)

Edit: explanation for the sum constraint: Checking for the sum first (and stoping if the sum is not 45) is much faster (and simpler) than checking for duplicates.It provides an easy way of discarding a wrong solution.

Radu094
Sudoku doesn't involve sums.
cjm
This answer is wrong. Sudoku has only three constraints.
Bill the Lizard
The sum of any row or column will be the same ie 1+2+3+4+5+6+7+8+9 = 45 and the dupe checks are because 45 can be reached in other ways if there are duplicates. The rules above ARE the constraints required to get a valid configuration.
TreeUK
This answer _is_ wrong. The sum constraints are unnecessary, because there is no way to obtain a sum of 45 in any other way without violating one of the 3 actual constraints.
sykora
Checking for the sum beforehand may give better performance than just a naive duplicate check, but it is not a requirement. Checking for duplicates is sufficient.
Svante
Checking for the sum first (and stoping if the sum is not 45) is much faster than checking for duplicates. Can people who skipped math clases stop downvoting this please?
Radu094
I agree that this algorithm would be faster for most real-world checks, but the question specifically states that efficiency is "quite unimportant" and he's looking for "elegance." This answer, while it always produces correct answers, is certainly not the most efficient or the most elegant solution posted here. It's odd to me that it was chosen as the answer. I don't think it deserves down-votes, though.
StriplingWarrior
Radu094: why don't you edit your answer explaining the reason the sums are there? I know the reason, but adding that in the answer as well would avoid some of the downvotes (IMO).
Dom De Felice
A: 

It would be very interesting to check if:

when the sum of each row/column/box equals n*(n+1)/2
and the product equals n!
with n = number of rows or columns

this suffices the rules of a sudoku. Because that would allow for an algorithm of O(n^2), summing and multiplying the correct cells.

Looking at n = 9, the sums should be 45, the products 362880.

You would do something like:

for i = 0 to n-1 do
  boxsum[i] := 0;
  colsum[i] := 0;
  rowsum[i] := 0;
  boxprod[i] := 1;
  colprod[i] := 1;
  rowprod[i] := 1;    
end;

for i = 0 to n-1 do
  for j = 0 to n-1 do
    box := (i div n^1/2) + (j div n^1/2)*n^1/2;
    boxsum[box] := boxsum[box] + cell[i,j];
    boxprod[box] := boxprod[box] * cell[i,j];
    colsum[i] := colsum[i] + cell[i,j];
    colprod[i] := colprod[i] * cell[i,j];
    rowsum[j] := colsum[j] + cell[i,j];
    rowprod[j] := colprod[j] * cell[i,j];
   end;
end;

for i = 0 to n-1 do
  if boxsum[i] <> 45
  or colsum[i] <> 45
  or rowsum[i] <> 45
  or boxprod[i] <> 362880
  or colprod[i] <> 362880
  or rowprod[i] <> 362880
   return false;
Ralph Rickenbach
It would be a great solution.. sadly the prime factors can be rearranged to trick it.. for example the sequence:444957921has sum 45 and product 9!
Marco M.
marco, ha, I wrote a test program to check it as well, and was about to triumphantly post a counterexample, but you beat me. I should read the comments before jumping in!
SPWorley
A: 

Let's say int sudoku[0..8,0..8] is the sudoku field.

bool CheckSudoku(int[,] sudoku)
{
    int flag = 0;

// Check rows
for(int row = 0; row < 9; row++)
{
 flag = 0;
 for (int col = 0; col < 9; col++)
 {
  // edited : check range step (see comments)
  if ((sudoku[row, col] < 1)||(sudoku[row, col] > 9)) 
  {
   return false;
  }

  // if n-th bit is set.. but you can use a bool array for readability
  if ((flag & (1 << sudoku[row, col])) != 0) 
  {
   return false;
  }

  // set the n-th bit
  flag |= (1 << sudoku[row, col]); 
 }
}

// Check columns
for(int col= 0; col < 9; col++)
{
 flag = 0;
 for (int row = 0; row < 9; row++)
 {
  if ((flag & (1 << sudoku[row, col])) != 0)
  {
   return false;
  }
  flag |= (1 << sudoku[row, col]);
 }
}

// Check 3x3 boxes
for(int box= 0; box < 9; box++)
{
 flag = 0;
 for (int ofs = 0; ofs < 9; ofs++)
 {
  int col = (box % 3) * 3;
  int row = ((int)(box / 3)) * 3;

  if ((flag & (1 << sudoku[row, col])) != 0)
  {
   return false;
  }
  flag |= (1 << sudoku[row, col]);
 }
}
return true;

}

Marco M.
If somebody would enter a value that is not valid (e.g. 0 or 10), this algorithm would still evaluate to true. Just being picky :-) Could be fixed by one more test at the end of each inner loop: flag xor 0x1FF = 0
Ralph Rickenbach
To be true that would also not work for values > 31.. I think value range for the entire sudoku should be a separate step.
Marco M.
O(3(n^2)) is very inefficient..
Josh Smeaton
You can't say O(3(n^2)) being better or worse than O(n^2) without considering the implementation of the single operations and the architecture it runs on; it's the same complexity class.You have to check 3*n^2 data. If you do in one big loop or in three smaller ones, it's the same.
Marco M.
+18  A: 

Peter Norvig has a great article on solving sudoku puzzles (with python),

http://norvig.com/sudoku.html

Maybe it's too much for what you want to do, but it's a great read anyway

daniel
+8  A: 

Wikipedia has an article on algorithms for sudoku. It pretty much covers everything.

Ilya Martynov
No idea why this is voted down. Upvoted to correct this.
Aron Rotteveel
Because at this link there are only solvers not checkers ! The question is for a checking algorithm
KeesDijk
It's not clear to me, from the wording of the question, whether the question is about:1) checking that the numbers currently on the board obey the rules, or2) checking whether a Sudoku puzzle is "solvable".The former is easy, the latter is hard.
Craig McQueen
+2  A: 

Create cell sets, where each set contains 9 cells, and create sets for vertical columns, horizontal rows, and 3x3 squares.

Then for each cell, simply identify the sets it's part of and analyze those.

Lasse V. Karlsen
This is how I've written it in the past.
jmucchiello
A: 

Let's assume that your board goes from 1 - n.

We'll create a verification array, fill it and then verify it.

grid [0-(n-1)][0-(n-1)]; //this is the input grid
//each verification takes n^2 bits, so three verifications gives us 3n^2
boolean VArray (3*n*n) //make sure this is initialized to false


for i = 0 to n
 for j = 0 to n
  /*
   each coordinate consists of three parts
   row/col/box start pos, index offset, val offset 
  */

  //to validate rows
  VArray( (0)     + (j*n)                             + (grid[i][j]-1) ) = 1
  //to validate cols
  VArray( (n*n)   + (i*n)                             + (grid[i][j]-1) ) = 1
  //to validate boxes
  VArray( (2*n*n) + (3*(floor (i/3)*n)+ floor(j/3)*n) + (grid[i][j]-1) ) = 1
 next    
next

if every array value is true then the solution is correct.

I think that will do the trick, although i'm sure i made a couple of stupid mistakes in there. I might even have missed the boat entirely.

Bryan
+1  A: 

You could extract all values in a set (row, column, box) into a list, sort it, then compare to '(1, 2, 3, 4, 5, 6, 7, 8, 9)

Svante
This seems like an intriguing path to try...
Leonardo Herrera
A: 
array = [1,2,3,4,5,6,7,8,9]  
sudoku = int [][]
puzzle = 9 #9x9
columns = map []
units = map [] # box    
unit_l = 3 # box width/height
check_puzzle()    


def strike_numbers(line, line_num, columns, units, unit_l):
    count = 0
    for n in line:
        # check which unit we're in
        unit = ceil(n / unit_l) + ceil(line_num / unit_l) # this line is wrong - rushed
        if units[unit].contains(n): #is n in unit already?
             return columns, units, 1
        units[unit].add(n)
        if columns[count].contains(n): #is n in column already?
            return columns, units, 1
        columns[count].add(n)
        line.remove(n) #remove num from temp row
    return columns, units, line.length # was a number not eliminated?

def check_puzzle(columns, sudoku, puzzle, array, units):
    for (i=0;i< puzzle;i++):
        columns, units, left_over = strike_numbers(sudoku[i], i, columns, units) # iterate through rows
        if (left_over > 0): return false

Without thoroughly checking, off the top of my head, this should work (with a bit of debugging) while only looping twice. O(n^2) instead of O(3(n^2))

Josh Smeaton
+2  A: 

I did this once for a class project. I used a total of 27 sets to represent each row, column and box. I'd check the numbers as I added them to each set (each placement of a number causes the number to be added to 3 sets, a row, a column, and a box) to make sure the user only entered the digits 1-9. The only way a set could get filled is if it was properly filled with unique digits. If all 27 sets got filled, the puzzle was solved. Setting up the mappings from the user interface to the 27 sets was a bit tedious, but made the rest of the logic a breeze to implement.

Bill the Lizard
A: 

Some time ago, I wrote a sudoku checker that checks for duplicate number on each row, duplicate number on each column & duplicate number on each box. I would love it if someone could come up one with like a few lines of Linq code though.

char VerifySudoku(char grid[81])
{
    for (char r = 0; r < 9; ++r)
    {
        unsigned int bigFlags = 0;

        for (char c = 0; c < 9; ++c)
        {
            unsigned short buffer = r/3*3+c/3;

                        // check horizontally
            bitFlags |= 1 << (27-grid[(r<<3)+r+c]) 
                        // check vertically
                     |  1 << (18-grid[(c<<3)+c+r])
                        // check subgrids
                     |  1 << (9-grid[(buffer<<3)+buffer+r%3*3+c%3]);

        }

        if (bitFlags != 0x7ffffff)
            return 0; // invalid
    }

    return 1; // valid
}
Hao Wooi Lim
A: 

If you need a solver, too, a "cool" algorithm (in 3 lines) can be found here:

http://www.ecclestoad.co.uk/blog/2005/06/02/sudoku_solver_in_three_lines_explained.html

Peter
A: 

Here is paper by math professor J.F. Crook: A Pencil-and-Paper Algorithm for Solving Sudoku Puzzles

This paper was published in April 2009 and it got lots of publicity as definite Sudoku solution (check google for "J.F.Crook Sudoku" ).

Besides algorithm, there is also a mathematical proof that algorithm works (professor admitted that he does not find Sudoku very interesting, so he threw some math in paper to make it more fun).

zendar
A: 

Check each row, column and box such that it contains the numbers 1-9 each, with no duplicates. Most answers here already discuss this.

But how to do that efficiently? Answer: Use a loop like

result=0;
for each entry:
  result |= 1<<(value-1)
return (result==511);

Each number will set one bit of the result. If all 9 numbers are unique, the lowest 9 bits will be set. So the "check for duplicates" test is just a check that all 9 bits are set, which is the same as testing result==511. You need to do 27 of these checks.. one for each row, column, and box.

SPWorley
A: 

I'd write an interface that has functions that receive the sudoku field and returns true/false if it's a solution. Then implement the constraints as single validation classes per constraint.

To verify just iterate through all constraint classes and when all pass the sudoku is correct. To speedup put the ones that most likely fail to the front and stop in the first result that points to invalid field.

Pretty generic pattern. ;-)

You can of course enhance this to provide hints which field is presumably wrong and so on.

First constraint, just check if all fields are filled out. (Simple loop) Second check if all numbers are in each block (nested loops) Third check for complete rows and columns (almost same procedure as above but different access scheme)

Patrick Cornelissen
A: 

if the sum and the multiplication of a row/col equals to the right number 45/362880

Eldan-San
A: 

Create an array of booleans for every row, column, and square. The array's index represents the value that got placed into that row, column, or square. In other words, if you add a 5 to the second row, first column, you would set rows[2][5] to true, along with columns[1][5] and squares[4][5], to indicate that the row, column, and square now have a 5 value.

Regardless of how your original board is being represented, this can be a simple and very fast way to check it for completeness and correctness. Simply take the numbers in the order that they appear on the board, and begin building this data structure. As you place numbers in the board, it becomes a O(1) operation to determine whether any values are being duplicated in a given row, column, or square. (You'll also want to check that each value is a legitimate number: if they give you a blank or a too-high number, you know that the board is not complete.) When you get to the end of the board, you'll know that all the values are correct, and there is no more checking required.

Someone also pointed out that you can use any form of Set to do this. Arrays arranged in this manner are just a particularly lightweight and performant form of a Set that works well for a small, consecutive, fixed set of numbers. If you know the size of your board, you could also choose to do bit-masking, but that's probably a little overly tedious considering that efficiency isn't that big a deal to you.

StriplingWarrior