views:

577

answers:

6

I'm writing an algorithm in PHP to solve a given Sudoku puzzle. I've set up a somewhat object-oriented implementation with two classes: a Square class for each individual tile on the 9x9 board, and a Sudoku class, which has a matrix of Squares to represent the board.

The implementation of the algorithm I'm using is a sort of triple-tier approach. The first step, which will solve only the most basic puzzles (but is the most efficient), is to fill in any squares which can only take a single value based on the board's initial setup, and to adjust the constraints accordingly on the rest of the unsolved squares.

Usually, this process of "constant propagation" doesn't solve the board entirely, but it does solve a sizable chunk. The second tier will then kick in. This parses each unit (or 9 squares which must all have unique number assignments, e.g. a row or column) for the "possible" values of each unsolved square. This list of possible values is represented as a string in the Square class:

class Square {

private $name;                // 00, 01, 02, ... , 86, 87, 88
private $peers;               // All squares in same row, col, and box
private $number;              // Assigned value (0 if not assigned)
private $possibles;           // String of possible numbers (1-9)

public function __construct($name, $p = 0) {
  $this->name = $name;
  $this->setNumber($p);
  if ($p == 0) {
    $this->possibles = "123456789";
  }
}

// ... other functions

Given a whole array of unsolved squares in a unit (as described in the second tier above), the second tier will concatenate all the strings of "possibles" into a single string. It will then search through that single string for any unique character values - values which do not repeat themselves. This will indicate that, within the unit of squares, there is only one square that can take on that particular value.

My question is: for implementing this second tier, how can I parse this string of all the possible values in a unit and easily detect the unique value(s)? I know I could create an array where each index is represented by the numbers 1-9, and I could increment the value at the corresponding index by 1 for each possible-value of that number that I find, then scan the array again for any values of 1, but this seems extremely inefficient, requiring two linear scans of an array for each unit, and in a Sudoku puzzle there are 27 units.

+3  A: 

This is somewhat like what you have already ruled out as "extremely inefficient", but with builtin functions so it might be quite efficient:

$all_possibilities = "1234567891234";
$unique = array();
foreach (count_chars($all_possibilities, 1) as $c => $occurrences) {
  if ($occurrences == 1)
    $unique[] = chr($c);
}
print join("", $unique) . "\n";

Prints: "56789"

PEZ
A: 
 function singletonsInString($instring) {

    $results = array();

    for($i = 1; $i < 10; $i++) {

        $first_pos = strpos($instring, str($i));
        $last_pos = strrpos($instring, str($i));

        if ( $first_pos !== FALSE and $first_pos == $last_pos ) 
            $results[] = $i;

    }

    return $results;

 }

That'll give you every singleton. Get the first and last positions of a number in that array, and if they match and aren't both FALSE (strict comparison in case there's a singleton right at the start) then there's only one such number in that array.

If you're super super worried about speed here, you can probably replace the interior of that loop with

 $istr = str($i);
 if ( ($first = strpos($instring, $istr)) !== FALSE 
       and $first == $strrpos($instring, $istr) ) $results[] = $i;

for a minimum number of computations. Well, assuming PHP's native strpos is the best way to go about these things, which as far as I know is not unreasonable.

Glazius
+1  A: 

Consider using a binary number to represent your "possibles" instead, because binary operations like AND, OR, XOR tend to be much faster than string operations.

E.g. if "2" and "3" are possible for a square, use the binary number 000000110 to represent the possibilities for that square.

Here's how you could find uniques:

$seenonce = 0;
$seenmore = 0;
foreach(all_possibles_for_this_unit as $possibles) {
    $seenmore |= ($possibles & $seenonce);
    $seenonce |= $possibles;
}
$seenonce ^= $seenmore;
if ($seenonce) {
    //something was seen once - now it must be located
}

I'm not sure if this method will actually work faster but it's worth looking into.

Artelius
A: 

The last time I fooled with Sudoku solving, I had a third class called "Run". A Run instance is created for each row, col and 3x3 square. Every square has three runs associated with it. The Run class contains the set of numbers not yet placed within the run. Solving the board then involves intersecting the sets at each square iteratively. This takes care of 80% of most medium boards and 60% of most hard boards. Once you've gone through the whole board with no changes, you can move on to higher level logic. Each time your higher level logic fills a square, you run through your squares again.

The nice thing about this setup is you can easily add variants to the solver. Say you use the variant where the two diagonals are also unique. You just add a 4th run to those 18 squares.

jmucchiello
A: 

What I would do, is actually use binary bits for storing actual values as another answer suggested. That allows to do efficient checks and in general might lend Sudoku itself to more mathematical(=efficient and shorter) solution (just my impression, I have not researched this).

Basically, you represent the numbers in squares not with digits, but with powers of 2

"1" = 2^0 = 1 =  000000001
"2" = 2^1 = 2 =  000000010
"3" = 2^2 = 4 =  000000100
"4" = 2^3 = 8 =  000001000
... etc up to 
"9" = 2^8 = 256= 100000000

this way, you can simply add contents' of single squares to find out what numbers are missing in a 3x3 or a row or any other subset of sudoku, like this:

// shows the possibles for 3x3 square number 1 (00-22)
$sum=0;
for ($i=0; $i< 3; $i++)
  for ($j=0; $j < 3; $j++)
         $sum += $square["${i}${j}"]->number

$possibles = $sum ^ 511  // ^ stands for bitwise XOR and 511 is binary 11111111

now the $possibles contains "1" in bit positions of digits that are possible in this square and you can do bitwise operations with the results for other squares to match them together, like this:

eg. let's say:

$possibles1 = 146 // is binary 100100101, 
                 //indicating that this row or 3x3 square has place for "9", "6", "3" and "1"
$possibles2 = 7 //   is binary 000000111, indicating it has place for "3", "2" and "1".

// so:
$possibles1 & $possibles2 
// bitwise AND, will show binary 101 saying that "3" and "1" is unfilled in both bloces
$possibles1 | $possibles2 
// bitwise OR will give that in total it is possible to use "9", "6", "3", "2" and "1" in those two squares together
Gnudiff
A: 

Here is a way using only PHP built-in functions which should be pretty fast.

function getUniques($sNumbers)
{
    return join(array_keys(array_count_values(str_split($sNumbers)),1));
}

echo getUniques("1234567891234"); // return 56789;
null