views:

1126

answers:

7

Note: This is my first Code Golf challenge/question, so I might not be using the correct format below. I'm not really sure how to tag this particular question, and should this be community wiki? Thanks!

This Code Golf challenge is about solving word searches!

A word search, as defined by Wikipedia, is:

A word search, word find, word seek, word sleuth or mystery word puzzle is a word game that is letters of a word in a grid, that usually has a rectangular or square shape. The objective of this puzzle is to find and mark all the words hidden inside the box. The words may be horizontally, vertically or diagonally. Often a list of the hidden words is provided, but more challenging puzzles may let the player figure them out. Many word search puzzles have a theme to which all the hidden words are related.

The word searches for this challenge will all be rectangular grids with a list of words to find provided. The words can be written vertically, horizontally, or diagonally.


Input/Output

The user inputs their word search and then inputs a word to be found in their grid. These two inputs are passed to the function that you will be writing. It is up to you how you want to declare and handle these objects.

Using a strategy described below or one of your own, the function finds the specific word in the search and outputs its starting coordinates (simply row number and column number) and ending coordinates. If you find two occurrences of the word, you must output both sets of coordinates. If the word is a palindrome, you may arbitrarily choose one end to be the "start" of the word.


Example

Input:

A I Y R J J Y T A S V Q T Z E 
X B X G R Z P W V T B K U F O 
E A F L V F J J I A G B A J K 
R E S U R E P U S C Y R S Y K 
F B B Q Y T K O I K H E W G N 
G L W Z F R F H L O R W A R E 
J A O S F U E H Q V L O A Z B 
J F B G I F Q X E E A L W A C 
F W K Z E U U R Z R T N P L D 
F L M P H D F W H F E C G W Z 
B J S V O A O Y D L M S T C R 
B E S J U V T C S O O X P F F 
R J T L C V W R N W L Q U F I 
B L T O O S Q V K R O W G N D 
B C D E J Y E L W X J D F X M 

Word to find: codegolf

Output:

row 12, column 8 --> row 5, column 1

Strategies

Here are a few strategies you might consider using. It is completely up to you to decide what strategy you want to use; it doesn't have to be in this list.

  • Looking for the first letter of the word; on each occurrence, looking at the eight surrounding letters to see whether the next letter of the word is there.
  • Same as above, except looking for a part of a word that has two of the same letter side-by-side.
  • Counting how often each letter of the alphabet is present in the whole grid, then selecting one of the least-occurring letters from the word you have to find and searching for the letter. On each occurrence of the letter, you look at its eight surrounding letters to see whether the next and previous letters of the word is there.
+2  A: 

C++ C, 411 400 388 367 characters

My first attempt ever at a code golf.

#include<stdio.h>
main(){char m[999],w[99];int r,c,l=-1,x=-1,y=-1,i=0,j,k;scanf("%s %d %d"
,w,&r,&c);for(;w[l+1];++l);for(;i<r*c;scanf("%s",&m[i++]));for(;y<2;++y)
for(;x<2;++x)if(x|y)for(i=(y<0)*l;i<r-(y>0)*l;++i)for(j=(x<0)*l;j<c-(x>0
)*l;++j)for(k=0;k<=l&&w[k++]==m[(i+k*y)*c+j+k*x];)k>l?printf(
"row %i, column %i --> row %i, column %i\n",i+1,j+1,i+y*l+1,j+x*l+1):0;}

It compiles without warnings on gcc with no additional flags.

Here's the version with whitespace:

#include<stdio.h>
main() {
    char m[999], w[99];
    int r, c, l = -1, x = -1, y = -1, i = 0, j, k;

    scanf("%s %d %d", w, &r, &c);
    for (; w[l+1]; ++l);
    for (; i < r*c; scanf("%s", &m[i++]));
    for (; y < 2; ++y)
        for (; x < 2; ++x)
            if (x | y)
                for (i = (y<0) * l; i < r - (y>0) * l; ++i)
                    for (j = (x<0) * l; j < c - (x>0) * l; ++j)
                        for (k = 0; k <= l && w[k++] == m[(i+k*y) * c + j+k*x];)
                            k > l ? printf("row %i, column %i --> row %i, column %i\n",
                                    i + 1, j + 1, i + y*l + 1, j + x*l + 1)
                                  : 0;
}

Expected input on stdin looks like:

CODEGOLF 15 15
A I Y R J J Y T A S V Q T Z E
X B X G R Z P W V T B K U F O
...

where 15 15 are the numbers of rows and columns, respectively.

Since this is my first round of golf, and I could find precious little common tricks on the web, I'll list some that I found:

  • Try to keep loop bodies to a single statement to save on moustaches (2 chars).
    Example: anything using the comma operator.

  • Use a conditional expression (?:) instead of an if when you can (1 char).
    Example: instead of if(x)printf("");, write x?printf(""):0;. This works because printf returns int.

  • Use the fact that booleans are 0 or 1 when used as an int (? chars).
    Example: instead of (y>0?r-l:r), write r-(y>0)*l. The savings here are in the parentheses, which were necessary in this situation.

  • while loops are never shorter than for loops, and longer most of the time (>= 0 chars).
    Example: while(x)y; is just as long as for(;x;y); but with for you get the loop body to play with as well, without paying for the extra ;.

  • Put your loop increments into the last expression that uses the loop counter (1 char).
    Example: instead of for(;x<b;++x)printf("%i",x);, write for(;x<b;)printf("%i",x++);.

  • Leave off main's return type (4 chars).

  • Leave off main's return statement (9 chars).

  • Use & and | instead of && and || whenever possible (1 char).
    Example: instead of if(x||y), write if(x|y). This only works if you don't depend on the short-circuit behaviour of && and ||.

  • Inline the strlen function and remove #include<string.h> (11 chars).

If you don't care about compiler warnings:

  • Do not include any headers (many chars).
Thomas
You could also use the #define macro if it's permitted to shorten the code. (e.g. #define f(x) for(; x) or something more sophisticated)
tomp
@tomp: I considered that, but I couldn't find any rules about it. I found it felt too much like cheating.
Thomas
+3  A: 

Python, 318 strokes.

from itertools import*
f=lambda x,y,i,j,n:(x-i+1,y-j+1)*(not n)or all((2*i+j,x+1,y+1,x<c,y<d))and a[x][y*2]==n[0]and f(x+i,y+j,i,j,n[1:])
w=raw_input
a=list(iter(w,''))
c=len(a)
d=len(a[0])/2
r=range
z=r(-1,2)
s='row %d, column %d'
for x in product(r(c),r(d),z,z,[w()]):
 if f(*x):print s%(x[0]+1,x[1]+1),'-->',s%f(*x)


Sample input:

A I Y R J J Y T A S V Q T Z E 
X C O D E G O L F B K U F O Z
E A F L V F J J I A G B A J K 
R E S U R E P U S C Y R S Y K 
F B B Q Y T K O I K H E W G N 
G L W Z F R F H L O R W A R E 
J A O S F U E H Q V L O A Z B 
J F B G I F Q X E E A L W A C 
F W K Z E U U R Z R T N P L D 
F L M P H D F W H F E C G W Z 
B J S V O A O Y D L M S T C R 
B E S J U V T C S O O X P F F 
R J T L C V W R N W L Q U F I 
B L T O O S Q V K R O W G N D 
B C D E J Y E L W X J D F X M 

CODEGOLF

Output:

$ python wordsearch < wordsearchinput.txt
row 2, column 2 --> row 2, column 9
row 12, column 8 --> row 5, column 1

Released under the "edit-the-answer-instead-of-commenting-on-how-I-should-fix-this" license.

Jimmy
I know it's nitpicking, but in the sample input there is a space between every character in the input (the raster)
ChristopheD
fixed :) count += 4
Jimmy
A: 

Python: 491 - 353 characters

Still quite some room for improvement I guess, all comments welcome.

Strategy: finding all occurences of the first letter, and then trying to fetch the word in all directions.

import sys;r=range(-1,2);k=''.join;q="row %s, column %s"
a=[l[:-1:2]for l in sys.stdin]
b=sys.argv[1];c=len(a[0])
for x,y in[(x/c,x%c)for x,h in enumerate(k(map(k,a)))]:
 for i,j in[(i,j)for i in r for j in r if i or j<>0]:
  if k([a[x+z*i][y+z*j]for z in range(len(b))if 0<=x+z*i<c and 0<=y+z*j<len(a)])==b:print q%(x+1,y+1)+" --> "+q%(x+z*i+1,y+z*j+1)

Example usage:

cat input.txt | python test.py CODEGOLF

Example output:

row 12, column 8 --> row 5, column 1

ChristopheD
You can leave off the `if h==b[0]`. You're testing that again in the inner loop anyway.
Thomas
@Thomas: good catch, thanks (shaves off 10 characters)
ChristopheD
+3  A: 

AWK - 252 255

NF<2{l=split($1,w,"")}NF>1{for(i=1;i<=NF;)t[x=i++,y=NR]=$i}END{
for(i=1;i<=x;++i)for(j=0;++j<=y;)for(a=0;a<9;++a)if(t[I=i,J=j]==w[1])
for(k=1;k<l;){if(!(T=t[I+=int(a/3)-1,J+=a%3-1])||T!=w[++k])break;if(k==l)
print"row "j (B=", column ")i" --> row "J B I}}

The input should be in the form

....
B L T O O S Q V K R O W G N D 
B C D E J Y E L W X J D F X M 
CODEGOLF
Dan Andreatta
+6  A: 

Perl - 243

(Not including linebreaks, all of which are optional)

Mostly de-golfed code can be found here.

$/="";@w=split//,pop;($b=<>)=~y/ //d;$W=1+index$b,"\n";
for(1,2){for(0,$W-2..$W){$p=join"."x$_,@w;
push@m,[$-[0],$+[0]-1]while$b=~/$p/sg}
@w=reverse@w;@m=map[reverse@$_],@m}
$t="row %d, column %d";
printf"$t --> $t\n",map{1+$_/$W,1+$_%$W}@$_ for@m;

Usage: run as a script. Either wordsearch.pl boardfile searchword to read from a file, or wordsearch.pl searchword to read the board from STDIN (hey, pop is two characters shorter than shift!)

The idea is that we read the board into a string (throwing away the extra spaces between letters), and also make note of how wide it is (by counting characters until the first newline). Then we do a series of regex matches to find the word, using the s regex modifier to allow a match to span lines. Given a board that's 4 letters wide (so each line is 5 chars including newline), and the search word "CAR", we can make the patterns /CAR/ to match forwards, /C...A...R/ to match going southwest, /C....A....R/ to match going downwards, and /C.....A.....R/ to match going southeast. For each match, we record the starting and ending positions of the match.

Then we reverse the search word ("CAR" -> "RAC") and do it again, which lets us match the word going left, up, northwest, and northeast. Of course, we want to make sure that the start/end positions are also swapped. To save code, I swap the match positions twice; the matches for the forward word are swapped both times (so they come out un-swapped), while the matches for the backward word are only swapped once, so they will come out swapped.

Finally, it's just a matter of taking the match offsets into the board string and turning them into row/column pairs, and formatting the output how the problem wants it.

hobbs
It's hard to see where the script deals with diagonal matches. Could you explain the logic?
Zaid
@Zaid explanation added.
hobbs
great explanation :) is it shorter to do the reverse than to generate extra regexes?
Jimmy
Where's the regex to rotate the board ?!?
mobrule
That's really slick. I only understood the trick when I replaced the `"."` with a `" "`. Match anything on the board while treating it as a single string to deal with the diagonals and verticals.
Zaid
If only, mobrule. If only. :)
hobbs
An interesting side-effect of using the /s modifier is that it does *not* match cases like `... C O D` on one line and `E G O L F ...` on the next, which is what is wanted.
Zaid
Shave off a character by using `printf` instead of `sprintf` ?
Zaid
@Zaid for a little while after I submitted it I was really worried that it *would* accidentally match across lines and be disqualified, but then I realized that no matter what, if it's going to wrap like that it's going to run into a newline in the middle of the word and break the match :)
hobbs
+7  A: 

Perl, 226 char

sub h($){"row ",$%=1+($x=pop)/$W,", column ",1+$x%$W}
@S=map/[^ ]/g,<STDIN>;
B:for(@ARGV){
 for$d(map{$_,-$_}1,$W=@S/$.,$W-1,$W+1){
  A:for$s(0..$#S){
   $t=$s-$d;for(/./g){next A if$S[$t+=$d]ne$_||$t<0}
   print h$s,' --> ',h$t,$/;
   next B
}}}

Character count excludes newlines and indentation, which are included for readability. I assume the matrix of letters is specified with standard input and the target words are command-line arguments.

First line (after the sub h definition) maps all non-space characters into a single array @S. Then iterate over all the target words, over the possible directions that words may appear in ($W is the width of the puzzle), and over all of the letters in the current target word until a match is found.

$ perl wordsearch.pl CODEGOLF RACECAR BYKLHQU <<EOF
A I Y R J J Y T A S V Q T Z E 
X B X G R Z P W V T B K U F O 
E A F L V F J J I A G B A J K 
R E S U R E P U S C Y R S Y K 
F B B Q Y T K O I K H E W G N 
G L W Z F R F H L O R W A R E 
J A O S F U E H Q V L O A Z B 
J F B G I F Q X E E A L W A C 
F W K Z E U U R Z R T R A C E 
C A R P H D F W H F E C G W Z 
B J S V O A O Y D L M S T C R 
B E S J U V T C S O O X P F F 
R J T L C V W R N W L Q U F I 
B L R A C E C A R R O W G N D 
B C D E J Y E L W X J D F X M 
EOF
row 12, column 8 --> row 5, column 1
row 14, column 3 --> row 14, column 9
row 3, column 12 --> row 9, column 6
mobrule
You have a tendency of sneaking up on me. Good job. :)
hobbs
+10  A: 

Python - 186 chars

def f(g,W):w=g.find("\n")+1;L=len(W);print" --> ".join("row %s, column %s"%(x/w+1
,x%w+1)for i in range(len(g))for j in(-w-1,-w,-w+1,-1,1,w-1,w,w+1)for x in(i,i+(L
-1)*j)if g[i::j][:L]==W)

testing code:

grid="""A I Y R J J Y T A S V Q T Z E 
X B X G R Z P W V T B K U F O 
E A F L V F J J I A G B A J K 
R E S U R E P U S C Y R S Y K 
F B B Q Y T K O I K H E W G N 
G L W Z F R F H L O R W A R E 
J A O S F U E H Q V L O A Z B 
J F B G I F Q X E E A L W A C 
F W K Z E U U R Z R T N P L D 
F L M P H D F W H F E C G W Z 
B J S V O A O Y D L M S T C R 
B E S J U V T C S O O X P F F 
R J T L C V W R N W L Q U F I 
B L T O O S Q V K R O W G N D 
B C D E J Y E L W X J D F X M """.lower().replace(" ","")
f(grid,"codegolf")
f(grid,"serverfault")
f(grid,"superuser")

This version is 196 chars and accepts the grid without doing any extra tweaking

def f(g,W):w=g.find("\n")+1;L=len(W);print" --> ".join("row %s, column %s"%(x/w+1,x%w/2+1)for i in range(len(g))for j in(-w-2,-w,-w+2,-2,2,w-2,w,w+2)for x in(i,i+(L-1)*j)if g[i::j][:L].lower()==W)

testing code:

grid="""A I Y R J J Y T A S V Q T Z E 
X B X G R Z P W V T B K U F O 
E A F L V F J J I A G B A J K 
R E S U R E P U S C Y R S Y K 
F B B Q Y T K O I K H E W G N 
G L W Z F R F H L O R W A R E 
J A O S F U E H Q V L O A Z B 
J F B G I F Q X E E A L W A C 
F W K Z E U U R Z R T N P L D 
F L M P H D F W H F E C G W Z 
B J S V O A O Y D L M S T C R 
B E S J U V T C S O O X P F F 
R J T L C V W R N W L Q U F I 
B L T O O S Q V K R O W G N D 
B C D E J Y E L W X J D F X M """
f(grid,"codegolf")
f(grid,"serverfault")
f(grid,"superuser")
gnibbler
Congratulations for producing a shorter version that the perl mongers so far!
Matthieu M.