I'm making a boggle-like word game. The user is given a grid of letters like this:
O V Z W X
S T A C K
Y R F L Q
The user picks out a word using any adjacent chains of letters, like the word "STACK" across the middle line. The letters used are then replaced by the machine e.g. (new letters in lowercase):
O V Z W X
z e x o p
Y R F L Q
Notice you can now spell "OVeRFLoW" by using the new letters. My problem is: What algorithm can I use to pick new letters that maximizes the number of long words the user can spell? I want the game to be fun and involve spelling e.g. 6 letter words sometimes but, if you pick bad letters, games involve the user just spelling 3 letter words and not getting a chance to find larger words.
For example:
You could just randomly pick new letters from the alphabet. This does not work well.
Likewise, I found picking randomly but using the letter frequencies from Scrabble didn't work well. This works better in Scrabble I think as you are less constrained about the order you use the letters in.
I tried having a set of lists, each representing one of the dies from the Boggle game, and each letter would be picked from a random die side (I also wonder whether I can legally use this data in a product). I didn't notice this working well. I imagine the Boggle dice sides were chosen in some sensible manner, but I cannot find how this was done.
Some ideas I've considered:
Make a table of how often letter pairs occur together in the dictionary. For the sake of argument, say E is seen next to A 30% of the time. When picking a new letter, I would randomly pick a letter based on the frequency of this letter occurring next to a randomly chosen adjacent letter on the grid. For example, if the neighboring letter was E, the new letter would be "A" 30% of the time. The should mean there are lots of decent pairs to use scattered around the map. I could maybe improve this by making probability tables of a letter occurring between two other letters.
Somehow do a search for what words can be spelt on the current grid, taking the new letters to be wildcards. I would then replace the wildcards with letters that allowed the biggest words to be spelt. I'm not sure how you would do this efficiently however.
Any other ideas are appreciated. I wonder if there is a common way to solve this problem and what other word games use.
Edit: Thanks for the great answers so far! I forgot to mention, I'm really aiming for low memory/cpu requirements if possible, I'm probably going to use the SOWPODS dictionary (about 250,000) and my grid will be able 6 x 6.