Today’s Squramble developer post talks about how one hand-crafts a word game “dictionary” for a casual word game.
Consider a game that chooses a word, mixes it up, and asks you to unscramble it as a minor diversion. What constitutes a good list of words, and where does such a list come from?
The game can’t just mix up random letters, right? It would be easy to program it, but you would be mad if you were asked to unscramble DACG to find the “correct” answer of AGCD. At a minimum, the words have to be words, according to some kind of English dictionary somewhere.
There are plenty of web sites that can give you a list of Scrabble-style words—yes, Scrabble words are words, and are in dictionaries of a fashion—but a casual game can’t realistically use these words, either. Unless that game has Scrabble in the name, I guess. My reasoning is that a Scrabble maniac might know that the word “BIX” means (fill in), but a mere mortal looking for a minor puzzle diversion is not going to enjoy a game that oh-so-cleverly hides BIX as XBI in any way.
So you can get a list of Scrabble words somewhat easily, but picking through that long list to find ones suitable for our minor puzzle diversion is no small feat. Primarily, by its definition, this task cannot be automated. Some person who knows something about words is going to have to look at each Scrabble word and give it the thumbs up or thumbs down. Unfortunately, that’s a lot of words to wiggle your thumb at.
For the list of all possible three-letter Scrabble words, this is a doable task, given the right tools to move the job along. For the four-letter Scrabble words, it’s a bigger job, but something on the order of tens of hours, again with the tools. For the five-letter Scrabble words, it’s pretty much insane.
All of this effort is intended to produce a word list that every casual game lover can enjoy. Every word chosen by the game has been vetted against a set of consistent standards.
I started making the four letter word list using this method and found that handcrafting your word list has drawbacks above and beyond the large amount of time you have to spend doing it. Everyone loves to supply arch comments on your word selections, and I hadn’t even hit the Internets yet! The word CAUL, for example, was the subject of some controversy, being near the beginning of the alphabet and likely to be selected by the prototype game at that point, with its limited word database. Of course, a CAUL is a troublesome biological “cover” that babies can be born with, but not everybody knows that.
This led to the conclusion that I needed “difficulty” levels if I wanted my CAUL. So now I needed to have multiple lists. And if you’ll allow me to get anatomical for a moment because this came up early in the alphabet: what about the words BARF and SNOT? These are actual words, okay, known to most people, and not on their own super medical or anything. (It’s easy to dismiss AGUE, a type of fever, as being too medical to allow at all, but not so much for those particular words for liquids that your head might emit.) It’s somewhat clear that you don’t want your family game to be expecting you to dig out the word BARF from its grid of mixed-up letters, but what if the player can complete the puzzle while using that word?
It became clear that, in addition to EASY words and HARD words (like CAUL!), I also needed a category of words such as DON’T USE BUT ALLOW. And while I was at it, I thought it might be fun to have a category for real curse words so that the game can give an achievement to people who complete puzzles using these words. After all, most of them are four letters, right?
I wrote a utility application to assist in the creation of the list. It runs on my PC, sorry Mac lovers. Below is a screenshot.
When constructing the list, I can double-click on any word to go to a dictionary web site to see the definition. In most cases, I try to write a description of why I either included or excluded the word, though by the time I get to the “Z”’s that may go out the window. In any case, I can easily change my mind about a word (perhaps based on user feedback) and can recreate the word database files used by the game. I can change ACHE from an easy word (my original classification) to a hard word with just a few clicks.
This has worked out pretty well. Only players that bump the word difficulty above EASY will ever be asked to discern harder (or less-common) words like CAUL. And lastly, if the player finds a word that differs from the one chosen by the game—but one that is still valid, for example, POET and LOOM when the game picked POEM and LOOT—the full dictionary can be brought to bear in rewarding this creativity rather than penalizing it. And the same goes for common-but-icky anatomical words and curse words.