Some words on move generation and evaluation in Elise

The true value of a move in Scrabble® (or any similar crossword game) can be thought of as the sum of various components. First and most basically, there is its score. If you are able to "play out" and end the game with the move, then the score component is the only one that matters -- you will win, lose, or draw based entirely on the score. Generally, the higher the score the better the move; but if the game is not going to end immediately after the present move, there are many exceptions. For example, in English-language play you can tack an "S" onto almost any singular noun or infinitive-form verb, and add another point to your play's score. But the "S" is such a versatile tile that it should not normally be spent to gain a single point now. You should wait in hopes of making a much better play with your "S". The blank is an even more valuable example: it is so useful that it should almost always be held until you can make a "bingo", or a similarly high-scoring, play with it.

That brings us into the second component of a move's value: what it leaves on your rack. The tiles a move leaves on your rack say a lot about your next move's expected value. In fact, if the end of the game is not imminent, the value of your rack leave is almost as important a component as the actual score of your move. If you play one tile and leave six vowels, your next move will probably be weak; if you leave something like ERT? on your rack ("?" representing the blank), you will probably have many strong options for your next move.

We can extend the idea and say that, not only are the tiles left on your rack important, but the composition of the well (the bag of remaining tiles) also. For example, if only ten tiles remain in the bag and both blanks have not been seen, it may make sense to make a move playing five tiles now in hopes of drawing one or both blanks, even if a move that uses three tiles has better score and a better rack leave.

Calculating the value of a rack leave, while perhaps more complicated than calculating score, is fairly straightforward. The value of each different rack can be estimated by playing many millions of games. (Having the speed of a computer is helpful, isn't it?) Then, the value of any rack leave can be estimated from the full-rack data. For greatest accuracy, the value of a partial rack can be estimated by finding all possible "full racks" containing it, using the tile distribution of the remaining unseen tiles to fill the partial racks. This takes into account the composition of the well, as well as the remaining composition of your rack.

The next component to consider, and perhaps the most difficult to quantify, is the value of the board left by the move. For example, a move that opens up a triple-triple word score creates an unlikely enormous scoring opportunity as well as many much more likely good scoring opportunities (moves that close one of the triple word scores.) Since your opponent gets the "first crack" at these new scoring opportunities, opening a triple-triple is necessarily a risky move and should only be done if the potential reward of the open board outweighs the risk.

Many board leave considerations could (and I'd say "should") be thought of as changing "risk" more substantially than the potential expected value: some board considerations may not change the expected value of our next play much, but could greatly broaden the expected range in value of both our and our opponent's next plays. Obviously, if we are far ahead, potentially allowing our opponent to score big and narrow the gap is unacceptable. However, if we are far behind, we want to create scoring opportunities, and the additional risk is something we can bear; we cannot win unless we create potential big plays. We run the risk that our opponent gets them before we can, and thus fall further behind, but that's life.

The open triple-triple is only the most extreme case; almost any move has some board ramifications, affecting both risk and expected value, to consider. For example, if a move plays a tile like "E" on a triple letter score, that takes away an opportunity to hook an "X" or "J" there; a move played alongside another usually reduces the number of chances for playing a "bingo" (all tiles on your rack) later, etc. Calculating the expected value of a board is a complicated and nuanced process, and while Elise is aware of the more important cases, there are without a doubt many potential smaller cases that could be considered.

Move evaluation in Elise

As a basis for comparison, let's consider the simplest, most straightforward way to evaluate moves in crossword puzzle games. This is simply to generate all possible moves, and choose a move with the highest score as the "best" move. This move evaluation algorithm, which only takes into account one component of a move's value, we will call the "naive" move algorithm.

The most basic quick move generation algorithm used in Elise is called the "hi-leave" algorithm. As you might guess from the name, this algorithm takes into account the second component, the values of rack leave. However, it also includes some basic estimates for the third (board) component: it performs a fast evaluation of possible opponent hooks, open bonus squares, open bingo lines, has some endgame timing considerations, and even plays a quite good endgame (although it does not do an exhaustive endgame search). It does all of this while being basically instantaneous; Elise can generate many thousands of hi-leave moves in a second, even on a single processor.

This is the least sophisticated move evaluator in Elise, but against the "naive" approach, "hi-leave" will win approximately 65% of games, and tie about one half of one percent more. Even a basic consideration of rack and board leave, it can be seen, produces a substantial improvement in game play over simplistic score-based approaches.

Then there is the next step up, the "best fast" method, which is used by "Quick move find" (Shift+Q) in the Elise GUI. This, along with having all of the smarts of basic hi-leave, also can utilize the exhaustive-search end-game code, includes deeper second component (rack leave expectancy) analysis that takes into account well composition, and has some additional third component (board position) adjustments that are slighly more computationally expensive. Even so, "best fast" move find is also basically instantaneous, with the exception of difficult end-game positions.

"Best fast" move generation beats "hi-leave" generation by almost as sound a margin as "hi-leave" beats "naive" high score -- about sixty percent to forty percent. Using "best fast" moves, with no simulation at all, Elise will present a challenge to seasoned tournament players.

Then there is lookahead -- simulation -- deep move find. Whatever you want to call it, Elise can do thorough simulations many ply deep speedily. It will use all of the processing power you care to throw at it; Elise really does play better on better hardware. In simulation, Elise plays ahead in the game a specified number of moves, using the smarts of both "hi-leave" move generation algorithms to determine move responses, and carefully evaluates the expected point spread or winning probability of each move. It repeats the process thousands of times with differing rack draws, and plays the move that gives it the best outcome.

To be furtherly cruel, in simulation, Elise can construct statistical models of opponent's rack leaves based on known moves. This information is used to crush you badly. It's possible to fool Elise's tile guesses, but to do it consistently you have to play quite suboptimally; I wouldn't try it. On top of that, Elise in simulation uses specialized evaluation functions to evaluate the board's likeness to a won endgame position. It's good at finding its way there. Simulation can maximize spread (expected score difference), or win probability. At 3 ply, I estimate that Elise is stronger than any human player. At higher ply levels it quickly becomes better than any other current computer engine -- I'm quite confident in saying that.

There is a lot of chance involved in games such as Scrabble®. It's hard to win against an even passably decent player with good tile luck -- who gets both blanks, the X, the Z, the Ses, etc. But Elise, at 3-ply simulation, not even close to its full strength, will beat the "naive" algorithm almost all of the time, good tile draws or bad tile draws. It's fun to watch -- if sometimes a little unsettling.

Closing

It should be said that simulation picks up much of the strength of considering the second and third components -- rack leave and board value -- by its nature. By looking ahead a few moves, it "sees" the range of possible new racks made available by making a move, as well as the possible scoring opportunities that are opened or closed by playing the current move on the board. Accordingly, you could build a simulation based on even the "naive" move evalution algorithm and still have a pretty strong engine at the end of the day. However, by building the rack-leave and board-leave smarts into the fast move evaluation and allowing simulation to avail itself of them, this gives simulation the ability to see how a game unfolds against a powerful player; it gets a much more realistic notion of the scoring opportunities available to it than it does if it's playing against a "naive" evaluator. Interestingly, not only does it play a much stronger game as a result, but it is often able to find its move faster -- against a better opponent Elise often finds less variance in expected point spread or win probability, allowing it to converge on the best move more quickly.

This situation is similar to the role of move evaluation in chess engines. The strongest, smartest chess engines of twenty or thirty years ago often had relatively primitive move evaluation, based on calculated piece values. Their strength came mainly from their incredible hardware, allowing them to search many ply (half-moves) ahead. They often could not prune their move trees until after a very deep search. Today's best chess engines have a significant amount of smarts built into their move evaluators (including chess concepts like pawn structure, situational piece values, and squares of control) which allow them to converge on the best move more efficiently. The improvement in gameplay is such that, even on today's PCs (which, while powerful, are not comparable to the multimillion dollar supercomputers of the 1990s), the best human grandmasters will be hard-pressed even to draw against the best modern chess engines.

Now, chess is a game of perfect information, not a mixed game. In a mixed game, a game of both skill and luck, where it is computationally infeasible to enumerate every possibility, proper move evaluation is, if anything, far more important. Elise, through both its cunning move evaluation and taking advantage of highly optimized parallel processing, seeks to be the crossword-game equivalent of the very best chess engines.

Back to the main page