Notes about playing phonies with Elise

In order to make Elise a more entertaining and "human-like" opponent, or in order to help instill more doubt and fear into the experience of playing against Elise (I haven't decided which), in version 0.1.8, I have given Elise the ability to play, in a mostly intelligent way, phonies. In order to play a game with phonies, you must choose "American English with phonies" or "International English with phonies" as the game type. (Alas, I do not have long lists of plausible phonies for other languages -- if you would like to contribute to such an effort you are more than welcome to!)

The first thing I would like to note is that Elise, in many games, will not play a phony. Elise has a rich vocabulary of phonies -- it knows well over 200,000 words not legal for play in either TWL or CSW. They are all English words, and are either headwords found in at least one unabridged dictionary, or plurals or conjugations of headwords. In almost any board position, Elise will find multiple phonies that, if they stay, are nearly as good or better than the best legitimate plays. It will often find phonies that score much better than any legal word. But Elise seldom actually plays a phony -- the risk of having the move challenged off the board and the cost when the phonies are, in most cases, make the phonies less valuable than the best allowable moves. Elise is most likely to break out the phonies while far behind or far ahead, or in a losing endgame position.

That said, Elise can go several games without playing a phony, and then put down three phonies in one game -- I've seen that happen, so watch out!

Playing a phony and having it challenged off the board hurts in multiple ways: first, you lose your turn, and second, your opponent gains knowledge of some or all of your rack. This can be especially damaging if you tip your opponent to the presence of blanks or power tiles on your rack. It is also more damaging in situations (such as close late games) where the difference between the best moves and the worst moves is large. Elise takes all of this into account when simulating, and thus is only likely to play a phony when it is much, much better than other options!

Elise becomes bolder and more willing to play phonies the stricter the challenge settings become. If you are playing with "double challenge" (an unsuccessful challenger loses his or her turn), Elise will play phonies more frequently. Similarly a "penalty challenge" with a high penalty for unsuccessful challenges will make Elise bolder. If you are playing "single challenge" (effectively free challenges), Elise will almost never play a phony.

Let's look at some positions from a game, and Elise's valuation of phonies in those positions:

The game has just begun, Alvin versus Simon, using the TWL06 lexicon. Simon is to play, with the rack EHIOPV?. There are no playable (legal) bingos. There is the phony OVErHIP* playable at 9I or 7I for 75 points, and the phony OVErWHIP* I4 for 72 points. The highest scoring non-phonies available are in the low twenties: POH 9F (24 pts), HOVE 7H or 9H (22 pts), HOP 7H (20 pts).

Opponent Alvin's opening move does not reveal much about his rack; JAW is a pretty likely opener whether the remaining four tiles are good or bad.

Elise in this position favors VOE 9F (19 pts) greatly over any of the phonies. It gives VOE a winning probability of approximately 55%. OVErWHIP* is just under 49%, while both OVErHIP* plays are in the 47-48% range. (For this game, Elise's analysis assumes a penalty challenge in which an incorrect challenge gives opponent 10 points. If stricter challenge settings are in effect, phonies are more valuable relative to legal plays, while with more lenient challenge rules, phonies are much less valuable.)

The fact is that playing one of the phony bingos here is quite weak -- Simon still has excellent bingo tiles after VOE, it's 19 points that can't be challenged away, and Simon does not reveal to Alvin that he has a blank! If Simon played one of the phonies and it was challenged off the board, Alvin (assuming he's sensible) would know his rack and play defensively, blocking Simon from bingoing for one or more turns.

Although the phonies score more than 50 points better than any of the legitimate moves, they aren't the best play in this position. If Simon plays a quieter move like VOE, there will quite likely be a legal 70+ point move on the next turn. There is no compelling reason to play one of the phony bingos, and they are costly moves if they are challenged.

Here is a position from a few moves later in the same game:

Simon has 132 points and Alvin has 85 points. Alvin is to play. As you can see, Simon played VOE, and on the next turn was able to make the (legal) 98-point bingo THOlEPIN K5, using the double-double lane opened up by Alvin's TACO 7H. Had Simon tipped off the presence of a blank on his rack, Alvin would hardly have made such a juicy opportunity available!

Alvin's rack is ACILNNR, which is quite terrible on this board: 12.3 points below average. Simon's last move was LOUIE 8K 15. What should Alvin do here? Again, there is no legal bingo available, but Alvin does have ENCRINAL# O8 83 playable -- legal in CSW, but not in TWL. The best legal moves all score poorly: CAPLIN 10I 14, LUNAR M7 7, REIN 9J 10, CAIRN N6 15, URN J13 5, RIN 11J 3...

In this case, Elise would favor the phony bingo. It gives ENCRINAL# O8 a 39.4% winning probability, versus 31.8% for the best legal move, CAPLIN 10I.

Why is the phony bingo OK here? Alvin is quickly falling behind, and this rack on this board doesn't offer much relief. If ENCRINAL# is challenged, and Simon uses full knowledge of opponent's rack, Alvin has only about a 20% winning probability. If Simon lets ENCRINAL# go, Alvin's average winning probability rockets to about 60%! So, if p is Simon's probability of challenging the bingo, playing the phony makes sense if (20% × p) + (60% × (1 - p)) > 31.8% (the winning probability of the best legal move.) In this situation, playing the phony is best if p is less than 70.5% -- Simon can challenge the phony fully seven times out of ten and Alvin still wins out choosing ENCRINAL# over CAPLIN 10I. It is a high-leverage play.

Elise primarily likes the phony bingo here because the downside is fairly low (20% win probability is certainly less than the roughly 30% win probability offered by the best legal moves, but unless Alvin's rack improves dramatically the difference isn't huge, while if the bingo stays Alvin is actually winning.) Simon is not likely to challenge this word if he isn't sure, since the game is still close and Alvin's tiles, while weak, don't immediately look as weak as they are. Perhaps Simon knows the word and knows it's only good in CSW -- perhaps Simon doesn't know the word and lets it go as plausible. Obviously, against an opponent with perfect word knowledge, playing a phony is (almost) never the best option.

In this case, much to Alvin's delight, Simon lets the phony go. He knows "encrinal" is a word, and in fact he eruditely quotes its definition as Alvin lays it down ("of or relating to encrinites, fossil crinoids of the genus Encrinus") but he does not know that it is not in TWL06.

Much later in the same game:

Alvin 342, Simon 357. The bag has just been emptied. Alvin's rack is EIMORST, Simon's rack is ALNRSU?. Alvin to play.

Without phonies, Alvin is out of luck in this position. He could try something like LIMO I1 9 (to block out-bingos by Simon) → AINS 2K 31 → TERMS 3F 8, but this loses by a solid 23 points. Simon could play well sub-optimally in this endgame position and still win.

If you allow Elise to consider 4-letter phonies, then it will suggest two possibly winning moves: MORTISE E3 75 and TRISOME E3 75. Both of these words are valid, of course, but they both form EVOE# at 9E. Simon is likely to challenge this move whatever he thinks of it, certainly, but assuming that this is a friendly game where the final point spread does not matter, do you have a better idea?

Incidentally, Simon played another phony in this game (one Elise didn't agree with.) Do you see it?

Further notes

Elise more accurately evaluates the value of phonies at higher simulation ply levels -- this is true of all moves, but especially true of phonies since it will sometimes lose a turn to an opponent challenge. 3 ply is often too shallow. Elise plays well with phonies on a fast computer with at least 20 minutes on its clock, or at least 5 ply of search depth.

Elise playing English with phonies uses quite a bit more RAM than normal. Move generation, since it's working with a super-sized lexicon of over 450,000 words, is also a little slower. This means that, in timed games, Elise will be slightly weaker in a game with phonies than otherwise. And, of course, against an opponent with very strong word knowledge, Elise with phonies will be weaker than Elise without. Playing with phonies is mostly a option for fun -- and why not? Isn't fun, ultimately, the point?

You can set the minimum length phony Elise will consider using the "Lexicon → Phonies..." menu option. By default, it's set to length 5. It can be set as low as 2, but in most positions setting this lower than 5 doesn't do much -- Elise is less likely to play a phony the shorter it is, and is extremely unlikely to play 2-4 letter phonies regardless of the circumstances.

If Elise can't win an endgame without phonies, it'll try to find a winning move with phonies. It would be foolish to play a phony from a winning endgame position -- Elise does not even consider phonies in that state.

Please note that, while Elise can and will play phonies with impunity whenever it thinks it might get away with it, it will challenge your phonies whenever it is to its advantage to do so. (This is at least ninety percent of the time.) After you play a phony, Elise will always say "hold" -- you don't get to see any new tiles until Elise has determined whether or not to challenge. If Elise thinks that letting the phony stay suits its purposes, it will choose not to challenge. In other words, if you play a phony and Elise doesn't challenge it, you are in for some immediate pain.

Have fun!

— CMS

Back