Simulation-based AI

by **Myrd** » 01 Jan 2016, 20:43

About a year ago, I started the implementation of a Simulation-based AI algorithm. The idea is to model it after Chess AIs - i.e. instead of using heuristics to figure out what to play (which is what the current AI does), making the AI simulate (using Forge's rules engine) what would happen as a result of a move and then evaluate that game state.

This way, the AI would be able to see the full effects of its moves and use that information in its decision making process. A simple example would be seeing that playing a creature when the opponent has a Soul Warden and Serra Ascendant with 29 life would cause the Serra Ascendant to become a 6/6 flier.

Some discussion regarding my initial implementation can be found in the Traversable Game State thread, but I'm starting this new thread, since I don't want to hijack that thread more than I already have. Although the goals of the two are similar, the distinction is my Simulation AI uses the existing Forge code base and rules engine, whereas that thread is focused on essentially building a new engine.

Anyway, with the above out of the way, I'd like to start the conversation about the current state of the Simulation AI and potential future work.

First, let me summarize the algorithm briefly. When deciding what Spell or Ability to play, the Simulation AI does the following:

Code: Select all: findBestSpellOrAbilityToPlay(game): for sa in all valid spells & abilities: if sa.needsTargets: for each possible set of targets: gameCopy = makeCopy(game) gameCopy.play(spellOrAbility, targets) score = eval(gameCopy) else gameCopy = makeCopy(game) gameCopy.play(spellOrAbility) score = eval(gameCopy) return sa with best score

Now, the above is greatly simplified and there's actually one other important part to it: recursion. Instead of checking the score for each ability it can immediately play, it will take the gameCopy where that ability was played and then see what other abilities are possible to play and simulate them. After some hard-coded recursion depth, it returns the ability that results in the best final score. So, it may find that the best ability is to activate a fetch land to get a land that produces a mana that is needed to cast the best spell that couldn't have otherwise been cast.

So, back to the current state of its implementation in Forge:
All the code for the current implementation is landed in Forge and it is generally functional (i.e. doesn't crash very often and is able to make reasonable decisions in most cases). While it's generally not ready for end user use, it can be experimented with in the desktop client by right-clicking on the "AI" radio-button and activating the Simulation AI checkbox that appears.

However, this current implementation does have a number of problems:
- There are still game states that it chokes on. You might see this as a "Game copy error". This happens when the simulation code detects that it didn't accurately copy the game state - i.e. when eval(game) != eval(gameCopy). When this happens, the code will print a diff output of the two states, which helps in figuring out the discrepancy. The fix is usually to update the copying code to take into account the extra state.
- Speed. Because the algorithm is recursive and abilities can have many targets it can take a long time for the AI to make a move when there are a lot of things on the battlefield. There are a number of ways to improve this, which I'll cover in a separate post.
- Decisions. Sometimes it will still make sub-optimal decisions. Often, this is because there are aspects that the new AI doesn't yet take into account (e.g. what phase to play spells, or how to best interact with combat, etc) - and sometimes the old heuristic based AI still makes better decisions in those circumstances.

by **Myrd** » 01 Jan 2016, 20:55

In terms of speed, the main problem is the current algorithm is essentially a depth-first traversal of the possible moves, with a hard-coded depth limit.

This quickly gets very slow when a) there's many possible spells and abilities and b) when there's many targets. Not only are we evaluating all possible orderings of the spells to play, but also multiplying that at each recursion stage by the number of possible targets of spells that have them. It gets unwieldy, very fast.

How can this be sped up:
- Pruning decision trees. Either with heuristics to just ignore bad actions (e.g. killing own creatures) or by memoizing state that was previously evaluated - e.g. if we found the best target for a spell previously, we shouldn't need to re-evaluate it at every stage of the recursion when considering that spell again. Of course, it gets tricky, because there may be new targets since the game state might be different now.
- Switching from a depth-first traversal of the decision tree to a bread-first traversal of it. Basically, queuing up new game states and simulating them in queue order, adding new sub-states to the queue as they're encountered. This way, we can set a fixed time limit on how long the AI should execute for and when that's reached, it will return the "best result so far". The challenge here is that this involves keeping many copies of the game state around in the queue, whereas depth first only needs as many copies as the max recursion depth. I believe doing this naively (i.e. without other things like pruning) will cause memory issues, as game state objects are not cheap RAM-wise. Still, I think this is the right thing to do, we just need to figure out how to get there.
- Speeding up Forge game copying and simulation. Since each step of the recursion involves making a copy of the game and then evaluating a spell/ability, speeding those parts up will speed up the Simulation AI algorithm. In particularly, I've recently found that the slowest part (e.g. something like 80% of the time) is taken by GameCopier code that copies cards via Card.fromPaperCard(). So, just fixing this part alone (i.e. by copying the live properties of cards rather than re-parsing them from the PaperCard object) will speed up this algorithm by 80%.

So the above are some of the best ways to speed it up to make it more useable when the game gets more complicated. I think realistically all of the above techniques will need to be applied to make Simulated AI "production ready".

Part of the reason I'm writing these posts is to document the state of the system and hopefully encourage others to contribute to it. I think it has a lot of potential to produce a really good magic AI that takes advantage of Forge's rule engine to decide how to play.

by **Marek14** » 02 Jan 2016, 10:37

What would be the result if you combine the approach with the current fast but not smart AI? For example, simulating several depth levels, then playing "normally" from there to determine the winner?

Second question is about hidden information. If you copy the game state, doesn't that mean that the AI will see the effect of its decision with hidden information revealed? For example, the simulation tells "if you cast this spell, opponent will counter it", and so it won't cast it?

To make hidden information work correctly, it would be necessary to fully randomize anything that is not accessible to AI before starting the simulation.

by **Myrd** » 02 Jan 2016, 16:48

Hidden information is definitely a problem.

Although, right now the AI doesn't try to simulate the opponenrs moves yet, it essentially plays solitaire in its simulation, so the counterspell example doesn't apply. (At least in terms of casting spells - it does try to predict possible attacks by the human.)

But other hidden information does - e.g. if he simulates an ability that makes him draw a card, he will know what card it is. I think long term it is something that's worth fixing, but I think it's not the top priority at the moment.

In terms of using the normal AI during its simulation - it actually does do some of this already - mainly because the new AI is an extension of the old AI code. So parts of it that have not been modified will make decisions the old way.

However, I'm not sure about using the new AI during the recursion. One problem with this is that the result won't be accurate in the sense that if there's a chain of moves X->Y->Z that the AI came up with that lead to the best solution, if the AI won't actually pick Y and Z after making X (due to differences in simulation vs. Heuristic AI) then the reason for choosing X might not be correct.

Although, perhaps it can be used just for target selection at recursion levels greater than 0. This will be a form of pruning that will likely help a lot when there are a lot choices.

by **timmermac** » 16 Jan 2016, 20:02

There are a couple of potential issues here that both relate to system requirements. First, with the desktop version, I believe we're requiring 2 GB of RAM. With your new AI system, how much more RAM will be needed, and will Java support more? This leads to the other issue. I am somewhat dubious regarding the usability of this new AI on the mobile platform, with its far more limited resources.