Traversable Game State/Engine Code

by **KrazyTheFox** » 07 Sep 2014, 23:35

Hey guys,

I've been toying around with some code and data structures that would let Forge have a fully traversable game state and I have an extremely rudimentary implementation currently that kind of functions as a proof-of-concept (you can pass priority and that's it). The goal behind this would be to allow for the creation of minimax AI that would be a whole lot smarter than what we have now (while simultaneously allowing perfect playback of previous games and undoing of any action).

The way it works in my example code is as follows:

The Game has a Stack and a LinkedList: one for GameEvents (List) and another for GameActions (Stack). The List is stored in another class that tracks the index of the last executed Event.
A GameAction is a single command (think command pattern) that changes the state of one thing and can undo that change.
A GameEvent (in this context; forget about the existing GameEvent classes for now) is a collection of GameActions that are all executed in order, allowing for complex interactions with small parts.
When something in the game happens, nearly all of the time a GameEvent will be passed to the game to be processed. The game creates an EventMarkerAction, signifying the start of a new event (important later) and puts it on the GameAction stack. The GameEvent is then told to execute and put its own GameActions on the stack.
To undo an event, the GameEvent list is told to move its index back one and GameActions are popped off the stack and are undone until an EventMarkerAction is removed. Once an EventMarkerAction is removed, we know that a full Event has been undone and we can safely continue.
If a GameEvent is undone, it can be redone by grabbing the next GameEvent from the event list and re-applying the Actions to the game state.
If a GameEvent is undone and a new GameEvent is executed (the player does something different instead of redoing a GameEvent), the GameEvents ahead of the current index are removed and the new GameEvent is put on the end.
For actions such as shuffling a deck that require random numbers, I've created a "ForgeRandom" class that tracks the seed of the Random generator so that it can be set to the same thing every time an action needs redoing.

Since all this is a lot to take in and is probably confusing since I'm awful at explaining things, I'll go through the start of an example game real quick (lots of text warning) right here. This is the output of a simulated game using some simple actions and events (no AI, input is hardcoded) and is likely much easier to follow than anything I could put into text.

Now, here's the same game, except after it finishes (gets to the point above), I undo every action back to the beginning of the game.

Finally, here's the same game again, except after it finishes, I undo every action back to the tapping of the forest and decide to cast a Llanowar Elves card instead.

A similar concept to the EventMarkerAction can be implemented to rewind the game state for AI if necessary, though a better way (possibly?) would be to clone() the game (which clones the queues/stacks) and then add new events to those. It would really depend on memory vs. processing time.

Obviously, this wouldn't be a simple task as every effect, trigger, etc would need rewriting to use this (or a similar) system. It might be something I just work on every so often, but I'm not sure I could ever complete it on my own while I'm employed.

So at this point I'm mostly just wondering what people think and if they have any ideas/comments.

The source code of the above examples is here if you want to take a look: https://github.com/KrazyTheFox/Forge-Engine-Example

by **excessum** » 08 Sep 2014, 00:22

I think that this is an excellent idea. This should greatly improve and simplify the AI logic for combat and stack logic for stuff like removal and pumping. The existing code involves calling long sequences of functions over multiple classes for those tasks and having the option to just run a sequence through a simulation will be a great improvement.

Just as a simple example, this will probably eliminate most of the bugs with the AI enchanting/pumping/equipping/etc. "useless" creatures (Pacifism/Encrust etc.) by checking that the recipient after the "sequence" is useful instead of adding the checks in each of the respective effect APIs.

by **KrazyTheFox** » 08 Sep 2014, 00:48

excessum wrote:I think that this is an excellent idea. This should greatly improve and simplify the AI logic for combat and stack logic for stuff like removal and pumping. The existing code involves calling long sequences of functions over multiple classes for those tasks and having the option to just run a sequence through a simulation will be a great improvement.

Just as a simple example, this will probably eliminate most of the bugs with the AI enchanting/pumping/equipping/etc. "useless" creatures (Pacifism/Encrust etc.) by checking that the recipient after the "sequence" is useful instead of adding the checks in each of the respective effect APIs.

The idea would be to do something similar to both Magarena and Duels of the Planeswalkers as far as AI goes. Basically take the game state, then get every conceivable action the AI can do in the current game state, clone the game for each action, then perform said actions in their own states. The game state would then be scored to find out if an action was advantageous. The states are then sorted by score and the process is repeated for each game state. Ultimately, you'd end up with a whole lot of states with scores. The AI would propagate those scores back up the tree and select the action with the highest total score, which would be "best" one. Not perfect by any means, but that's just the basics of it and it can definitely be improved.

It'll definitely fix a whole lot of problems with the AI pumping, blocking, etc incorrectly and will automatically pick up a lot of advanced interactions (for example, it'd lightning bolt your 6/6 and then attack with its 3/3 with first strike, keeping you from safely blocking).

That said, this will be much more processor intensive since we'll be simulating many hundreds (or even thousands) of actions at once. I'm hoping we can combat some of that by completely fixing the way static effects are handled. To keep this in check, we'd limit the amount of time the AI spends calculating states to X seconds. That way it'll be performant enough on both desktop and mobile and can be configurable for those who don't mind waiting for better AI so much.

by **drdev** » 08 Sep 2014, 01:04

This could also help a lot with the network implementation. Good luck and thanks.

by **KrazyTheFox** » 08 Sep 2014, 01:10

drdev wrote:This could also help a lot with the network implementation. Good luck and thanks.

That it should. I'll keep that in mind when I work on things.

by **excessum** » 08 Sep 2014, 01:45

KrazyTheFox wrote:The idea would be to do something similar to both Magarena and Duels of the Planeswalkers as far as AI goes. Basically take the game state, then get every conceivable action the AI can do in the current game state, clone the game for each action, then perform said actions in their own states. The game state would then be scored to find out if an action was advantageous. The states are then sorted by score and the process is repeated for each game state. Ultimately, you'd end up with a whole lot of states with scores. The AI would propagate those scores back up the tree and select the action with the highest total score, which would be "best" one. Not perfect by any means, but that's just the basics of it and it can definitely be improved.

It'll definitely fix a whole lot of problems with the AI pumping, blocking, etc incorrectly and will automatically pick up a lot of advanced interactions (for example, it'd lightning bolt your 6/6 and then attack with its 3/3 with first strike, keeping you from safely blocking).

That said, this will be much more processor intensive since we'll be simulating many hundreds (or even thousands) of actions at once. I'm hoping we can combat some of that by completely fixing the way static effects are handled. To keep this in check, we'd limit the amount of time the AI spends calculating states to X seconds. That way it'll be performant enough on both desktop and mobile and can be configurable for those who don't mind waiting for better AI so much.

If I am not wrong both Magarena and Duels of the Planeswalkers have far less cards than Forge so it is probably impossible to design a game state system similar to their's without chopping off AI support for thousands of cards. Limiting the game state system to just a single phase or "n" priority steps (to prevent exploding decision trees) is probably sufficient to greatly improve the current AI. Improving the static effect handling is going to be a beast by itself, considering how long the Blood Artist trigger bug has been sitting in "Known Issues".

For your example, optimally the attack will attack with the first-strike 3/3 and then Lightning Bolt when the player/opponent tries to block or the reverse of blocking the 6/6 and then bolting. This should be possible if the game state evaluation extends over the full combat phase though the computation cost might be an issue. Bolting and then attacking is usually a terrible choice since the AI just wasted a card and mana to prevent the opponent from blocking ie. Mugging.

The other main advantage is to refactor a crap ton of old AI code so that it is easier for new developers to come in.

by **KrazyTheFox** » 08 Sep 2014, 02:02

excessum wrote:If I am not wrong both Magarena and Duels of the Planeswalkers have far less cards than Forge so it is probably impossible to design a game state system similar to their's without chopping off AI support for thousands of cards. Limiting the game state system to just a single phase or "n" priority steps (to prevent exploding decision trees) is probably sufficient to greatly improve the current AI. Improving the static effect handling is going to be a beast by itself, considering how long the Blood Artist trigger bug has been sitting in "Known Issues".

For your example, optimally the attack will attack with the first-strike 3/3 and then Lightning Bolt when the player/opponent tries to block or the reverse of blocking the 6/6 and then bolting. This should be possible if the game state evaluation extends over the full combat phase though the computation cost might be an issue. Bolting and then attacking is usually a terrible choice since the AI just wasted a card and mana to prevent the opponent from blocking ie. Mugging.

The other main advantage is to refactor a crap ton of old AI code so that it is easier for new developers to come in.

I think you're thinking about this wrong. Instead of trying to figure out AI for each card, the AI will simulate playing the card, then it'll evaluate the board by assigning scores to things. For example, losing the game would get a score of, say -Integer.MAX_VALUE. A creature on the opponent's board would get -1. A creature on its own board would get +1. Life total of 20 would get +20. Things like that. Evaluating the board like this will help the AI determine what the best course of action would be by steering away from poor scores, regardless of what actually causes it.

I will admit that my example was a poor one and you should add another, capable blocker to the AI's side that can deal with the 6/6, give the player 3 life, then it'll have the desired behavior. The point here is that we wouldn't be telling the AI what to do, but rather, it'd be figuring it out on its own.

by **friarsol** » 08 Sep 2014, 02:19

The key to comparing Game States is the weighting. How do you properly weight cards the opponent controls that it doesn't know how to use? If the Human has two 3/3 creatures, how does the AI know which one to cast Temrinate on?

by **KrazyTheFox** » 08 Sep 2014, 02:36

friarsol wrote:The key to comparing Game States is the weighting. How do you properly weight cards the opponent controls that it doesn't know how to use? If the Human has two 3/3 creatures, how does the AI know which one to cast Temrinate on?

Assuming they're identical 3/3s, it'd just pick one at random (assuming that killing a creature is the highest scoring option at the time). If one has, say, flying, and the AI has nothing that can block it, then it should be able to identify that it needs to kill the flyer by simulating the future attack.

The actual scoring methods will need plenty of adjusting and tweaking to get priorities right, but that's the easy part.

Edit: Another thought just occurred to me. We could give AIs different profiles by adjusting the values assigned to various things on the board. A reckless AI might not care about its own life total, whereas a life gain AI would prioritize it.

by **excessum** » 08 Sep 2014, 02:53

KrazyTheFox wrote:I think you're thinking about this wrong. Instead of trying to figure out AI for each card, the AI will simulate playing the card, then it'll evaluate the board by assigning scores to things. For example, losing the game would get a score of, say -Integer.MAX_VALUE. A creature on the opponent's board would get -1. A creature on its own board would get +1. Life total of 20 would get +20. Things like that. Evaluating the board like this will help the AI determine what the best course of action would be by steering away from poor scores, regardless of what actually causes it.

I will admit that my example was a poor one and you should add another, capable blocker to the AI's side that can deal with the 6/6, give the player 3 life, then it'll have the desired behavior. The point here is that we wouldn't be telling the AI what to do, but rather, it'd be figuring it out on its own.

What I meant was those other engines implement less cards so they can afford to run minimax over more phases to get better AI coverage as you described. Trying to run minimax for cards like Chord of Calling and Thoughtseize is going to be a nightmare if you go deeper than one "step" (ie. do not consider the ETB triggers of the fetched creature, discard effects like Loxodon Smiter). This is not even considering the real AI killers like Steam Augury with combinations of choices.

Just applying minimax for Forge's implemented cards will involve all of the above so you will get exploding decision trees.

Chord of Calling = choose what to tap (lands and creatures), what to fetch, fetched creature's ETB/static effect, combat options. Being an instant, all of the above will happen for each priority step...

by **KrazyTheFox** » 08 Sep 2014, 02:56

excessum wrote:
KrazyTheFox wrote:I think you're thinking about this wrong. Instead of trying to figure out AI for each card, the AI will simulate playing the card, then it'll evaluate the board by assigning scores to things. For example, losing the game would get a score of, say -Integer.MAX_VALUE. A creature on the opponent's board would get -1. A creature on its own board would get +1. Life total of 20 would get +20. Things like that. Evaluating the board like this will help the AI determine what the best course of action would be by steering away from poor scores, regardless of what actually causes it.

I will admit that my example was a poor one and you should add another, capable blocker to the AI's side that can deal with the 6/6, give the player 3 life, then it'll have the desired behavior. The point here is that we wouldn't be telling the AI what to do, but rather, it'd be figuring it out on its own.
What I meant was those other engines implement less cards so they can afford to run minimax over more phases to get better AI coverage as you described. Trying to run minimax for cards like Chord of Calling and Thoughtseize is going to be a nightmare if you go deeper than one "step" (ie. do not consider the ETB triggers of the fetched creature, discard effects like Loxodon Smiter). This is not even considering the real AI killers like Steam Augury with combinations of choices.

Just applying minimax for Forge's implemented cards will involve all of the above so you will get exploding decision trees.

Chord of Calling = choose what to tap (lands and creatures), what to fetch, fetched creature's ETB/static effect, combat options. Being an instant, all of the above will happen for each priority step...

Ah, yeah, I see how that could be a problem. I'm not immediately sure how this would be fixed, but I can say for sure that we have plenty of time to figure it out. I don't expect any of this to be done for at least a couple months, if not much longer.

by **mastroego** » 08 Sep 2014, 21:27

Speaking purely as a curious observer/ex amateurish coder.

If I get it right you have "Steps" (each point in a sequences of actual decisions), i.e. a Level in the Decision Tree, and singular potential "Choices" (which can be encountered in any number even on a specific "Step", because of cards that offer many different possibilities), i.e. the elements that populate the tree.

What I'd instinctively consider possible is to use a sort of "hard counter" for the Choices. Before running the simulations the AI could count how many Choices are possible at the current Level, and if that number exceeds a fixed parameter (possibly adjustable in the Options panel), the AI stops the iteration at the previous level, and uses whatever scores it obtained already or, at worst, reverts to default "dumb" Forge behavior.

Even with a relatively low value of max choices, you'd still get a substantial improvement in the perceived AI for the most frequent cases, and even the hard ones could be handled at least "decently" if the set value allowed for a depth of a couple of Steps. I know, the exponential behavior is dreadful, but a middle-ground can be probably found: after all we ARE willing to give some time to think to a human player.
You could even set a different (lower) max value for instants, since they have the potential of creating more troubles (as in the above example), accepting the trade-off with performance/playability.

Ok, this kind of considerations are probably totally trivial to you, and/or contain idiocies.
If either of the above is true, just disregard this post

by **jeffwadsworth** » 09 Sep 2014, 16:56

I have been looking over the AI code of XMage. Aside from being very difficult to follow, it only checks for moves in one phase. Your code looks like a nice stepping stone to getting it on the right track. Please keep up the good work.

by **Agetian** » 09 Sep 2014, 17:11

This sounds like an absolutely amazing idea! Best of luck with the implementation!

- Agetian

by **KrazyTheFox** » 09 Sep 2014, 21:58

mastroego wrote:Speaking purely as a curious observer/ex amateurish coder.

If I get it right you have "Steps" (each point in a sequences of actual decisions), i.e. a Level in the Decision Tree, and singular potential "Choices" (which can be encountered in any number even on a specific "Step", because of cards that offer many different possibilities), i.e. the elements that populate the tree.

What I'd instinctively consider possible is to use a sort of "hard counter" for the Choices. Before running the simulations the AI could count how many Choices are possible at the current Level, and if that number exceeds a fixed parameter (possibly adjustable in the Options panel), the AI stops the iteration at the previous level, and uses whatever scores it obtained already or, at worst, reverts to default "dumb" Forge behavior.

Even with a relatively low value of max choices, you'd still get a substantial improvement in the perceived AI for the most frequent cases, and even the hard ones could be handled at least "decently" if the set value allowed for a depth of a couple of Steps. I know, the exponential behavior is dreadful, but a middle-ground can be probably found: after all we ARE willing to give some time to think to a human player.
You could even set a different (lower) max value for instants, since they have the potential of creating more troubles (as in the above example), accepting the trade-off with performance/playability.

Ok, this kind of considerations are probably totally trivial to you, and/or contain idiocies.
If either of the above is true, just disregard this post

I think you're on the right track, here (Events are collections of Actions and Actions do only one thing; cards, abilities, player input is all submitted to the game in the form of Events). For things with many choices, it's likely that I'll default to the current AI to select a bunch (say, 10) of the "best" options and explore those. This AI could be improved over time, as, like you said, just about anything would be an improvement.

jeffwadsworth wrote:I have been looking over the AI code of XMage. Aside from being very difficult to follow, it only checks for moves in one phase. Your code looks like a nice stepping stone to getting it on the right track. Please keep up the good work.

Thanks! I'll try to keep it moving along.

Agetian wrote:This sounds like an absolutely amazing idea! Best of luck with the implementation!

- Agetian

I just hope it turns out the way it works in my head.

Thanks for the encouragement!

Traversable Game State/Engine Code

Traversable Game State/Engine Code

Re: Traversable Game State/Engine Code

Re: Traversable Game State/Engine Code

Re: Traversable Game State/Engine Code

Re: Traversable Game State/Engine Code

Re: Traversable Game State/Engine Code

Re: Traversable Game State/Engine Code

Re: Traversable Game State/Engine Code

Re: Traversable Game State/Engine Code

Re: Traversable Game State/Engine Code

Re: Traversable Game State/Engine Code

Re: Traversable Game State/Engine Code

Re: Traversable Game State/Engine Code

Re: Traversable Game State/Engine Code

Re: Traversable Game State/Engine Code

Who is online

Main Menu

User Menu

Our Partners

Who is online

Who is online

Main Menu

User Menu

Our Partners

Who is online

Login Form