next-gen-AI considerations (long text)
If you are interested in a next generation AI, let's start a discussion. I gave some hints here and then but never wrote down more, and don't think that many developers have experience with AI stuff, so here I have an article for you that orders my thoughts about AI (I am a little bit rusty in this field) and tries to explain the most important things in relation to MtG (well, and chess ^^), so if you are a user or programmer without any clue about AI you are in danger of possibly learning something 
Actually it is not that bad and having structured thoughts is the most important thing, I hope my article is structured and clear enough ^^. If you need a deeper insight, read "Russell: Artificial Intelligence" (AI stuff) and "Brachmann: Knowledge Representation" (formal logic and KB stuff), this information here is from these books (but ultra short and without pictures and stuff).
I have to say that I don't know anything about how forge works and I don't know everything about Magic either, but who does...
Feel free to comment and add ideas, maybe we can use this article to plan this thing so far that some people can use it to implement a nice AI and add it to forge
First: The goal is to write a rational system, that tries to win a Magic game with a given deck as soon as possible and it must not be a set of static rules (see later why).
Second: Let's call the AI player agent from now, because that is the common terminus in AI theory.
Every agent has a perception of the world (robots use sensors for that but we need some software interface as sensor) and may interact with it. In this case, every MtG agent has the ability to read some game information during the game and can play cards. He has an updatable memory and the ability to choose the best action. In theory basic AI agents are (in pseudocode) defined as
As we want to improve the agent's ability to properly choose the best action, we have to think about the game first:
Let's think about the game setup on the kitchen table with real cards and real players: There are different zones that may or may not contain cards (deck, hand, graveyard, exile, command zone and maybe more).
After all players shuffled the cards and drew the first 7 cards, the game begins (I count mulligan to be part of the game, because the agent has to decide if it needs to mulligan) and the current distribution of cards on the table can be seen as a specific, unique game state.
From now on, every action caused by a player creates a new game state (mulligan, playing a land, playing a card for this land's mana cost: everything changes the game state you had before). It is like a party of chess where you move the pieces and create new states.
Like in chess there are several possibilities each "turn": In chess you can draw (yes, on a piece of paper) the starting board as root of a tree and then consider the possible moves for white: with every pawn and the knights there are 20 possible game states you can move to in the very first turn (each of the 8 pawns moves 1 or 2 steps and the 2 knights can each jump to 2 positions). Well, for each of those new game states the second player may answer with (in this case) 20 possible reactions, so we have a tree with 1 initial state on the root with 20 children of game states each with 20 children of their own. This is a lot and covers only the first turn! Remark: both players could jump with their knights back and forth, creating an infinite game, which might be a problem.
But let's assume we could create the complete tree of game states (even with infinite paths inside). Then we need a strategy to win. This means, we mark which player "owns" which state by coloring every odd level white and every even level black and then start a search (BFS, not DFS, remember the possible infinity of the tree!) to find a shortest path from the root of the tree to a state in which we would win the game.
That was too easy, there must be a catch...
Compared to chess (which is tough to crunch, remember all the chessmaster tournaments against supercomputers?) you have even more branching possibilities in MtG, because there are 20.000 cards with funny abilities and you might mix them randomly together into your deck...
Obviously the AI will have some obstacles to deal with, but more about that later.
Let's have a look at the AI we have right now in forge:
If it were a chess AI, it would look a the board, push the first piece it finds to some valid place and ends it's turn. Yes, it is a little bit more intelligent, but not much (see later).
The weakness of the current forge AI is, that it cannot see how the game will develop (well, like many human players ^^) and as you might guess, computing all the possibilities will make every home computer explode for just one turn (either the CPU melts, the player ages or 100GB of RAM are not enough to save the tree of states, because the MtG tree also has infinite paths: if both players play Azorius Chancery constantly, there will never be an end of computation and all the clouds of the world have only a finite amount of RAM
).
The current AI is called a reflexive agent, because it acts after a set of rules (MtG rules and some extensions the developers programmed into forge, like: if you would play a land that returns a land (like Azorius Chancery) in the very first turn, don't do it, because that is stupid!).
A reflexive agent is (in this usecase) inferior to other models, because the ruleset is static (unlike MtG) and can never deal with all cases you might throw against it (this would result in a very, very huge and complex ruleset, that deprecates every time a new card comes out - no one wants to program that thing!
).
Before I introduce another type of agent that will be superior, we have to imagine the MtG kitchen table setup first (aka the world) and after investigating it (I did it, here is the solution), we find out that from the view of an agent the world is
Now let's return to the agent: We still want to develop how he chooses his best action during the game and discovered that the world is very dynamic (and the other stuff from above). To be able to choose the best action, we already saw that reflexive agents are not a very good choice.
So here is another type of agent (two in fact, but they can be combined) that is much more useful as an opponent player:
Before we can evaluate the game state (cards in zones and life and stuff), the agent has to save this information in the agent's memory (some sort of database) and every time when something changes, the agent has to be able to also update this information.
A database like that is called a Knowledge Base (KB) and it may also be used by the agent to deduct information from (by resolution: https://en.wikipedia.org/wiki/Resolution_%28logic%29 This source is crappy but if you never heard it, you may gain a first impression there).
This results in additional functions that are needed:
The KB can be created for every single game or could be saved persistantly on disk. This simulates how a human player learns during play or during play sessions with the same opponent and increases the difficulty.
The KB has to be filled with the following information about every team and player:
(I am thinking about a class AIKnowledgeBase or so with appropriate containers for each zone of each player of each team, where cards, actions and results are saved).
Let's assume now, our KB system was ready and perfect and the only thing left is the agent. If the new agent is utility-based, he needs an utility-function that delivers numerical values of game states which can be compared to each other.
A prototype of the utility function would be
If we apply the function to each leaf of the game state tree, the return value (some number) will tell the agent if he wants to play up to this state.
As I stated before, building the complete tree may be impossible (infinite path) or inappropriate (crashes the PC because of exponential growth). A first improvement is the Minimax algorithm, which generates an improved strategy because it discards half of the tree (see https://en.wikipedia.org/wiki/Minimax).
Using the standard Minimax you still have the problem that you need to search from the root to the leaves, which could still use a ridiculous amount of time and space. In that case an eval function starts the search for some levels of the tree (could be a hardcoded number of steps, adaptive or choosable by the player), then stops the search and calls a heuristic function that evaluates the reachable results so far. This heuristic is the fun part:
In principle it's goal is to evaluate

After calling the heuristic, the agent can decide which path to choose, play accordingly and update it's KB.
There is still a lot of fine tuning though:
As we know, the non-cheating AI cannot know all cards and all positions but may have gathered some stochastic values about that. This should be considered by the heuristic to improve results.
A good idea is a generic interface to this heuristic and a directory where one can copy his own heuristic scripts (could be Lua or another language btw, many other game projects outsource AI to a scripting language); It is also wise to make very small heuristic scripts for exactly one purpose and then combine them... compare it to the Starcraft Broodwar AI Loader where you can write your own AI scripts and select or deselect them before creating a new game (they make AI competitions with Starcraft 1!).
A funny idea is a meta-heuristic that finds the best heuristic (we put a car into your car ^^). It would ask the KB about all known cards, colors and abilities and assume what the best scripts are for this particular opponent.
One last addition to the game tree: we have to add chance nodes for coin tosses and the plane dice. If we have enough computing power and space, then the solution with chances is a so called Monte Carlo Simulation, where the same game is played some thousand times to find acceptable approximations.
My example heuristic:
The learning part I mentioned above needs some love too, but I am exhausted right now and guess that should be enough for now. What I say about learning is, that the game log may be used to gather player behaviour like special cardcombos (e.g. he always blocks and attacks with his Bloodthrone Vampire and sacrifices his Reassembling Skeleton, immediately returns it to the battle field, sacrifices it again and again... I should do something about that and stop blocking his dumb 2/2 token with my wall while getting 8 damage from the Bloodthrone Vampire every turn
).
Actually it is not that bad and having structured thoughts is the most important thing, I hope my article is structured and clear enough ^^. If you need a deeper insight, read "Russell: Artificial Intelligence" (AI stuff) and "Brachmann: Knowledge Representation" (formal logic and KB stuff), this information here is from these books (but ultra short and without pictures and stuff).
I have to say that I don't know anything about how forge works and I don't know everything about Magic either, but who does...
Feel free to comment and add ideas, maybe we can use this article to plan this thing so far that some people can use it to implement a nice AI and add it to forge
First: The goal is to write a rational system, that tries to win a Magic game with a given deck as soon as possible and it must not be a set of static rules (see later why).
Second: Let's call the AI player agent from now, because that is the common terminus in AI theory.
Every agent has a perception of the world (robots use sensors for that but we need some software interface as sensor) and may interact with it. In this case, every MtG agent has the ability to read some game information during the game and can play cards. He has an updatable memory and the ability to choose the best action. In theory basic AI agents are (in pseudocode) defined as
- Code: Select all
function agent(percept) {
memory = updateMemory(memory, percept)
action = chooseBestAction(memory)
memory = updateMemory(memory, action)
return action
}
As we want to improve the agent's ability to properly choose the best action, we have to think about the game first:
Let's think about the game setup on the kitchen table with real cards and real players: There are different zones that may or may not contain cards (deck, hand, graveyard, exile, command zone and maybe more).
After all players shuffled the cards and drew the first 7 cards, the game begins (I count mulligan to be part of the game, because the agent has to decide if it needs to mulligan) and the current distribution of cards on the table can be seen as a specific, unique game state.
From now on, every action caused by a player creates a new game state (mulligan, playing a land, playing a card for this land's mana cost: everything changes the game state you had before). It is like a party of chess where you move the pieces and create new states.
Like in chess there are several possibilities each "turn": In chess you can draw (yes, on a piece of paper) the starting board as root of a tree and then consider the possible moves for white: with every pawn and the knights there are 20 possible game states you can move to in the very first turn (each of the 8 pawns moves 1 or 2 steps and the 2 knights can each jump to 2 positions). Well, for each of those new game states the second player may answer with (in this case) 20 possible reactions, so we have a tree with 1 initial state on the root with 20 children of game states each with 20 children of their own. This is a lot and covers only the first turn! Remark: both players could jump with their knights back and forth, creating an infinite game, which might be a problem.
But let's assume we could create the complete tree of game states (even with infinite paths inside). Then we need a strategy to win. This means, we mark which player "owns" which state by coloring every odd level white and every even level black and then start a search (BFS, not DFS, remember the possible infinity of the tree!) to find a shortest path from the root of the tree to a state in which we would win the game.
That was too easy, there must be a catch...
Compared to chess (which is tough to crunch, remember all the chessmaster tournaments against supercomputers?) you have even more branching possibilities in MtG, because there are 20.000 cards with funny abilities and you might mix them randomly together into your deck...
Obviously the AI will have some obstacles to deal with, but more about that later.
Let's have a look at the AI we have right now in forge:
If it were a chess AI, it would look a the board, push the first piece it finds to some valid place and ends it's turn. Yes, it is a little bit more intelligent, but not much (see later).
The weakness of the current forge AI is, that it cannot see how the game will develop (well, like many human players ^^) and as you might guess, computing all the possibilities will make every home computer explode for just one turn (either the CPU melts, the player ages or 100GB of RAM are not enough to save the tree of states, because the MtG tree also has infinite paths: if both players play Azorius Chancery constantly, there will never be an end of computation and all the clouds of the world have only a finite amount of RAM
The current AI is called a reflexive agent, because it acts after a set of rules (MtG rules and some extensions the developers programmed into forge, like: if you would play a land that returns a land (like Azorius Chancery) in the very first turn, don't do it, because that is stupid!).
A reflexive agent is (in this usecase) inferior to other models, because the ruleset is static (unlike MtG) and can never deal with all cases you might throw against it (this would result in a very, very huge and complex ruleset, that deprecates every time a new card comes out - no one wants to program that thing!
).Before I introduce another type of agent that will be superior, we have to imagine the MtG kitchen table setup first (aka the world) and after investigating it (I did it, here is the solution), we find out that from the view of an agent the world is
- nonaccessible (not all information can be gathered, except the AI is cheating and looking through every pile of cards and all hands),
- deterministic (all game states only depend on and emerge from the previous game state, remember the tree)
- dynamic (all/most game zones change)
- episodic (possible actions depend from the current state)
- discrete (the model of a single game state is finite: we have a finite number of cards in a finite number of zones)
Now let's return to the agent: We still want to develop how he chooses his best action during the game and discovered that the world is very dynamic (and the other stuff from above). To be able to choose the best action, we already saw that reflexive agents are not a very good choice.
So here is another type of agent (two in fact, but they can be combined) that is much more useful as an opponent player:
- The agent is utility-based (he knows parts of the world and information how to reach it's goal. He needs search or planning functions and he can decide how happy he is with his work with a so called utility function)
- The agent learns (a critic compares what the agent does with some performance standards, the agent adapts. A problem generator proposes actions which might be less effective in the moment but open better options later).
Before we can evaluate the game state (cards in zones and life and stuff), the agent has to save this information in the agent's memory (some sort of database) and every time when something changes, the agent has to be able to also update this information.
A database like that is called a Knowledge Base (KB) and it may also be used by the agent to deduct information from (by resolution: https://en.wikipedia.org/wiki/Resolution_%28logic%29 This source is crappy but if you never heard it, you may gain a first impression there).
This results in additional functions that are needed:
- Code: Select all
function ask(KB, question) { ... }
function tell(KB, fact) { ... }
The KB can be created for every single game or could be saved persistantly on disk. This simulates how a human player learns during play or during play sessions with the same opponent and increases the difficulty.
The KB has to be filled with the following information about every team and player:
- Save all game logs (to be able to learn cardcombos and behaviour of the players and to be able to choose appropriate (counter)strategies), including life and posion counter history, what attacked/blocked what etc.
- Cheating AI: all cards and their exact position in the game are known at all times. This KB contains perfect information, thus helps to create the most difficult player to deal with (ever played against god? ^^).
- Non-Cheating AI: remember every card that was ever revealed and in which zone it is or could be now, including the position's probability, aka try to approximate the perfect KB by considering
- Graveyards, Exile, Command Zone, ...
- Own hand
- Enemy's hand (revealed and number of unkown cards)
- Decks:
- in the first round: only all own cards without position, optional all cards from own team
- in later rounds: all cards that were revealed so far (previous games, this game).
- in the first round: only all own cards without position, optional all cards from own team
- Graveyards, Exile, Command Zone, ...
(I am thinking about a class AIKnowledgeBase or so with appropriate containers for each zone of each player of each team, where cards, actions and results are saved).
Let's assume now, our KB system was ready and perfect and the only thing left is the agent. If the new agent is utility-based, he needs an utility-function that delivers numerical values of game states which can be compared to each other.
A prototype of the utility function would be
- Code: Select all
utility(state, player) = {1, -1, 0}
If we apply the function to each leaf of the game state tree, the return value (some number) will tell the agent if he wants to play up to this state.
As I stated before, building the complete tree may be impossible (infinite path) or inappropriate (crashes the PC because of exponential growth). A first improvement is the Minimax algorithm, which generates an improved strategy because it discards half of the tree (see https://en.wikipedia.org/wiki/Minimax).
Using the standard Minimax you still have the problem that you need to search from the root to the leaves, which could still use a ridiculous amount of time and space. In that case an eval function starts the search for some levels of the tree (could be a hardcoded number of steps, adaptive or choosable by the player), then stops the search and calls a heuristic function that evaluates the reachable results so far. This heuristic is the fun part:
In principle it's goal is to evaluate
- card advantage
- number of cards in the zones
- free mana advantage
- life points advantage
- other modifiers (which ones?)
After calling the heuristic, the agent can decide which path to choose, play accordingly and update it's KB.
There is still a lot of fine tuning though:
As we know, the non-cheating AI cannot know all cards and all positions but may have gathered some stochastic values about that. This should be considered by the heuristic to improve results.
A good idea is a generic interface to this heuristic and a directory where one can copy his own heuristic scripts (could be Lua or another language btw, many other game projects outsource AI to a scripting language); It is also wise to make very small heuristic scripts for exactly one purpose and then combine them... compare it to the Starcraft Broodwar AI Loader where you can write your own AI scripts and select or deselect them before creating a new game (they make AI competitions with Starcraft 1!).
A funny idea is a meta-heuristic that finds the best heuristic (we put a car into your car ^^). It would ask the KB about all known cards, colors and abilities and assume what the best scripts are for this particular opponent.
One last addition to the game tree: we have to add chance nodes for coin tosses and the plane dice. If we have enough computing power and space, then the solution with chances is a so called Monte Carlo Simulation, where the same game is played some thousand times to find acceptable approximations.
My example heuristic:
- card advantage:
+1 per common, +2 uncommon, +3 rare, +4 mythic permanent on the battlefield
+1 per Power/Toughness of every creature incl +/- counter, auras, ... and damage
x2 per Infect, Trample, Vigilance, Deathtouch, Lifelink, Unblockable, Protection, ... ability of each creature
x0.5 per "Cannot (block, attack, use ability, ...)" or when creature is tapped - cards in hand
* Many cards punished, e.g. by damage, >7-milling? -1 per card over the limit of cards
* Few cards punished (e.g. The Rack)? -1 per card under limit - Mana advantage:
+1 for each free Mana if Instants, Flashbacks, Abilities are playable - Life advantage:
+1 for each own life point
-1 for each opponent's life point
+1 for each poison counter
-10% for each own poison counter - Other modifyers:
-0.5 for playing a land while mana advantage is sufficient and few cards in hand are left (bluff)
* Players are less worth than teams (in multiplayer/multiteam games killing yourself or your teammates could be the key to victory):
-1M points if own team loses
+1M if own team wins
-200k, if team player dies
+200k, if enemy player dies
The learning part I mentioned above needs some love too, but I am exhausted right now and guess that should be enough for now. What I say about learning is, that the game log may be used to gather player behaviour like special cardcombos (e.g. he always blocks and attacks with his Bloodthrone Vampire and sacrifices his Reassembling Skeleton, immediately returns it to the battle field, sacrifices it again and again... I should do something about that and stop blocking his dumb 2/2 token with my wall while getting 8 damage from the Bloodthrone Vampire every turn
).