AI improvements
A version of WOTC's game by telengard
Moderators: telengard, CCGHQ Admins
AI improvements
by telengard » 21 Sep 2011, 22:27
For the past few weeks I've been focusing on AI improvements. I've made some great strides w/ the minmax, and even moreso w/ the hybrid AI. Against the minmax it wins close to 100% of the time. I also spent time w/ pure Monte Carlo and even at very very high numbers of plays it doesn't do all that well. I've made enough improvements performance-wise to bump up the default depth for minmax to 8.
Over the past night or so I let the 2 AIs play against themselves (using the same warband to keep things fair).
The results were:
Silver wins 4, Gold wins 96
The 4 losses were:
Winning player was Silver; Final score: Silver 6 ---- Gold 3
Winning player was Silver; Final score: Silver 6 ---- Gold 5
Winning player was Silver; Final score: Silver 6 ---- Gold 5
Winning player was Silver; Final score: Silver 6 ---- Gold 3
This was at a depth of 8 for minmax and 4 turns ahead w/ 10000 plays w/ 8 minmax for hybrid.
The hybrid AI is just a combination of Monte Carlo and minmax. However, this Monte Carlo algorithm does not play to the end of the game w/ hybrid, it plays a set number of turns (in this case 4). If the Monte Carlo cannot find a truly optional choice the minmax takes over using a list pruned by the MC eval. Also, if the MC portion determines that all moves are winners or losers, the minmax allows for an optimal choice (using the eval function). The Monte Carlo also doesn't just score if the player is ahead, it takes into account the net gain/loss in victory points for scoring.
However promising, this so far has all been just testing. I myself need to play some games against it now.
~telengard
Over the past night or so I let the 2 AIs play against themselves (using the same warband to keep things fair).
The results were:
Silver wins 4, Gold wins 96
The 4 losses were:
Winning player was Silver; Final score: Silver 6 ---- Gold 3
Winning player was Silver; Final score: Silver 6 ---- Gold 5
Winning player was Silver; Final score: Silver 6 ---- Gold 5
Winning player was Silver; Final score: Silver 6 ---- Gold 3
This was at a depth of 8 for minmax and 4 turns ahead w/ 10000 plays w/ 8 minmax for hybrid.
The hybrid AI is just a combination of Monte Carlo and minmax. However, this Monte Carlo algorithm does not play to the end of the game w/ hybrid, it plays a set number of turns (in this case 4). If the Monte Carlo cannot find a truly optional choice the minmax takes over using a list pruned by the MC eval. Also, if the MC portion determines that all moves are winners or losers, the minmax allows for an optimal choice (using the eval function). The Monte Carlo also doesn't just score if the player is ahead, it takes into account the net gain/loss in victory points for scoring.
However promising, this so far has all been just testing. I myself need to play some games against it now.
~telengard
Author of Dreamblade:
viewtopic.php?f=51&t=1215
viewtopic.php?f=51&t=1215
-
telengard - DEVELOPER
- Posts: 379
- Joined: 23 May 2009, 23:04
- Has thanked: 2 times
- Been thanked: 27 times
Re: AI improvements
by melvin » 22 Sep 2011, 02:21
Your hybrid AI looks very impressive. It is nice to see more applications of Monte Carlo methods for AI. I've been thinking about using a hybrid approach for Magarena.
What is the general steps for the hybrid AI? From your description, it seems that you use Monte Carlo first to check if there any optimal moves, if not use the evaluation score from Monte Carlo to prune of the list of choices and then run minimax.
What is the general steps for the hybrid AI? From your description, it seems that you use Monte Carlo first to check if there any optimal moves, if not use the evaluation score from Monte Carlo to prune of the list of choices and then run minimax.
How do you determine whether Monte Carlo finds a optimal? choice?. By optimal choice do you mean one that would guarantee a win? How does MC eval prune the list of options?If the Monte Carlo cannot find a truly optional choice the minmax takes over using a list pruned by the MC eval.
If all moves are winners, then it should be possible to play any one of them right? How does minimax help in this case?Also, if the MC portion determines that all moves are winners, the minmax allows for an optimal choice.
I assume by this you mean that your Monte Carlo playout return a range of values and not just a binary 0 or 1. Is that correct?The Monte Carlo also doesn't just score if the player is ahead, it takes into account the net gain/loss in victory points for scoring.
-
melvin - AI Programmer
- Posts: 1062
- Joined: 21 Mar 2010, 12:26
- Location: Singapore
- Has thanked: 36 times
- Been thanked: 459 times
Re: AI improvements
by telengard » 22 Sep 2011, 03:30
Thanks!melvin wrote:Your hybrid AI looks very impressive. It is nice to see more applications of Monte Carlo methods for AI. I've been thinking about using a hybrid approach for Magarena.
I'm pretty happy with minmax in my app, but it has reached the point where it can't get much better without a tremendous increase in horsepower (it scales exponentially it seems due to branching, and DB has a very high branching factor, there are sometimes >60 choices). This is after implementing multi-core support and spending countless hours optimizing. Things start to get a little slow looking 10 moves ahead even w/ 16 cores going at it.
I've been eyeballing the UCT stuff in Magarena, very cool, but I don't think I can implement that in my app due to the way I generate the tree (mostly for memory considerations, I have my own memory pool for speed/efficiency at high depths).
Yep, that's exactly what it does. Here's the steps:melvin wrote:What is the general steps for the hybrid AI? From your description, it seems that you use Monte Carlo first to check if there any optimal moves, if not use the evaluation score from Monte Carlo to prune of the list of choices and then run minimax.
MC runs first and if there is an obvious best move that is taken.
If not the best ones (by a percentage closest to the best one) are kept and sent to the AB pruning minmax code.
Optimal choice is one that has the most # of "wins", it's not really wins because the MC does not play out games to the absolute end. It only plays N number of turns out (currently 4, a turn in DB is quite long and each player does their steps in a single turn).melvin wrote:How do you determine whether Monte Carlo finds a optimal? choice?. By optimal choice do you mean one that would guarantee a win? How does MC eval prune the list of options?If the Monte Carlo cannot find a truly optional choice the minmax takes over using a list pruned by the MC eval.
The "wins" is the net (or loss) gain of victory points (the way you win) for the duration of the MC run.
Now that I think about this, it wouldn't work well with alternate win conditions (which I added not too long ago, booooo) although the minmax would notice it. I should probably handle that in some way.
As for pruning, it's quite crude but seems to be effective at handling "noise". If there are multiple choices that are within a small percentage of wins of each other those, all the others are pruned out.
True, if this were the end of the game you could choose any, but just looking ahead a few turns means there is still a possible better long term move or one with better positioning. The minmax handles this since the eval function does tactical choices quite well.melvin wrote:If all moves are winners, then it should be possible to play any one of them right? How does minimax help in this case?Also, if the MC portion determines that all moves are winners, the minmax allows for an optimal choice.
Yep, so if player 1 had 1 victory point and player 2 had 2 and after the MC eval the score was now 4 to 3 that would be a net gain of 2 "wins". If the score had been 1 to 4, it would have subtracted 2 "wins". I'm doing another run tonight w/ only 5000 MC play-outs for each choice and it seems to be doing just as well.melvin wrote:I assume by this you mean that your Monte Carlo playout return a range of values and not just a binary 0 or 1. Is that correct?The Monte Carlo also doesn't just score if the player is ahead, it takes into account the net gain/loss in victory points for scoring.
My real reason for trying out these other AIs is to try and have the AI understand longer term strategies that the minmax just can't see unless you are at crazy depths. For instance seeing the benefits of spawning something one turn and getting rewards, etc in the next few turns.
~telengard
Author of Dreamblade:
viewtopic.php?f=51&t=1215
viewtopic.php?f=51&t=1215
-
telengard - DEVELOPER
- Posts: 379
- Joined: 23 May 2009, 23:04
- Has thanked: 2 times
- Been thanked: 27 times
Re: AI improvements
by telengard » 22 Sep 2011, 04:44
Just finished my first game playing it:
Winning player was Silver; Final score: Silver 6 ---- Gold 0
I was Gold.
I did give the AI a very good warband and I had a randomly generated one, but it played quite well if not a little unorthodox in the beginning. I need to tweak it a little to keep it from overextending itself too much, although it worked well for it.
Next up is try to playing with a purposefully constructed warband.
~telengard
Winning player was Silver; Final score: Silver 6 ---- Gold 0
I was Gold.
I did give the AI a very good warband and I had a randomly generated one, but it played quite well if not a little unorthodox in the beginning. I need to tweak it a little to keep it from overextending itself too much, although it worked well for it.
Next up is try to playing with a purposefully constructed warband.
~telengard
Author of Dreamblade:
viewtopic.php?f=51&t=1215
viewtopic.php?f=51&t=1215
-
telengard - DEVELOPER
- Posts: 379
- Joined: 23 May 2009, 23:04
- Has thanked: 2 times
- Been thanked: 27 times
Re: AI improvements
by Huggybaby » 29 Sep 2011, 17:50
Can you describe how you implemented multi-core support? I notice when Magarena (for example) is thinking, it only uses one of my four cores.telengard wrote:melvin wrote:This is after implementing multi-core support and spending countless hours optimizing. Things start to get a little slow looking 10 moves ahead even w/ 16 cores going at it.
-
Huggybaby - Administrator
- Posts: 3207
- Joined: 15 Jan 2006, 19:44
- Location: Finally out of Atlanta
- Has thanked: 702 times
- Been thanked: 595 times
Re: AI improvements
by telengard » 30 Sep 2011, 03:51
Sure! It was actually a lot more work then I was expecting to get it right (at least the way I implemented it). First off, you have to enable your app to use multiple threads. When running my program you can specify how many cores to use (for each core specified to use there is 1 worker thread). Each of these threads can be scheduled on a processor which is where the parallelism comes from.Huggybaby wrote:Can you describe how you implemented multi-core support? I notice when Magarena (for example) is thinking, it only uses one of my four cores.telengard wrote:melvin wrote:This is after implementing multi-core support and spending countless hours optimizing. Things start to get a little slow looking 10 moves ahead even w/ 16 cores going at it.
Each worker thread also has a dedicated game_state instance in addition to the "main one". These have to stay in sync, including each's RNG, etc. I have an undo engine so each game state has it's own undo queue (as well as memory management which helps avoid locking bottlenecks when running with threads). Getting the synchronization of the game_states right was probably the bulk of the work. Even simple things like the order of a list in one of the states somewhere could cause them to get out of sync.
When the AI starts thinking, the available actions are queued up and all the worker threads are woken up and they take an action off a queue and run the algorithm specified, when done they'll grab another. When all the actions have been evaluated, they wait to be woken up again. This is standard threaded app fare.
The ai threads do what the main thread would do if only specifying 1 core. It works quite well. Given the branching factor of Dreamblade, the cores get a nice workout. I enjoy seeing all 16 cores pegged on my Mac.
The only downside which is offset by the ability to evaluate many actions at once is the memory allocator for things like STL vectors etc (I have my own memory allocator for my code). In a multithreaded app there is a lot of lock contention. I've tried to utilize MT aware memory allocators like "the hoard" allocator but didn't have much luck getting them to work.
As far as Magarena, do you ever see it running code on more than 1 core? It's possible that one game tree takes much longer to evaluate so the other threads finish quick.
~telengard
Author of Dreamblade:
viewtopic.php?f=51&t=1215
viewtopic.php?f=51&t=1215
-
telengard - DEVELOPER
- Posts: 379
- Joined: 23 May 2009, 23:04
- Has thanked: 2 times
- Been thanked: 27 times
Re: AI improvements
by Huggybaby » 30 Sep 2011, 20:12
Thanks for the explanation, I hope others here find it useful, and more than that, I hope to see more multi-core aware programs around here in the future.
I've looked at my CPU utilization while Magarena was thinking and it never goes over 25%, so I don't think it's using more than one core.
I've looked at my CPU utilization while Magarena was thinking and it never goes over 25%, so I don't think it's using more than one core.
-
Huggybaby - Administrator
- Posts: 3207
- Joined: 15 Jan 2006, 19:44
- Location: Finally out of Atlanta
- Has thanked: 702 times
- Been thanked: 595 times
7 posts
• Page 1 of 1
Who is online
Users browsing this forum: No registered users and 3 guests