AI improvements

by **telengard** » 21 Sep 2011, 22:27

For the past few weeks I've been focusing on AI improvements. I've made some great strides w/ the minmax, and even moreso w/ the hybrid AI. Against the minmax it wins close to 100% of the time. I also spent time w/ pure Monte Carlo and even at very very high numbers of plays it doesn't do all that well. I've made enough improvements performance-wise to bump up the default depth for minmax to 8.

Over the past night or so I let the 2 AIs play against themselves (using the same warband to keep things fair).

The results were:

Silver wins 4, Gold wins 96

The 4 losses were:

Winning player was Silver; Final score: Silver 6 ---- Gold 3
Winning player was Silver; Final score: Silver 6 ---- Gold 5
Winning player was Silver; Final score: Silver 6 ---- Gold 5
Winning player was Silver; Final score: Silver 6 ---- Gold 3

This was at a depth of 8 for minmax and 4 turns ahead w/ 10000 plays w/ 8 minmax for hybrid.

The hybrid AI is just a combination of Monte Carlo and minmax. However, this Monte Carlo algorithm does not play to the end of the game w/ hybrid, it plays a set number of turns (in this case 4). If the Monte Carlo cannot find a truly optional choice the minmax takes over using a list pruned by the MC eval. Also, if the MC portion determines that all moves are winners or losers, the minmax allows for an optimal choice (using the eval function). The Monte Carlo also doesn't just score if the player is ahead, it takes into account the net gain/loss in victory points for scoring.

However promising, this so far has all been just testing. I myself need to play some games against it now.

~telengard

by **melvin** » 22 Sep 2011, 02:21

Your hybrid AI looks very impressive. It is nice to see more applications of Monte Carlo methods for AI. I've been thinking about using a hybrid approach for Magarena.

What is the general steps for the hybrid AI? From your description, it seems that you use Monte Carlo first to check if there any optimal moves, if not use the evaluation score from Monte Carlo to prune of the list of choices and then run minimax.

If the Monte Carlo cannot find a truly optional choice the minmax takes over using a list pruned by the MC eval.

How do you determine whether Monte Carlo finds a optimal? choice?. By optimal choice do you mean one that would guarantee a win? How does MC eval prune the list of options?

Also, if the MC portion determines that all moves are winners, the minmax allows for an optimal choice.

If all moves are winners, then it should be possible to play any one of them right? How does minimax help in this case?

The Monte Carlo also doesn't just score if the player is ahead, it takes into account the net gain/loss in victory points for scoring.

I assume by this you mean that your Monte Carlo playout return a range of values and not just a binary 0 or 1. Is that correct?

by **telengard** » 22 Sep 2011, 03:30

melvin wrote:Your hybrid AI looks very impressive. It is nice to see more applications of Monte Carlo methods for AI. I've been thinking about using a hybrid approach for Magarena.

Thanks!

I'm pretty happy with minmax in my app, but it has reached the point where it can't get much better without a tremendous increase in horsepower (it scales exponentially it seems due to branching, and DB has a very high branching factor, there are sometimes >60 choices). This is after implementing multi-core support and spending countless hours optimizing. Things start to get a little slow looking 10 moves ahead even w/ 16 cores going at it. :shock:

I've been eyeballing the UCT stuff in Magarena, very cool, but I don't think I can implement that in my app due to the way I generate the tree (mostly for memory considerations, I have my own memory pool for speed/efficiency at high depths).

melvin wrote:What is the general steps for the hybrid AI? From your description, it seems that you use Monte Carlo first to check if there any optimal moves, if not use the evaluation score from Monte Carlo to prune of the list of choices and then run minimax.

Yep, that's exactly what it does. Here's the steps:

MC runs first and if there is an obvious best move that is taken.
If not the best ones (by a percentage closest to the best one) are kept and sent to the AB pruning minmax code.

melvin wrote:
If the Monte Carlo cannot find a truly optional choice the minmax takes over using a list pruned by the MC eval.
How do you determine whether Monte Carlo finds a optimal? choice?. By optimal choice do you mean one that would guarantee a win? How does MC eval prune the list of options?

Optimal choice is one that has the most # of "wins", it's not really wins because the MC does not play out games to the absolute end. It only plays N number of turns out (currently 4, a turn in DB is quite long and each player does their steps in a single turn).
The "wins" is the net (or loss) gain of victory points (the way you win) for the duration of the MC run.

Now that I think about this, it wouldn't work well with alternate win conditions (which I added not too long ago, booooo) although the minmax would notice it. I should probably handle that in some way.

As for pruning, it's quite crude but seems to be effective at handling "noise". If there are multiple choices that are within a small percentage of wins of each other those, all the others are pruned out.

melvin wrote:
Also, if the MC portion determines that all moves are winners, the minmax allows for an optimal choice.
If all moves are winners, then it should be possible to play any one of them right? How does minimax help in this case?

True, if this were the end of the game you could choose any, but just looking ahead a few turns means there is still a possible better long term move or one with better positioning. The minmax handles this since the eval function does tactical choices quite well.

melvin wrote:
The Monte Carlo also doesn't just score if the player is ahead, it takes into account the net gain/loss in victory points for scoring.
I assume by this you mean that your Monte Carlo playout return a range of values and not just a binary 0 or 1. Is that correct?

Yep, so if player 1 had 1 victory point and player 2 had 2 and after the MC eval the score was now 4 to 3 that would be a net gain of 2 "wins". If the score had been 1 to 4, it would have subtracted 2 "wins". I'm doing another run tonight w/ only 5000 MC play-outs for each choice and it seems to be doing just as well.

My real reason for trying out these other AIs is to try and have the AI understand longer term strategies that the minmax just can't see unless you are at crazy depths. For instance seeing the benefits of spawning something one turn and getting rewards, etc in the next few turns.

~telengard

by **telengard** » 22 Sep 2011, 04:44

Just finished my first game playing it:

Winning player was Silver; Final score: Silver 6 ---- Gold 0

I was Gold. :lol:

I did give the AI a very good warband and I had a randomly generated one, but it played quite well if not a little unorthodox in the beginning. I need to tweak it a little to keep it from overextending itself too much, although it worked well for it.

Next up is try to playing with a purposefully constructed warband.

~telengard

by **Huggybaby** » 29 Sep 2011, 17:50

telengard wrote:
melvin wrote:This is after implementing multi-core support and spending countless hours optimizing. Things start to get a little slow looking 10 moves ahead even w/ 16 cores going at it.

Can you describe how you implemented multi-core support? I notice when Magarena (for example) is thinking, it only uses one of my four cores.

by **telengard** » 30 Sep 2011, 03:51

Huggybaby wrote:
telengard wrote:
melvin wrote:This is after implementing multi-core support and spending countless hours optimizing. Things start to get a little slow looking 10 moves ahead even w/ 16 cores going at it.
Can you describe how you implemented multi-core support? I notice when Magarena (for example) is thinking, it only uses one of my four cores.

Sure! It was actually a lot more work then I was expecting to get it right (at least the way I implemented it). First off, you have to enable your app to use multiple threads. When running my program you can specify how many cores to use (for each core specified to use there is 1 worker thread). Each of these threads can be scheduled on a processor which is where the parallelism comes from.

Each worker thread also has a dedicated game_state instance in addition to the "main one". These have to stay in sync, including each's RNG, etc. I have an undo engine so each game state has it's own undo queue (as well as memory management which helps avoid locking bottlenecks when running with threads). Getting the synchronization of the game_states right was probably the bulk of the work. Even simple things like the order of a list in one of the states somewhere could cause them to get out of sync.

When the AI starts thinking, the available actions are queued up and all the worker threads are woken up and they take an action off a queue and run the algorithm specified, when done they'll grab another. When all the actions have been evaluated, they wait to be woken up again. This is standard threaded app fare.

The ai threads do what the main thread would do if only specifying 1 core. It works quite well. Given the branching factor of Dreamblade, the cores get a nice workout. I enjoy seeing all 16 cores pegged on my Mac.

The only downside which is offset by the ability to evaluate many actions at once is the memory allocator for things like STL vectors etc (I have my own memory allocator for my code). In a multithreaded app there is a lot of lock contention. I've tried to utilize MT aware memory allocators like "the hoard" allocator but didn't have much luck getting them to work.

As far as Magarena, do you ever see it running code on more than 1 core? It's possible that one game tree takes much longer to evaluate so the other threads finish quick.

~telengard

by **Huggybaby** » 30 Sep 2011, 20:12

Thanks for the explanation, I hope others here find it useful, and more than that, I hope to see more multi-core aware programs around here in the future.

I've looked at my CPU utilization while Magarena was thinking and it never goes over 25%, so I don't think it's using more than one core.

AI improvements

AI improvements

Re: AI improvements

Re: AI improvements

Re: AI improvements

Re: AI improvements

Re: AI improvements

Re: AI improvements

Who is online

Main Menu

User Menu

Our Partners

Who is online

Who is online

Main Menu

User Menu

Our Partners

Who is online

Login Form