Improving Default Policy(Rollout Policy) in Mont Carlo Tree Search

Improving Default Policy(Rollout Policy) in Mont Carlo Tree Search - python

I've written the MCTS AI in python and now, I'm trying to improve upon its first iteration. I've been told that I need to improve my rollout function. The purpose of the AI is to play the game of dots and boxes.
Right now, after receiving the state of the game, the rollout just plays out the remaining of the game randomely.
Rollout:
while not state.is_terminal:
state.apply_move(choice(state.legal_moves))
I was wondering how can I improve the AI by altering the rollout function?

In dots an boxes random play is probably pretty poor because it will (1) miss opportunities to fill boxes and (2) give opportunities for opponents to fill boxes, both of which make the playouts less like real play.
So, the simplest change would be to order the moves in the playout. First, take a random move that fills a box, if possible. Second, take a random move that doesn't give the opponent a chance to fill a box. Finally, give the opponent an opportunity to fill a box. (But, here you might want to select a move that gives the opponent the smallest region to fill with high probability.)

Related

Add Solver Constraint For Non-Mutually Exclusive Ranges

I use or-tools to optimize my fantasy baseball team. My setup very much resembles the program described here. The only difference in my particular case is that players can actually be eligible for a number of different positions. So, I end up with 1 player in a list for a specific position type, and the same player in another list for another position type. I am trying to avoid having the solver select the same player for multiple positions (which wouldn't be realistic).
Is there any way to modify the aforementioned program to constrain the use of a player to a single position even while they are technically eligible for many? Please let me know if I can clarify any further & thanks for your input.

How to decide the class that should be built for a problem statement?

I'm writing a code for blackjack, using the classes and other oops concept. Currently I'm stuck on how to decide what classes it will have.
Following are the rules of blackjack:
1.Create a deck of 52 cards
2.Shuffle the deck
3.Ask the Player for their bet
4.Make sure that the Player’s bet does not exceed their available chips
5.Deal two cards to the Dealer and two cards to the Player
6.Show only one of the Dealer’s cards, the other remains hidden
7.Show both of the Player’s cards
8.Ask the Player if they wish to Hit, and take another card
9.If the Player’s hand doesn’t Bust (go over 21), ask if they’d like to Hit again.
10.If a Player Stands, play the Dealer’s hand. The dealer will always Hit until the Dealer’s value meets or exceeds 17
11.Determine the winner and adjust the Player’s chips accordingly
12.Ask the Player if they’d like to play again
I'm new to coding and oops, kindly help
NOTE: this is not a homework problem, as there are many solution on github, which i can copy submit if required. I Just want to learn oops and classes. I don't seek a solution, I'm seeking a correct thought process

It is a difficult process; usually, the first step is to identify the nouns in the description of the project, that gives you a starting point from where to think about the shape you will give to your code, and how you see these objects interacting.
From your description, we could list the following nouns:
Deck, Player, Bet, Chips, Cards, Dealer, Hand, Winner, Player’s stash.
They may, or may not represent a useful object in your representation of blackjack. Some may be obvious objects you need now (Deck, Card, Hand, Player, Dealer); some may be combined (Chip, Bet, Stash); some may not be needed in a modest application (Winner, Bet, Stash), and replaced by data structures, like Lists, Vectors, HashTables, etc.
Create a deck of 52 cards
Shuffle the deck
Ask the Player for their bet
Make sure that the Player’s bet does not exceed their available chips
Deal two cards to the Dealer and two cards to the Player
Show only one of the Dealer’s cards, the other remains hidden
Show both of the Player’s cards
Ask the Player if they wish to Hit, and take another card
If the Player’s hand doesn’t Bust (go over 21), ask if they’d like to Hit again.
If a Player Stands, play the Dealer’s hand. The dealer will always Hit until the Dealer’s value meets or exceeds 17
Determine the winner and adjust the Player’s stash accordingly
Ask the Player if they’d like to play again

You should think in any object you will need in your game, and think about the attributes and actions they will make, for example the Card should have a suit and a value, these are the card attributes. The deck should contain every card and be able to shuffle the cards, so create a method inside your Deck Class that can do this function.
Here i will leave a link so you can have a better understanding of OOP in python.
https://realpython.com/python3-object-oriented-programming/

How do I create a Leaderboard using a text file in python?

I'm working on a quiz for my exam in computer science. I'm relatively new to the program, in the sense that I know all of the basics, but I am on the point where I want to expand my knowledge. One way I want to do this is by adding a Leaderboard system. The user gets a number of points, and then the program checks in a text file that has other high scores in it, and adds the user to it. It then prints out the leaderboard. This means that I'm going to have to use some sort of operations to determine whether the user's score is higher or lower than another score in the file, and then delete the score it is higher than and replace it. Any idea on how to do this? I'm completely stuck.

Try Pseudeocode and work through steps.
Get Score
Compare Score
Add Score
You have to think like a computer and break all the way down. At each step think about how do I tell the computer to do that. Once you have all that look at what you have done and remember DRY -> Don't Repeat Yourself. Your coding will go much faster.

How to detect moving object on a moving conveyor using opencv

I'm building a grading system for crabs. In this system, the animals (crabs) are placed in a moving conveyor and I need to identify dead or alive animals by detecting its motion based on images captured by a camera on this conveyor.
The color of conveyor belt is black.
As the conveyor is always moving, so I can't apply methods using stationary camera like here. Does anyone have a suggestion about motion detection of the animals in this case using opencv? I can use more than one camera if it's necessary. Thanks.

Well, the most obvious answer is:
1) adjust the pictures of the conveyor in the different periods of time so that they become of the same area.
2) watch which ones of the crabs have different poses (like, "substract the images") - different regions (pixels) mean that there happened a motion.
If using a tracking - well, you should train your classifier to watch the crabs, and than compare the regions of crabs in a same way. But i think it's too complicated for your particular issue.

Well, This is an interesting question. While weighing different solutions to the problem, I learned that crabs are ectothermal animals, i.e. they can not control their body temperatures but rather their body temperatures are equal to the temperatures of the environment they are in. So, using a remote thermometer is out of question. (But I learned something new, thank you for that :] )
A different, but a little bit cruel method would be, to give take a shot of a crab on the the belt, then give it a nudge of electric pulse (very very small voltage, enough for it to make it react only, similar to us when we get a static discharge) and take another shot of the crab immediately. Compare two images to see if there is a difference in crab's movements. If so, it should be alive, if not, RIP crab.
There are downsides of this solution too:
I really do not like the idea of giving electric shocks to crabs,
even if it is low voltage. Sounds very cruel to me. I am not sure,
if it is legally doable where you live in either.
This requires adding another step to process.
I absolutely have no idea what would be a amount of voltage to be
used in such a system. Would it pose any danger for the employees
around the conveyor belt?
[I hope I am not get stoned for suggesting giving electric shocks to crabs here]

Calculating a game's high score table

I need to create a function/method ( in python) which calculates a high score "leaderboard". Each player will have played any number of rounds of the game, recieving a score for each round. I want to know what's the best way to sort the top ranking players (accounting for score AND number of rounds played). The possible scores for each round are F, D-, D, D+, C-, C, C+, B-, B, B+, A-, and A.
Obviously a simple average won't work because it doesn't take into account number of rounds played. Whats the best way to set up a fair sorting function?
EDIT: I've been reading some of the really great answers here and I want to try to clear up my question a bit. I want both the players score AND the number of rounds they've played to count towards their ranking in a way that's fair. Meaning a player with 20 B's should be of a higher rank than a player with 5 A's. Basically the high score should reflect general effort and skill, "the number of rounds played PLUS their score" means the higher their ranking should be.
EDIT 2: After reading the answers, I think the best way to do it is a simple total sum of the players points across all rounds. I'm not sure which answer to assign the green check to because you were all correct.

There are many ways that you could do this. Try this for example, let F-A be 0-11 (you can make your own; however try to take difficulty into account), so each score is one higher than the previous. For every game you play, you receive a score (from 0-11). Create a total score and add the game score every time to the total score. That way, if a person receives 7 A's, that's 77, while a person that receives 7 A-'s gets a score of 70, then simply sort them accordingly. Each function has its drawbacks of course. This function is not the "best", consider getting 20 B's would exceed 7 A's even though, 7 A's is a much better score. if you can give me more details about how you want to rank them, then it will be much easier to get the algorithm down.

What you are asking is essentially how we define "good" players and it's not an easy problem. As you mentioned, a simple average score or picking-the-highest-score will not be an ideal answer depending on your game design.
I'd like to recommend that you read about ELO rating system for Chess and other modified versions of it before you design your own player rating system.
One simple and possible way is you can set a window (like 10 most recent games) and use average score from the window. Players who play less games than this window would be "in placement" state. Again, it's not an easy problem and heavily depends on what your game is. Good Luck!
[UPDATE]
I assumed that your game is player vs. player. If not, this is another story. Most games just keep the highest score no matter how many times you play the game and that's going to be your entry in the leaderboard. Since you don't say anything about your game, I have no idea why it wouldn't be fair. As I mentioned earlier, you could set a window for avg. score or the highest score. You can even reset your leaderboard every month or remove players who haven't played for a week. It all depends on your game and what you want. Please remember that no matter what you do, make it sure that the rules are crystal clear for players otherwise they would be easily upset and frustrated.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Improving Default Policy(Rollout Policy) in Mont Carlo Tree Search - python

Related

Add Solver Constraint For Non-Mutually Exclusive Ranges

How to decide the class that should be built for a problem statement?

How do I create a Leaderboard using a text file in python?

How to detect moving object on a moving conveyor using opencv

Calculating a game's high score table

Categories

Resources