Chess position evaluation using neural network

Chess position evaluation using neural network - python

I'm working on an AI that should be able to play chess. I want to make use of keras neural networks to evaluate position on the board. I would like to teach the NN by playing plenty of games between AI and AI. I already have alpha-beta pruning implemented.
My idea was to create a csv file with positions of every single game the AI has played. I would chose variables I would like to store there. Very simple example:
"white_pawns","black_pawns","white_queens","black_queens","white_pawns_on_side","white_won"
3,7,1,2,0,False
3,5,3,0,1,True
I would like to train a model using these values and then use it to evaluate current board position. So the main question is:
How to make a neural network output a value of position given these variables? Eg. 0 when it's draw or 1 when we are one pawn up. keras preferred, but I'm open to any other python library.
I would also be grateful if you could dispel my few other doubts.
Are there any flaws in that approach? Wouldn't every position from a single game make the neural network overfitted? Maybe I should pick only few positions from each game?

I think you know this, but when a human evaluates the board, he is not only looking at the material—but also looking in the positions of the pieces. Secondly, with this csv, you can't decide what is a better movie if the thing you see is only true or false. This is why the engine's evaluation is numerical. Or you want it to output a number from -1 to 1, and then it is the score? Looking to do the same thing but do 1 for a white win, -1 for a black win or 0 for a draw (in the dataset file). If you want to do this with me, hit me up (is there a messaging service for stack overflow?).
conclusion
the input should be a numerical representation for the board, in my opinion, and the target should not be a classifier but a numerical classifier. it is actually simpler.
I have a python engine that I am working on and this is an opportunity to meet new people that are interested in the things I am.
only saying, this is my first answer so if something is unclear please make a comment and I will try to help!
also, like krish said this can be implemented with reinforcement learning. but first you need to make a dqn (deep q networks (q learning is a really popular reinforcement learning algorithm)) and for that you need another network. because if not, this will take a lot of time to train.

Related

Limit RNN legal output for each time step

I'm trying to make an RNN predict moves for a card game. In each time step, only certain actions are legal (i.e. some moves cannot be made in certain situations).
So at any given point through the game, one out of 12 moves is the correct one. Each move is labeled as an int in range 0 through 11. In most situations, only a subset of these are actually legal moves. So say I try to train the model, and it predicts e.g. move 4 in a situation, however only moves 2, 3 and 9 are legal at this time. After this move has been made, a different subset is allowed for the next time step. How do I make it predict from only a subset of the moves?
I haven't come so far as to coding the model yet, but I intend to use Keras/TensorFlow LSTM in Python to do this.
I would be happy if you could point me in a good direction here!

You can add a mask as an argument of calling your model.
But the simpliest way is to get prediction as probabilities of all moves and then just ignore those which are not permitted.

How to give an AI controls in a video game?

So I made Pong using PyGame and I want to use genetic algorithms to have an AI learn to play the game. I want it to only know the location of its paddle and the ball and controls. I just don't know how to have the AI move the paddle on its own. I don't want to do like: "If the ball is above you, go up." I want it to just try random stuff until it learns what to do.
So my question is, how do I get the AI to try controls and see what works?

Learning Atari-Pong has become a standard task in reinforcement learning. For example there is the OpenAI baselines github repo implementing RL algorithms that can be plugged into various tasks.
You definitely don't need those advanced algos just to learn Pong the way you describe, but you can learn from the API they're using to separate between tasks ("environments" in reinforcement learning terms) and the AI part ("controller" or "agent"). For this, I suggest to read the OpenAI Gymn Documentation for how you would add a new Environment.
In short, you could either use some float numbers (position and velocity of ball, or two positions instead of velocity, and position of the paddle). Or you could use discrete inputs (integers, or just pixels, much harder to learn). Those inputs could be connected to a small neural network.
For the command output, the simplest thing to do is to predict a probability for moving up or down. This is a good idea because when you evaluate your controller, it will have some non-zero chance of scoring points, so your genetic algorithm can compare different controllers (with different weights) against each other. Just use the sigmoid function on your neural net output, and interpret it as probability.
If you initialize all your neural network weights to a good random range, you probably can get a pong player that doesn't completely suck just by trying random weights for long enough (even without a GA).
PS: if you didn't plan to use a neural network: they are really simple to implement from scratch if you only have to implement the forward-pass. E.g. if you don't implement back-propagation training, and use a GA instead to learn the weights (or an evolution strategy, or just random weights). The hardest part is to find a good range for the initial random weights.

One design consideration which may be helpful is if you can provide some minimal set of display details out through another interface; and conversely allow for commands to the player paddle. For example, you could send a simple structure describing ball position and both paddles and the ball with each frame update out through a socket to another process. Following the same pattern, you could create a structure that is sent as a reply to that message describing how to move the player paddle. For example:
# Pong Game program
import socket
import struct
# Set up server or client socket
# ... Into game loop
state = (p1_paddle_y, p2_paddle_y, ball_x, ball_y, victory_state)
# assuming pixel locations, and victory_state is -1:Loss, 0:InProgress, 1:Win
myGameStateMsg = struct.pack('>LLLLh', state[0], state[1], state[2], state[3])
sock.send(myGameStateMsg) # Sending game state to player
playerMsg = sock.recv(4) # Get player command
playerCmd = struct.unpack('i', playerMsg)
# playerCmd is an integer describing direction & speed of paddle motion
# ... Process game state update, repeat loop
You could accomplish the same effect using threads and a transacted structure, but you'll need to consider properly guarding those structures (read-while-write problems, etc.)
Personally, I prefer the first approach (sockets & multi-processing) for stability reasons. Suppose there's some sort of bug that causes a crash; if you've already got process separation, it becomes easier to identify the source of the crash. At the thread-level, it's still possible but a little more challenging. One of the other benefits of the multi-processing approach is that you can easily set up multiple players and have the game expand (1vInGameAI, 1v1, 3v3, 4v4). Especially when you expand, you could test out different algorithms, like Q-Learning, adaptive dynamic programming, etc. and have them play each other!
Addendum: Sockets 101
Sockets are a mechanism to get more than one process (i.e., a running program) to send messages to one another. These processes can be running on the same machine or across the network. In a sense, using them is like reading and writing to a file that is constantly modifying (that's the abstraction that sockets provide), but also provide blocking calls so that make the process wait for information to be available.
There is a lot more detail that can be discussed about sockets (like file-sockets vs network-sockets (FD vs IP); UDP vs TCP, etc.) that could easily fill multiple pages. Instead, please refer to the following tutorial about a basic setup: https://docs.python.org/3/howto/sockets.html. With that, you'll have a basic understanding of what they can provide and where to go for more advanced techniques with them.
You may also want to consult the struct tutorial as well for introductory message packing: https://docs.python.org/3/library/struct.html. There are better ways of doing this, but you won't understand much about how they work and break-down without understanding structs.

So you'd want as the AI input the position of the paddle, and the position of the ball. The AI output is two boolean output whether the AI should press up or down button on the next simulation step.
I'd also suggest adding another input value, the ball's velocity. Otherwise, you would've likely needed to add another input which is the location of the ball in the previous simulation step, and a much more complicated middle layer for the AI to learn the concept of velocity.

How can I make my neural network emphasize that some data is more important than the rest?

I looked around online but couldn't find anything, but I may well have missed a piece of literature on this. I am running a basic neural net on a 289 component vector to produce a 285 component vector. In my input, the last 4 pieces of data are critical to change the rest of the input into the resultant 285 for the output. That is to say, the input is 285 + 4, such that the 4 morph the rest of the input into the output.
But when running a neural network on this, I am not sure how to reflect this. Would I need to use convolution on the rest of the input? I want my system to emphasize the 4 data points that critically affect the other 285. I am still new to all of this, so a few pointers would be great!
Again, if there is something already written on this, then that would be awesome too.

I don't think you have any reason doing this since the network will infer that on its own. The weights will be reduced or enhanced for each input according to their importance considering the output.
What you could do though, is to have a preliminary network that is going to have the 285 component as an input, and then a new network that is going to have the 4 critical components and the output of the preliminary network as an input.
[285 compo.]---[neural network]---+---[neural network]---[output 285 compo.]
|
[4 compo.]-+
For instance, you could treat a picture with convolution networks and then add some meta information later in a fully connected network to process everything.

The neural network should more or less learn this thing by itself. Especially with newer approaches like deep learning & friends, where the amount of hand-tuning is almost zero. However, this does assume that the function which you're trying to learn is learnable and that the system you use has enough power to learn it. That's a function of the complexity of the network involved (number of layers, nodes, types of activations etc.), the learning algorithms involved, as well as the data you supply.
It's really hard to tell without knowing more about the domain you're addressing? What sort of signals are we talking about (I assume they're signals since you speak of convolution)? What are the four inputs about? I assume they have a different modality than the other 285.
Perhaps this doc will help a little bit though.

Theoretically, you can let the network try to learn this relationship. However, there are good reasons to try to rethink the way you're formulating the problem. Also, the difficulty a neural network will have learning this function is going to depend strongly on your specific problem (and the best way to figure it out is probably just to try it and find out).
Let me try to help by making an analogy to a simpler problem: let's take your 289-element vector and assume that 285 elements take values from -1 to 1 and the remaining four take values from -1000 to 1000. This maintains your original premise: that the four variables are somehow far more important in determining the output than the 285. (I understand that this loses the coupled relationship between the variables, but let's run with the example anyways.)
This is a simpler example for two reasons:
it's easier to see why it's harder to learn
there are a bag of well-understood tricks to solve it
Compared to a scenario where all 289 inputs have the same input range, a gradient descent algorithm will be slower to converge on the heterogeneous case. (Extra credit: try this!) Geoff Hinton has a rather famous set of slides which describes this effect fairly well: Lecture 6. I believe this is also part of a Coursera course now.
Hinton's slides also touch on two ways to attack this simpler version of the problem. The first is just to pre-process your inputs. If you scale down the inputs to have the same mean and variance, your gradient descent optimizer will converge more quickly. The other is to use a more powerful optimization method, specifically one with per-parameter adaptive learning rates, which handles this case as well as trickier scenarios. Andrej Karpathy's fantastic notes from Stanford's CS231n class are a good intro.
But let's tie this back to your problem: that there are four "special" variables which transform the entire input. Given enough time and input, it's possible that a network can learn this function. But understand that if this transformation is complex and makes the optimization landscape rough, your network will likely have some trouble dealing with it.
If there's a way to transform your representation of the problem to avoid this link, I'd say try to pursue that. If not, then be prepared to resort to some bigger guns to solve the problem.
Without knowing the specifics of your problem, it's hard to give more concrete advice. Plus, ultimately, you're the one that will be solving it, so you're going to be the expert eventually!

To emphasize on any vector elements in your input vector you will have to give less information of the unimportant vector to your neural network.
Try to encode the first less important 285 numbers into one number or any vector size you like, with a a multiplayer neural network then use that number with other 4 number as a input to a neural network.
Example:
v1=[1,2,3,..........285]
v2=[286,287,288,289]
v_out= Neural_network(input_vector=v1,neurons=[100,1]) # 100 hidden unit with one outpt.
v_final=Neural_network(input_vector=v_out,neurons=[100,1]) # 100 hidden unit with one outpt.

Use machine learning for simple robot control

I'd like to improve my little robot with machine learning.
Up to now it uses simple while and if then decisions in its main function to act as a lawn mowing robot.
My idea is to use SKLearn for that purpose.
Please help me to find the right first steps.
i have a few sensors that tell about the world otside:
World ={yaw, pan, tilt, distance_to_front_obstacle, ground_color}
I have a state vector
State = {left_motor, right_motor, cutter_motor}
that controls the 3 actors of the robot.
I'd like to build a dataset of input and output values to teach sklearn the wished behaviour, after that the input values should give the correct output values for the actors.
One example: if the motors are on and the robot should move forward but the distance meter tells constant values, the robot seems to be blocked. Now it should decide to draw back and turn and move to another direction.
First of all, do you think that this is possible with sklearn and second how should i start?
My (simple) robot control code is here: http://github.com/bgewehr/RPiMower
Please help me with the first steps!

I would suggest to use Reinforcement Learning. Here you have a tutorial of Q-Learning that fits well into your problem.
If you want code in python, right now I think there is no implementation of Q-learning in scikit-learn. However, I can give you some examples of code in python that you could use: 1, 2 and 3.
Also please have in mind that reinforcement learning is set to maximize the sum of all future rewards. You have to focus on the general view.
Good luck :-)

The sklearn package contains a lot of useful tools for machine learning so I dont think thats a problem. If it is, then there are definitely other useful python packages. I think collecting data for the supervised learning phase will be the challenging part, and wonder if it would be smart to make a track with tape within a grid system. That would make it be easier to translate the track to labels (x,y positions in the grid). Each cell in the grid should be small if you want to make complex tracks later on I think. It may be very smart to check how they did in the self-driving google car.

how to build game playing neural network in Python?

I am a neural-network beginner. I'd like to learn the basics of neural networks by teaching computers to play checkers. Actually, the games I want to learn are Domineering and Hex.
These games are pretty easy to store and the rules are much simpler than chess, but there aren't too many people who play. If I can get this idea off the ground it would be great for experimenting Combinatorial Game Theory.
PyBrain seems to be the clear winner for Python neural networks, but who can walk me through how to set up a neural net for my game-playing task? A google search turned up Blondie24 in 2001 but it uses some genetic algorithms - I don't want to complicate things.

Once you replace "neural networks" by machine learning (or even artificial intelligence, rather, imho) as the comments rightly suggest, I think you're better off starting with Alpha-beta pruning, the Minimax algorithm, and Branch and bound ideas.
Basically :
At each step, you build the tree of all possible futures, and evaluate leaf positions with an evaluation function (e.g. board domination, connectivity, material, etc.)
Propagate the results up in the tree, choosing the best play you can make, and the worse your opponent can (the best for him), until you know what move to play in the position you're at.
Rinse, repeat. Branch and bound saves you a lot of computation if you have a few good heuristics, and the level of your programm will basically be how deep it'll be able to search the game tree.
This will most probably be the basic framework in which anyone would introduce new ideas, so if you're not familiar with it, go for it :-)

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.