The quick summary of my question:
I'm trying to solve a clone of the Flappy Bird game found on the internet with the Reinforcement Learning algorithm Proximal Policy Optimization. Apparently, I've faced an issue with designing the reward system. How can I specify a reward for the agent given that it's a third party game so it does not return anything to me and the only info I get is visual information form the window?
Some details and the background:
Prior to trying to solve a third party game I've played with several google gym environments such as Cart-Pole, Mountain Car, Lunar Lander and recently Car Racing. To solve them I used PG, DQN, Actor-Critic and PPO algorithms. After understanding how to work with problems when the state is an image I've decided to take on a new challenge and try to get out of the sandbox (gym).
I've picked Flappy Bird because it's simple in concept, action space is 1 (actually 2) and it's notoriously hard for humans.
My code can be found here: https://github.com/Mike-Kom/Flappy-Bird-PPO
Agent class and buffer was tested on the Car Racing so there shouldn't be any issues with the RL algorithm. The neural net was changed a little due to a different state size but conceptually it's the same so there should not be any problems ether.
My current guess is that the reward system is not robust and causes the agent not to learn properly.
Currently I'm just giving the agent 0,025 points each step and 2 points after the 25th frame and above (I've found that this is exactly the frame at which the agent passes between the first two pipes.) but it does not seems to work.
Any suggestions on how to solve an external environment and especially on how to design the reward system are welcome!
Sorry if the code is messy and not professional it was originally meant to be just for me :) Programing is just my hobby and my occupation is far from code writing.
Moreover, this is my first question here and I wanted to take an opportunity and thank all of you for writing our answers and suggestions for different question! You make this community super helpful for so many people! Even though, I did not write a question before I found here a tone of answers and good suggestions :)
Related
Closed. This question is not about programming or software development. It is not currently accepting answers.
This question does not appear to be about a specific programming problem, a software algorithm, or software tools primarily used by programmers. If you believe the question would be on-topic on another Stack Exchange site, you can leave a comment to explain where the question may be able to be answered.
Closed 3 months ago.
Improve this question
As a mechanical engineer how I start to learn python and what I need to learn in python For machine learning in mechanical field
Can anyone Suggest best
Well, there are plenty of the free courses on YouTube. Personally, I am not in the machine learning field.
However, to get started in Python, I would suggest the following Python video: https://www.youtube.com/watch?v=rfscVS0vtbw
This is the same video I used when I first got started in Python.
Once you have the basics down, I would suggest you do some beginner projects just to brush up on the skills you have learned. Some of the ones I recommend for total beginners are programs like Tic-Tac-Toe, Hangman, and Random code generators, or even a program that takes a string and turns it into a simple code (like making every letter "a" into "b" and "b" into "c"). Needless to say, there are many, many simple beginner projects. Do a couple of these (around 3-4). This will get you used to Python (remember, you are just scratching the surface).
Next, I would suggest you take a dive into numpy and pandas modules in Python. These two modules are very important for handling data (and that is critical for machine learning). I am not proficient in numpy or pandas, but I did look into them at one point. For numpy, I watched the following video: https://www.youtube.com/watch?v=QUT1VHiLmmI
and for pandas I watched videos by the YouTube channel named Corey Schafer that were about pandas.
Now, if you do not have knowledge of the math behind machine learning, you can one of two choices. Some of the machine learning tutorials I am suggesting will allow you to make progress in learning machine learning, but you will definitely not understand the underlying concepts (you will not know what each line of code does, what each block of code is meant to be, but you will not understand the why). You can choose to skip this next step, but it is important to know that it is critical for success to understand the underlying mechanisms of something rather than just the surface (as with everything else). So, I would strongly suggest you look into machine learning courses that simply teach theory rather than practice as well: https://www.youtube.com/watch?v=GwIo3gDZCVQ
It is important for me to point out that these resources more than likely will not be enough, as you watch these videos, you will end up having questions that they may not go over, so you will have to do a little bit of research on the internet yourself as well
After you get those down, you can then actually look into Machine Learning with Python.
I believe there are two main ways you can practice Machine Learning: TensorFlow and PyTorch. I am not aware between the primary differences between the two, but I do know that they are quite popular and have plenty of documentation.
Again, same technique as before. Go on YouTube, and look into courses about Machine Learning, here are a few:
A very, very, long tutorial (~25 hrs) in PyTorch: https://www.youtube.com/watch?v=V_xro1bcAuA
Shorter tutorial in PyTorch (~5hrs): https://www.youtube.com/watch?v=c36lUUr864M
A 2-part TensorFlow Course: https://www.youtube.com/watch?v=tpCFfeUEGs8 and https://www.youtube.com/watch?v=ZUKz4125WNI&t=0s
Also, I would like to add that along with PyTorch and TensorFlow, there is also Keras, but I believe it may not be as popular and/or well-documented as the other two
Also, these are just my suggestions for you to get started from the little knowledge I have about machine learning.
The common thing with beginning something, is that no matter what it is, it generally starts with scratching the surface of the basics, and just sitting down and watching tutorials for long hours until you get the hang of it
I have following scenario:
I want to have a vector field simulation which shows the current of a fluid, lets say water. This current produces a certain noise, which can change when a solid object is submerged into the current.
Is there a way to somehow attach this noise/sound to the visuals of VTK?
I am not really experienced with VTK, so any point in the right direction is appreciated.
Thanks in advance!
This is a pretty general question on an esoteric topic. A good first step in these cases is to do a scientific journal review to see what researchers have attempted before, what tools they used and what success they had. After a quick search I found a few relevant journals that cover generating sound from simulations/data.
Sounding liquids: Automatic sound synthesis from fluid simulation
Visual to Sound: Generating Natural Sound for Videos in the Wild
Auditory Display and the VTK Sonification Toolkit
Listen to your data: Model-based sonification for data analysis
After reviewing these, you'll have a better idea of what's already been attempted and what's possible.
For my Msc thesis I want to apply multi-agent RL to a bus control problem. The idea is that the busses operate on a given line, but without a timetable. The busses should have bus stops where passengers accumulate over time and pick them up, the longer the interval between busses, the more passengers will be waiting at the stop (on average, it's a stochastic process). I also want to implement some intersections where busses will have to wait for a green light.
I'm not sure yet what my reward function will look like, but it will be something along the lines of keeping the intervals between busses as regular as possible or minimising total travel time of the passengers.
The agents in the problem will be the busses, but also the traffic lights. The traffic lights can choose when to show green light for which road: apart from the busses they will have other demand as well that has to be processed. The busses can choose to speed up, slow down, to wait longer at a stop or to continue on normal speed.
To be able to put this problem in a RL framework I will need an enviroment and suitable RL algorithms. Ideally I would have a flexible simulation environment to re-create my case study bus line and connect this to of-the-shelf RL algorithms. However, so far I haven't found this. This means I may have to connect a simulation environment to something like an OpenAI gym myself.
Does anyone have advice for which simulation environment may be suitable? And if it's possible to connect this to of-the-shelf RL algorithms?
I feel most comfortable with programming in Python, but other languages are an option as well (but this would mean considerable extra effort from my side).
So far I have found the following simulation environments that may be suitable:
NetLogo
SimPy
Mesa
MATSim (https://www.matsim.org)
Matlab
CityFlow (https://cityflow-project.github.io/#about)
Flatland (https://www.aicrowd.com/challenges/neurips-2020-flatland-challenge/)
For the RL algorithms the options seem to be:
Code them myself
Create the environment according to the OpenAI gym API guidelines and use the OpenAI baselines algorithms.
I would love to hear some suggestions and advice on which environments may be most suitable for my problem!
You can also check SUMO as a traffic simulator and RLLib library for multi-agent reinforcement learning.
I want to develop a Math quiz program using reinforcement learning.
Assume that we have 1000 questions in hand and 25 questions to be asked in each quiz. Instead of asking questions at random, program has to learn from the way user answer and ask the next question.
Quiz programme should be a reinforcement learning agent. How to design the solution and which are the reinforcement learning techniques to be used?
Example :
BoT: what is 5+ 1:
User: 3 (Wrong Answer)
Bot: Asked easy question or if correct answer asked a difficult question.
PPO is a very common technique for these type of RL applications in the edTech space. You can take a lot of inspiration from this article. They use the RLgraph package and the PPO algorithm.
You would first have to define your goal/reward function. In your case, I would define the reward function to have something to do with the percentage of previous questions that was answered correctly. If this percentage is 0% or 100%, the reward is low (it's too difficult/easy). If it's close to 50% you might choose the reward to be high.
That way, the algo will move towards questions that get 50% correctness (are medium difficulty). You can play with the range (last 2 q's or last 10 q's).
As state-space, you can also include questions answered correctly, maybe characteristics like age etc. to help initiate the algo when the user hasn't used it too much yet.
As action-space, you can have all the questions. You could also cluster(e.g. difficult/easy, or geometry/algebra) questions based on your intuition and make the clusters actions, to decrease the action space.
I'd like to improve my little robot with machine learning.
Up to now it uses simple while and if then decisions in its main function to act as a lawn mowing robot.
My idea is to use SKLearn for that purpose.
Please help me to find the right first steps.
i have a few sensors that tell about the world otside:
World ={yaw, pan, tilt, distance_to_front_obstacle, ground_color}
I have a state vector
State = {left_motor, right_motor, cutter_motor}
that controls the 3 actors of the robot.
I'd like to build a dataset of input and output values to teach sklearn the wished behaviour, after that the input values should give the correct output values for the actors.
One example: if the motors are on and the robot should move forward but the distance meter tells constant values, the robot seems to be blocked. Now it should decide to draw back and turn and move to another direction.
First of all, do you think that this is possible with sklearn and second how should i start?
My (simple) robot control code is here: http://github.com/bgewehr/RPiMower
Please help me with the first steps!
I would suggest to use Reinforcement Learning. Here you have a tutorial of Q-Learning that fits well into your problem.
If you want code in python, right now I think there is no implementation of Q-learning in scikit-learn. However, I can give you some examples of code in python that you could use: 1, 2 and 3.
Also please have in mind that reinforcement learning is set to maximize the sum of all future rewards. You have to focus on the general view.
Good luck :-)
The sklearn package contains a lot of useful tools for machine learning so I dont think thats a problem. If it is, then there are definitely other useful python packages. I think collecting data for the supervised learning phase will be the challenging part, and wonder if it would be smart to make a track with tape within a grid system. That would make it be easier to translate the track to labels (x,y positions in the grid). Each cell in the grid should be small if you want to make complex tracks later on I think. It may be very smart to check how they did in the self-driving google car.