Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 5 years ago.
Improve this question
I'm currently trying to write a bot to play tetris on tetrisfriends.com to practice machine-learning, but I've become stuck. I'm trying to find a way to read the players score from the game but Tesseract doesn't recognize the font/numbers and I don't think I can retrain Tesseract to recognize the numbers either because it isn't a full font being used, just numbers.
The image that I'm trying to read the numbers from is this:
https://imgur.com/a/OVwV5
When I use Tesseract I can get it to recognize other words on the page, just not the numbers which is the part I need.
Does anyone have a way to do this, either by retraining Tesseract, another method, or any other way?
I'm not very familiar with Tesseract in particular, but it might not your best bet here. If the end goal was just to make a bot, you could probably pull the text directly from the app rather than worrying about OCR, but if you want to learn more about machine learning and you haven't done them already the MNIST and CIFAR-10 datasets are fantastic places to start.
Anyway! The image you're trying to test has very low contrast, and the font is heavily stylised. Looking at the website itself it looks like the characters are coloured yellow:
If you preprocessed your image so that yellow pixels are black and all others are white you would have a much cleaner source to work with e.g.:
If you want to push forward with Tesseract for this and the preprocessing isn't enough then you will probably have to retrain it for this font. You will need to prepare a corpus, process it similarly to how you expect your source data to look, and then use something like qt-box-editor to correct the data. This guide should be able to walk you through the basic steps of retraining.
Related
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 4 years ago.
Improve this question
I have images of the following type
etc..
What would be the easiest way of identifying what piece it is and if it is black or white? Do I need to use machine learning or is there an easier way?
It depends on your inputs. If all of your data looks like this (nice and clean, with contours being identical, just background and color changes), you could probably use some kind of pixel + color matching and you could be good to go.
You definitely know that deep learning and machine learning only approximate function (functions of pixels in this case), if you can find it (the function) without using those methods (with sensible amount of work), you always should.
And no, machine learning is not a silver bullet, you get an image and you throw it into convolutional neural networks black-box magic and you get your results, that's not the point.
Sorry but deep learning might be just an overkill to recognize a known set of images. Use template matching!
https://machinelearningmastery.com/using-opencv-python-and-template-matching-to-play-wheres-waldo/
You could do this using machine learning (plan or convolutional neural nets). This isn't that hard of a problem, but you have to do manual work of creating proper dataset with lots of pictures.
For example: For each piece you need to create picture with white/black field color. And you have to do different combinations, different chess piece sets vs. different table color schema. In order to make the system more robust to color schema you can try different color channels.
There are lots of questions, will the pictures you test always be in same resolution? If they aren't then you should also take that into consideration when creating dataset.
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 5 years ago.
Improve this question
I am trying to implement some kind of text detection algorithm and I want to separate the image into regions where each region contains different font size.
As in this image, for example:
Is there any easy way to implement it using python and/or opencv? if so, how?
I did tried googling it but could not find anything useful..
Thanks.
This is an interesting question. There are a few steps that you need to take in order to achieve your goals. I hope you are sufficiently informed of basic computer vision algorithm (knowledge in openCV function helps) to understand the steps i am suggesting.
Group all the words together using morphological dilation process.
Use openCV findcountour function to label all the blobs. This will give you the width and height information of each blob as well.
Here is the tricky part, now that you have data on each blob, try to run a clustering algorithm on the data with the location(x,y) and geometry(width,height) as your features.
Once you cluster them correctly, its a matter of finding the leftmost, rightmost, topmost and bottom data to draw the bounding rect.
I hope this will provide you enough information to start you work. Its is not detailed but i think its enough to guide you.
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 7 years ago.
Improve this question
my project is Recognition of handwritten tamil character using python , opencv and scikit-learn.
input file:handwritten tamil charcter images.
output file:recognised character in text file.
what are the basic steps to do the project?
i know three steps,
preprocessing , feature point extraction and classification
but,i dont know how to exactly proceed this project?
how to do the preprocessing?
where to store the training data sets images?
how to extract feature point in opencv?
how to implement this?
please help....
I am working on the same project of Handwritten Arabic Character Recognition and Generation but I didn't use opencv so far. Because in opencv you have to put filters on the image and process that image and you get the processed image as a result of the same size everytime. But in Arabic there is so much variation on every character and opencv is of no use for that purpose.
For your problem, I have some suggestions and helping material too. Before starting, you have to do a lot of research about character recognition and everything you want. Read research papers of Alex Graves, he has done a lot of research on character recognition and generation. It will help you a lot.
I am using Neural Network for this purpose. Initially, it is bit difficult to understand but when you understand this, you will get everything you want. And Python is very good language for that too. I have a lot of material to learn Neural Network and how to train your dataset on that. I have some useful links too which I have shared with you below:
Alex Graves's Profile: http://www.cs.toronto.edu/~graves/
Neural Network Understanding: http://nikhilbuduma.com/2014/12/29/deep-learning-in-a-nutshell/
Video: https://www.youtube.com/watch?v=q0pm3BrIUFo
Neural Network Code In Python: http://iamtrask.github.io/2015/07/12/basic-python-network/
Hope it helps you.
Thanks
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 7 years ago.
Improve this question
I am doing a project where I am attempting to take out the green screen on a video on python but I do not know how to go about it.
Thank you!
well...I presume you have some kind of RGB image for each frame of video.
I.e an N by 3 array.
(You could use OpenCV to read each frame.)
So it is case of going through the image and locating all the green and replacing it with what you want.
E.g if the array is called arr then for each row, i, you would check whether arr[i] == [0,255,0].
But due to the nature of film, you aren't going to have a perfectly uniform 0,255,0 green. There will be shadows and other slight variations. Perhaps it wasn't even 0,255,0 to start out with.
So you are going to be looking at removing a range of colours. Now for each row we are searching for a range of colours and replacing them with your choice.
We now run the risk of identifying a colour for removal that we don't actually want removing....so how can we check for that...
We still probably won't get a perfect match around the edges (of the objects/people we want to keep in the image), so to make this less obvious, we might want to use a little bit of blur and so on and so forth.
Look at this video: https://www.youtube.com/watch?v=rIWoLCFvjME
Try to think about what logic code is required for each little step the user takes.
Also think about all the decision the user makes that are purely subjective. Obviously these would be nigh-on impossible to automate reliably. So now we are talking about some kind of interactive application that allows the user to select different actions based on their subjective choice.
And we quickly see why green screen is often removed manually, frame by frame using a powerful editing application like photoshop, after effects etc...
OpenCV (http://opencv-python-tutroals.readthedocs.org/en/latest/) will do a lot of the algorithms for you...there is almost enough there to build your interactive greenscreen removal software...
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 8 years ago.
Improve this question
In sliding window object detectors, is it possible to do object detection "intelligently"? For example, if a human is looking for a vehicle, they're not going to look into the sky for a car. But an object detector that uses a sliding window is going to slide the window across the entire image (including the sky) and run the object classifier on each window, resulting in a lot of wasted time. Are there are any techniques out there to make sure it only looks in reasonable places?
Edit
I understand we'll have to look through everything at least once, but I wouldn't want to run a heavy complicated classifier on each window. A pre-classification classifier of sorts, perhaps?
Have you considered looking at saliency detection algorithms? Saliency detection algorithms give you an indication of where in the image a human would most likely focus on. A good example would be a human in an open field. The sky would have low saliency while the human a high one.
Maybe put your image through a saliency detection algorithm first, then threshold and find regions of where to search instead of the entire image.
A great algorithm for this is by Stas Goferman: Context-Aware Saliency Detection - http://webee.technion.ac.il/~ayellet/Ps/10-Saliency.pdf.
There is also code here to get you started: https://sites.google.com/a/jyunfan.co.cc/site/opensource-1/contextsaliency
Unfortunately it is in MATLAB, and from your tag you want to look at Python. However, there are many similarities between numpy / scipy and MATLAB so hopefully that will help you if you want to transcribe any code.
Check it out!