This question may be a little different, since I'm pretty much a noob at programming. I've recently started playing a Pokémon game, and I thought of an idea for a cool Python program that would be able to grab a color on a certain pixel to detect if a pokémon is shiny or not.
However, due to my very limited programming experience, I don't know what modules to use and how to use them.
So basically, here's what I want it to do:
Move the cursor to a certain pixel and click.
Detect the color of a certain pixel, and compare that to the desired color.
If it's not desirable, click a button and re-loop until it's desirable.
So, it's pretty obvious that we'll be needing a while loop, but can someone explain how to do the above three things in relatively simple terms? Thanks.
Try breaking down this list into actions and searching for answers to each action.
For example, 1 is performed by the user? So we don't have to program that.
For 2, we need to determine the location of the mouse when clicked and get the color under it.
For 3, compare the RGB values (or whatever) to the desired values for that pokemon. This is complicated because your program needs to figure out which pokemon it is checking against. There are probably pokemon where their regular color is another's shiny. Try breaking down this into even smaller problems :)
No guarantees that these links will be perfect, just trying to show how you need to break down the problem into smaller, workable chunks which you can address either directly in code or by searching for other people who have already solved those smaller problems.
Related
So lets say I have a objects like a face. I take multiple pictures of the face all at different angles and from far and close. I have a sort of idea of how to make a 3d model out of these pictures but don't know how to accomplish them. My idea goes likes this.
First make code that gets the image object and gets rid of all background "noise".
Second find what part of the 3d model the picture is about and place a tag on the image for where it should fit.
Third collect and overlap all the images together to create a 3d object.
Anyone have any idea how to accomplish any of these steps or any ideas how to create a 3d model out of a series of images? I use python 3.10.4.
It seems that you are asking if there are some Python modules that would help to implement a complete photogrammetry process.
Please note that, even in the existing (and commercial) photogrammetry solutions, the process is not always fully-automated, sometimes it require some manual tweaking & point cloud selection.
Anyway, to the best of my knowledge, what you asked requires to implement the following steps:
detecting common features between the different photographs
infer the position in space of the camera that took each photograph
generate a point cloud of the photographs based on their relative position in space and the common features
convert the point cloud in a 3D mesh.
Possibly, all of these steps can be implemented in Python but I'm not aware that such a "off-the-shelf" module does exist.
There's this commercial solution called: Metashape from Agisoft, it has a python module you can use, but beware that it has its pitfalls (it threw segmentation fault for me at the end of processing which makes things... icky) and the support kind of ignores bigger problems and you can expect that they would ignore your ticket. Still, does the job quite well.
I am asking a question, because my two week research are started to get me really confused.
I have a bunch of images, from which I want to get the numbers in Runtime (it is needed for reward function in Reinforcment Learning). The thing is, that they are pretty clear for me (I know that it is absolutely different thing for OCR-systems, but that's why I am providing additional images to show what I am talking about)
And I thought that because they are rather clear. So I've tried to use PyTesseract and when it does not worked out I have tried to research which other methods could be useful to me.
... and that's how my search ended here, because two weeks of trying to find out which method would be bestly suited for my problem just raised more questions.
Currently I think that the best resolve for it is to create digit recognizing model from MNIST/SVNH dataset, but is not it a little bit overkill? I mean, images are standardized, they are in Grayscale, they are small, and the numbers font stays the same so I suppose that there is easier way of modyfing those images/using different OCR method.
That is why I am asking for two questions:
Which method should be the most useful for my case, if not model
trained with MNIST/SVNH datasets?
Is there any kind of documentation/books/sources which could make the actual choice of infrastructure easier? I mean, let's say
that in future I will come up again to plan which OCR system to use.
On what basis should I make choice? Is it purely trial and error
thing?
If what you have to recognize are those 7 segment digits, forget about any OCR package.
Use the outline of the window to find the size and position of the digits. Then count the black pixels in seven predefined areas, facing the segments.
recently I've been working on a project using the raspberry pi camera OpenCV and Python to count people passing by a specific area, live, since for my usage will be easier than processing a recorded video.
Overall the code works and all, but I've been experiencing a problem with the counting part of it, that:
1 - If an object stays in the reference line, it keeps adding to the counts;
2 - Sometimes depending on the speed of the object, it is counted multiple times;
I am not an expert on python, and may be lacking the words in english to look for the proper solution, so I thought maybe someone could tell me what would be better here to solve this problem. To illustrate, here it is a gif sample:
Even tough it looks like there are more than one reference box crossing the line, it happens when only one box crosses it, as well as when the object stays on the line.
This is the code that checks if the object is crossing the line:
if (TestaInterseccaoEntrada(CoordenadaYCentroContorno,CoordenadaYLinhaEntrada,CoordenadaYLinhaSaida)):
ContadorEntradas += 1
if (TestaInterseccaoSaida(CoordenadaYCentroContorno,CoordenadaYLinhaEntrada,CoordenadaYLinhaSaida)):
ContadorSaidas += 1
I thought of using some kind of delay with time.sleep(x) on the loop, but that does not solve it obviously, and also looks bad =D.
If needed, I may post the rest of the code here, but it's here, to keep things here tidy: Code Paste
Don't mind any bad syntax or errors, part of it is not mine and the part that is, looks terrible! XD
Thanks in advance.
Cool project! It's quite a challenge to count the amount of bounding boxes that pass each line if you don't track them. It's even worse if you want to count them going both ways.
Because of this difficulty usually people prefer to track the object and then look at the trajectory to determine if the object passed the line or not.
This link can help you understand the difference. It also provides code to do the detection (but you got that part working already) and tracking (which you will need)
https://www.pyimagesearch.com/2018/08/13/opencv-people-counter/
Next to that the easiest way to track is by linking the boxes with the highest iou. A good and easy implementation can be found here:
https://github.com/bochinski/iou-tracker
Good luck!
I'm working on a basic Tkinter image viewer. After fiddling with a lot of the code, I have two lines which depending on the operation that triggered the refresh are 50-95% of the execution time.
self.photo = ImageTk.PhotoImage(resized)
self.main_image.config(image=self.photo)
Is there a faster way to display a PIL/Pillow Image in Tkinter?
I would recommend testing the 4000x4000 images you're concerned about (which is twice the resolution of a 4K monitor). Use a Google advanced search to find such images, or use a photo/image editor to tile an image you already have. Since I don't think many people would connect a 4K (or better) monitor to a low-end computer, you could then test the difference between scaling down a large image and simply displaying it, so if most of the work is in resizing the image you don't have to worry as much about that part.
Next, test the individual performance of each of the two lines you posted. You might try implementing some kind of intelligent pre-caching, which many programs do: resize and create the next photo as the user is looking at the current one, then when the user goes to the next image all the program has to do is reconfigure self.main_image. You can see this general strategy at work in the standard Windows Photo Viewer, which responds apparently instantaneously to normal usage, but can have noticeable lag if you browse too quickly or switch your browsing direction.
Also ask your target userbase what sort of machines they're using and how large their images are. Remember that reasonable "recommended minimum specs" are acceptable.
Most importantly, keep in mind that the point of Python is clear, concise, readable code. If you have to throw away those benefits for the sake of performance, you might as well use a language that places more emphasis on performance, like C. Make absolutely sure that these performance improvements are an actual need, rather than just "my program might lag! I have to fix that."
Quick question, I'm looking for a python function that performs the equivalent job that matlab's imfill.m does. I realize that python has openCV but I have been unable to get that to work properly and am trying to find a substitute for it. The part of imfill that I'm trying to replicate is the 'holes' part of it.
I have a mask that I've generated but I'm trying to fill in all regions that are surrounded by 'land' and leave only the water regions unfilled in.
If this isn't clear enough please let me know and I can try and be more specific. Thank you for your time.
I was able to find a function within scipy that performed similar to what imfill does. It's called binary_fill_holes and it can be found here for anyone that is having the same problem as myself.
Although I can't take full/any real credit for finding it since it was answered here to one of my other questions PIL Plus/imToolkit replacements by unutbu.