I want to start a python project thats identify some elements in screen, like a square's size or all elements in screen with a certain color, and i don't even know from where should i start.
For example, all the top points above certain value at the graph.
Anyone could please give me a north?
If you're looking to interact with elements inside a browser, look at Selenium. If you want to control the desktop itself, look into Sikuli.
In either case, you can use OpenCV to identify elements and do template matching.
Edit: more comments after more details from OP
If you're just looking to identify the peaks in the graph, you can take a screenshot of the display at regular intervals using Sikuli or PyScreenshot and then use template matching in OpenCV (either directly or using Sikuli) to get the coordinates of the peaks in the screenshot. The horizontal line across the graph might throw off some of the template matching, but you can play around with the various parameters to get the results you want.
Check out this tutorial for template matching.
Related
I‘m trying to automatically insert text into dxf contours using python. I have a bulk of dxf files for lasercutting. Often we want to engrave the partnumber into the sheetmetal part.
My attempt is to make a square box Where the length and with equals the text height and width. After to Programm found a place where it is outside the innerconturs and inside the outer contour I want to fill the box with text. I tried to abstract the contours with a polygon and start checking if it fits or not, which kind of works ok. Not finish completely yet.
I wondered if there is some sort of library /tool that has this function because computing time is quite high atm before putting more afford into the Programm or anyone has an easier approach than mine.
The next release of ezdxf (v0.16 is in beta now), has a new text2path add-on, which uses Matplotlib to render text strings as path objects. This path objects can be placed as:
POLYLINE and SPLINE entities to preserves the smooth curves
flattened to POLYLINE or LWPOLYLINE entities, which consist only of straight lines
HATCH entities with or without spline edges
flatten the paths into simple vertices
There are some examples in the https://github.com/mozman/ezdxf/tree/master/examples/addons folder (*_to_path.py).
It is possible to transform the path objects by a Matrix44 transformation and there even exist a function to fit some paths into a box: fit_paths_into_box().
For additional questions, you can use the discussions board at github.
I am writing a python tool to find specific symbols (e.g. a circle/square with a number inside) on a drawing pdf/screenshot.png
I know from another data source the specific number(s) that should be inside the circle/square.
Using opencv matchTemplate I can find symbols and its coordinates.
One way would be to created all possible symbols (so circles/squares with number 1 to 1000) and save them. Then use opencv to find it on the drawing since I know the number to be found, and thus the filled symbol.
I am sure that the is a smart way to do this. Can somebody guide me into the right direction.
Note: pdfminer will not work since I will not be able to distinguish between measurement numbers and the text coming from the symbol, but I could be wrong here.
I am also trying to solve a similar problem in a coding assignment. The input is a n low poly art illustration.
Once you find the location of the UFO's, you need to crop that part and pass it through a classifier to find the number that UFO contains. The classifier is trained on 5000 images.
I am now going to try the matchTemplate method suggested by you to find the co-ordinates of the UFOs.
I'm working on a project to help my visually impaired friend, a python script will first take a screenshot every second and whatever is on the image will be converted to text, and the character which is nearest to the coordinate of curser, will be the output.
User can move the curser anywhere on screen and nearest alphabet to curser will be the output of program.
Don't worry about the form of output, it will be in form audio. But for the sake of simplicity of question lets assume it's in the form of a single character text.
Every tutorial I could find explained how to use OCR dependencies just to convert all the text to a continuous text file.
For my particular application, each alphabet will be associated with a specific co-ordinate. But I just couldn't find a single resource to learn how to identify the location of converted character on the image.
Please enlighten me how to extract the coordinates of a character from an image.
This is a good project. But I think it is a chicken-and-egg problem. You need to have OCR performed by a capable OCR engine (most don't provide coordinates) and the result will have the text and associated coordinates. Your question "how to extract the coordinates of a character from an image" means perform OCR and get coordinates. If performing zonal OCR, i.e. Not the entire screen, you need to know what zone to OCR, and establishing this zone to make sure it includes all necessary text around your mouse location in that zone is probably the biggest challenge. My company at www.wisetrend.com builds such OCR-specialized projects per case. We'll be glad to help in this non-commercial project if you'd like to work jointly.
Looking to have python take input from a camera and have it read what it sees.
For example the camera lens sees ...---... and would translate that to SOS.
Trying to find a good pattern to use that would go back to a database and tell the program a location. But I'm first trying to figure out how I could take a camera lens and it would see the pattern and feed what it sees to a program to determine location.
I found this article: count colored dots in image
But it was reading and counting color dots. So similar, but I want to be able to recognize a pattern (bar code, Morse code, ect) and related it back to a location (thinking I will have this in a database of some sort).
Thanks for any input or direction you can give.
I have to build some rudimentary CAD Tool in Python based on matplotlib for handling the display of the content.
After all the parts have been put together, the whole layout shall be exported as line elements (basically just tuples of the start / end coordinates of the lines, e.g. [x1,y1,x2,y2]) and just points.
So far I have all the basic gemoetric stuff implemented, but I cannot figure out how to implement text properly. To be able to use different fonts etc. I want to use the text capabilities of matplotlib, but I can't find a way to export the text properly from matplotlib.
Is there a way to get a vectorized output right away? Or at least an array of the plotted text?
After some days of struggling, I found a way to get the outline of the text: https://github.com/rougier/freetype-py , more precisely the example https://github.com/rougier/freetype-py/blob/master/examples/glyph-vector.py
If you just want to get the outline as an vector array, you can delete everything after line 78 and do this:
path = Path(VERTS, CODES)
outline = path.to_polygons()
This will give you an array of polygons, and each polygon is again an array of points (x,y) of the polygon.
Though it was some trouble to get freetype running on windows and I still have not figured out how to make it portable, I think I will stick with this solution, because it is fast, reliable and allows one to use all the nice system fonts.