I'm producing an ugv prototype. The goal is to perform the desired actions to the targets set within the maze. When I surf the Internet, the mere right to navigate in the labyrinth is usually made with a distance sensor. I want to consult more ideas than the question.
I want to navigate the labyrinth by analyzing the image from the 3d stereo camera. Is there a resource or successful method you can suggest for this? As a secondary problem, the car must start in front of the entrance of the labyrinth, see the entrance and go in, and then leave the labyrinth after it completes operations in the labyrinth.
I would be glad if you suggest a source for this problem. :)
The problem description is a bit vague, but i'll try to highlight some general ideas.
An useful assumption is that labyrinth is a 2D environment which you want to explore. You need to know, at every moment, which part of the map has been explored, which part of the map still needs exploring, and which part of the map is accessible in any way (in other words, where are the walls).
An easy initial data structure to help with this is a simple matrix, where each cell represents a square in the real world. Each cell can be then labelled according to its state, starting in an unexplored state. Then you start moving, and exploring. Based on the distances reported by the camera, you can estimate the state of each cell. The exploration can be guided by something such as A* or Q-learning.
Now, a rather subtle issue is that you will have to deal with uncertainty and noise. Sometimes you can ignore it, sometimes you don't. The finer the resolution you need, the bigger is the issue. A probabilistic framework is most likely the best solution.
There is an entire field of research of the so-called SLAM algorithms. SLAM stands for simultaneous localization and mapping. They build a map using some sort of input from various types of cameras or sensors, and they build a map. While building the map, they also solve the localization problem within the map. The algorithms are usually designed for 3d environments, and are more demanding than the simpler solution indicated above, but you can find ready to use implementations. For exploration, something like Q-learning still have to be used.
Related
So lets say I have a objects like a face. I take multiple pictures of the face all at different angles and from far and close. I have a sort of idea of how to make a 3d model out of these pictures but don't know how to accomplish them. My idea goes likes this.
First make code that gets the image object and gets rid of all background "noise".
Second find what part of the 3d model the picture is about and place a tag on the image for where it should fit.
Third collect and overlap all the images together to create a 3d object.
Anyone have any idea how to accomplish any of these steps or any ideas how to create a 3d model out of a series of images? I use python 3.10.4.
It seems that you are asking if there are some Python modules that would help to implement a complete photogrammetry process.
Please note that, even in the existing (and commercial) photogrammetry solutions, the process is not always fully-automated, sometimes it require some manual tweaking & point cloud selection.
Anyway, to the best of my knowledge, what you asked requires to implement the following steps:
detecting common features between the different photographs
infer the position in space of the camera that took each photograph
generate a point cloud of the photographs based on their relative position in space and the common features
convert the point cloud in a 3D mesh.
Possibly, all of these steps can be implemented in Python but I'm not aware that such a "off-the-shelf" module does exist.
There's this commercial solution called: Metashape from Agisoft, it has a python module you can use, but beware that it has its pitfalls (it threw segmentation fault for me at the end of processing which makes things... icky) and the support kind of ignores bigger problems and you can expect that they would ignore your ticket. Still, does the job quite well.
I am learning OpenCV applications by reading research papers and attempting to duplicate their tests and results. I may have jumped a bit too deep off the beaten path and am now curious the proper way to go about this investigation.
Goal: 1) Register these two images. 2) Stack the exposures (there are actually 20+ in this series). 3) Learn.
Attached below is an example image- shot with a cell phone, in low light, in burst mode. If one were to level stretch one would see there are very few hard edges (some sheets), but there are enough details to manually align portions of the images with each other. I ran this through the default OpenCV implementations of ORB and SIFT and, as expected, came back with poor matches.
I have not yet stumbled upon the right technique described to increase edge detection. As mentioned, no hard edges are present. However I thought I'd previously read that one could downsample the image using a max function and get a better 'edge' detection. That edge should be able to provide registration homography to the higher resolution image. But I can neither find the resource to do so nor any descriptions of similar activity. Help here would be appreciated.
In addition if there are any authored papers discussing this technique that I could be pointed to I'd appreciate it. I'm quite familiar with astrophotography and star stacking, and am looking forward to trying drizzle on a different type of image set.
Downsampling the image techniques I've tried to better indicate edges: Differences of Gaussians, Laplace, directional edge detection, and a few others.
I appreciate the time you've taken to help me learn how to expand my efforts for this.
Thank you.
Edit: Modifying the image's contrast, or brightness, or tonal response, has no effect on the correlation of the image content. At least in the limited set of tests I've been able to run. It makes them 'prettier' but, honestly, the algorithms don't care if they're in 'human visual space' or in 'linear digital counts'. I can post it as a pretty image but, without those sharp edges, most of the filters fail and matches don't succeed- which is the crux of my issues here.
I am trying to assess the potential of Python to calculate the service area of two points.
The idea is to create a map showing which terminal is more efficient in serving a given cell based in distance or cost or time (different map for each).
The image shows point A and point B as terminals, I am trying to calculate the service area (or influence area) for each of the terminals.
In the example on the right the domain is homogeneous, and in the example on the left we have rail (green) and waterway (yellow). The different transportation modes will change the cost and time to market of any shipment to/from A and B. Intermodal operations are possible when any of the modes intercept i.e. green to white, white to yellow, yellow to green, etc.
By service area I mean a given cell is closer/cheaper/faster to a A or to B. Once I have this information than I´d be able to create a service area map of A and B.
My question is if python is the right tool for this. As you might notice I am not familiar with programming and would appreciate any tips (tutorials, etc).
Please feel free to ask any questions back if the problem description is not clear.
Domain of the problem:
You can solve this problem in almost any programming language.
Python is a high-level programming language, meaning it takes care of things like memory management. This makes it somewhat slower but easier to learn as you have to write fewer lines of code to do what you want.
It is also versatile, well supported and established, making it a good candidate for a first language.
However, ultimately, the question is what you are going to do with it? For example, if you want to develop something for the web, then going with JavaScript is probably better.
Here a rough guide where different programming languages are used
Otherwise google "which programming language should I learn" to find any of millions of articles on this topic.
I'd like to improve my little robot with machine learning.
Up to now it uses simple while and if then decisions in its main function to act as a lawn mowing robot.
My idea is to use SKLearn for that purpose.
Please help me to find the right first steps.
i have a few sensors that tell about the world otside:
World ={yaw, pan, tilt, distance_to_front_obstacle, ground_color}
I have a state vector
State = {left_motor, right_motor, cutter_motor}
that controls the 3 actors of the robot.
I'd like to build a dataset of input and output values to teach sklearn the wished behaviour, after that the input values should give the correct output values for the actors.
One example: if the motors are on and the robot should move forward but the distance meter tells constant values, the robot seems to be blocked. Now it should decide to draw back and turn and move to another direction.
First of all, do you think that this is possible with sklearn and second how should i start?
My (simple) robot control code is here: http://github.com/bgewehr/RPiMower
Please help me with the first steps!
I would suggest to use Reinforcement Learning. Here you have a tutorial of Q-Learning that fits well into your problem.
If you want code in python, right now I think there is no implementation of Q-learning in scikit-learn. However, I can give you some examples of code in python that you could use: 1, 2 and 3.
Also please have in mind that reinforcement learning is set to maximize the sum of all future rewards. You have to focus on the general view.
Good luck :-)
The sklearn package contains a lot of useful tools for machine learning so I dont think thats a problem. If it is, then there are definitely other useful python packages. I think collecting data for the supervised learning phase will be the challenging part, and wonder if it would be smart to make a track with tape within a grid system. That would make it be easier to translate the track to labels (x,y positions in the grid). Each cell in the grid should be small if you want to make complex tracks later on I think. It may be very smart to check how they did in the self-driving google car.
I have load an obj file to render my opengl model using pyopengl and pygame. The 3D model show successfully.
Below is the 3D model i render with obj file, Now i cut my model into ten pieces through y axis , my question is how to get the sectional drawing in each piece?
I'm really very new to openGL, Is there any way can do that?
There are two ways to do this and both use clipping to "slice" the object.
In older versions of OpenGL you can use user clip planes to "isolate" the slices you desire. You probably want to rotate the object before you clip it, but it's unclear from your question. You will need to call glClipPlane() and you will need to enable it using glEnable with the argument GL_CLIP_PLANE0, GL_CLIP_PLANE1, ...
If you don't understand what a plane equation is you will have to read up on that.
In theory you should check to see how many user clip planes exist on your GPU by calling glGetIntegerv with argument GL_MAX_CLIP_PLANES but all GPUs support at least 6.
Since user clip planes are deprecated in modern Core OpenGL you will need to use a shader to get the same effect. See gl_ClipDistance[]
Searching around on Google should get you plenty of examples for either of these.
Sorry not to provide source code but I don't like to post code unless I am 100% sure it works and I don't have the time right now to check it. However I am 100% sure you can easily find some great examples on the internet.
Finally, if you can't make it work with clip planes and some hacks to make the cross sections visible then this may indeed be complicated because creating closed cross sections from an existing model is a hard problem.
You would need to split the object, and then rotate the pieces so that they are seen from the side. (Or move the camera. The two ideas are equivalent. But if you're coding this from scratch, you don't really have the abstraction of a 'camera'.) At that point, you can just render all the slices.
This is complicated to do in raw OpenGL and python, essentially because objects in OpenGL are not solid. I would highly recommend that you slice the object into pieces ahead of time in a modeling program. If you need to drive those operations with scripting, perhaps look into Blender's python scripting system.
Now, to explain why:
When you slice a real-life orange, you expect to get cross sections. You expect to be able to see the flesh of the fruit inside, with all those triangular pieces.
There is nothing inside a standard polygonal 3D model.
Additionally, as the rind of a real orange has thickness, it is possible to view the rind from the side. In contrast, one face of a 3D model is infinitely thin, so when you view it from the side, you will see nothing at all. So if you were to render the slices of this simple model, from the side, each render would be completely blank.
(Well, the bits at the end will have 'caps', like the ends of a loaf a bread, but the middle sections will be totally invisible.)
Without a programming library that has a conception of what a cut is, this will get very complicated, very fast. Simply making the cuts is not enough. You must seal up the holes created by slicing into the original shape, if you want to see the cross-sections. However, filling up the cross sections has to be done intelligently, otherwise you'll wind up with all sorts of weird shading artifacts (fyi: this is caused by n-gons, if you want to go discover more about those issues).
To return to the original statement:
Modeling programs are designed to address problems such as these, so I would suggest you leverage their power if possible. Or at least, you can examine how Blender implements this functionality, as it is open source.
In Blender, you could make these cuts with the knife tool*, and then fill up the holes with the 'make face' command (just hit F). Very simple, even for those who are not great at art. I encourage you to learn a little bit about 3D modeling before doing too much 3D programming. It personally helped me a lot.
*(The loop cut tool may do the job as well, but it's hard to tell without understanding the topology of your model. You probably don't want to get into understanding topology right now, so just use the knife)