I am using python bindings of openkinect for getting the depth and RGB image. For some reasons , I need the disparity image.
Could someone please help me that how to get that image?
Thanks a lot.
AFAIK you can retrieve different types of image both from the RGB camera (e.g. the IR image) and the depth camera. Don't known exactly which one can be useful for your purposes.
These are the constants for the video camera:
('VIDEO_BAYER',
'VIDEO_IR_10BIT',
'VIDEO_IR_10BIT_PACKED',
'VIDEO_IR_8BIT',
'VIDEO_RGB',
'VIDEO_YUV_RAW',
'VIDEO_YUV_RGB')
and these for the depth camera:
('DEPTH_10BIT',
'DEPTH_10BIT_PACKED',
'DEPTH_11BIT',
'DEPTH_11BIT_PACKED')
For example, to retrieve IR image you can do:
freenect.sync_get_video(0, freenect.VIDEO_IR_10BIT)
working with the non-sync api is analogous. Maybe you could give a try to some of the above video types, sorry I can't be more precise but I lack of some theory behind the kinect.
Related
My goal is to transform an image captured by a camera and transform that image to orthographical image without effects of perspective.
I have a few objects of known size on a surface. I have a camera, placed above and directed to those objects, as exemplified in the scene. The camera is capturing images as in image captured by the camera. I want to get an orthographical image of the environment as in orthographical image I want to get.
I have read few posts, but did not really understand their relevance to my problem, as I am not expert on these transforms. The answer from this question made me think it is possible, although I did not get how.
I would appreciate a clear explanation or pointing a clear tutorial, using Python or Lua if possible.
Any help is appreciated.
This was not possible without distorting the image. A straightforward explanation is that the perspective causes some parts of the image to be not visible, for example the white line in the marked area is not visible, and there could be something small that we are not able to observe. For those parts, the algorithm is supposed to produce some kind of prediction based on heuristics.
I want to change the video resolution from landscape to portrait mode for output from my inbuilt webcam on the laptop (cv2.VideoCapture(0)). I tried rescaling the frames to get it to work, it does go to portrait mode ( height bigger than width) but the video is skewed/stretched. Is there a way around this ? please help. I am using opencv with python.
Welcome to Stackoverflow. What you want to achieve depends on the webcam you use. The Resolution you want need to be supported by your cam. this small tutorial explains it very good.
If your camera does not support the Resolution you want, you have two possibilites:
You Crop the Image to the Resolution you want.
If your max resolution does not allow your resolution you can crop it to the biggest resultion possible with your wanted ratio and after that upscale it.
Careful with upscaling. You have different interpolation methods available.
I'm currently learning about computer vision OCR. I have an image that needs to be scan. I face a problem during the image cleansing.
I use opencv2 in python to do the things. This is the original image:
image = cv2.imread(image_path)
cv2.imshow("imageWindow", image)
I want to cleans the above image, the number at the middle (64) is the area I wanted to scan. However, the number got cleaned as well.
image[np.where((image > [0,0,105]).all(axis=2))] = [255,255,255]
cv2.imshow("imageWindow", image)
What should I do to correct the cleansing here? I wanted to make the screen where the number 64 located is cleansed coz I will perform OCR scan afterwards.
Please help, thank you in advance.
What you're trying to do is called "thresholding". Looks like your technique is recoloring pixels that fall below a certain threshold, but the LCD digit darkness varies enough in that image to throw it off.
I'd spend some time reading about thresholding, here's a good starting place:
Thresholding in OpenCV with Python. You're probably going to need an adaptive technique (like Adaptive Gaussian Thresholding), but you may find other ways that work for your images.
Is there any predefined code for this or I have to write my own code?
Also, I do not have the camera properties for this, I have only the image taken in fisheye lens and now I have to flatten the images
OpenCV provides a module for working with fisheye images: https://docs.opencv.org/3.4/db/d58/group__calib3d__fisheye.html
This is a tutorial with an example application.
Keep in mind that your task might be a bit hard to achieve since the problem is under-determined. If you have some cues in the image (such as straight lines), that might help. Otherwise, you should seek a way of getting more information about the lens. If it's a known lens type, you might find calibration info online. Also, some images might have the lens used to capture them in the EXIF data.
I am new to openCV and python both. I am trying to count people in an image. The image is supposed to be captured with an overhead camera or the way a CCTV camera is placed.
I have converted the colored image into binary image and then inverted the binary image. Then I used bitwise OR on original and inverted binary image so that the background is white and the people are colored.
How to count these people? Is it necessary to use a classifier or can i just count the contours ,if yes then how to count them?
Plus there are some issues with the technique I'm using.
Faces of people are light in color so sometimes only hair are getting extracted.
The dark objects other than people also get extracted.
If the floor is dark it won't give the binary image that is needed.
So is there any other method to achieve what I'm trying to do here?
Not sure but it may worth to check there.
It explain how to perform face recognition using openCV and python in pictures and extand it to webcam here, it's not quite what your looking for but may give you some clue/