I'm trying to tile a folder of 3 band images into 256 x 256 pixel squares, (keeping the original CRS), feed this into a Unet model and then re-organize my tiled data back into one complete image to view the results of my model.
I've used this answer but the last row of tiles are composed of the remaining pixels (edge cases).
Is there a way of either allowing for padding to be added or an automatically determined overlap to prevent this and ensure all tiles are 256x256?
makeLastPartFull = True is supposed to do this with cv2, so I am looking for something similar with GDAL?
I'm using Pycharm version 2021.1.
Related
I have the a list of images (each of these image is a separate file), let's say they are some jigsaw puzzle pieces and, for each of them, I know it's position (x,y) and rotation in the complete puzzle.
How can I show the complete puzzle by stitching each of these pieces together in a single image (given that i know where to put each of them)?
I don't know if this is important but the pieces are not of regular shape (e.g. they are not squares), and they are all of different sizes
EDIT:
For the moments it seems to be working without the rotation but there is another problem, the pieces seems to not have a transparent background but rather a black one.
I have loaded them with opencv2 in the following way:
import glob
folder = './img/2/frag_eroded/'
frags = []
files = glob.glob(folder+"/*.png")
for file in files:
image = cv2.imread(file, cv2.IMREAD_UNCHANGED)
image = cv2.cvtColor(image, cv2.COLOR_BGRA2RGBA)
frags.append(image)
Example of resulting image, you can kinda see the the squares around each piece and see how the pieces overlap with their "background" that should be transparent rather then black
This depends on how you want to handle it when there's an overlapping transparent area.
Suppose all pixels are either transparent or opaque, and
Suppose each image has RGBA (4-channels including alpha),
then you can set all RGB values to zero whenever the pixel is transparent.
Then proceed to add the smaller images to a bigger canvas (initialized to be all zeros RGB). The canvas can either have an alpha layer or not, depending on your preference.
Beware the canvas is big enough to contain all of them. So the first step here would be to make a large enough matrix / opencv image.
How to add images: https://stackoverflow.com/a/68878529/19042045
I have a large .tiff raster (at least 200x150), but I need a bunch of 32x32 .tiff files. Is there an easy way to cut up that .tiff in Python? I think the workflow would look something like:
Create 32x32 box in bottom right corner of raster
Clip raster with that box, saving new clipped .tiff raster
Shift box left by 32 pixels (if less than 32 pixels left, shift up 32 pixels and restart on right side)
Repeat clip/shift until can't shift up or sideways
The input raster won't be an even multiple of 32, but I don't care if I lose some of the original raster off of the sides. As long as the original data is preserved for each 32x32 raster, I'm happy.
I was able to solve this using the arcpy module.
The arcpy.management.SplitRaster() documentation is available at https://pro.arcgis.com/en/pro-app/2.8/tool-reference/data-management/split-raster.htm. After this, I used os.listdir() to get a list of the output files. OpenCV has a ndarrayt.shape() function that gives the dimensions of the image, so I looped that over each file and deleted any that didn't match the size that I wanted.
My yolov5 model was trained on 416 * 416 images. I need to detect objects on my input image of size 4008 * 2672. I split the image into tiles of size 416 * 416 and fed to the model and it can able to detect objects but at the time of stitching the predicted image tiles to reconstruct original image, I could see some objects at the edge of tiles become split and detecting half in one tile and another half in another tile, can someone tell me how to made that half detections into a single detection in the reconstruction.
Running a second detection after offseting the tiles split would ensure that all previously cut objects would be in a single tile (assuming they are smaller than a tile). Maybe you could then combine the two results to get only the full objects
You wrote "I need to detect objects" but didn't say why splitting the image is the solution you chose. I must ask, is splitting the image necessary?
Here is the output of yolov4 on a (3840,2160,3) image. yolov4 resize the image internally to size specified as an argument (YOLO FAMILY ALLOWED IN_DIMS: (320, 320), (416, 416), (512, 512), (608, 608)), that should be transparent to the user.
I think you have to calculate union of the objects to get bounding boxes as you might calculated iou while tiling the images. Did you try that?
I am also in the same path used image tiling technique for detecting small objects.
I am new to deep learning, and I am trying to train a ResNet50 model to classify 3 different surgical tools. The problem is that every article I read tells me that I need to use 224 X 224 images to train ResNet, but the images I have are of size 512 X 288.
So my questions are:
Is it possible to use 512 X 288 images to train ResNet without cropping the images? I do not want to crop the image because the tools are positioned rather randomly inside the image, and I think cropping the image will cut off part of the tools as well.
For the training and test set images, do I need to draw a rectangle around the object I want to classify?
Is it okay if multiple different objects are in one image? The data set I am using often has multiple tools appearing in one image, and I wonder if I must only use images that only have one tool appearing at a time.
If I were to crop the images to fit one tool, will it be okay even if the sizes of the images vary?
Thank you.
Is it possible to use 512 X 288 images to train ResNet without cropping the images? I do not want to crop the image because the tools
are positioned rather randomly inside the image, and I think cropping
the image will cut off part of the tools as well.
Yes you can train ResNet without cropping your images. you can resize them, or if that's not possible for some reason, you can alter the network, e.g. add a global pooling at the very end and account for the different input sizes. (you might need to change kernel sizes, or downsampling rate).
If your bigest issue here is that resnet requires 224x224 while your images are of size 512x228, the simplest solution would be to first resize them into 224x224. only if that`s not a possibility for you for some technical reasons, then create a fully convolutional network by adding a global pooling at the end.(I guess ResNet does have a GP at the end, in case it does not, you can add it.)
For the training and test set images, do I need to draw a rectangle around the object I want to classify?
For classification no, you do not. having a bounding box for an object is only needed if you want to do detection (that's when you want your model to also draw a rectangle around the objects of interest.)
Is it okay if multiple different objects are in one image? The data set I am using often has multiple tools appearing in one image, and I
wonder if I must only use images that only have one tool appearing at
a time.
3.Its ok to have multiple different objects in one image, as long as they do not belong to different classes that you are training against. That is, if you are trying to classify apples vs oranges, its obvious that, an image can not contain both of them at the same time. but if for example it contains anything else, a screwdriver, key, person, cucumber, etc, its fine.
If I were to crop the images to fit one tool, will it be okay even if the sizes of the images vary?
It depends on your model. cropping and image size are two different things. you can crop an image of any size, and yet resize it to your desired dimensions. you usually want to have all images with the same size, as it makes your life easier, but its not a hard condition and based on your requirements you can have varying images, etc as well.
First time posting here.
I am working on image processing for a bioengineering project, and I am using Python, mostly the Skimage package, to batch process the images we have taken. There are 6 - 8 tubes in each image, where cells flow through. At each time point, we captured normal bright field images, as well as fluorescence images. I am trying to identify the tubes based on the bright field images, separate the tubes from the background and label them. The masked/labeled images will be used for downstream processing for the fluorescence images, where we identify cells and get their shape metrics.
tl;dr: Python; skimage; image processing; separate tube-like structures from background in a bright field image.
I will use one example to show what I have done. I wanted to include all the intermediate images, but I do not have any reputation points to post more than two images. So I will show the first and last images.
I cropped out the scale bar first and obtained a greyscale image, and here is the resulting image.bf_image
I used sobel_h filter to find the horizontal edges.
bf_sobel = sobel_h(bf_cropped)
io.imshow(bf_sobel)
I then tried all thresholds and picked out a threshold algorithm that looked good to me (otsu)
fig, ax = try_all_threshold(bf_sobel, figsize=(25,20), verbose=False)
plt.show()
bf_threshold = threshold_otsu(bf_sobel)
bf_thresholded = bf_sobel > bf_threshold
io.imshow(bf_thresholded)
I then applied the closing function and remove_small_objects.
bf_closed = closing(bf_thresholded)
bf_small_removed = remove_small_objects(bf_closed, 50)
io.imshow(bf_small_removed)
bf_small_removed
This is where I got stuck. I am trying to fill the gaps between the tube edges, and create masks for individual tubes to separate them from the background. Any advice? Thanks!!!