Currently, I am working on training the images for facial recognition system. I am using Python, OpenCV for doing so. I have collected the samples from the webcam, however, the size of sample images differs. The example for the size of sample images is 376 x 376, 412 x 412, 836 x 836.
The screenshot of current working directory:
The sample images are saved within the main folder named 'sampleImgFolder' and under the main folder specific folder for each sample.
Source code for training image
import os
import cv2
import numpy as np
from PIL import Image
recognizer = cv2.face.LBPHFaceRecognizer_create()
targetImagesDirectory="sampleImgFolder/"
dataset = cv2.CascadeClassifier('resources/haarcascade_frontalface_default.xml')
def getImageWithID(path):
#empty list to store processed data
sampleFaces = []
sampleFaceId = []
os.chdir(targetImagesDirectory)
for directory in os.listdir():
os.chdir(directory)
for files in os.listdir():
imagePath = '{}/{}'.format(os.getcwd(), files)
imagePil = Image.open(imagePath).convert('L')
imageNumpy = np.array(imagePil, 'uint8') #conversion of normal image to numpy array
#imageNumpy.astype(np.float32)
#detect face
faces = dataset.detectMultiScale(imageNumpy)
#extracting id from file name
id = files.split('_')
id = id[0].split('-')
id = id[2]
for (x, y, w, h) in faces:
sampleFaces.append(imageNumpy[y:y + h, x:x + w])
sampleFaceId.append(id)
os.chdir('../')
os.chdir('../')
return np.array(sampleFaceId), sampleFaces
print("reading images")
Ids,faces=getImageWithID(targetImagesDirectory)
print('reading completed')
recognizer.train(faces,Ids)
print("training")
#train the dataset. Create a file name trainningData.yml
recognizer.write('train/trainningData.yml')
cv2.destroyAllWindows()
I am getting following error while running above code:
That is because the datatype of Ids is a list[str]. .train() methods accepts int for labels
Related
Th h5 file does not have a group or subgroup, and when I try to extract images it shows me this error. These are depth images.
This code works for h5 file with group i.e. images then I just write image_ds = hf['images'] and it works, but for h5 file without group doesn't work.
Maybe some error in imwrite function, because when I print(IMAGE_arr) and print(imagename) it prints fine. The number of dimensions are 3 and type is float32
Here is my code:
import h5py
import numpy as np
import cv2
save_dir = 'C:/Users.../depth_imgs'
with h5py.File('depth.h5', 'r') as hf:
image_ds = hf
for imagename in image_ds.keys():
IMAGE_arr = image_ds[imagename][()]
cv2.imwrite(f"{save_dir}/{imagename}", IMAGE_arr)
cv2.waitKey(1000)
cv2.destroyAllWindows()
Loaded data:
enter image description here
enter image description here
Currently, I am preparing a synthetic dataset for object detection task. There are annotated datasets available for this kind of tasks like COCO dataset and Open Images V6. I am trying to download the images from there but only the foreground objects for a specific class e.g. person, in other words images without transparent background. The reason I am doing this is that I want to insert those images after editing them into a new images e.g. a street scene.
What I have tried so far, I used a library called FiftyOne and I downloaded the dataset with their semantic label and I am stuck here and I don`t what else to do.
It is not necessary to use FiftyOne any other method would work.
Here is the code that I have used to download a sample of the dataset with their labels
import fiftyone as fo
import fiftyone.zoo as foz
dataset = foz.load_zoo_dataset(
"coco-2017",
split="validation",
dataset_dir = "path/fiftyone",
label_types=["segmentations"],
classes = ["person"],
max_samples=10,
label_field="instances",
dataset_name="coco-images-person",
)
# Export the dataset
dataset.export(
export_dir = "path/fiftyone/image-segmentation-dataset",
dataset_type=fo.types.ImageSegmentationDirectory,
label_field="instances",
)
Thank you
The easiest way to do this is by using FiftyOne to iterate over your dataset in a simple Python loop, using OpenCV and Numpy to format and write the images of object instances to disk.
For example, this function will take in any collection of FiftyOne samples (either a Dataset for View) and write all object instances to disk in folders separated by class label:
import os
import cv2
import numpy as np
def extract_classwise_instances(samples, output_dir, label_field, ext=".png"):
print("Extracting object instances...")
for sample in samples.iter_samples(progress=True):
img = cv2.imread(sample.filepath)
img_h,img_w,c = img.shape
for det in sample[label_field].detections:
mask = det.mask
[x,y,w,h] = det.bounding_box
x = int(x * img_w)
y = int(y * img_h)
h, w = mask.shape
mask_img = img[y:y+h, x:x+w, :]
alpha = mask.astype(np.uint8)*255
alpha = np.expand_dims(alpha, 2)
mask_img = np.concatenate((mask_img, alpha), axis=2)
label = det.label
label_dir = os.path.join(output_dir, label)
if not os.path.exists(label_dir):
os.mkdir(label_dir)
output_filepath = os.path.join(label_dir, det.id+ext)
cv2.imwrite(output_filepath, mask_img)
Here is a complete example that loads a subset of the COCO2017 dataset and writes all "person" instances to disk:
import fiftyone as fo
import fiftyone.zoo as foz
from fiftyone import ViewField as F
dataset_name = "coco-image-example"
if dataset_name in fo.list_datasets():
fo.delete_dataset(dataset_name)
label_field = "ground_truth"
classes = ["person"]
dataset = foz.load_zoo_dataset(
"coco-2017",
split="validation",
label_types=["segmentations"],
classes=classes,
max_samples=20,
label_field=label_field,
dataset_name=dataset_name,
)
view = dataset.filter_labels(label_field, F("label").is_in(classes))
output_dir = "/path/to/output/segmentations/dir/"
os.makedirs(output_dir, exist_ok=True)
extract_classwise_instances(view, output_dir, label_field)
If this capability is something that will be used regularly, it may be useful to write a custom dataset exporter for this format.
I have always work with images with extensions .png, .jpg, .jpeg Now, I have seen medical images with extension .nii.gz
I'm using python and I have read it with the following code:
path = "./Task01_BrainTumour/imagesTr"
path_list = glob.glob(path+'/*.gz') #list with all paths of image.nii.gz
img = nib.load(path_list[0]).get_data() #load a single image
Now the image is an array of float32 and it has the following shape (240, 240, 155, 4). I have read online that (240, 240, 155, 4) indicates that the image has size (240,240), 155 indicates the depth of the image object, namely there are 155 layers in every image object. However, this information related to the layer/depth is not clear to me, what does it mean that an image has some layers? Finally, 4 indicates the channel of the image.
I would like to convert these images in the classical format (240,240,3) for rgb or (240,240) in grayscale. I don't know if it is possible to do that.
You're halfway there.
It looks like you're using the Brain Tumours data from the Medical Segmentation Decathlon, and NiBabel to read the images. You can install e.g. scikit-image to save the JPGs.
from pathlib import Path
import numpy as np
import nibabel as nib
from skimage import io
def to_uint8(data):
data -= data.min()
data /= data.max()
data *= 255
return data.astype(np.uint8)
def nii_to_jpgs(input_path, output_dir, rgb=False):
output_dir = Path(output_dir)
data = nib.load(input_path).get_fdata()
*_, num_slices, num_channels = data.shape
for channel in range(num_channels):
volume = data[..., channel]
volume = to_uint8(volume)
channel_dir = output_dir / f'channel_{channel}'
channel_dir.mkdir(exist_ok=True, parents=True)
for slice in range(num_slices):
slice_data = volume[..., slice]
if rgb:
slice_data = np.stack(3 * [slice_data], axis=2)
output_path = channel_dir / f'channel_{channel}_slice_{slice}.jpg'
io.imsave(output_path, slice_data)
I created an model in blender. From here I took 2d slices through the y-plane of that model leading to the following.
600 png files each corresponding to a ylocation i.e y=0, y=0.1 etc
Each png file has a resolution of 500 x 600.
I am now trying to merge the 600 pngs into a h5 file using python before loading the .h5 into some software. I find that each individual png file is read fine and looks great. However when I look at the final 3d image there is some stretching of the image, and im not sure how this is being created.
The images are resized (from 600x600 to 500x600, but I have checked and this is not the cause of the stretching). I would like to know why I am introducing such stretching in other planes (not y-plane).
Here is my code, please note that there is some work in progress here, hence why I append the dataset to a list (this is to be used for later code)
from PIL import Image
import sys
import os
import h5py
import numpy as np
import cv2
from datetime import datetime
dir_path = os.path.dirname(os.path.realpath(__file__))
sys.path.append(dir_path + '//..//..')
Xlen=500
Ylen=600
Zlen=600
directory=dir_path+"/LowPolyA21/"
for filename in os.listdir(directory):
if fnmatch.fnmatch(filename, '*.png'):
image = Image.open(directory+filename)
new_image = image.resize((Zlen, Xlen))
new_image.save(directory+filename)
dataset = np.zeros((Xlen, Zlen, Ylen), np.float)
# traverse all the pictures under the specified address
cnt_num = 0
img_list = sorted(os.listdir(directory))
os.chdir(directory)
for img in (img_list):
if img.endswith(".png"):
gray_img = cv2.imread(img, 0)
dataset[:, :, cnt_num] = gray_img
cnt_num += 1
dataset[dataset == 0] = -1
dataset=dataset.swapaxes(1,2)
datasetlist=[]
datasetlist.append(dataset)
dz_dy_dz = (float(0.001),float(0.001),float(0.001))
for j in range(Xlen):
for k in range(Ylen):
for l in range(Zlen):
if datasetlist[i][j,k,l]>1:
datasetlist[i][j,k,l]=1
now = datetime.now()
timestamp = now.strftime("%d%m%Y_%H%M%S%f")
out_h5_path='voxelA_'+timestamp+'_flipped'
out_h5_path2='voxelA_'+timestamp+'_flipped.h5'
with h5py.File(out_h5_path2, 'w') as f:
f.attrs['dx_dy_dz'] = dz_dy_dz
f['data'] = datasetlist[i] # Write data to the file's primary key data below
Example of image without stretching (in y-plane)
Example of image with stretching (in x-plane)
I am trying to open a set of images in python, but I am a bit puzzled on how I should do that. I know how to do it with one image, but I don't have a clue on how to handle several hundreds of images.
I have a file folder with a few hundred .jpg images. I want to load them in a python program to do machine learning on them. How can I do this properly?
I don't have any code yet since I am already struggling with this.
But my Idea in pseudocode was
dataset = load(images)
do some manipulations on it
How I have done it before:
from sklearn.svm import LinearSVC
from numpy import genfromtxt,savetxt
load = lambda x: genfromtxt(open(x,"r"),delimiter = ",",dtype = "f8")[1:]
dataset = load("train.csv")
train = [x[1:] for x in dataset]
target = [x[0] for x in dataset]
test = load("test.csv")
linear = LinearSVC()
linear.fit(train,target)
savetxt("digit2.csv",linear.predict(test),delimiter = ",", fmt = "%d")
Which worked fine because of the format. Al the data was in one file.
If you want to process each image individually (assuming you're using PIL or Pillow) then do so sequentially:
import os
from glob import glob
try:
# PIL
import Image
except ImportError:
# Pillow
from PIL import Image
def process_image(img_path):
print "Processing image: %s" % img_path
# Open the image
img = Image.open(img_path)
# Do your processing here
print img.info
# Not strictly necessary, but let's be explicit:
# Close the image
del img
images_dir = "/home/user/images"
if __name__ == "__main__":
# List all JPEG files in your directory
images_list = glob(os.path.join(images_dir, "*.jpg"))
for img_filename in images_list:
img_path = os.path.join(images_dir, img_filename)
process_image(img_path)
Read the documentation on python glob module and in a loop process each of the images in turn.