HI I'm preprocessing some image data to run in a simple FF network:
I have two options that in my eyes are the same but one performs a lot better than the other:
Option 1
I save the images in a directory with correspondent subdirectories and run this:
xy_training = tf.keras.preprocessing.image_dataset_from_directory("/content/data/train", image_size=(48,48), color_mode='grayscale',label_mode="int")
xy_validation = tf.keras.preprocessing.image_dataset_from_directory("/content/data/valid", image_size=(48,48), color_mode='grayscale',label_mode="int")
xy_testing = tf.keras.preprocessing.image_dataset_from_directory("/content/data/test", image_size=(48,48), color_mode='grayscale',label_mode="int")
Option 2
I have the raw arrays of the grayscale images and do this
def preprocess(data):
X = []
pixels_list = data["pixels"].values
for pixels in pixels_list:
single_image = np.reshape(pixels.split(" "), (WIDTH,HEIGHT)).astype("float")
X.append(single_image)
# Convert list to 4D array:
X = np.expand_dims(np.array(X), -1)
# Normalize pixel values to be between 0 and 1
X = X / 255.0
return X
train_images= preprocess(train_data)
valid_images= preprocess(valid_data)
test_images= preprocess(test_data)
Option 2 performs so much better than Option 1. Is there a parameter in tf.keras.preprocessing.image_dataset_from_directory( i'm not setting?
Thanks!
This is most probably due to
tf.keras.preprocessing.image_dataset_from_directory
not having a built in normalization function. The other custom function you have is applying normalization, so comparison in not an apple-to-apple one.
You will have to do the normalization in a later step after loading the datasets using image_dataset_from_directory.
Here's a sample code for normalizing after loading a batch dataset:
def normalize(image,label):
image = tf.cast(image/255. ,tf.float32)
label = tf.cast(label ,tf.float32)
return image,label
xy_training = xy_training.map(normalize)
xy_validation = xy_validation.map(normalize)
xy_testing = xy_testing.map(normalize)
Related
I am working on a classification problem in python and would like to scale the dataset in the first step.
I have 3463 images each with a dimension of (40,90,3) respectively (x, y, channel) . Overall, the array has a dimension of (3463, 40, 90,3)
How can I use the standard scale correctly and how can I display the image?
Code:
#------------- Image Preprocessing -----------------------------------
Eingangsbilder2 = np.asarray(Eingangsbilder2)
print("Image-dim: ",Eingangsbilder2.shape)
scalers = {}
for x in range(0, len(Eingangsbilder2)):
for i in range(0,Eingangsbilder2[x].shape[2]):
scalers[i] = StandardScaler()
Eingangsbilder2[x][:, :, i] = scalers[i].fit_transform(Eingangsbilder2[x][:, :, i])
plt.imshow(Eingangsbilder[2010])
You can get rid of the for loop altogether by applying z-scoring, which is equivallent to scikit-learn StandardScaler to the first "image number" axis:
Eingangsbilder2 = scipy.stats.zscore(Eingangsbilder2, axis=0)
Hint: In Python you can simply write range(len(Eingangsbilder2)), since the first index (unlike MATLAB) always starting with 0
I am trying to implement an image augmentation strategy similar to RandAugment in TensorFlow. From the RandAugment paper, the following code shows how N augmentations are randomly selected to be applied to images.
transforms = [’Identity’, ’AutoContrast’, ’Equalize’, ’Rotate’, ’Solarize’,
’Color’,’Posterize’, ’Contrast’, ’Brightness’, ’Sharpness’,
’ShearX’, ’ShearY’,’TranslateX’, ’TranslateY’]
def randaugment(N, M):
"""Generate a set of distortions.
Args:
N: Number of augmentation transformations to apply
sequentially.
M: Magnitude for all the transformations.
"""
sampled_ops = np.random.choice(transforms, N)
return [(op, M) for op in sampled_ops]
However, I wish to do this per batch of images in TensorFlow, ideally as efficiently as possible. It would look something like
transform_names = ['Identity', 'Brightness', 'Colour', 'Contrast', 'Equalise', 'Rotate',
'Sharpness', 'ShearX', 'ShearY', 'TranslateX', 'TranslateY']
transforms = {'Identity':identity, 'Brightness':brightness, 'Colour':colour,
'Contrast':contrast, 'Equalise':equalise, 'Rotate':rotate,
'Sharpness':sharpness, 'ShearX':shear_x, 'ShearY':shear_y,
'TranslateX':translate_x, 'TranslateY':translate_y}
def brightness(image, M):
M = tf.math.minimum(M, 0.95)
M = tf.math.maximum(M, 0.05)
B = M - 1
image = tf.image.adjust_brightness(image, delta=B)
image = tf.clip_by_value(image, clip_value_min=0, clip_value_max=1)
return image
def augment(image):
N = 3
M = tf.random.uniform(minval=0, maxval=1, shape=[])
sampled_ops = np.random.choice(transform_names, N)
for op in sampled_ops:
image = transforms[op](image, M)
return image
x = tf.data.Dataset.from_tensor_slices(x)
x = x.batch(batch_size)
x_a = x.map(augment)
where x is the dataset of images, and augment is the augmentation function that randomly samples N augmentations to apply to each image. I've added the brightness function to illustrate the composition of the individual augmentation functions. From what I've gathered, any NumPy function seems to only be called once across the entire dataset, meaning the sampled augmentations will be the same for every image.
How could I write this code such that the individual augmentations are randomly sampled independently for each batch?
I'm using the following example to analyse the performance of Computer Vision system depending on the data quality.
Keras Implementation Retinanet: https://keras.io/examples/vision/retinanet/
My goal is to corrupt(stretch, shift) certain percentages (10%,20%,30%) of the total bounding boxes across all images. This means that images should be randomly picked and them some of the bounding boxes corrupted so that in total the target percentage is affected.
I'm using the tensorflow datasets as my training data (e.g. https://www.tensorflow.org/datasets/catalog/kitti).
My basic idea was to generate an array in the size of the total amout of boxes and fill it with 1 (modify box) and 0 (ignore box) and then iterate through all boxes:
random_array = np.concatenate((np.ones(int(error_rate_size*TOTAL_NUMBER_OF_BOXES)+1,dtype=int),np.zeros(int((1-error_rate_size)*TOTAL_NUMBER_OF_BOXES)+1,dtype=int)))
The problem is that the implementation I'm using is heavily relying on graph implementation and specifially on the map function (https://www.tensorflow.org/api_docs/python/tf/data/Dataset#map). I would like to follow this pattern in order to keep the implemented data pipeline.
What I am hopeing to do is to use map function in combination with a global counter so I can loop through the array and modify whenever a condition is given. It should roughly look something like this:
COUNT = 0
def damage_data(box):
scaling_range = 2.0
global COUNT
COUNT += 1
if random_array[COUNT]== 1:
new_box = tf.stack(
[
box[0]*scaling_range*tf.random.uniform(shape=(),minval=0.0,maxval=1.0,dtype=tf.float32,seed=1), # x center
box[1]*scaling_range*tf.random.uniform(shape=(),minval=0.0,maxval=1.0,dtype=tf.float32,seed=2), # y center
box[2]*scaling_range*tf.random.uniform(shape=(),minval=0.0,maxval=1.0,dtype=tf.float32,seed=3), # width,
box[3]*scaling_range*tf.random.uniform(shape=(),minval=0.0,maxval=1.0,dtype=tf.float32,seed=4), # height,
],
axis=-1,)
else:
tf.print("Not Changed")
new_box = tf.stack(
[
box[0],
box[1], # y center
box[2], # width,
box[3], # height,
],
axis=-1,)
return new_box
def damage_data_cross_sequential(image, bbox, class_id):
# bbox format [x_center, y_center, width, height]
bbox = tf.map_fn(damage_data,bbox)
return image, bbox, class_id
train_dataset = train_dataset.map(damage_data_cross_sequential,num_parallel_calls=1)
But using this code the variable COUNT is not incremented globally but rather every map() call starts from the initial value 0. I assume this somehow is caused through the graph implementation and the parallel processes in map().
The question is now if there is any way to globally increase a counter through the map function or if I could extend the given dataset with a unique identifier (e.g. add box[5] = id).
I hope the problem is clear and thanks already! :)
--------------UPDATE 1-------------------------------
The second approach as described by #Lescurel is what I'm trying to do.
Some clarifications about the dataset structure.
The number of boxes per image is not identical.It changes from image to image.
e.g. sample 1: ((x_dim, y_dim, 3), (4,4)), sample 2: ((x_dim, y_dim, 3), (2,4))
For a better understanding the structure can be reproduced with the following:
import tensorflow as tf
import tensorflow_datasets as tfds
import numpy as np
valid_ds = tfds.load('kitti', split='validation') # validation is a smaller set
def select_relevant_info(sample):
image = sample["image"]
bbox = sample["objects"]["bbox"]
class_id = tf.cast(sample["objects"]["type"], dtype=tf.int32)
return image, bbox, class_id
valid_ds = valid_ds.map(select_relevant_info)
for sample in valid_ds.take(1):
print(sample)
For plenty of reasons, using a global state is not a terribly good idea, but it's probably even worse in a concurrent context like this one.
There is at least two other ways of implementing what you want:
using a random sample with a threshold as condition to modify the label
put your random array in the dataset as the condition to modify the label.
I personally prefer the first option, which is simpler.
An example.
Lets generate some random data, and create a tf.Dataset. In that example, the total number of sample is 1000:
imgs = tf.random.uniform((1000, 4, 4))
boxes = tf.ones((1000, 4))
ds = tf.data.Dataset.from_tensor_slices((imgs, boxes))
First option: Random Sample
This function will draw a number uniformly between 0 and 1. If this number is higher than the threshold prob, then nothing happens. Otherwise, we modify the label. In that example, it gives a 0.05% chance of modifying the label.
def change_label_with_prob(label, prob=0.05, scaling_range=2.):
return tf.cond(
tf.random.uniform(()) > prob,
lambda: label,
lambda: label*scaling_range*tf.random.uniform((4,), 0., 1., dtype=tf.float32),
)
You can simply call it with Dataset.map:
new_ds = ds.map(lambda img, box: (img, change_label_with_prob(box)))
Second Option : Pass the condition array around
First, we generate an array filled with our conditions: 1 if we want to modify the array, 0 if not.
# lets set the number to change to 200
N_TO_CHANGE = 200
# randomly generated array with 200 "1" and "800" 0.
cond_array = tf.random.shuffle(
tf.concat([tf.ones((N_TO_CHANGE,),dtype=tf.bool), tf.zeros((1000 - N_TO_CHANGE,),dtype=tf.bool)], axis=0)
)
Then we can create a dataset from that array of conditions, and zip it with our previous dataset:
# creating a dataset from the conditional array
ds_cond = tf.data.Dataset.from_tensor_slices(cond_array)
# zipping the two datasets together
ds_data_and_cond = tf.data.Dataset.zip((ds, ds_cond))
# each element of that dataset is ((img, box), cond)
We can write our function, roughly the same as before:
def change_label_with_cond(label, cond, scaling_range=2.0):
# if true, modifies, do nothing otherwise
return tf.cond(
cond,
lambda: label
* scaling_range
* tf.random.uniform((4,), 0.0, 1.0, dtype=tf.float32),
lambda: label,
)
And then map the function on our new dataset, paying attention to the nested shape of each element of the dataset:
ds_changed_label = ds_data_and_cond.map(
lambda img_and_box, z: (img_and_box[0], change_label_with_cond(img_and_box[1], z))
)
# New dataset has a shape (img, box), same as before the zipping
I am working on a captcha recognition project with Keras library. For training set, I am using the following function to generate at most 5 digit captchas.
def genData(n=1000, max_digs=5, width=60):
capgen = ImageCaptcha()
data = []
target = []
for i in range(n):
x = np.random.randint(0, 10 ** max_digs)
img = misc.imread(capgen.generate(str(x)))
img = np.mean(img, axis=2)[:, :width]
data.append(img.flatten())
target.append(x)
return np.array(data), np.array(target)
Then, I am trying to reshape training data array like following;
train_data = train_data.reshape(train_data.shape[0], 60, 60, 3)
I guess my captchas have 3 color channel. However, when I tried to reshape the training data I am facing with the following error;
ValueError: cannot reshape array of size 3600000 into shape
(1000,60,60,3)
Note: If I try with 1 instead of 3. the error is not occurring but my accuracy is not even close to %1
You are creating a single channel image by taking the mean. The error says that you are trying to reshape an array with 3600000 elements in an array three times as big (1000*60*60*3 = 10800000). Adapt your function the example below to get it to work.
Also, because you are decreasing the width of your image to 60 pixels the target is not correct anymore. This explains the low accuracy. Try using a bigger width and your accuracy will most likely increase (e.g 150-155).
def genData(n=1000, max_digs=5, width=60):
capgen = ImageCaptcha()
data = []
target = []
for i in range(n):
x = np.random.randint(0, 10 ** max_digs)
img = misc.imread(capgen.generate(str(x)))
img = img[:,:width,:]
data.append(img.flatten())
target.append(x)
return np.array(data), np.array(target)
I'm training/testing ML models over a dataset containing images of multiple sizes. I know Keras allows us to extract a random patch of fixed size using the target_size parameter:
gen = ImageDataGenerator(width_shift_range=.9, height_shift_range=.9)
data = gen.flow_from_directory('/path/to/dataset/train',
target_size=(224, 224),
classes=10,
batch_size=32,
seed=0)
for _ in range(data.N // data.batch_size):
X, y = next(data)
For each iteration, X contains 32 patches (one for each different sample). Across all iterations, I have access to one patch of each sample in the dataset.
Question: what is the best way to extract MULTIPLE patches of a same sample?
Something like:
data = gen.flow_from_directory(..., nb_patches=10)
X, y = next(data)
# X contains 320 rows (10 patches for each 32 sample in the batch)
I know I can write a second for loop and iterate multiple times over the dataset, but this seems a little bit messy. I also would like to have a more strong guarantee that I am really fetching patches of a sample sample.
skimage has a utility method that allows you to patch images with overlapping or non-overlapping segments.
Check out view_as_windows and view_as_blocks
http://scikit-image.org/docs/dev/api/skimage.util.html?highlight=view_as_windows#skimage.util.view_as_windows
I decided to implement it myself. That's how it ended up:
n_patches = 10
labels = ('class1', 'class2', ...)
for label in labels:
data_dir = os.path.join('path-to-dir', label)
for name in os.listdir(data_dir):
full_name = os.path.join(data_dir, name)
img = Image.open(full_name).convert('RGB')
patches = []
for patch in range(n_patches):
start = (np.random.rand(2) * (img.width - image_shape[1],
img.height -image_shape[0])).astype('int')
end = start + (image_shape[1], image_shape[0])
patches.append(img_to_array(img.crop((start[0], start[1],
end[0], end[1]))))
X.append(patches)
y.append(label)
X, y = np.array(X, dtype=np.float), np.array(y, dtype=np.float)