I want to create a Numpy array or arrays, where each sub array has the shape [128, audio_length, 1], so I can feed this np array into Keras.fit. However I cannot seem to figure out how to do this as np.array just throws cannot broadcast error
def prepare_data(df, config, data_dir, bands=128):
log_specgrams_2048 = []
for i, fname in enumerate(df.index):
file_path = data_dir + fname
data, _ = librosa.core.load(file_path, sr=config.sampling_rate, res_type="kaiser_fast")
melspec = librosa.feature.melspectrogram(data, sr=config.sampling_rate, n_mels=bands)
logspec = librosa.core.power_to_db(melspec) # shape would be [128, your_audio_length]
logspec = logspec[..., np.newaxis] # shape will be [128, your_audio_length, 1]
log_specgrams_2048.append(normalize_data(logspec))
return log_specgrams_2048
You have to group sequences by length and call Keras.fit multiple times.
You can do:
Bucketing
Zero-padding
Batch of size 1
Related
I am trying to use the following code for data generator to work with brain vessel defects segmentation. I have generated npy files for nifiti files. each npy files different dimensions [512,512,140] ,[560,560,141]. I am using the following code:
def load_img(img_dir, img_list):
images=[]
for i, image_name in enumerate(img_list):
if (image_name.split('.')[1] == 'npy'):
image = np.load(img_dir+image_name)
images.append(image)
images = np.array(images)
return(images)
def imageLoader(img_dir, img_list, mask_dir, mask_list, batch_size):
L = len(img_list)
#keras needs the generator infinite, so we will use while true
while True:
batch_start = 0
batch_end = batch_size
while batch_start < L:
limit = min(batch_end, L)
X = load_img(img_dir, img_list[batch_start:limit])
Y = load_img(mask_dir, mask_list[batch_start:limit])
yield (X,Y) #a tuple with two numpy arrays with batch_size samples
batch_start += batch_size
batch_end += batch_size
############################################
#Test the generator
from matplotlib import pyplot as plt
import random
train_img_dir = "/content/drive/MyDrive/input_data_512/train/images/"
train_mask_dir = "/content/drive/MyDrive/input_data_512/train/masks/"
train_img_list=os.listdir(train_img_dir)
train_mask_list = os.listdir(train_mask_dir)
batch_size = 2
train_img_datagen = imageLoader(train_img_dir, train_img_list,
train_mask_dir, train_mask_list, batch_size)
The error:
/usr/local/lib/python3.7/dist-packages/ipykernel_launcher.py:13: VisibleDeprecationWarning: Creating an ndarray from ragged nested
sequences (which is a list-or-tuple of lists-or-tuples-or ndarrays with different lengths or shapes) is deprecated.
If you meant to do this, you must specify 'dtype=object' when creating the ndarray
del sys.path[0]
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-6-ac15ea7a12c9> in <module>()
57
58 #Verify generator.... In python 3 next() is renamed as __next__()
---> 59 img, msk = train_img_datagen.__next__()
60
61
1 frames
<ipython-input-6-ac15ea7a12c9> in load_img(img_dir, img_list)
11
12 images.append(image)
---> 13 images = np.array(images)
14
15 return(images)
ValueError: could not broadcast input array from shape (512,512,100) into shape (512,512)
Your problem is with the line images = np.array(images)
The input is a list of differently shaped images. You try to convert it into a single higher-dimensional array. This cannot work. So what do you want to achieve?
From the looks of it, your input has shape (512, 512, 100) and (512, 512). What output shape do you want? Where should the pixels go? What you told numpy to do is create shape (2, 512, 512) but that obviously doesn't work.
What you could do is create (512, 512, 101). If that is desired, replace the offending line with images = np.dstack(images)
that worked thanks. However when I try to plot them I am getting this error: too many indices for array: array is 2-dimensional, but 3 were indexed
img, msk = train_img_datagen.__next__()
img_num = random.randint(0,img.shape[0]-1)
test_img=img[img_num]
test_mask=msk[img_num]
test_mask=np.argmax(test_mask, axis=2)
n_slice=random.randint(0, test_mask.shape[0])
plt.figure(figsize=(12, 8))
plt.subplot(221)
plt.imshow(test_img[:,:,n_slice], cmap='gray')
plt.title('Image flair')
plt.subplot(222)
plt.imshow(test_mask[:,:,n_slice],cmap='gray')
plt.title('Mask')
plt.show()
if I change the axis to 3 I get:
axis 3 is out of bounds for array of dimension 3
I'm struggling in creating a data generator in PyTorch to extract 2D images from many 3D cubes saved in .dat format
There is a total of 200 3D cubes each having a 128*128*128 shape. Now I want to extract 2D images from all of these cubes along length and breadth.
For example, a is a cube having size 128*128*128
So I want to extract all 2D images along length i.e., [:, i, :] which will get me 128 2D images along the length, and similarly i want to extract along width i.e., [:, :, i], which will give me 128 2D images along the width. So therefore i get a total of 256 2D images from 1 3D cube, and i want to repeat this whole process for all 200 cubes, there by giving me 51200 2D images.
So far I've tried a very basic implementation which is working fine but is taking approximately 10 minutes to run. I want you guys to help me create a more optimal implementation keeping in mind time and space complexity. Right now my current approach has a time complexity of O(n2), can we dec it further to reduce the time complexity
I'm providing below the current implementation
from os.path import join as pjoin
import torch
import numpy as np
import os
from tqdm import tqdm
from torch.utils import data
class DataGenerator(data.Dataset):
def __init__(self, is_transform=True, augmentations=None):
self.is_transform = is_transform
self.augmentations = augmentations
self.dim = (128, 128, 128)
seismicSections = [] #Input
faultSections = [] #Ground Truth
for fileName in tqdm(os.listdir(pjoin('train', 'seis')), total = len(os.listdir(pjoin('train', 'seis')))):
unrolledVolSeismic = np.fromfile(pjoin('train', 'seis', fileName), dtype = np.single) #dat file contains unrolled cube, we need to reshape it
reshapedVolSeismic = np.transpose(unrolledVolSeismic.reshape(self.dim)) #need to transpose the axis to get height axis at axis = 0, while length (axis = 1), and width(axis = 2)
unrolledVolFault = np.fromfile(pjoin('train', 'fault', fileName),dtype=np.single)
reshapedVolFault = np.transpose(unrolledVolFault.reshape(self.dim))
for idx in range(reshapedVolSeismic.shape[2]):
seismicSections.append(reshapedVolSeismic[:, :, idx])
faultSections.append(reshapedVolFault[:, :, idx])
for idx in range(reshapedVolSeismic.shape[1]):
seismicSections.append(reshapedVolSeismic[:, idx, :])
faultSections.append(reshapedVolFault[:, idx, :])
self.seismicSections = seismicSections
self.faultSections = faultSections
def __len__(self):
return len(self.seismicSections)
def __getitem__(self, index):
X = self.seismicSections[index]
Y = self.faultSections[index]
return X, Y
Please Help!!!
why not storing only the 3D data in mem, and let the __getitem__ method "slice" it on the fly?
class CachedVolumeDataset(Dataset):
def __init__(self, ...):
super(...)
self._volumes_x = # a list of 200 128x128x128 volumes
self._volumes_y = # a list of 200 128x128x128 volumes
def __len__(self):
return len(self._volumes_x) * (128 + 128)
def __getitem__(self, index):
# extract volume index from general index:
vidx = index // (128 + 128)
# extract slice index
sidx = index % (128 + 128)
if sidx < 128:
# first dim
x = self._volumes_x[vidx][:, :, sidx]
y = self._volumes_y[vidx][:, :, sidx]
else:
sidx -= 128
# second dim
x = self._volumes_x[vidx][:, sidx, :]
y = self._volumes_y[vidx][:, sidx, :]
return torch.squeeze(x), torch.squeeze(y)
I'm trying to create an array of images using Numpy to feed it into an image classification neural network. When I put the image into an array, it comes 3 dimensions,but when I use np.append to append it into my array of all of the images the shape is 631800003. Why is this happening and how do I fix this? Or should I be loading the images some other way?
Here is my code for the variable definition cell:
normal = np.array([])
normalSet = np.array([])
badSet = np.array([])
Labels = np.array([])
Training_data = np.array([])
validationSet = []
process_data = True
ramCheck = 0
And the image loading:
if process_data:
for image in os.listdir('train/NORMAL/'):
normal = imread('train/NORMAL/'+image)
normalSet = np.append(normal, normalSet)
Labels = np.append(Labels, 0)
validationSet.append(normal)
for image in os.listdir('train/PNEUMONIA/'):
bad = imread('train/PNEUMONIA/'+image)
badSet = np.append(badSet, bad)
Labels = np.append(Labels, 1)
validationSet.append(bad)
print("done!")
Training_Data = np.append(badSet, normalSet)
np.save("TrainingData.npy", Training_data)
np.save("TrainingLabels.npy", Labels)
else:
Training_data = np.load("TrainingData.npy")
Labels = np.load("TrainingLabels.npy")
You can append into a list easier than change the list to a numpy array which will have the correct dimensions you need to feed it to a neural network.
normalSet = []
labels = []
for image in os.listdir('train/NORMAL/'):
normal = imread('train/NORMAL/'+ image)
normalSet.append(normal)
validationSet.append(normal)
labels.append(0)
normalSet = np.array(normalSet)
labels = np.array(labels)
Say I have a for loop which runs 10 times, and each loop generates a NumPy array of shape (32, 128).
How do I iteratively combine them in the loop to finally get a NumPy array of shape (10, 32, 128, 1)?
Here I'm working on the IAM Database for handwriting recognition. So here I want the numpy array to store all my images as pixels
file = open(path, 'rb')
l = file.readlines()
y_train = np.array([])
x_train = np.array([])
count = 0
for x in l:
a = x.split()
y_train = np.append(y_train, str(a[-1].decode("utf-8")))
path1 = a[0].decode("utf-8").split('-')
os.chdir(path_toimages + path1[0] + '/' + path1[0] + '-' + path1[1])
try:
im = Image.open(a[0].decode("utf-8") + ".png")
np_im = np.array(im)
np_im = preprocess(np_im, (32, 128))
x_train = np.append((x_train, np_im))
print(np_im.shape)
except OSError:
continue
count += 1
print(count)
if count == 10:
break
Since you already know x_train's shape you could define an array with that shape. Then add the newly processed images to x_train[index] while increementing the index for every iteration.
I'm using OpenCV to read images into numpy.array, and they have the following shape.
import cv2
def readImages(path):
imgs = []
for file in os.listdir(path):
if file.endswith('.png'):
img = cv2.imread(file)
imgs.append(img)
imgs = numpy.array(imgs)
return (imgs)
imgs = readImages(...)
print imgs.shape # (100, 718, 686, 3)
Each of the image has 718x686 pixels/dimension. There are 100 images.
I don't want to work on 718x686, I'd like to combine the pixels into a single dimension. That is, the shape should look like: (100,492548,3). Is there anyway either in OpenCV (or any other library) or Numpy that allows me to do that?
Without modifying your reading function:
imgs = readImages(...)
print imgs.shape # (100, 718, 686, 3)
# flatten axes -2 and -3, using -1 to autocalculate the size
pixel_lists = imgs.reshape(imgs.shape[:-3] + (-1, 3))
print pixel_lists.shape # (100, 492548, 3)
In case anyone wants it. Here's a general way of doing this
import functools
def combine_dims(a, i=0, n=1):
"""
Combines dimensions of numpy array `a`,
starting at index `i`,
and combining `n` dimensions
"""
s = list(a.shape)
combined = functools.reduce(lambda x,y: x*y, s[i:i+n+1])
return np.reshape(a, s[:i] + [combined] + s[i+n+1:])
With this function you could use it like this:
imgs = combine_dims(imgs, 1) # combines dimension 1 and 2
# imgs.shape = (100, 718*686, 3)
def combine_dims(a, start=0, count=2):
""" Reshapes numpy array a by combining count dimensions,
starting at dimension index start """
s = a.shape
return numpy.reshape(a, s[:start] + (-1,) + s[start+count:])
This function does what you need in a more general way.
imgs = combine_dims(imgs, 1) # combines dimension 1 and 2
# imgs.shape == (100, 718*686, 3)
It works by using numpy.reshape, which turns an array of one shape into an array with the same data but viewed as another shape. The target shape is just the initial shape, but with the dimensions to be combined replaced by -1. numpy uses -1 as a flag to indicate that it should work out itself how big that dimension should be (based on the total number of elements.)
This code is essentially a simplified version of Multihunter's answer, but my edit was rejected and hinted that it should be a separate answer. So there you go.
import cv2
import os
import numpy as np
def readImages(path):
imgs = np.empty((0, 492548, 3))
for file in os.listdir(path):
if file.endswith('.png'):
img = cv2.imread(file)
img = img.reshape((1, 492548, 3))
imgs = np.append(imgs, img, axis=0)
return (imgs)
imgs = readImages(...)
print imgs.shape # (100, 492548, 3)
The trick was to reshape and append to a numpy array. It's not good practice to hardcode the length of the vector (492548) so if I were you I'd also add a line that calculates this number and puts it in a variable, for use in the rest of the script.