How to get a single index from a DataSet in PyTorch? - python

I want to randomly draw a sample from my test DataSet object to perform a prediction using my trained model.
To achieve this I use this code block which causes the following error:
rng = np.random.default_rng()
ind = rng.integers(0,len(test_ds),(1,))[-1]
I = test_ds[ind] # Note I is a list of tensors of equal size
I = [Ik.to(device) for Ik in I]
with torch.no_grad():
_, y_f_hat, _, y_f = model.forward_F(I)
y_f_hat = y_f_hat.cpu().numpy().flatten()
y_f = y_f.cpu().numpy().flatten()
ERROR: /usr/local/lib/python3.8/dist-packages/torch/nn/modules/flatten.py in forward(self, input)
44
45 def forward(self, input: Tensor) -> Tensor:
---> 46 return input.flatten(self.start_dim, self.end_dim)
47
48 def extra_repr(self) -> str:
IndexError: Dimension out of range (expected to be in range of [-1, 0], but got 1)
There is no problem when using the dataloader:
for I in test_dataloader:
with torch.no_grad():
_, y_f_hat, _, y_f = model.forward_F(I)
y_f_hat = y_f_hat.cpu().numpy().flatten()
y_f = y_f.cpu().numpy().flatten()
break
test_ds is the dataset used in test_dataloader.
Notes: on google Colab GPU, Python 3.9

When using DataLoader, it brings the data as a batch of samples. So the shape of the data coming out of DataLoader is like (B, ...) where B is the batch size and ... are the other dimensions (I do not know how your samples look like, in terms of images, for example, it is like (B, C, H, W) where C, H, W are the number of channels, height and width, respectively). This is what pytorch layers expect. In other words, you need a preceding dimension for batch size.
As a solution, you can call .unsqueeze(0) on input tensor before feeding into the model.
_, y_f_hat, _, y_f = model.forward_F(I.unsqueeze(0))

Related

How to handle transforms.FiveCrop change in tensor size

I am trying to add transforms.FiveCrop to my model. I understand that this method of data augmentation adds dimensions to my tensor but I am not sure how to handle it. The documentation notes:
This transform returns a tuple of images and there may be a mismatch in the number of inputs and targets your Dataset returns. See below for an example of how to deal with this.
Example
>>> transform = Compose([
>>> FiveCrop(size), # this is a list of PIL Images
>>> Lambda(lambda crops: torch.stack([ToTensor()(crop) for crop in crops])) # returns a 4D tensor
>>> ])
>>> #In your test loop you can do the following:
>>> input, target = batch # input is a 5d tensor, target is 2d
>>> bs, ncrops, c, h, w = input.size()
>>> result = model(input.view(-1, c, h, w)) # fuse batch size and ncrops
>>> result_avg = result.view(bs, ncrops, -1).mean(1) # avg over crops
...But I am not sure how to implement this.
Train loop:
for batch_idx, (data, target) in enumerate(train_loader):
print(f'Data: {data.shape}, Target: {target.shape}')
# Before Fivecrop: Data: torch.Size([32, 3, 224, 224]), Target: torch.Size([32])
# After Fivecrop: Data: torch.Size([32, 5, 3, 224, 224]), Target: torch.Size([32])
indx_target = target.clone()
data = data.to(train_config.device)
target = target.to(train_config.device)
optimizer.zero_grad()
output = model(data)
loss = F.cross_entropy(output, target)
loss.backward()
optimizer.step()
Can someone help explain how this is implemented in my train loop and what I am not understanding?
Thanks
This makes sense now. Here is for anyone who needs assistance. The new code looks as such:
for batch_idx, (data, target) in enumerate(train_loader):
bs, ncrops, c, h, w = data.size()
print(f'Data: {data.view(-1, c, h, w).shape}, Target: {target.shape}')
indx_target = target.clone()
data = data.to(train_config.device)
target = target.to(train_config.device)
optimizer.zero_grad()
output = model(data.view(-1, c, h, w)) # fuse batch size and ncrops
output_avg = output.view(bs, ncrops, -1).mean(1) # average the output over fivecrops
loss = F.cross_entropy(output_avg, target)
loss.backward()
optimizer.step()
batch_loss = np.append(batch_loss, [loss.item()])
prob = F.softmax(output_avg, dim=1)
pred = prob.data.max(dim=1)[1]
correct = pred.cpu().eq(indx_target).sum()
accuracy = float(correct) / float(len(data))
batch_acc = np.append(batch_acc, [accuracy])

Shaping Numpy Arrays for pytorch GAN

(pre-processing for qiskit QGAN but the use case is somewhat irrelevant)
I'm a bit lost trying to figure out how to preprocess an image dataset before passing it through a GAN. Below is all the relevant code up to my error. This code is derived from https://github.com/Qiskit/qiskit-tutorials/blob/master/legacy_tutorials/aqua/machine_learning/qgans_for_loading_random_distributions.ipynb and has been (attempted to be) altered to accommodate for a different input dataset. (The original has generated dummy data of much simpler dimensions)
# Root directory for dataset
dataroot = "./data/land"
# Number of workers for dataloader
workers = 2
# Batch size during training
batch_size = 128
#img size
image_size = 64
dataset = dset.ImageFolder(root=dataroot,
transform=transforms.Compose([
transforms.Resize(image_size),
transforms.CenterCrop(image_size),
transforms.ToTensor(),
transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5)),
]))
# Create the dataloader
dataloader = torch.utils.data.DataLoader(dataset, batch_size=batch_size,
shuffle=True, num_workers=workers)
real_batch = next(iter(dataloader))
real_batch_arr= [t.numpy() for t in real_batch]
# Set number of qubits per data dimension as list of k qubit values[#q_0,...,#q_k-1]
num_qubits = [4]
k = len(num_qubits)
num_epochs = 100
# Initialize qGAN
qgan = QGAN(real_batch_arr,bounds=bounds, num_qubits = num_qubits,batch_size = 128, num_epochs=num_epochs, snapshot_dir=None)
This gives me is the following error.
ValueError Traceback (most recent call last)
<ipython-input-42-8cba9a74f024> in <module>
5
6 # Initialize qGAN
----> 7 qgan = QGAN(real_batch_arr,bounds=bounds, num_qubits = num_qubits,batch_size = 128, num_epochs=num_epochs, snapshot_dir=None)
8 qgan.seed = 1
9 # Set quantum instance to run the quantum generator
~\Anaconda3\lib\site-packages\qiskit\aqua\algorithms\distribution_learners\qgan.py in __init__(self, data, bounds, num_qubits, batch_size, num_epochs, seed, discriminator, generator, tol_rel_ent, snapshot_dir, quantum_instance)
99 if data is None:
100 raise AquaError('Training data not given.')
--> 101 self._data = np.array(data)
102 if bounds is None:
103 bounds_min = np.percentile(self._data, 5, axis=0)
ValueError: could not broadcast input array from shape (128,3,64,64) into shape (128)
I understand that the qiskit function (QGAN) at some point is attempting to turn real_batch_arr to an array (which is defined as a list when passed to QGAN). This array is expected to be just (128) however, on top of that, an array needs to be passed to QGAN, not a list (based from the original code linked above).
My question is how would I be able to transform my list into the array that I need. There also could be something I am simply fundamentally missing. I truly appreciate any advice or comments.
The current implementation of the qGAN algorithm does not support data sets which are given as a tensor. The data is required to be either given as a flat array or an array of k-dimensional data points, i.e., the shape of the data should be num_data_samples x dim_data_samples.

How to pass y_true as a dict to a custom loss function unchanged?

I need to implement simple OCR model using tf.keras.*.
But:
Blank class is not zero, like tf expects, but (num_classes - 1) instead.
Width of input images is not known beforehand (it is different for different batches).
I want to utilize tf.nn.ctc_loss which has some nice argument: blank_index.
So I made a simple wrapper to compute CTC loss:
class CTCLossWrapper(tf.keras.losses.Loss):
def __init__(self, blank_class: int, reduction: str = tf.keras.losses.Reduction.AUTO, name: str = 'ctc_loss'):
super().__init__(reduction=reduction, name=name)
self.blank_class = blank_class
def call(self, y_true, y_pred):
output = y_true['output']
targets, target_lenghts = output['targets'], output['target_lengths']
y_pred = tf.math.log(tf.transpose(y_pred, perm=[1, 0, 2]) + K.epsilon())
max_input_len = K.cast(K.shape(y_pred)[1], dtype='int32')
input_lengths = tf.ones((K.shape(y_pred)[0]), dtype='int32') * max_input_len
return tf.nn.ctc_loss(
labels=targets,
logits=y_pred,
label_length=target_lenghts,
logit_length=input_lengths,
blank_index=self.blank_class
)
I also wrote a simple generator function which yields training samples:
def generator(dataset, batch_size: int, shuffle=False):
indexes = np.arange(len(dataset))
while True:
if shuffle:
indexes = np.random.permutation(indexes)
for i in range(0, len(dataset), batch_size):
# Get next batch
batch = dataset[indexes[i:i+batch_size]]
images, image_widths = batch['images'], batch['image_widths']
targets, target_lengths = batch['targets'], batch['target_lengths']
# Re-arrange dimensions (B, H, W, C) -> (B, W, H, C)
# Important Note: width=W and height=H are swapped from typical Keras convention
# because width is the time dimension when it gets fed into the RNN
images = np.transpose(images, axes=(0, 2, 1, 3)).astype(np.float32) / 255.0
# Change zero target length to 1 due to invalid implementation of ctc_batch_cost in keras
target_lengths[target_lengths == 0] = 1
# Add singleton dimension
# image_widths = image_widths[:, np.newaxis]
# target_lengths = target_lengths[:, np.newaxis]
# Construct output value
outputs = {
'images': images, # (batch_size, max_image_width, 32, 1)
'image_widths': image_widths, # (batch_size,)
'targets': targets, # (batch_size, max_target_len)
'target_lengths': target_lengths, # (batch_size,)
}
yield images, dict(output=outputs)
As you may see, generator outputs not just (x, y_true) but 4 values:
input images,
input image widths,
target sequences,
lengths for each target sequence.
This is so because tf.nn.ctc_loss also requires at least 4 arguments to work.
My plan was to pass input images as x and a dictionary of all 4 values as y_true.
Then of course I compile the model using my CTCLossWrapper and blank_class:
model.compile(
optimizer=Adam(),
loss=CTCLossWrapper(blank_class=blank_class),
)
After that I can start training by:
model.fit(
x=generator(train_dataset, batch_size=batch_size, shuffle=True),
steps_per_epoch=int(len(train_dataset) // batch_size),
epochs=200
)
The problem is that when my CTCLossWrapper is invoked it does not get dict() as y_true. It gets only one of tensors from it.
How is it possible to avoid or turn off tensorflow preprocessing and get y_true values in the same form as they were supplied from dataset?

How to load 2D data into an LSTM in pytorch

I have a series of sine waves that i have loaded in using a custom dataloader. The data is converted to a torch tensor using from_numpy. I then try to load the data using an enumerator over the train_loader. The iterator is shown below.
for epoch in range(epochs):
for i, data in enumerate(train_loader):
input = np.array(data)
train(epoch)
The error i receive is:
RuntimeError: input must have 3 dimensions, got 2
I know i need to have my input data in [sequence length, batch_size, input_size] for an LSTM but i have no idea how to format my array data of 1000 sine waves of length 10000.
Below is my training method.
def train(epoch):
model.train()
train_loss = 0
def closure():
optimizer.zero_grad()
print(input.shape)
output = model(Variable(input))
loss = loss_function(output)
print('epoch: ', epoch.item(),'loss:', loss.item())
loss.backward()
return loss
optimizer.step(closure)
I thought i would try add (seq_length, batch_size, input_size) in a tuple but this cant be fed into the network. Further to this my assumption was that the dataloader fed batch size into the system. Any help would be appreciated.
edit:
Here is my sample data:
T = 20
L = 1000
N = 100
x = np.empty((N, L), 'int64')
x[:] = np.array(range(L)) + np.random.randint(-4 * T, 4 * T, N).reshape(N, 1)
data = np.sin(x / 1.0 / T).astype('float64')
torch.save(data, open('traindata.pt', 'wb'))
Can you share a simple example of your data just to confirm?
Also, you have to have a different order for your shape. Generally, the first dimension is always batch_size, and then afterwards the other dimensions, like [batch_size, sequence_length, input_dim].
One way to achieve this, if you have a batch size of 1, is to use torch.unsqueeze(). This allows you to create a "fake" dimension:
import torch as t
x = t.Tensor([1,2,3])
print(x.shape)
x = x.unsqueeze(dim=0) # adds a 0-th dimension of size 1
print(x.shape)

Dimension out of range when applying l2 normalization in Pytorch

I'm getting a runtime error:
RuntimeError: Dimension out of range (expected to be in range of [-1, 0], but got 1)`
and can't figure out how to fix it.
The error appears to refer to the line:
i_enc = F.normalize(input =i_batch, p=2, dim=1, eps=1e-12) # (batch, K, feat_dim)
I'm trying to encode image features (batch x 36 x 2038) by applying a L2 norm. Below is the full code for the section.
def forward(self, q_batch, i_batch):
# batch size = 512
# q -> 512(batch)x14(length)
# i -> 512(batch)x36(K)x2048(f_dim)
# one-hot -> glove
emb = self.embed(q_batch)
output, hn = self.gru(emb.permute(1, 0, 2))
q_enc = hn.view(-1,self.h_dim)
# image encoding with l2 norm
i_enc = F.normalize(input =i_batch, p=2, dim=1, eps=1e-12) # (batch, K, feat_dim)
q_enc_copy = q_enc.repeat(1, self.K).view(-1, self.K, self.h_dim)
q_i_concat = torch.cat((i_enc, q_enc_copy), -1)
q_i_concat = self.non_linear(q_i_concat, self.td_W, self.td_W2 )#512 x 36 x 512
i_attention = self.att_w(q_i_concat) #512x36x1
i_attention = F.softmax(i_attention.squeeze(),1)
#weighted sum
i_enc = torch.bmm(i_attention.unsqueeze(1), i_enc).squeeze() # (batch, feat_dim)
# element-wise multiplication
q = self.non_linear(q_enc, self.q_W, self.q_W2)
i = self.non_linear(i_enc, self.i_W, self.i_W2)
h = torch.mul(q, i) # (batch, hid_dim)
# output classifier
# BCE with logitsloss
score = self.c_Wo(self.non_linear(h, self.c_W, self.c_W2))
return score
I would appreciate any help.
Thanks
I would suggest to check the shape of i_batch (e.g. print(i_batch.shape)), as I suspect i_batch has only 1 dimension (e.g. of shape [N]).
This would explain why PyTorch is complaining you can normalize only over the dimension #0; while you are asking for the operation to be done over a dimension #1 (c.f. dim=1).

Categories

Resources