'reverse' transpose / flatten - python

I have an numpy array data of shape (3, 3, k), where the length k is fixed.
The array was processed to a flatten one dimensional one with:
mat2 = numpy.transpose(data, (1, 0, 2)).flatten('C')
How do I reverse this transpose / flattening process to get the original (3, 3, k) shape and ordering of the data array?

>>> k = 10
# Generating a `(3, 3, k)` matrix:
>>> a = np.linspace(0, 89, 90).reshape((3, 3, k))
array([[[ 0., 1., 2., 3., 4., 5., 6., 7., 8., 9.],
[10., 11., 12., 13., 14., 15., 16., 17., 18., 19.],
[20., 21., 22., 23., 24., 25., 26., 27., 28., 29.]],
[[30., 31., 32., 33., 34., 35., 36., 37., 38., 39.],
[40., 41., 42., 43., 44., 45., 46., 47., 48., 49.],
[50., 51., 52., 53., 54., 55., 56., 57., 58., 59.]],
[[60., 61., 62., 63., 64., 65., 66., 67., 68., 69.],
[70., 71., 72., 73., 74., 75., 76., 77., 78., 79.],
[80., 81., 82., 83., 84., 85., 86., 87., 88., 89.]]])
# Doing your transform on it:
>>> b = np.transpose(a, (1, 0, 2)).flatten('C')
array([ 0., 1., 2., 3., 4., 5., 6., 7., 8., 9., 30., 31., 32.,
33., 34., 35., 36., 37., 38., 39., 60., 61., 62., 63., 64., 65.,
66., 67., 68., 69., 10., 11., 12., 13., 14., 15., 16., 17., 18.,
19., 40., 41., 42., 43., 44., 45., 46., 47., 48., 49., 70., 71.,
72., 73., 74., 75., 76., 77., 78., 79., 20., 21., 22., 23., 24.,
25., 26., 27., 28., 29., 50., 51., 52., 53., 54., 55., 56., 57.,
58., 59., 80., 81., 82., 83., 84., 85., 86., 87., 88., 89.])
# Reversing the transform:
>>> c = b.reshape((3, 3, k)).transpose((1, 0, 2))
array([[[ 0., 1., 2., 3., 4., 5., 6., 7., 8., 9.],
[10., 11., 12., 13., 14., 15., 16., 17., 18., 19.],
[20., 21., 22., 23., 24., 25., 26., 27., 28., 29.]],
[[30., 31., 32., 33., 34., 35., 36., 37., 38., 39.],
[40., 41., 42., 43., 44., 45., 46., 47., 48., 49.],
[50., 51., 52., 53., 54., 55., 56., 57., 58., 59.]],
[[60., 61., 62., 63., 64., 65., 66., 67., 68., 69.],
[70., 71., 72., 73., 74., 75., 76., 77., 78., 79.],
[80., 81., 82., 83., 84., 85., 86., 87., 88., 89.]]])
# Figuring out if we did it right:
>>> np.array_equal(a, c)
True

Related

Pytorch matrix multiplication

I'm struggling with dimension and matric multiplication in pytorch.
I want to multiply matrix A
tensor([[[104.7500, 111.3750, 138.2500, 144.8750],
[104.2500, 110.8750, 137.7500, 144.3750]],
[[356.8750, 363.5000, 390.3750, 397.0000],
[356.3750, 363.0000, 389.8750, 396.5000]]])
with matrix B
tensor([[[[ 0., 1., 2., 5., 6., 7., 10., 11., 12.],
[ 2., 3., 4., 7., 8., 9., 12., 13., 14.],
[ 10., 11., 12., 15., 16., 17., 20., 21., 22.],
[ 12., 13., 14., 17., 18., 19., 22., 23., 24.]],
[[ 25., 26., 27., 30., 31., 32., 35., 36., 37.],
[ 27., 28., 29., 32., 33., 34., 37., 38., 39.],
[ 35., 36., 37., 40., 41., 42., 45., 46., 47.],
[ 37., 38., 39., 42., 43., 44., 47., 48., 49.]],
[[ 50., 51., 52., 55., 56., 57., 60., 61., 62.],
[ 52., 53., 54., 57., 58., 59., 62., 63., 64.],
[ 60., 61., 62., 65., 66., 67., 70., 71., 72.],
[ 62., 63., 64., 67., 68., 69., 72., 73., 74.]]],
[[[ 75., 76., 77., 80., 81., 82., 85., 86., 87.],
[ 77., 78., 79., 82., 83., 84., 87., 88., 89.],
[ 85., 86., 87., 90., 91., 92., 95., 96., 97.],
[ 87., 88., 89., 92., 93., 94., 97., 98., 99.]],
[[100., 101., 102., 105., 106., 107., 110., 111., 112.],
[102., 103., 104., 107., 108., 109., 112., 113., 114.],
[110., 111., 112., 115., 116., 117., 120., 121., 122.],
[112., 113., 114., 117., 118., 119., 122., 123., 124.]],
[[125., 126., 127., 130., 131., 132., 135., 136., 137.],
[127., 128., 129., 132., 133., 134., 137., 138., 139.],
[135., 136., 137., 140., 141., 142., 145., 146., 147.],
[137., 138., 139., 142., 143., 144., 147., 148., 149.]]]])
However using the simple # to multiply them, doesn'e lead me to the desired result.
What I want is somethinlg like: multiply the first two rows of A by the first 3 4x9 submatrices of B (let's say B[:,:,0,:]) so that I have two results, then in the same way muliply the third and fourth row of A with the second 3 4x9 submatrices of B, so to have again two results, then I want to sum the first results of each multiplication and the second results of each.
I know I have to work with some kind of reshapes but I find it so confusing, can you help me with a quite generalizable solution?
In case there's someone wondering how to perform this with torch.einsum, you just have to think in terms of dimensions and explicit the operation with the subscript:
>>> torch.einsum('ijk,ilkm->ljm', A, B)
The overall operation performed is, in pseudo-code:
for i, j, k, l, m in IxJxKxLxM:
out[l][j][m] += A[i][j][k]*B[i][l][k][m]
This example would be helpful:
a = torch.ones((4, 4)).long()
a = a.reshape(2, 2, 4)
b = torch.tensor(list(range(36*6)))
b = b.reshape(2, 3, 4, 9)
t1 = a[0] # b[0, :]
t2 = a[1] # b[1, :]
result = t1 + t2
accum = torch.zeros((b.shape[1], a.shape[1], b.shape[3]))
for i in range(a.shape[0]):
accum = accum + (a[i] # b[i, :])

Reshape tensors in pytorch?

I'm struggling with the result of a matrix multiplication in pytorch and I don't know how to solve it, in particular:
I'm multiplying these two matrices
tensor([[[[209.5000, 222.7500],
[276.5000, 289.7500]],
[[208.5000, 221.7500],
[275.5000, 288.7500]]]], dtype=torch.float64)
and
tensor([[[[ 0., 1., 2., 5., 6., 7., 10., 11., 12.],
[ 2., 3., 4., 7., 8., 9., 12., 13., 14.],
[10., 11., 12., 15., 16., 17., 20., 21., 22.],
[12., 13., 14., 17., 18., 19., 22., 23., 24.]],
[[25., 26., 27., 30., 31., 32., 35., 36., 37.],
[27., 28., 29., 32., 33., 34., 37., 38., 39.],
[35., 36., 37., 40., 41., 42., 45., 46., 47.],
[37., 38., 39., 42., 43., 44., 47., 48., 49.]],
[[50., 51., 52., 55., 56., 57., 60., 61., 62.],
[52., 53., 54., 57., 58., 59., 62., 63., 64.],
[60., 61., 62., 65., 66., 67., 70., 71., 72.],
[62., 63., 64., 67., 68., 69., 72., 73., 74.]]]],
dtype=torch.float64)
with the following line of code A.view(2,-1) # B, and then I reshape the result with result.view(2, 3, 3, 3).
The resulting matrix is
tensor([[[[ 6687.5000, 7686.0000, 8684.5000],
[11680.0000, 12678.5000, 13677.0000],
[16672.5000, 17671.0000, 18669.5000]],
[[ 6663.5000, 7658.0000, 8652.5000],
[11636.0000, 12630.5000, 13625.0000],
[16608.5000, 17603.0000, 18597.5000]],
[[31650.0000, 32648.5000, 33647.0000],
[36642.5000, 37641.0000, 38639.5000],
[41635.0000, 42633.5000, 43632.0000]]],
[[[31526.0000, 32520.5000, 33515.0000],
[36498.5000, 37493.0000, 38487.5000],
[41471.0000, 42465.5000, 43460.0000]],
[[56612.5000, 57611.0000, 58609.5000],
[61605.0000, 62603.5000, 63602.0000],
[66597.5000, 67596.0000, 68594.5000]],
[[56388.5000, 57383.0000, 58377.5000],
[61361.0000, 62355.5000, 63350.0000],
[66333.5000, 67328.0000, 68322.5000]]]], dtype=torch.float64)
Instead I want
tensor([[[[ 6687.5000, 7686.0000, 8684.5000],
[11680.0000, 12678.5000, 13677.0000],
[16672.5000, 17671.0000, 18669.5000]],
[[31650.0000, 32648.5000, 33647.0000],
[36642.5000, 37641.0000, 38639.5000],
[41635.0000, 42633.5000, 43632.0000]],
[[56612.5000, 57611.0000, 58609.5000],
[61605.0000, 62603.5000, 63602.0000],
[66597.5000, 67596.0000, 68594.5000]]],
[[[ 6663.5000, 7658.0000, 8652.5000],
[11636.0000, 12630.5000, 13625.0000],
[16608.5000, 17603.0000, 18597.5000]],
[[31526.0000, 32520.5000, 33515.0000],
[36498.5000, 37493.0000, 38487.5000],
[41471.0000, 42465.5000, 43460.0000]],
[[56388.5000, 57383.0000, 58377.5000],
[61361.0000, 62355.5000, 63350.0000],
[66333.5000, 67328.0000, 68322.5000]]]], dtype=torch.float64)
Can someone help me? Thanks
This is a common but interesting problem because it involves a combination of torch.reshapes and torch.transpose to solve it. More specifically, you will need
Apply an initial reshape to restructure the tensor and expose the axes you want to swap;
Then do so using a transpose operation;
Lastly apply a second reshape to get to the desired format.
In your case, you could do:
>>> result.reshape(3,2,3,3).transpose(0,1).reshape(2,3,3,3)
tensor([[[[ 6687.5000, 7686.0000, 8684.5000],
[11680.0000, 12678.5000, 13677.0000],
[16672.5000, 17671.0000, 18669.5000]],
[[31650.0000, 32648.5000, 33647.0000],
[36642.5000, 37641.0000, 38639.5000],
[41635.0000, 42633.5000, 43632.0000]],
[[56612.5000, 57611.0000, 58609.5000],
[61605.0000, 62603.5000, 63602.0000],
[66597.5000, 67596.0000, 68594.5000]]],
[[[ 6663.5000, 7658.0000, 8652.5000],
[11636.0000, 12630.5000, 13625.0000],
[16608.5000, 17603.0000, 18597.5000]],
[[31526.0000, 32520.5000, 33515.0000],
[36498.5000, 37493.0000, 38487.5000],
[41471.0000, 42465.5000, 43460.0000]],
[[56388.5000, 57383.0000, 58377.5000],
[61361.0000, 62355.5000, 63350.0000],
[66333.5000, 67328.0000, 68322.5000]]]], dtype=torch.float64)
I encourage you to look a the intermediate results to get an idea of how the method works so you can apply it on other use cases in the future.

Is there a way to get batches with continuous examples in Pytorch

I have a dataset with 10,000+ examples and using Dataloader, I create batches of size 50. I'm trying to find a way to have batch 1 start at example 1 and end at example 50 then have batch 2 start at example 2 and end at example 51 and so on.
This is a snip of where I use DataLoader:
train_loader = torch.utils.data.DataLoader(train, batch_size=batch_size, drop_last=True, shuffle=False)
for epoch in range(num_epochs):
totalEpochs += 1
for X, y in train_loader:
train = X.view(-1, 1, X.shape[1]).float()
def timeseries_to_supervised(data, seq_length):
x = []
y = []
for i in range(len(data)-seq_length-1):
_x = data[i:(i+seq_length)]
_y = data[i+seq_length]
x.append(_x)
y.append(_y)
return np.array(x),np.array(y)
data = range(180)
window_size = 30 # 60 mins
x,y = timeseries_to_supervised(data, window_size)
train_data = x[:90]
train_label = y[:90]
test_data = x[90:]
test_label = y[90:]
trainX = Variable(torch.Tensor(np.array(train_data)))
trainY = Variable(torch.Tensor(np.array(train_label)))
testX = Variable(torch.Tensor(np.array(test_data)))
testY = Variable(torch.Tensor(np.array(test_label)))
batch = 24
from torch.utils.data import Dataset, DataLoader
train_loader = (DataLoader(TimeSeriesDataSet(trainX, trainY), batch_size=batch, shuffle=False))
test_loader = (DataLoader(TimeSeriesDataSet(testX, testY), batch_size=batch, shuffle=False))
for i, d in enumerate(train_loader):
print(i, d[0].shape, d[1].shape)
print (d) # d[0] - features , d[1] - labels
Results:
0 torch.Size([24, 30]) torch.Size([24])
[tensor([[ 0., 1., 2., 3., 4., 5., 6., 7., 8., 9., 10., 11., 12., 13.,
14., 15., 16., 17., 18., 19., 20., 21., 22., 23., 24., 25., 26., 27.,
28., 29.],
[ 1., 2., 3., 4., 5., 6., 7., 8., 9., 10., 11., 12., 13., 14.,
15., 16., 17., 18., 19., 20., 21., 22., 23., 24., 25., 26., 27., 28.,
29., 30.],
[ 2., 3., 4., 5., 6., 7., 8., 9., 10., 11., 12., 13., 14., 15.,
16., 17., 18., 19., 20., 21., 22., 23., 24., 25., 26., 27., 28., 29.,
30., 31.],
...
[23., 24., 25., 26., 27., 28., 29., 30., 31., 32., 33., 34., 35., 36.,
37., 38., 39., 40., 41., 42., 43., 44., 45., 46., 47., 48., 49., 50.,
51., 52.]]),
tensor([30., 31., 32., 33., 34., 35., 36., 37., 38., 39., 40., 41., 42., 43.,
44., 45., 46., 47., 48., 49., 50., 51., 52., 53.])]

Output array after performing fast fast fourier transform of a data set

I'm trying to perform a fourier transform of a data set that I have and subsequently writing its real and imaginary parts separately.
This is my code:
import sys,string
import numpy as np
from math import *
import fileinput
from scipy.fftpack import fft, ifft
temparray = []
for i in range(200000):
line = sys.stdin.readline() ## reads data from a text file
fields = map(float,line.split())
temparray.append(fields)
acf = sum(temparray,[]) ## I need to do this as the appended array from above is nested
y = np.fft.fft(acf)
z = (y.real,y.imag)
print z
The output that I get is as follows:
(array([ 2600.36368107, 2439.50426935, 1617.52631545, ..., 1575.78483016, 1617.52631545, 2439.50426935]), array([ 0. , -767.19967198, -743.75183367, ..., 726.45052092, 743.75183367, 767.19967198]))
It looks like its only printing the first few and last few values, completely skipping everything in between. Can anybody please tell me why this is happening?
Thanks
As others have indicated, include a modified version of
>>> np.set_printoptions(edgeitems=5,linewidth=80,precision=2,suppress=True,threshold=10)
>>> a = np.arange(0,100.)
>>>
>>> a
array([ 0., 1., 2., 3., 4., ..., 95., 96., 97., 98., 99.])
>>> np.set_printoptions(edgeitems=5,linewidth=80,precision=2,suppress=True,threshold=100)
>>> a
array([ 0., 1., 2., 3., 4., 5., 6., 7., 8., 9., 10., 11.,
12., 13., 14., 15., 16., 17., 18., 19., 20., 21., 22., 23.,
24., 25., 26., 27., 28., 29., 30., 31., 32., 33., 34., 35.,
36., 37., 38., 39., 40., 41., 42., 43., 44., 45., 46., 47.,
48., 49., 50., 51., 52., 53., 54., 55., 56., 57., 58., 59.,
60., 61., 62., 63., 64., 65., 66., 67., 68., 69., 70., 71.,
72., 73., 74., 75., 76., 77., 78., 79., 80., 81., 82., 83.,
84., 85., 86., 87., 88., 89., 90., 91., 92., 93., 94., 95.,
96., 97., 98., 99.])
np.set_printoptions(edgeitems=3,linewidth=80,precision=2,suppress=True,threshold=5)
perhaps setting threshold to some very large number.
Addendum
I would be remiss if I didn't state that the simplest solution to the above is to simple use
>>> list(a)
should you not care whether an array is visually returned or not.

Tolerances, linalg.solv, polynom solve

I have following problem:
I try to solve the equilation by using linalg.solv and it seems to work. But if i try to check it by inserting the aquired coefficients and one of the required points i get a difference of about 30% to the original data. Have i done a mistake, which i dont get? Or do i have do use a different methode to get more accurate datasets. If yes, which one?
Further if i use different values from which i entered while calculating the coefficients, i get strangly high results
data = np.genfromtxt("data1.csv",dtype=float,delimiter=";")
to = data[0,2:]
tc = data[1:,0]
y = data[1:,2:]
a = np.array([[1, to[0], tc[0], to[0]**2, to[0]*tc[0], tc[0]**2, to[0]**3, tc[0]*to[0]**2, to[0]*tc[0]**2, tc[0]**3],
[1, to[1], tc[1], to[1]**2, to[1]*tc[1], tc[1]**2, to[1]**3, tc[1]*to[1]**2, to[1]*tc[1]**2, tc[1]**3],
[1, to[2], tc[2], to[2]**2, to[2]*tc[2], tc[2]**2, to[2]**3, tc[2]*to[2]**2, to[2]*tc[2]**2, tc[2]**3],
[1, to[3], tc[3], to[3]**2, to[3]*tc[3], tc[3]**2, to[3]**3, tc[3]*to[3]**2, to[3]*tc[3]**2, tc[3]**3],
[1, to[4], tc[4], to[4]**2, to[4]*tc[4], tc[4]**2, to[4]**3, tc[4]*to[4]**2, to[4]*tc[4]**2, tc[4]**3],
[1, to[5], tc[5], to[5]**2, to[5]*tc[5], tc[5]**2, to[5]**3, tc[5]*to[5]**2, to[5]*tc[5]**2, tc[5]**3],
[1, to[6], tc[6], to[6]**2, to[6]*tc[6], tc[6]**2, to[6]**3, tc[6]*to[6]**2, to[6]*tc[6]**2, tc[6]**3],
[1, to[7], tc[7], to[7]**2, to[7]*tc[7], tc[7]**2, to[7]**3, tc[7]*to[7]**2, to[7]*tc[7]**2, tc[7]**3],
[1, to[8], tc[8], to[8]**2, to[8]*tc[8], tc[8]**2, to[8]**3, tc[8]*to[8]**2, to[8]*tc[8]**2, tc[8]**3],
[1, to[9], tc[9], to[9]**2, to[9]*tc[9], tc[9]**2, to[9]**3, tc[9]*to[9]**2, to[9]*tc[9]**2, tc[9]**3]])
b = np.array([y[0,0],y[1,1],y[2,2],y[3,3],y[4,4],y[5,5],y[6,6],y[7,7],y[8,8],y[9,9]])
c = np.linalg.solve(a, b)
ges_to = 10
ges_tc = 35
ges_y = c[0] + c[1]*ges_to + c[2]*ges_tc + c[3]*ges_to**2 + c[4]*ges_to*ges_tc + c[5]*ges_tc**2 + c[6]*ges_to**3 \
+ c[7]*ges_tc*ges_to**2 + c[8]*ges_to*ges_tc**2 + c[9]*ges_tc**3
Here are the values I use to calculate
('to:', array([ 15., 10., 5., 0., -5., -10., -15., -20., -25., -30., -35.]))
('tc:', array([ 30., 35., 40., 45., 50., 55., 60., 65., 70., 80., 90.]))
('b', array([ 24., 31., 35., 36., 35., 33., 30., 25., 21., 18.]))
('y:', array([[ 24., 26., 27., 27., 26., 25., 23., 20., 18., 15., 13.],
[ 30., 31., 31., 30., 29., 27., 24., 21., 18., 16., 14.],
[ 35., 35., 35., 33., 31., 29., 26., 22., 19., 16., 15.],
[ 40., 40., 38., 36., 33., 30., 27., 23., 20., 16., 15.],
[ 45., 44., 41., 39., 35., 32., 28., 24., 20., 17., 16.],
[ 49., 47., 44., 41., 37., 33., 29., 25., 20., 17., 16.],
[ 53., 51., 47., 43., 39., 34., 30., 25., 21., 17., 16.],
[ 57., 54., 50., 45., 40., 35., 30., 25., 21., 17., 16.],
[ 61., 57., 52., 47., 41., 36., 31., 26., 21., 17., 16.],
[ 64., 60., 54., 59., 53., 37., 32., 27., 22., 18., 19.],
[ 67., 63., 56., 61., 55., 59., 34., 29., 24., 18., 19.]]))
('ges_y:', 49.0625)
The floating point arithmetic is leading you astray. If you look at the determinant of that matrix a, it's something incredibly small like 1.551864434916621e-51. If you compute the determinate with the entries as integers (and avoid floating point arithmetic weirdness) you'll see it's actually 0, and the rank of your matrix is 5. So it's singular, and in general equations like ax = b may not have any solution.
Another quick thing you can do to see this is np.dot(a, np.linalg.inv(a)) is nowhere close to the identity matrix. Similarly, np.dot(a, c) is nowhere close to b.
There may or may not be an actual solution to ax = b, but np.linalg.lstsq(a,b) will get you an approximate solution in either case, if that's sufficient for you rneeds.

Categories

Resources