I have a long 121 element array where the data is stored in ascending order and I want to reshape to an 11x11 matrix and so I use the NumPy reshape command
Z = data.attributevalue[2,time,axial,:]
Z = np.reshape(Z, (int(math.sqrt(datacount)), int(math.sqrt(datacount))))
The data should be oriented in a Cartesian plane and I create the mesh grid with the following
x = np.arange(1.75, 12.5, 1)
y = np.arange(1.75, 12.5, 1)
X,Y = np.meshgrid(x, y)
The issue is that rows of Z are in the wrong order so the data in the last row of the matrix should be in the first and vice-versa. I want to rearrange so the rows are filled in the proper manner. The starting array Z is assembled in the following arrangement [datapoint #1, datapoint #2 ...., datapoint #N]. Datapoint #1 should be in the top left and the last point in the bottom right. Is there a simple way of accomplishing this or do I have to make a function to changed the order of the rows?
my plot statement is the following
surf = self.ax.plot_surface(X, Y, Z, rstride=1, cstride=1, cmap=cm.jet,
linewidth=1, antialiased=True)
***UPDATE****
I tried populating the initial array backwards and still no luck. I changed the orientation of the axis to the following
y = np.arrange(12.5,1,-1)
This flipped the data but my axis label is wrong so it is not a real solution to my issue. Any ideas?
It is possible that your original array does not look like a 1x121 array. The following code block shows how you reshape an array from 1x121 to 11x11.
import numpy as np
A = np.arange(1,122)
print A
print A.reshape((11,11))
Gives:
[ 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121]
[[ 1 2 3 4 5 6 7 8 9 10 11]
[ 12 13 14 15 16 17 18 19 20 21 22]
[ 23 24 25 26 27 28 29 30 31 32 33]
[ 34 35 36 37 38 39 40 41 42 43 44]
[ 45 46 47 48 49 50 51 52 53 54 55]
[ 56 57 58 59 60 61 62 63 64 65 66]
[ 67 68 69 70 71 72 73 74 75 76 77]
[ 78 79 80 81 82 83 84 85 86 87 88]
[ 89 90 91 92 93 94 95 96 97 98 99]
[100 101 102 103 104 105 106 107 108 109 110]
[111 112 113 114 115 116 117 118 119 120 121]]
Related
To select data for training and validation in my machine learning projects, I usually use numpys masking functionality. So a typical reoccuring block of code to select the indices for validation and test data looks like this:
import numpy as np
validation_split = 0.2
all_idx = np.arange(0,100000)
idxValid = np.random.choice(all_idx, int(validation_split * len(all_idx)))
idxTrain = np.setdiff1d(all_idx, idxValid)
Now the following should always be true:
len(all_idx) == len(idxValid)+len(idxTrain)
Unfortunately, I found out that somehow this is not always the case. As I inrease the number of elements that are chosen from the all_idx-array the resulting numbers do not add up properly. Here another standalone example which breaks as soon as I increase the number of randomly chosen validation indices above 1000:
import numpy as np
all_idx = np.arange(0,100000)
idxValid = np.random.choice(all_idx, 1000)
idxTrain = np.setdiff1d(all_idx, idxValid)
print(len(all_idx), len(idxValid), len(idxTrain))
This results in -> 100000, 1000, 99005
I am confused?! Please try yourself. I would be glad to understand this.
idxValid = np.random.choice(all_idx, 10, replace=False)
Careful, you need to indicate that you don't want to have duplicates in idxValid. To do so, you just have to had replace=False in np.random.choice
replace boolean, optional
Whether the sample is with or without replacement
Consider the following example:
all_idx = np.arange(0, 100)
print(all_idx)
>>> [ 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23
24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47
48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71
72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95
96 97 98 99]
Now if you print out your validation dataset:
idxValid = np.random.choice(all_idx, int(validation_split * len(all_idx)))
print(idxValid)
>>> [31 57 55 45 26 25 55 76 33 69 49 90 46 14 18 30 89 73 47 82]
You can actually observe that there are duplicates in the resulting set and thus
len(all_idx) == len(idxValid)+len(idxTrain)
wouldn't result to True.
What you need to do is to make sure that np.random.choice does a sampling without replcacement by passing replace=False:
idxValid = np.random.choice(all_idx, int(validation_split * len(all_idx)), replace=False)
Now the results should be as expected:
import numpy as np
validation_split = 0.2
all_idx = np.arange(0, 100)
print(all_idx)
idxValid = np.random.choice(all_idx, int(validation_split * len(all_idx)), replace=False)
print(idxValid)
idxTrain = np.setdiff1d(all_idx, idxValid)
print(idxTrain)
print(len(all_idx) == len(idxValid)+len(idxTrain))
and the output is:
[ 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23
24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47
48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71
72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95
96 97 98 99]
[12 85 96 64 48 21 55 56 80 42 11 92 54 77 49 36 28 31 70 66]
[ 0 1 2 3 4 5 6 7 8 9 10 13 14 15 16 17 18 19 20 22 23 24 25 26
27 29 30 32 33 34 35 37 38 39 40 41 43 44 45 46 47 50 51 52 53 57 58 59
60 61 62 63 65 67 68 69 71 72 73 74 75 76 78 79 81 82 83 84 86 87 88 89
90 91 93 94 95 97 98 99]
True
Consider using train_test_split from scikit-learn which is straight-forward:
from sklearn.model_selection import train_test_split
train, test = train_test_split(df, test_size=0.2)
I want to perform a manual short time fourier transform. I have a simple time series in the form of a cosine wave. I want to perform a short time fourier transform by splitting up the time series into a number of evenly spaced segments that include overlap... how do i do that?
this is my time series:
fs = 10e3 # Sampling frequency
N = 1e5 # Number of samples
time = np.arange(N) / fs
x = np.cos(5*time) # Some random audio wave
# x.shape gives (100000,)
How do i split into say, 10 evenly spaced segments?
Here's one way to do this.
import numpy as np
def get_windows(n, Mt, olap):
# Split a signal of length n into olap% overlapping windows each containing Mt terms
ists = []
ieds = []
ist = 0
while 1:
ied = ist + Mt
if ied > n:
break
ists.append(ist)
ieds.append(ied)
ist += int(Mt * (1 - olap/100))
return ists, ieds
n = 100
x = np.arange(n)
ists, ieds = get_windows(n, Mt=20, olap=50) # windows of length 20 and 50% overlap
for ist, ied in zip(ists, ieds):
print(x[ist:ied])
result:
[ 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19]
[10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29]
[20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39]
[30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49]
[40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59]
[50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69]
[60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79]
[70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89]
[80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99]
If your data is relatively small and you are comfortable with storing all the windows in RAM, then you can continue as follows:
X = np.array([x[ist:ied] for ist, ied in zip(ists, ieds)])
# X.shape is (nwindows, Mt)
By doing this, you can generate W a windowing function (e.g. Hanning window) as a 1D array of shape (Mt, ), so that W*X will broadcast in a way so that W applies to each window in X.
I just noticed that the term "window" is used with two meanings in this context. Sorry for the confusion.
i am trying to convert code from matlab to python.
Can you please help me to convert this code from matlab to python?
in matlab code
z is list and z length is 121
z= 7.0502 5.8030 4.4657 3.0404 1.5416 0 -1.5416 -3.0404 -4.4657
-5.8030 -7.0502 7.5944 6.3059 4.8990 3.3662 1.7189 0 -1.7189 -3.3662 -4.8990 -6.3059 -7.5944 8.2427 6.9282 5.4611 3.8122 1.9735 0 -1.9735 -3.8122 -5.4611 -6.9282 -8.2427 9.0135 7.7027 6.2075 4.4590 2.3803 0 -2.3803 -4.4590 -6.2075 -7.7027 -9.0135 9.9185 8.6576 7.2038 5.4466 3.1530 0 -3.1530 -5.4466 -7.2038 -8.6576 -9.9185 10.9545 9.7980 8.4853 6.9282 4.8990 0 -4.8990 -6.9282 -8.4853 -9.7980 -10.9545 12.0986 11.0885 9.9947 8.8128 7.6119 -6.9282 -7.6119 -8.8128 -9.9947 -11.0885 -12.0986 13.3133 12.4632 11.5988 10.7649 10.0829 -9.7980 -10.0829 -10.7649 -11.5988 -12.4632 -13.3133 14.5583 13.8564 13.1842 12.5910 12.1612 -12.0000 -12.1612 -12.5910 -13.1842 -13.8564 -14.5583 15.8011 15.2238 14.6969 14.2594 13.9626 -13.8564 -13.9626 -14.2594 -14.6969 -15.2238 -15.8011 17.0207 16.5431 16.1227 15.7875 15.5684 -15.4919 -15.5684 -15.7875 -16.1227 -16.5431 -17.0207
Matlab code : [z,index]=sort(abs(z));
after the code
z = 0 0 0 0 0 0 1.5416 1.5416 1.7189 1.7189 1.9735 1.9735 2.3803 2.3803 3.0404 3.0404 3.1530 3.1530 3.3662 3.3662 3.8122 3.8122 4.4590 4.4590 4.4657 4.4657 4.8990 4.8990 4.8990 4.8990 5.4466 5.4466 5.4611 5.4611 5.8030 5.8030 6.2075 6.2075 6.3059 6.3059 6.9282 6.9282 6.9282 6.9282 6.9282 7.0502 7.0502 7.2038 7.2038 7.5944 7.5944 7.6119 7.6119 7.7027 7.7027 8.2427 8.2427 8.4853 8.4853 8.6576 8.6576 8.8128 8.8128 9.0135 9.0135 9.7980 9.7980 9.7980 9.9185 9.9185 9.9947 9.9947 10.0829 10.0829 10.7649 10.7649 10.9545 10.9545 11.0885 11.0885 11.5988 11.5988 12.0000 12.0986 12.0986 12.1612 12.1612 12.4632 12.4632 12.5910
12.5910 13.1842 13.1842 13.3133 13.3133 13.8564 13.8564 13.8564 13.9626 13.9626 14.2594 14.2594 14.5583 14.5583 14.6969 14.6969 15.2238 15.2238 15.4919 15.5684 15.5684 15.7875 15.7875 15.8011 15.8011 16.1227 16.1227 16.5431 16.5431 17.0207 17.0207
and index is
index = 6 17 28 39 50 61 5 7 16 18 27 29 38 40 4 8 49 51 15 19 26 30 37 41 3 9 14 20 60 62 48 52 25 31 2 10 36 42 13 21 24 32 59 63 72 1 11 47 53 12 22 71 73 35 43 23 33 58 64 46 54 70 74 34 44 57 65 83 45 55 69 75 82 84 81 85 56 66 68 76 80 86 94 67 77 93 95 79 87 92 96 91 97 78 88 90 98 105 104 106 103 107 89 99 102 108 101 109 116 115 117 114 118 100 110 113 119 112 120 111 121
so what is the [z,index] in python ?
Do you need to return the index? If you don't, you could use:
z = abs(z)
new_list = sorted(map(abs, z))
index = sorted(range(len(z)), key=lambda k: z[k])
where x is the output and z is the list.
EDIT:
Try that now
I have the following code in R and I need to write it in an optimal way in python using pandas. I wrote it but it takes a long time to run.
1) is there someone who can confirm that this is an equivalent of R code in python
2) how to write it in a pythonic way(optimal way)
in R
for (i in 1:dim(df1)[1])
df1$column1[i] <- sum(df2[i,4:33])
in Python
for i in range(df1.shape[0]):
df1['column1'][i] = df2.iloc[i,3:34].sum()
These are two ways to make the replacement
df1['column1'] = df2.iloc[:, 3:34].sum(axis=1)
OR
df1.loc[:, 'column1'] = df2.iloc[:, 3:34].sum(axis=1)
Use vectorized operations:
>>> df = pd.DataFrame(np.random.randint(0, 100, (10, 15)), columns=list('abcdefghijklmno'))
>>> df
a b c d e f g h i j k l m n o
0 71 93 12 32 17 23 35 57 26 89 4 29 28 83 30
1 98 78 75 0 61 81 8 17 93 71 48 47 72 52 11
2 13 62 93 48 31 23 42 66 77 99 59 1 40 72 87
3 7 5 5 43 83 19 59 36 18 96 50 60 46 45 54
4 32 69 93 6 7 12 15 49 29 11 37 83 75 97 84
5 52 53 43 61 93 85 91 99 65 62 35 89 55 77 62
6 44 7 41 56 40 11 39 91 87 46 95 48 30 75 16
7 93 15 63 23 14 20 7 33 29 31 41 40 82 0 16
8 46 63 59 59 81 51 34 41 89 68 20 64 95 70 74
9 33 58 49 91 51 46 43 83 37 53 47 32 42 12 59
Then simply:
>>> df['column1'] = df.iloc[:, 3:8].sum(axis=1)
>>> df
a b c d e f g h i j k l m n o column1
0 71 93 12 32 17 23 35 57 26 89 4 29 28 83 30 164
1 98 78 75 0 61 81 8 17 93 71 48 47 72 52 11 167
2 13 62 93 48 31 23 42 66 77 99 59 1 40 72 87 210
3 7 5 5 43 83 19 59 36 18 96 50 60 46 45 54 240
4 32 69 93 6 7 12 15 49 29 11 37 83 75 97 84 89
5 52 53 43 61 93 85 91 99 65 62 35 89 55 77 62 429
6 44 7 41 56 40 11 39 91 87 46 95 48 30 75 16 237
7 93 15 63 23 14 20 7 33 29 31 41 40 82 0 16 97
8 46 63 59 59 81 51 34 41 89 68 20 64 95 70 74 266
9 33 58 49 91 51 46 43 83 37 53 47 32 42 12 59 314
>>>
I am trying to rotate some images for data augmentation to train a network for image segmentation task. After searching a lot, the best candidate for rotating each image and its corresponding mask was to use the scipy.ndimage.rotate function, but the problem with this is that after rotating the mask image numpy array ( which includes only 0 and 255 values for pixel values) the rotated mask has got all the values from 0 to 255 while I expect the mask array to have only 0 and 255 as its pixel values.
Here is the code:
from scipy.ndimage import rotate
import numpy as np
ample = dataset[1]
print(np.unique(sample['image']))
print(np.unique(sample['mask']))
print(sample['image'].shape)
print(sample['mask'].shape)
rot_image = rotate(sample['image'], 60, reshape = False)
rot_mask = rotate(sample['mask'], 60, reshape = False)
print(np.unique(rot_image))
print(np.unique(rot_mask))
print(rot_image.shape)
print(rot_mask.shape)
Here are the results:
[ 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17
18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35
36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53
54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71
72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89
90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107
108 109 110 111 112 113 114 115 118 119 120 121 125 139]
[ 0 255]
(512, 512, 1)
(512, 512, 1)
[ 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17
18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35
36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53
54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71
72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89
90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107
108 109 110 111 112 113 115 117 118 124 125 132 135]
[ 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 17 18 20
24 25 26 28 31 34 35 38 39 41 42 43 45 46 48 49 50 51
52 58 59 62 66 67 68 73 75 76 79 80 82 85 86 88 90 96
98 101 108 109 111 114 116 118 119 123 124 125 127 128 130 138 140 142
146 148 151 156 157 158 161 164 165 166 168 169 176 180 184 185 188 189
194 196 197 198 199 201 203 204 205 207 208 210 211 213 216 217 218 219
220 221 222 225 228 229 230 231 233 234 235 237 239 240 241 242 243 244
245 246 247 248 249 250 251 252 253 254 255]
(512, 512, 1)
(512, 512, 1)
It seems to be a simple problem to rotate image array, but I'm searching for days and I didn't find any solution to this problem. I am really confused how to prevent mask array values( 0 and 255) to take all values from 0 to 255 after rotation. I mean something like this:
x = np.unique(sample['mask'])
rot_mask = rotate(sample['mask'], 30, reshape = False)
x_rot = np.unique(rot_mask)
print(np.unique(x - x_rot))
[ 0]
Since you are using numpy arrays to represent images, why not using numpy functions? This library has all sorts of array manipulations. Try the rot90 function