Subtraction on numpy array not working - python

I have a 4D numpy array. I am trying to normalize it's value for that I need to subtract some value from it but the operation is adding the values.
Please help
print(X_train.shape)
print(X_train[0][0][0])
print(X_train[0][0][0]-128)
It's output is:
(34799, 32, 32, 3)
[28 25 24]
[156 153 152]
Shouldn't it be?
[-100,-103,-102]
Please let me know what I am doing wrong.
I am new to numpy.

The fact that it's a 4-dimensional array is not the point here.
I guess that your problem is with the data type of that numpy array. For example, if it's numpy.uint8 (unsigned byte, i.e. allowing only values in [0,255]) then subtracting 128 from 28 will give you 156... :)
Try: print (X_train.dtype) to see the data type associated with your numpy array.
If that's the case, then consider converting it to some other dtype, e.g. X_train = X_train.astype(numpy.int16), or simply to numpy.int8, depending on your expectations from your data.

Related

could not broadcast input array of shape (what I want) to (what I don't want)

I'm writing a code that reads data from monthly data (28, 29, 30 or 31 files in each month): we read the files and get a shape (31,24) array for each variable (9 variables total)
Then we take the mean and standard deviation of the 31 rows and display them as written below, so that the resulting table has a shape (9,2,24).
table9_2_24 = np.array([np.array([daymean1,daystdev1]),
np.array([daymean2,daystdev2]),
np.array([daymean3,daystdev3]),
np.array([daymean4,daystdev4]),
np.array([daymean5,daystdev5]),
np.array([daymean6,daystdev6]),
np.array([daymean7,daystdev7]),
np.array([daymean8,daystdev8]),
np.array([daymean9,daystdev9])])
This code worked without a problem for the January dataset, but then when I ran the exact same thing for the February dataset, I get the error:
could not broadcast input array from shape (2,24) into shape (2,)
which is tremendously infuriating as I don't even WANT the array to be broadcast into (2,), and in fact shape (2,24) was exactly what I wanted. Does anyone know how to resolve this?
Ps. using np.ndarray instead of np.array didn't work, I get :
only integer scalar arrays can be converted to a scalar index

How to map np.array values?

I have a numpy array of the following shape - (363, 640, 4),
with the following values - [ 67 219 250 255]
e.g:
Shape
I want to map this array into the same size (363,640) but the values to be an integer 127.
I have tried to use numpy.vectorize without success, it returns None on the all array.
Is there a way to do it?
Thank you.
I do not fully understand the question, so I'm gonna guess you want to create an array, which we'll call B, with the same shape as the array you are showing, which we'll call A.
So I'm assuming you want B to be of the same shape as A, but filled with 127? Correct me if that's not the case, but to do what I'm guessing should be something along the lines:
import numpy as np
A = np.zeros((363, 640, 4)) # Just to initialize the array to something with your shape
B = np.full(np.shape(A), 127) # Returns an array of shape A filled with 127
Tell me if that is what you expected, if not, please could you detail a bit more the issue?

Stack dimensions of numpy array

I have a numpy array of shape (2500, 16, 32, 24), and I want to make it into a ( something, 24) array, but I don't want numpy to shuffle my values. The 32 x 24 dimension at the end represent images and I want the corresponding elements to be consistent. Any ideas?
EDIT: Ok , I wasn't clear enough. (something, 24) = (1280000, 24).
Use arr.reshape(-1,arr.shape[-1]) or if you know it will be 24 arr.reshape(-1,24)

What does -1 in numpy reshape mean? [duplicate]

This question already has answers here:
What does -1 mean in numpy reshape?
(12 answers)
Closed 6 years ago.
I have a numpy array (A) of shape = (100000, 28, 28)
I reshape it using A.reshape(-1, 28x28)
This is very common use in Machine learning pipelines.
How does this work ? I have never understood the meaning of '-1' in reshape.
An exact question is this
But no solid explanation. Any answers pls ?
in numpy, creating a matrix of 100X100 items is like this:
import numpy as np
x = np.ndarray((100, 100))
x.shape # outputs: (100, 100)
numpy internally stores all these 10000 items in an array of 10000 items regardless of the shape of this object, this allows us to change the shape of this array into any dimensions as long as the number of items on the array does not change
for example, reshaping our object to 10X1000 is ok as we keep the 10000 items:
x = x.reshape(10, 1000)
reshaping to 10X2000 wont work as we does not have enough items on the list
x.reshape(10, 2000)
ValueError: total size of new array must be unchanged
so back to the -1 question, what it does is the notation for unknown dimension, meaning:
let numpy fill the missing dimension with the correct value so my array remain with the same number of items.
so this:
x = x.reshape(10, 1000)
is equivalent to this:
x = x.reshape(10, -1)
internally what numpy does is just calculating 10000 / 10 to get the missing dimension.
-1 can even be on the start of the array or in the middle.
the above two examples are equivalent to this:
x = x.reshape(-1, 1000)
if we will try to mark two dimensions as unknown, numpy will raise an exception as it cannot know what we are meaning as there are more than one way to reshape the array.
x = x.reshape(-1, -1)
ValueError: can only specify one unknown dimension
It means, that the size of the dimension, for which you passed -1, is being inferred. Thus,
A.reshape(-1, 28*28)
means, "reshape A so that its second dimension has a size of 28*28 and calculate the correct size of the first dimension".
See documentation of reshape.

assigning different weights to every numpy column

I have the following numpy array:
from sklearn.decomposition import PCA
from sklearn.preprocessing import normalize
import numpy as np
# NumPy array comprising associate metrics
# i.e. Open TA's, Open SR's, Open SE's
associateMetrics = np.array([[11, 28, 21],
[27, 17, 20],
[19, 31, 3],
[17, 24, 17]]).astype(np.float64)
print("raw metrics=", associateMetrics)
Now, I want to assign different weights to every column in the above array & later normalize this. For eg. lets say i want to assign higher weight to 1st column by multiplying by 5, multiple column 2 by 3 and the last column by 2.
How do i do this in python? Sorry a bit new to python and numpy.
I have tried this for just 1 column but it wont work:
# Assign weights to metrics
weightedMetrics = associateMetrics
np.multiply(2, weightedMetrics[:,0])
print("weighted metrics=", weightedMetrics)
You should make use of numpy's array broadcasting. This means that lower-dimensional arrays can be automatically expanded to perform a vectorized operation with an array of higher (but compatible) dimensions. In your specific case, you can multiply your (4,3)-shaped array with a 1d weight array of shape (3,) and obtain what you want:
weightedMetrics = associateMetrics * np.array([5,3,2])
The trick is that you can imagine numpy ndarrays to have leading singleton dimensions, along which broadcasting is automatic. By this I mean that your 1d numpy weight array of shape (3,) can be thought to have a leading singleton dimension (but only from the point of view of broadcasting!). And it's easy to see how the array of shape (4,3) and (1,3) should be multiplied: each element of the latter has to be used for full columns of the former.
In the very general case, you can even use arithmetic operations on, say, an array of shape (3,1,3,1,4) and one of shape (2,3,4,4). What's important that dimensions that meet should either agree, or one of the arrays should have a singleton dimension at that place, and one of the arrays is allowed to be longer (in the front).
i found my answer. This is what i used:
print("weighted metrics=", np.multiply([ 1, 2, 3], associateMetrics))

Categories

Resources