Why can't numpy remove this useless dimension? - python

No matter what I do to this array:
data = np.mean(np.mat(segment_data), axis=0)
print(data)
print(data.shape)
print(data[0].shape)
print(data[0,:].shape)
print(data.squeeze().shape)
print(data.flatten().shape)
print(data.transpose().shape)
print(data.transpose()[:,0].shape)
The output is still two-dimensional:
[[-0.48134436 13.09216948 10.63232405 10.6977263 11.95639315 13.83434023
13.61501793 8.21932062 8.93592935 26.15871746 58.73205665]]
(1, 11)
(1, 11)
(1, 11)
(1, 11)
(1, 11)
(11, 1)
(11, 1)
What is happening? Why does numpy refuse to give me a 1-dimensional array?

You specifically used numpy.matrix, which refuses to be 1-dimensional. Don't use numpy.matrix! Remove that np.mat call.

Related

Indices of all values in an array

I have a matrix A. I would like to generate the indices of all the values in this matrix.
A=np.array([[1,2,3],[4,5,6],[7,8,9]])
The desired output should look like:
[(0,0),(0,1),(0,2),(1,0),(1,1),(2,1),(2,0),(2,1),(2,2)]
You can use:
from itertools import product
list(product(*map(range, A.shape)))
This outputs:
[(0, 0), (0, 1), (0, 2), (1, 0), (1, 1), (1, 2), (2, 0), (2, 1), (2, 2)]
Explanation:
A.shape gives the dimensions of the array. For each dimension, we create a range() that generates all of the numbers between 0 and the length of a given dimension. We use map() to perform this for each dimension of the array. Finally, we unpack all of these ranges into the arguments of itertools.product() to create the Cartesian product among all these ranges.
Notably, the use of list unpacking and map() means that this approach can handle ndarrays with an arbitrary number of dimensions. At the time of posting this answer, all of the other answers cannot be immediately extended to a non-2D array.
This should work.
indices = []
for i in range(len(A)):
for j in range(len(A[i])):
indices.append((i,j))
Heres a way of doing by using itertools combinations
from itertools import combinations
sorted(set(combinations(tuple(range(A.shape[0])) * 2, 2)))
combinations chooses two elements from the list and pairs them, which results in duplication, so converting it to set to remove duplications and then sorting it.
This line of list comprehension works. It probably isn't as fast as using itertools, but it does work.
[(i,j) for i in range(len(A)) for j in range(len(A[i]))]
Using numpy only you can take advantage of ndindex
list(np.ndindex(A.shape))
or unravel_index:
list(zip(*np.unravel_index(np.arange(A.size), A.shape)))
Output:
[(0, 0), (0, 1), (0, 2), (1, 0), (1, 1), (1, 2), (2, 0), (2, 1), (2, 2)]
NB. The second option enables you to pass a order='C' (row-major) or order='F' (column-major) parameter to get a different order of the coordinates
Example on A = np.array([[1,2,3],[4,5,6]])
order='C' (default):
[(0, 0), (0, 1), (0, 2), (1, 0), (1, 1), (1, 2)]
order='F':
[(0, 0), (1, 0), (0, 1), (1, 1), (0, 2), (1, 2)]

TypeError: Argument has incorrect type (expected numpy.ndarray, got numpy.bool_)

I am now working on Cython where I need to define the NumPy array type and dimension before running the code. I have these coordinates (x,y) as input:
list_a = [(2, 16), (24, 26)]
list_b = [(18, 8), (30, 22)]
Since I use NumPy, I transform the list into a NumPy array:
arr_a = numpy.array([(2, 16), (24, 26)])
arr_b = numpy.array([(18, 8), (30, 22)])
This is the code compiled in Cython that I use:
%%cython
import numpy
cimport numpy
cimport cython
ctypedef numpy.int_t DTYPE_t
def do_iter(numpy.ndarray[DTYPE_t,ndim=2] arr_a, numpy.ndarray[DTYPE_t,ndim=2] arr_b):
for a in arr_a:
for b in arr_a:
if a != b:
for i in arr_b:
for j in arr_b:
if i != j:
print(a,b,i,j)
I expect the following output:
(2, 16) (24, 26) (18, 8) (30, 22)
(2, 16) (24, 26) (30, 22) (18, 8)
(24, 26) (2, 16) (18, 8) (30, 22)
(24, 26) (2, 16) (30, 22) (18, 8)
At first, I got the following Error: ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all() Then I modified the code into:
do_iter(arr_a.any(),arr_b.any())
But it produced another error: TypeError: Argument 'arr_a' has incorrect type (expected numpy.ndarray, got numpy.bool_) I suspect that I define the wrong NumPy type or dimension. Any help is appreciated, thanks!
To solve the first error you got, you have to wrap the comparisons with .any() rather than changing the input.
...
if (a!=b).any():
...
if (i!=j).any():
...
This error is because a!=b will give you something like (True, False) since its a comparisons of 2 of 1x2 arrays. If you want the condition to be True if either of the 2 elements is different use .any() like the above. But if you want the condition to be True if both of the elements have to be different, change the above to .all()

How to create a numpy array from two lists of tuples, but only when the tuples are the same

For image analysis i loaded up a float image with scipy imread.
Next, i had scipys argrelmax search for local maxima in axis 0 and 1 and stored the results as arrays of tuples.
data = msc.imread(prediction1, 'F')
datarelmax_0 = almax(data, axis = 0)
datarelmax_1 = almax(data, axis = 1)
how can i create a numpy array from both lists which contains only the tuples that are in both list?
Edit:
argrelmax creates a tuple with two arrays:
datarelmax_0 = ([1,2,3,4,5],[6,7,8,9,10])
datarelmax_1 = ([11,2,13,14,5], [11,7,13,14,10])
in want to create a numpy array that looks like:
result_ar[(2,7),(5,10)]
How about this "naive" way?
import numpy as np
result = np.array([x for x in datarelmax_0 if x in datarelmax_1])
Pretty simple. Maybe there's a better/faster/fancier way by using some numpy methods but this should work for now.
EDIT:
To answer your edited question, you can do this:
result = [x for x in zip(datarelmax_0[0], datarelmax_0[1]) if x in zip(datarelmax_1[0], datarelmax_1[1])]
This gives you
result = [(2, 7), (5, 10)]
If you convert it to a numpy array by using
result = np.array(result)
it looks like this:
result = array([[ 2, 7],
[ 5, 10]])
In case you are interested in what zip does:
>>> zip(datarelmax_0[0], datarelmax_0[1])
[(1, 6), (2, 7), (3, 8), (4, 9), (5, 10)]
>>> zip(datarelmax_1[0], datarelmax_1[1])
[(11, 11), (2, 7), (13, 13), (14, 14), (5, 10)]

Getting positions of specific values of 2d array

I need some help to detect all values (coordinates) of 2D array which verify a specific conditional.
At beginning, i try to convert my 2D array in an 1D one and i get
the iteration (position) in the 1D array but that seems to be difficult
to find the good position and not very "safe" when i reconvert in 2D...
Is it possible to detect that without 1D transformation?
Thanks for help!
As example :
import numpy as np
test2D = np.array([[ 3051.11, 2984.85, 3059.17],
[ 3510.78, 3442.43, 3520.7 ],
[ 4045.91, 3975.03, 4058.15],
[ 4646.37, 4575.01, 4662.29],
[ 5322.75, 5249.33, 5342.1 ],
[ 6102.73, 6025.72, 6127.86],
[ 6985.96, 6906.81, 7018.22],
[ 7979.81, 7901.04, 8021. ],
[ 9107.18, 9021.98, 9156.44],
[ 10364.26, 10277.02, 10423.1 ],
[ 11776.65, 11682.76, 11843.18]])
a,b = test2D.shape
test1D = np.reshape(test2D,(1,a*b))
positions=[]
for i in range(test1D.shape[1]):
if test1D[0,i] > 5000.:
positions.append(i)
print positions
So for this example my input is the 2D array "test2D" and i want all coordinates which verify the condition >5000 as a list.
If you want positions, use something like
positions = zip(*np.where(test2D > 5000.))
Numpy.where
This will return
In [15]: zip(*np.where(test2D > 5000.))
Out[15]:
[(4, 0),
(4, 1),
(4, 2),
(5, 0),
(5, 1),
(5, 2),
(6, 0),
(6, 1),
(6, 2),
(7, 0),
(7, 1),
(7, 2),
(8, 0),
(8, 1),
(8, 2),
(9, 0),
(9, 1),
(9, 2),
(10, 0),
(10, 1),
(10, 2)]
In general, when you use numpy.arrays, you can use conditions in fancy indexing. For example, test2D > 5000 will return a boolean array with the same dimensions as test2D and you can use it to find the values where your condition is true: test2D[test2D > 5000]. Nothing else is needed. Instead of using indexes, you can simply use the boolean array to index other arrays than test2D of the same shape. Have a look here.

Visually Representing X and Y Values

I have a list of (x,y) values that are in a list like [(x,y),(x,y),(x,y)....]. I feel like there is a solution in matplotlib, but I couldn't quite get there because of my formatting. I would like to plot it as a histogram or line plot. Any help is appreciated.
You can quite easily convert a list of (x, y) tuples into a list of two tuples of x- and y- coordinates using the * ('splat') operator (see also this SO question):
>>> zip(*[(0, 0), (1, 1), (2, 4), (3, 9)])
[(0, 1, 2, 3), (0, 1, 4, 9)]
And then, you can use the * operator again to unpack those arguments into plt.plot
>>> plt.plot(*zip(*[(0, 0), (1, 1), (2, 4), (3, 9)]))
or even plt.bar
>>> plt.bar(*zip(*[(0, 0), (1, 1), (2, 4), (3, 9)]))
Perhaps you could try something like this (also see):
import numpy as np:
xs=[]; ys=[]
for x,y in xy_list:
xs.append(x)
ys.append(y)
xs=np.asarray(xs)
ys=np.asarray(ys)
plot(xs,ys,'ro')
Maybe not the most elegant solution, but it should work. Cheers, Trond

Categories

Resources