Python: Creating multi dimensional array of multidimensional zero array - python

Hello I have the following question. I create zero arrays of dimension (40,30,80). Now I need 7*7*7 of these zero arrays in an array. How can I do this?
One of my matrices is created like this:
import numpy as np
zeroMatrix = np.zeros((40,30,80))
My first method was to put the zero matrices in a 7*7*7 list. But i want to have it all in a numpy array. I know that there is a way with structured arrays I think, but i dont know how. If i copy my 7*7*7 list with np.copy() it creates a numpy array with the given shape, but there must be a way to do this instantly, isnt there?
EDIT
Maybe I have to make my question clearer. I have a 7*7 list of my zero matrices. In a for loop all of that arrays will be modified. In another step, this tempory list is appended to an empty list which will have a length of 7 in the end ( So i append the 7*7 list 7 times to the empty list. In the end I have a 7*7*7 List of those matrices. But I think this will be better If I have a numpy array of these zero matrices from the beginning.

Building an array of same-shaped arrays is not well supported by numpy which prefers to create a maximum depth array of minimum depth elements instead.
It turns out that numpy.frompyfunc is quite useful in circumventing this tendency where it is unwanted.
In your specific case one could do:
result = np.frompyfunc(zeroMatrix.copy, 0, 1)(np.empty((7, 7, 7), object))
Indeed:
>>> result.shape
(7, 7, 7)
>>> result.dtype
dtype('O')
>>> result[0, 0, 0].shape
(40, 30, 80)

Related

Update numpy array with sparse indices and values

I have 1-dimensional numpy array and want to store sparse updates of it.
Say I have array of length 500000 and want to do 100 updates of 100 elements. Updates are either adds or just changing the values (I do not think it matters).
What is the best way to do it using numpy?
I wanted to just store two arrays: indices, values_to_add and therefore have two objects: one stores dense matrix and other just keeps indices and values to add, and I can just do something like this with the dense matrix:
dense_matrix[indices] += values_to_add
And if I have multiple updates, I just concat them.
But this numpy syntax doesn't work fine with repeated elements: they are just ignored.
Updating pair when we have an update that repeats index is O(n). I thought of using dict instead of array to store updates, which looks fine from the point of view of complexity, but it doesn't look good numpy style.
What is the most expressive way to achieve this? I know about scipy sparse objects, but (1) I want pure numpy because (2) I want to understand the most efficient way to implement it.
If you have repeated indices you could use at, from the documentation:
Performs unbuffered in place operation on operand ‘a’ for elements
specified by ‘indices’. For addition ufunc, this method is equivalent
to a[indices] += b, except that results are accumulated for elements
that are indexed more than once.
Code
a = np.arange(10)
indices = [0, 2, 2]
np.add.at(a, indices, [-44, -55, -55])
print(a)
Output
[ -44 1 -108 3 4 5 6 7 8 9]

Numpy-like slicing in Julia

In Python/Numpy I can slice arrays in this form:
arr = np.ones((3,4,5))
arr[2]
and the shape will be maintained:
(arr[2]).shape # prints (4, 5)
Which means that, if I want to keep the shape of the array, the following code works for N-dimensional arrays
arr = np.ones((3,4,5,2,2))
(arr[2]).shape # prints (4, 5, 2, 2)
This is great if I want to write functions that work for N-dim arrays preserving their output.
In Julia, however, the same action does not preserve the structure:
arr = ones(3,4,5)
size(arr[3]) # prints () (0-dimensinoal)
size(arr[3,:]) # prints (20,)
because of partial linear indexing. So if want to keep the original dimensions I need to write arr[3,:,:], which only works for 3D arrays. If I want a 4D array I would have to use arr[3,:,:,:] and so on. The code isn't general.
Furthermore, when you get to array that are 5 dimensions or more (which is the case I'm working with now) this notation gets extremely cumbersome.
Is there any way I can write code like I do in Python and make it general? I couldn't even think of a nice clean way with reshape, let alone a way that's as clean as Python.
Notice that in Python the shape is only preserved if you slice the first dimension of the array. In Julia you can use slicedim(A, d, i) to slice dimension d of array A at index i.

Interpreting numpy.where results

I'm confused by what the results of numpy.where mean, and how to use it to index into an array.
Have a look at the code sample below:
import numpy as np
a = np.random.randn(10,10,2)
indices = np.where(a[:,:,0] > 0.5)
I expect the indices array to be 2-dim and contain the indices where the condition is true. We can see that by
indices = np.array(indices)
indices.shape # (2,120)
So it looks like indices is acting on the flattened array of some sort, but I'm not able to figure out exactly how. More confusingly,
a.shape # (20,20,2)
a[indices].shape # (2,120,20,2)
Question:
How does indexing my array with the output of np.where actually grow the size of the array? What is going on here?
You are basing your indexing on a wrong assumption: np.where returns something that can be immediatly used for advanced indexing (it's a tuple of np.ndarrays). But you convert it to a numpy array (so it's now a np.ndarray of np.ndarrays).
So
import numpy as np
a = np.random.randn(10,10,2)
indices = np.where(a[:,:,0] > 0.5)
a[:,:,0][indices]
# If you do a[indices] the result would be different, I'm not sure what
# you intended.
gives you the elements that are found by np.where. If you convert indices to a np.array it triggers another form of indexing (see this section of the numpy docs) and the warning message in the docs gets very important. That's the reason why it increases the total size of your array.
Some additional information about what np.where means: You get a tuple containing n arrays. n is the number of dimensions of the input array. So the first element that satisfies the condition has index [0][0], [1][0], ... [n][0] and not [0][0], [0][1], ... [0][n]. So in your case you have (2, 120) meaning you have 2 dimensions and 120 found points.

How to declare a 2 dimensional array with different row lengths using np.array?

For example, I want a 2 row matrix, with a first row of length 1, and second row of length 2. I could do,
list1 = np.array([1])
list2 = np.array([2,3])
matrix = []
matrix.append(list1)
matrix.append(list2)
matrix = np.array(matrix)
I wonder if I could declare a matrix of this shape directly in the beginning of a program without going through the above procedure?
A matrix is by definition a rectangular array of numbers. NumPy does not support arrays that do not have a rectangular shape. Currently, what your code produces is an array, containing a list (matrix), containing two more arrays.
array([array([1]), array([2, 3])], dtype=object)
I don't really see what the purpose of this shape could be, and would advise you simply use nested lists for whatever you are doing with this shape. Should you have found some use for this structure with NumPy however, you can produce it much more idiomatically like this:
>>> np.array([list1,list2])
array([array([1]), array([2, 3])], dtype=object)

how to create ndarray with ndim==0 and size==0?

I am testing some edge cases of my program and observed a strange fact. When I create a scalar numpy array, it has size==1 and ndim==0.
>>> A=np.array(1.0)
>>> A.ndim # returns 0
>>> A.size # returns 1
But when I create empty array with no element, then it has size==0 but ndim==1.
>>> A=np.array([])
>>> A.ndim # returns 1
>>> A.size # returns 0
Why is that? I would expect the ndim to be also 0. Or is there another way of creation of 'really' empty array with size and ndim equal to 0?
UPDATE: even A=np.empty(shape=None) does not create dimensionless array of size 0...
I believe the answer is that "No, you can't create an ndarray with both ndim and size of zero". As you've already found out yourself, the (ndim,size) pairs of (1,0) and (0,1) are as low as you can go.
This very nice answer explains a lot about numpy scalar types, and why they're a bit odd to have around. This explanation makes it clear that scalar numpy arrays like array(1) are a very special kind of beast. They only have a single value (causing size==1), but by definition they don't have a sense of dimensionality, hence ndim==0. Non-scalar numpy arrays, on the other hand, can be empty, but they contain at least a pair of square brackets, leading to a minimal ndim of 1, even if their size can be 0 if they are made up of empty lists. (This is how I think about the situation: ndarrays are in a way lists of lists of lists of ..., on as many levels as there are dimensions. 1d arrays are compatible with lists, so an empty list, being still a list, also has a defining dimension.)
The only way to come up with an empty scalar would be to call np.array() like this, but arrays can only be initialized by some actual object. So I believe your program is safe from this edge case.

Categories

Resources