Python list of numpy matrices behaving strangely - python

I am trying to work with lists of numpy matrices and am encountering an annoying problem.
Let's say I start with a list of ten 2x2 zero matrices
para=[numpy.matrix(numpy.zeros((2,2)))]*(10)
I access individual matrices like this
para[0]
para[1]
and so on. So far so good.
Now, I want to modify the first row of the second matrix only, leaving all the others unchanged. So I do this
para[1][0]=numpy.matrix([[1,1]])
The first index points to the second matrix in the list and the second index points to the first row in that matrix, replacing it with [1,1].
But strangely enough, this command changes the first row of ALL ten matrices in the list to [1,1] instead of just the second one like I wanted. What gives?

When you multiply the initial list by 10, you end up with a list of 10 numpy arrays which are in fact references to the the same underlying structure. Modifying one will modify all of them because in fact there's only one numpy array, not 10.
If you need proof, check out this example in the REPL:
>>> a = numpy.zeros(10)
>>> a = [numpy.zeros(10)]*10
>>> a[0] is a[1]
True
>>>
The is operator checks if both objects are in fact the same(not if they are equal in value).
What you should do is use a list comprehension to generate your initial arrays instead of a multiplication, like so:
para=[numpy.matrix(numpy.zeros((2,2))) for i in range(10)]
That will call numpy.matrix() ten times instead of just once and generate 10 distinct matrixes.

Related

Replacing new vector after it gets empty on Python

Hi have an original vector, I would like to put the first 3 elements into new vector, do some math and then get new elements after the math. Put those new elements into a new vector, delete the original first 3 elements from original vector and repeat this exact procedure until the original vector is empty.
This is what I have done so far
OR=np.array([1,2,3,4,5,6])
new=OR[0:3]
while (True):
tran=-2*c_[new]
OR= delete(OR, [0,1,2])
new=OR[0:3]
if (OR==[]):
break
However it is not working out properly, do you have any suggestions?
Not sure what c_ is in your code, but regardless since numpy arrays are not dynamic, you can't remove or add elements to them. Deleting elements creates a new array without those elements, which is not optimal. I think you should either use a python deque which has fast pop methods for removing one element from the front/end, or just iterate over the original numpy array, for example like this:
def modify_array(arr):
# your code for modifying the array here
result = []
original_array = np.arange(1, 10)
for idx in range(0, len(original_array), 3):
result.append(modify_array(original_array[idx:idx+3]))
result = np.concatenate(result)

Getting a single array containing several sub-arrays iteratively

I have a little question about python Numpy. What I want to do is the following:
having two numpy arrays arr1 = [1,2,3] and arr2 = [3,4,5] I would like to obtain a new array arr3 = [[1,2,3],[3,4,5]], but in an iterative way. For a single instance, this is just obtained by typing arr3 = np.array([arr1,arr2]).
What I have instead, are several arrays e.g. [4,3,1 ..], [4,3,5, ...],[1,2,1,...] and I would like to end up with [[4,3,1 ..], [4,3,5, ...],[1,2,1,...]], potentally using a for loop. How should I do this?
EDIT:
Ok I'm trying to add more details to the overall problem. First, I have a list of strings list_strings=['A', 'B','C', 'D', ...]. I'm using a specific method to obtain informative numbers out of a single string, so for example I have method(list_strings[0]) = [1,2,3,...], and I can do this for each single string I have in the initial list.
What I would like to come up with is an iterative for loop to end up having all the numbers extracted from each string in turn in the way I've described at the beginning, i.e.a single array with all the numeric sub-arrays with information extracted from each string. Hope this makes more sense now, and sorry If I haven't explained correctly, I'm really new in programming and trying to figure out stuff.
Well if your strings are in a list, we want to put the arrays that result from calling method in a list as well. Python's list comprehension is a great way to achieve that.
list_strings = ['A', ...]
list_of_converted_strings = [method(item) for item in list_strings]
arr = np.array(list_of_converted_strings)
Numpy arrays are of fixed dimension i.e. for example a 2D numpy array of shape n X m will have n rows and m columns. If you want to convert a list of lists into a numpy array all the the sublists in the main list should be of same length. You cannot convert it into a numpy array if sublist are of varying size.
For example, below code will give an error
np.array([[1], [3,4]]])
so if all the sublist are of same size then you can use
np.array([method(x) for x in strings]])

Update numpy array with sparse indices and values

I have 1-dimensional numpy array and want to store sparse updates of it.
Say I have array of length 500000 and want to do 100 updates of 100 elements. Updates are either adds or just changing the values (I do not think it matters).
What is the best way to do it using numpy?
I wanted to just store two arrays: indices, values_to_add and therefore have two objects: one stores dense matrix and other just keeps indices and values to add, and I can just do something like this with the dense matrix:
dense_matrix[indices] += values_to_add
And if I have multiple updates, I just concat them.
But this numpy syntax doesn't work fine with repeated elements: they are just ignored.
Updating pair when we have an update that repeats index is O(n). I thought of using dict instead of array to store updates, which looks fine from the point of view of complexity, but it doesn't look good numpy style.
What is the most expressive way to achieve this? I know about scipy sparse objects, but (1) I want pure numpy because (2) I want to understand the most efficient way to implement it.
If you have repeated indices you could use at, from the documentation:
Performs unbuffered in place operation on operand ‘a’ for elements
specified by ‘indices’. For addition ufunc, this method is equivalent
to a[indices] += b, except that results are accumulated for elements
that are indexed more than once.
Code
a = np.arange(10)
indices = [0, 2, 2]
np.add.at(a, indices, [-44, -55, -55])
print(a)
Output
[ -44 1 -108 3 4 5 6 7 8 9]

Avoid copying when indexing a numpy arrays using lists

Is there a simple way to index arrays using lists or any other collection so that no copy is made (just a view of the array is taken). Please do not try to answer the question in terms of the snippet of code below --- the list I use to index the element is not always short (i.e. thousands of elements, not 4) and the list is a product of an algorithm and hence the number are not necessarily ordered, etc.
For example in the code below columns 1,2 and 3 are selected in both cases but only in the first case a view of the data is returned:
>>> a[:,1:4]
>>> b = a[:,1:4]
>>> b.base is a
True
>>> c = a[:,[1,3,2]]
>>> c.base is a
False
Fancy indexing (using a list of indices to access elements of an array) always produces a copy, as there is no way for numpy to translate it into a new view of the same data, but with a different fixed stride and shape, starting from a particular element.
Under the hood, a numpy array is a pointer to the first element in memory of an array, a dtype, shape and information about how far to move in memory to get to each of the dimensions (next row, column, etc) and some flags. A view on some pre-existing memory just points to some element in that array and fiddles with the stride and shape. Fancy indexing generally specifies random access into that pre-existing memory and you can't force that data into the necessary form, so a copy has to be made.

How to get the number of elemets in an np.array?

Suppose there is an array
(1) x=np.array([[1,2],[1,2],[1,2]])
and a second array
(2) y=np.array([[1],[1,2],[1,2,3]])
The command size(x) returns the total count of all elements along every axis. In this case 6. However, size(y) returns 3. This must be because numpy interprets (2) in this case as three elements (the three subarrays) along one axis, although shape(y) returns (3, ). My question is now: how can I get numpy to interpret (2) as an array with three axes, so that size(y) returns the total count of all atomic elemets, which is 6?
I don't think it's possible to get the number of elements from y without looping over the objects.
The problem is that the elements of y are not numbers, they are objects (lists). Numpy does not support lists of lists and therefore it stores it as a 1-dimensional array of objects. I don't think there are Numpy methods to get the total number of elements in y.

Categories

Resources