I would like to update (prepend each one with additional elements) many numpy arrays in a loop, without having to repeat the code for each one.
I tried creating a list of all the arrays and looping through the items in that list and updating each one, but that doesn't change the original array.
import numpy as np
arr01 = [1,2,3]
arr02 = [4,5,6]
arr99 = [7,8,9]
print('initial arr01', arr01)
arraylist = [arr01, arr02, arr99]
for array in arraylist:
array = np.concatenate((np.zeros(3, dtype=int), array))
print('array being modified inside the loop', array)
print('final arr01', arr01)
In the sample code, I expected arr01, arr02, arr03 to all be modified with the prepended zeros.
array = np.concatenate((np.zeros(3, dtype=int), array)) does not change the current array but creates a new one and stores it inside the variable array. So for the solution you have to change the values of the array itself, which can be done with array[:].
That means the only change you would have to make is replacing this one line
array[:] = np.concatenate((np.zeros(3, dtype=int), array))
So your correct solution would be
import numpy as np
arr01 = [1,2,3]
arr02 = [4,5,6]
arr99 = [7,8,9]
print('initial arr01', arr01)
arraylist = [arr01, arr02, arr99]
for array in arraylist:
array[:] = np.concatenate((np.zeros(3, dtype=int), array))
print('array being modified inside the loop', array)
print('final arr01', arr01)
Related
I have the following code for deleting the indices inside the array, but it seems not to work
import numpy as np
length=4
indices=np.arange(length)
for num in (indices):
np.delete(indices, num)
print("checking", indices, num)
What seems to be the issue? does np.delete not work on arrays?
It's better not iterating a container while changing its length. Try working with indexed instead :
import numpy as np
length=4
indices=np.arange(length)
for i in reversed(range(len(indices))):
indices = np.delete(indices, i)
print("checking", indices, i)
If you're looking for deletion by element you could find the element index before using delete :
import numpy as np
length=4
elements=np.array([1, 3, 2, 4])
for i in elements:
idx = np.where(elements == i)
elements = np.delete(elements, idx)
print("checking", elements, i)
Numpy array is immutable. So you cannot delete an item from it. However, you can construct a new array without the values you don't want, like this:
new_indices = np.delete(indices, [0,1,2])
In your code, you can try to get new array after deleting the element, by assigning it to a variable as below:
new_indices = np.delete(indices, num)
print(new_indices, indices) # prints [1,2,3],[0,1,2,3]
Now if you print new_indices it will not have the element at index num
numpy.delete returns a copy of given array, it doesn't modify existing one.
I have a list of ints, a, between 0 and 3000. len(a) = 3000. I have a for loop that is iterating through this list, searching for the indices of each elemenent in a larger array.
import numpy as np
a = [i for i in range(3000)]
array = np.random.randint(0, 3000, size(12, 1000, 1000))
newlist = []
for i in range(0, len(a)):
coord = np.where(array == list[i])
newlist.append(coord)
As you can see, coord will be 3 arrays of the coordinates x, y, z for the values in the 3D matrix that equal the value in the list.
Is there a way to do this in a vectorized manner without the for loop?
The output should be a list of tuples, one for each element in a:
# each coord looks like this:
print(coord)
(array[1, ..., 1000], array[2, ..., 1000], array[2, ..., 12])
# combined over all the iterations:
print(newlist)
[coord1, coord2, ..., coord3000]
There is actually a fully vectorized solution to this, despite the fact that the resulting arrays are all of different sizes. The idea is this:
Sort all the elements of the array along with their coordinates. argsort is ideal for this sort of thing.
Find the cut-points in the sorted data, so you know where to split the array, e.g. with diff and flatnonzero.
split the coordinate array along the indices you found. If you have missing elements, you may need to generate a key based on the first element of each run.
Here is an example to walk you through it. Let's say you have an d-dimensional array with size n. Your coordinates will be a (d, n) array:
d = arr.ndim
n = arr.size
You can generate the coordinate arrays with np.indices directly:
coords = np.indices(arr.shape)
Now ravel/reshape the data and the coordinates into an (n,) and (d, n) array, respectively:
arr = arr.ravel() # Ravel guarantees C-order no matter the source of the data
coords = coords.reshape(d, n) # C-order by default as a result of `indices` too
Now sort the data:
order = np.argsort(arr)
arr = arr[order]
coords = coords[:, order]
Find the locations where the data changes values. You want the indices of the new values, so we can make a fake first element that is 1 less than the actual first element.
change = np.diff(arr, prepend=arr[0] - 1)
The indices of the locations give the break-points in the array:
locs = np.flatnonzero(change)
You can now split the data at those locations:
result = np.split(coords, locs[1:], axis=1)
And you can create the key of values actually found:
key = arr[locs]
If you are very confident that all the values are present in the array, then you don't need the key. Instead, you can compute locs as just np.diff(arr) and result as just np.split(coords, inds, axis=1).
Each element in result is already consistent with the indexing used by where/nonzero, but as a numpy array. If specifically want a tuple, you can map it to a tuple:
result = [tuple(inds) for inds in result]
TL;DR
Combining all this into a function:
def find_locations(arr):
coords = np.indices(arr.shape).reshape(arr.ndim, arr.size)
arr = arr.ravel()
order = np.argsort(arr)
arr = arr[order]
coords = coords[:, order]
locs = np.flatnonzero(np.diff(arr, prepend=arr[0] - 1))
return arr[locs], np.split(coords, locs[1:], axis=1)
You can return a list of index arrays with empty arrays for missing elements by replacing the last line with
result = [np.empty(0, dtype=int)] * 3000 # Empty array, so OK to use same reference
for i, j in enumerate(arr[locs]):
result[j] = coords[i]
return result
You can optionally filter for values that are in the specific range you want (e.g. 0-2999).
You can use logical OR in numpy to pass all those equality conditions at once instead of one by one.
import numpy as np
conditions = False
for i in list:
conditions = np.logical_or(conditions,array3d == i)
newlist = np.where(conditions)
This allows numpy to do filtering once instead of n passes for each condition separately.
Another way to do it more compactly
np.where(np.isin(array3d, list))
import numpy as np
a = np.arange(0,60,5)
a = a.reshape(3,4)
for x in np.nditer(a, op_flags = ['readwrite']):
x[...] = 2*x
print 'Modified array is:'
print a
In the above code, why can't we simply write x=2*x instead of x[...]=2*x?
No matter what kind of object we were iterating over or how that object was implemented, it would be almost impossible for x = 2*x to do anything useful to that object. x = 2*x is an assignment to the variable x; even if the previous contents of the x variable were obtained by iterating over some object, a new assignment to x would not affect the object we're iterating over.
In this specific case, iterating over a NumPy array with np.nditer(a, op_flags = ['readwrite']), each iteration of the loop sets x to a zero-dimensional array that's a writeable view of a cell of a. x[...] = 2*x writes to the contents of the zero-dimensional array, rather than rebinding the x variable. Since the array is a view of a cell of a, this assignment writes to the corresponding cell of a.
This is very similar to the difference between l = [] and l[:] = [] with ordinary lists, where l[:] = [] will clear an existing list and l = [] will replace the list with a new, empty list without modifying the original. Lists don't support views or zero-dimensional lists, though.
I have a function that returns many output arrays of varying size.
arr1,arr2,arr3,arr4,arr5, ... = func(data)
I want to run this function many times over a time series of data, and combine each output variable into one array that covers the whole time series.
To elaborate: If the output arr1 has dimensions (x,y) when the function is called, I want to run the function 't' times and end up with an array that has dimensions (x,y,t). A list of 't' arrays with size (x,y) would also be acceptable, but not preferred.
Again, the output arrays do not all have the same dimensions, or even the same number of dimensions. Arr2 might have size (x2,y2), arr3 might be only a vector of length (x3). I do not know the size of all of these arrays before hand.
My current solution is something like this:
arr1 = []
arr2 = []
arr3 = []
...
for t in range(t_max):
arr1_t, arr2_t, arr3_t, ... = func(data[t])
arr1.append(arr1_t)
arr2.append(arr2_t)
arr3.append(arr3_t)
...
and so on. However this is inelegant looking when repeated 27 times for each output array.
Is there a better way to do this?
You can just make arr1, arr2, etc. a list of lists (of vectors or matrices or whatever). Then use a loop to iterate the results obtained from func and add them to the individual lists.
arrN = [[] for _ in range(N)] # N being number of results from func
for t in range(t_max):
results = func(data[t])
for i, res in enumerate(results):
arrN[i].append(res)
The elements in the different sub-lists do not have to have the same dimensions.
Not sure if it counts as "elegant", but you can build a list of the result tuples then use zip to group them into tuples by return position instead of by call number, then optionally map to convert those tuples to the final data type. For example, with numpy array:
from future_builtins import map, zip # Only on Python 2, to minimize temporaries
import numpy as np
def func(x):
'Dumb function to return tuple of powers of x from 1 to 27'
return tuple(x ** i for i in range(1, 28))
# Example inputs for func
data = [np.array([[x]*10]*10, dtype=np.uint8) for in range(10)]
# Output is generator of results for each call to func
outputs = map(func, data)
# Pass each complete result of func as a positional argument to zip via star
# unpacking to regroup, so the first return from each func call is the first
# group, then the second return the second group, etc.
positional_groups = zip(*outputs)
# Convert regrouped data (`tuple`s of 2D results) to numpy 3D result type, unpack to names
arr1,arr2,arr3,arr4,arr5, ...,arr27 = map(np.array, positional_groups)
If the elements returned from func at a given position might have inconsistent dimensions (e.g. one call might return 10x10 as the first return, and another 5x5), you'd avoid the final map step (since the array wouldn't have consistent dimensions and just replace the second-to last step with:
arr1,arr2,arr3,arr4,arr5, ...,arr27 = zip(*outputs)
making arr# a tuple of 2D arrays, or if the need to be mutable:
arr1,arr2,arr3,arr4,arr5, ...,arr27 = map(list, zip(*outputs))
to make them lists of 2D arrays.
This answer gives a solution using structured arrays. It has the following requirement: Ggven a function f that returns N arrays, and the size of each of the returned arrays can be different -- then for all results of f, len(array_i) must always be same. eg.
arrs_a = f("a")
arrs_b = f("b")
for sub_arr_a, sub_arr_b in zip(arrs_a, arrs_b):
assert len(sub_arr_a) == len(sub_arr_b)
If the above is true, then you can use structured arrays. A structured array is like a normal array, just with a complex data type. For instance, I could specify a data type that is made up of one array of ints of shape 5, and a second array of floats of shape (2, 2). eg.
# define what a record looks like
dtype = [
# tuples of (field_name, data_type)
("a", "5i4"), # array of five 4-byte ints
("b", "(2,2)f8"), # 2x2 array of 8-byte floats
]
Using dtype you can create a structured array, and set all the results on the structured array in one go.
import numpy as np
def func(n):
"mock implementation of func"
return (
np.ones(5) * n,
np.ones((2,2))* n
)
# define what a record looks like
dtype = [
# tuples of (field_name, data_type)
("a", "5i4"), # array of five 4-byte ints
("b", "(2,2)f8"), # 2x2 array of 8-byte floats
]
size = 5
# create array
arr = np.empty(size, dtype=dtype)
# fill in values
for i in range(size):
# func must return a tuple
# or you must convert the returned value to a tuple
arr[i] = func(i)
# alternate way of instantiating arr
arr = np.fromiter((func(i) for i in range(size)), dtype=dtype, count=size)
# How to use structured arrays
# access individual record
print(arr[1]) # prints ([1, 1, 1, 1, 1], [[1, 1], [1, 1]])
# access specific value -- get second record -> get b field -> get value at 0,0
assert arr[2]['b'][0,0] == 2
# access all values of a specific field
print(arr['a']) # prints all the a arrays
I want to know how to declare a two dimensional array in Python.
arr = [[]]
arr[0].append("aa1")
arr[0].append("aa2")
arr[1].append("bb1")
arr[1].append("bb2")
arr[1].append("bb3")
The first two assignments work fine. But when I try to do, arr[1].append("bb1"), I get the following error:
IndexError: list index out of range.
Am I doing anything silly in trying to declare the 2-D array?
Edit:
but I do not know the number of elements in the array (both rows and columns).
You do not "declare" arrays or anything else in python. You simply assign to a (new) variable. If you want a multidimensional array, simply add a new array as an array element.
arr = []
arr.append([])
arr[0].append('aa1')
arr[0].append('aa2')
or
arr = []
arr.append(['aa1', 'aa2'])
There aren't multidimensional arrays as such in Python, what you have is a list containing other lists.
>>> arr = [[]]
>>> len(arr)
1
What you have done is declare a list containing a single list. So arr[0] contains a list but arr[1] is not defined.
You can define a list containing two lists as follows:
arr = [[],[]]
Or to define a longer list you could use:
>>> arr = [[] for _ in range(5)]
>>> arr
[[], [], [], [], []]
What you shouldn't do is this:
arr = [[]] * 3
As this puts the same list in all three places in the container list:
>>> arr[0].append('test')
>>> arr
[['test'], ['test'], ['test']]
What you're using here are not arrays, but lists (of lists).
If you want multidimensional arrays in Python, you can use Numpy arrays. You'd need to know the shape in advance.
For example:
import numpy as np
arr = np.empty((3, 2), dtype=object)
arr[0, 1] = 'abc'
You try to append to second element in array, but it does not exist.
Create it.
arr = [[]]
arr[0].append("aa1")
arr[0].append("aa2")
arr.append([])
arr[1].append("bb1")
arr[1].append("bb2")
arr[1].append("bb3")
We can create multidimensional array dynamically as follows,
Create 2 variables to read x and y from standard input:
print("Enter the value of x: ")
x=int(input())
print("Enter the value of y: ")
y=int(input())
Create an array of list with initial values filled with 0 or anything using the following code
z=[[0 for row in range(0,x)] for col in range(0,y)]
creates number of rows and columns for your array data.
Read data from standard input:
for i in range(x):
for j in range(y):
z[i][j]=input()
Display the Result:
for i in range(x):
for j in range(y):
print(z[i][j],end=' ')
print("\n")
or use another way to display above dynamically created array is,
for row in z:
print(row)
When constructing multi-dimensional lists in Python I usually use something similar to ThiefMaster's solution, but rather than appending items to index 0, then appending items to index 1, etc., I always use index -1 which is automatically the index of the last item in the array.
i.e.
arr = []
arr.append([])
arr[-1].append("aa1")
arr[-1].append("aa2")
arr.append([])
arr[-1].append("bb1")
arr[-1].append("bb2")
arr[-1].append("bb3")
will produce the 2D-array (actually a list of lists) you're after.
You can first append elements to the initialized array and then for convenience, you can convert it into a numpy array.
import numpy as np
a = [] # declare null array
a.append(['aa1']) # append elements
a.append(['aa2'])
a.append(['aa3'])
print(a)
a_np = np.asarray(a) # convert to numpy array
print(a_np)
a = [[] for index in range(1, n)]
For compititve programming
1) For input the value in an 2D-Array
row=input()
main_list=[]
for i in range(0,row):
temp_list=map(int,raw_input().split(" "))
main_list.append(temp_list)
2) For displaying 2D Array
for i in range(0,row):
for j in range(0,len(main_list[0]):
print main_list[i][j],
print
the above method did not work for me for a for loop, where I wanted to transfer data from a 2D array to a new array under an if the condition. This method would work
a_2d_list = [[1, 2], [3, 4]]
a_2d_list.append([5, 6])
print(a_2d_list)
OUTPUT - [[1, 2], [3, 4], [5, 6]]
x=3#rows
y=3#columns
a=[]#create an empty list first
for i in range(x):
a.append([0]*y)#And again append empty lists to original list
for j in range(y):
a[i][j]=input("Enter the value")
In my case I had to do this:
for index, user in enumerate(users):
table_body.append([])
table_body[index].append(user.user.id)
table_body[index].append(user.user.username)
Output:
[[1, 'john'], [2, 'bill']]