Numpy: How is this code different from each other? - python

Let r be an array where each element is an column index (less than N, size of r is M) and P be a MxN array.
The following two snippets behave differently. Why?
1.
P[:, r] += 1
2.
for i in range(len(r)):
P[i, r[i]] += 1

The first one selects an entire column for each element of r. The second just an element. You can directly compare the two cases like so:
>>> P = np.arange(12).reshape(4, 3)
>>> r = np.random.randint(0, 3, (4,))
>>> r
array([1, 1, 2, 0])
>>>
>>> P[:, r]
array([[ 1, 1, 2, 0],
[ 4, 4, 5, 3],
[ 7, 7, 8, 6],
[10, 10, 11, 9]])
>>> P[np.arange(4), r]
array([1, 4, 8, 9])
As you can see the second produces essentially the diagonal of the first.
You may profit from having a look at section "Combining advanced and basic indexing" in the numpy docs.

P[:, r] in the first snippet selects the entire first axis (: is like 0:-1 here) and r-th second axis.
P[i, r[i]] in the for loop selects only the i-th first axis and r[i]-th second axis, which is just a single element.
Once that's clear, it's no surprising the two give different results.

Related

How to split the numpy array into separate arrays in python. The number is splits is given by the user and the splitting is based on index

I want to split my numpy array into separate arrays. The separation must be based on the index. The split count is given by the user.
For example,
The input array: my_array=[1,2,3,4,5,6,7,8,9,10]
If user gives split count =2,
then, the split must be like
my_array1=[1,3,5,7,9]
my_array2=[2,4,6,8,10]
if user gives split count=3, then
the output array must be
my_array1=[1,4,7,10]
my_array2=[2,5,8]
my_array3=[3,6,9]
could anyone please explain, I did for split count 2 using even odd concept
for i in range(len(outputarray)):
if i%2==0:
even_array.append(outputarray[i])
else:
odd_array.append(outputarray[i])
I don't know how to do the split for variable counts like 3,4,5 based on the index.
You can use indexing by vector (aka fancy indexing) for it:
>>> a=np.array([1,2,3,4,5,6,7,8,9,10])
>>> n = 3
>>> [a[np.arange(i, len(a), n)] for i in range(n)]
[array([ 1, 4, 7, 10]), array([2, 5, 8]), array([3, 6, 9])]
Explanation
arange(i, len(a), n) generates an array of integers starting with i, spanning no longer than len(a) with step n. For example, for i = 0 it generates an array
>>> np.arange(0, 10, 3)
array([0, 3, 6, 9])
Now when you index an array with another array, you get the elements at the requested indices:
>>> a[[0, 3, 6, 9]]
array([ 1, 4, 7, 10])
These steps are repeated for i=1..2 resulting in the desired list of arrays.
Here is a python-only way of doing your task
def split_array(array, n=3):
arrays = [[] for _ in range(n)]
for x in range(n):
for i in range(n):
arrays[i] = [array[x] for x in range(len(array)) if x%n==i]
return arrays
Input:
my_array=[1,2,3,4,5,6,7,8,9,10]
print(split_array(my_array, n=3))
Output:
[[1, 4, 7, 10], [2, 5, 8], [3, 6, 9]]

array split with overlap [duplicate]

Let's say I have a list A
A = [1,2,3,4,5,6,7,8,9,10]
I would like to create a new list (say B) using the above list in the following order.
B = [[1,2,3], [3,4,5], [5,6,7], [7,8,9], [9,10,]]
i.e. the first 3 numbers as A[0,1,2] and the second 3 numbers as A[2,3,4] and so on.
I believe there is a function in numpy for such a kind of operation.
Simply use Python's built-in list comprehension with list-slicing to do this:
>>> A = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
>>> size = 3
>>> step = 2
>>> A = [A[i : i + size] for i in range(0, len(A), step)]
This gives you what you're looking for:
>>> A
[[1, 2, 3], [3, 4, 5], [5, 6, 7], [7, 8, 9], [9, 10]]
But you'll have to write a couple of lines to make sure that your code doesn't break for unprecedented values of size/step.
The 'duplicate' Partition array into N chunks with Numpy suggests np.split - that's fine for non-overlapping splits. The example (added after the close?) overlaps, one element across each subarray. Plus it pads with a 0.
How do you split a list into evenly sized chunks? has some good list answers, with various forms of generator or list comprehension, but at first glance I didn't see any that allow for overlaps - though with a clever use of iterators (such as iterator.tee) that should be possible.
We can blame this on poor question wording, but it is not a duplicate.
Working from the example and the comment:
Here my window size is 3., i.e each splitted list should have 3 elements first split [1,2,3] and the step size is 2 , So the second split start should start from 3rd element and 2nd split is [3,4,5] respectively.
Here is an advanced solution using as_strided
In [64]: ast=np.lib.index_tricks.as_strided # shorthand
In [65]: A=np.arange(1,12)
In [66]: ast(A,shape=[5,3],strides=(8,4))
Out[66]:
array([[ 1, 2, 3],
[ 3, 4, 5],
[ 5, 6, 7],
[ 7, 8, 9],
[ 9, 10, 11]])
I increased the range of A because I didn't want to deal with the 0 pad.
Choosing the target shape is easy, 5 sets of 3. Choosing the strides requires more knowledge about striding.
In [69]: x.strides
Out[69]: (4,)
The 1d striding, or stepping from one element to the next, is 4 bytes (the length one element). The step from one row to the next is 2 elements of the original, or 2*4 bytes.
as_strided produces a view. Thus changing an element in it will affect the original, and may change overlapping values. Add .copy() to make a copy; math with the strided array will also produce a copy.
Changing the strides can give non overlapping rows - but be careful about the shape - it is possible to access values outside of the original data buffer.
In [82]: ast(A,shape=[4,3],strides=(12,4))
Out[82]:
array([[ 1, 2, 3],
[ 4, 5, 6],
[ 7, 8, 9],
[10, 11, 17]])
In [84]: ast(A,shape=[3,3],strides=(16,4))
Out[84]:
array([[ 1, 2, 3],
[ 5, 6, 7],
[ 9, 10, 11]])
edit
A new function gives a safer version of as_strided.
np.lib.strided_tricks.sliding_window_view(np.arange(1,10),3)[::2]
This function that I wrote may help you, although it only outputs filled chunks with a length of len_chunk:
def overlap(array, len_chunk, len_sep=1):
"""Returns a matrix of all full overlapping chunks of the input `array`, with a chunk
length of `len_chunk` and a separation length of `len_sep`. Begins with the first full
chunk in the array. """
n_arrays = np.int(np.ceil((array.size - len_chunk + 1) / len_sep))
array_matrix = np.tile(array, n_arrays).reshape(n_arrays, -1)
columns = np.array(((len_sep*np.arange(0, n_arrays)).reshape(n_arrays, -1) + np.tile(
np.arange(0, len_chunk), n_arrays).reshape(n_arrays, -1)), dtype=np.intp)
rows = np.array((np.arange(n_arrays).reshape(n_arrays, -1) + np.tile(
np.zeros(len_chunk), n_arrays).reshape(n_arrays, -1)), dtype=np.intp)
return array_matrix[rows, columns]

Remove head and tail from numpy array PYTHON

I have a numpy.ndarray, and want to remove first h elements and last t.
As I see, the more general way is by selecting:
h, t = 1, 1
my_array = [0,1,2,3,4,5]
middle = my_array[h:-t]
and the middle is [1,2,3,4]. This is correct, but when I want not to remove anything, I used h = 0 and t = 0 since I was trying to remove nothing, but this returns empty array. I know it is because of t = 0 and I also know that an if condition for this border case would solve it with my_array[h:] but I don't want this solution (my problem is a little more complex, with more dimensions, code will become ugly)
Any ideas?
Instead, use
middle = my_array[h:len(my_array)-t]
For completeness, here's the trial run:
my_array = [0,1,2,3,4,5]
h,t = 0,0
middle = my_array[h:len(my_array)-t]
print(middle)
Output: [0, 1, 2, 3, 4, 5]
This example was just for a standard array. Since your ultimate goal is to work with numpy multidimensional arrays, this problem is actually a bit trickier. When you say you want to remove the first h elements and the last t elements, are we guaranteed that h and t satisfy the proper divisibility criteria so that the result will be a well-formed array?
I actually think the cleanest solution is simply to use this solution, but divide out by the appropriate factor first. For example, in two dimensions:
h = 3
t = 6
a = numpy.array([[ 1, 2, 3],
[ 4, 5, 6],
[ 7, 8, 9],
[10, 11, 12]])
d = numpy.prod(numpy.shape(a)[1:])
mid_a = a[int(h/3):int(len(a)-t/3)]
print(mid_a)
Output: array([[4, 5, 6]])
I have included the int casts in the indices because python 3 automatically promotes division to float, even when the numerator evenly divides the denominator.
The i:j can be replaced with a slice object. and ':j' with slice(None,j), etc:
In [55]: alist = [0,1,2,3,4,5]
In [56]: h,t=1,-1; alist[slice(h,t)]
Out[56]: [1, 2, 3, 4]
In [57]: h,t=None,-1; alist[slice(h,t)]
Out[57]: [0, 1, 2, 3, 4]
In [58]: h,t=None,None; alist[slice(h,t)]
Out[58]: [0, 1, 2, 3, 4, 5]
This works for lists and arrays. For multidimensional arrays use a tuple of indices, which can include slice objects
x[i:j, k:l]
x[(slice(i,j), Ellipsis, slice(k,l))]

Split list into separate but overlapping chunks

Let's say I have a list A
A = [1,2,3,4,5,6,7,8,9,10]
I would like to create a new list (say B) using the above list in the following order.
B = [[1,2,3], [3,4,5], [5,6,7], [7,8,9], [9,10,]]
i.e. the first 3 numbers as A[0,1,2] and the second 3 numbers as A[2,3,4] and so on.
I believe there is a function in numpy for such a kind of operation.
Simply use Python's built-in list comprehension with list-slicing to do this:
>>> A = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
>>> size = 3
>>> step = 2
>>> A = [A[i : i + size] for i in range(0, len(A), step)]
This gives you what you're looking for:
>>> A
[[1, 2, 3], [3, 4, 5], [5, 6, 7], [7, 8, 9], [9, 10]]
But you'll have to write a couple of lines to make sure that your code doesn't break for unprecedented values of size/step.
The 'duplicate' Partition array into N chunks with Numpy suggests np.split - that's fine for non-overlapping splits. The example (added after the close?) overlaps, one element across each subarray. Plus it pads with a 0.
How do you split a list into evenly sized chunks? has some good list answers, with various forms of generator or list comprehension, but at first glance I didn't see any that allow for overlaps - though with a clever use of iterators (such as iterator.tee) that should be possible.
We can blame this on poor question wording, but it is not a duplicate.
Working from the example and the comment:
Here my window size is 3., i.e each splitted list should have 3 elements first split [1,2,3] and the step size is 2 , So the second split start should start from 3rd element and 2nd split is [3,4,5] respectively.
Here is an advanced solution using as_strided
In [64]: ast=np.lib.index_tricks.as_strided # shorthand
In [65]: A=np.arange(1,12)
In [66]: ast(A,shape=[5,3],strides=(8,4))
Out[66]:
array([[ 1, 2, 3],
[ 3, 4, 5],
[ 5, 6, 7],
[ 7, 8, 9],
[ 9, 10, 11]])
I increased the range of A because I didn't want to deal with the 0 pad.
Choosing the target shape is easy, 5 sets of 3. Choosing the strides requires more knowledge about striding.
In [69]: x.strides
Out[69]: (4,)
The 1d striding, or stepping from one element to the next, is 4 bytes (the length one element). The step from one row to the next is 2 elements of the original, or 2*4 bytes.
as_strided produces a view. Thus changing an element in it will affect the original, and may change overlapping values. Add .copy() to make a copy; math with the strided array will also produce a copy.
Changing the strides can give non overlapping rows - but be careful about the shape - it is possible to access values outside of the original data buffer.
In [82]: ast(A,shape=[4,3],strides=(12,4))
Out[82]:
array([[ 1, 2, 3],
[ 4, 5, 6],
[ 7, 8, 9],
[10, 11, 17]])
In [84]: ast(A,shape=[3,3],strides=(16,4))
Out[84]:
array([[ 1, 2, 3],
[ 5, 6, 7],
[ 9, 10, 11]])
edit
A new function gives a safer version of as_strided.
np.lib.strided_tricks.sliding_window_view(np.arange(1,10),3)[::2]
This function that I wrote may help you, although it only outputs filled chunks with a length of len_chunk:
def overlap(array, len_chunk, len_sep=1):
"""Returns a matrix of all full overlapping chunks of the input `array`, with a chunk
length of `len_chunk` and a separation length of `len_sep`. Begins with the first full
chunk in the array. """
n_arrays = np.int(np.ceil((array.size - len_chunk + 1) / len_sep))
array_matrix = np.tile(array, n_arrays).reshape(n_arrays, -1)
columns = np.array(((len_sep*np.arange(0, n_arrays)).reshape(n_arrays, -1) + np.tile(
np.arange(0, len_chunk), n_arrays).reshape(n_arrays, -1)), dtype=np.intp)
rows = np.array((np.arange(n_arrays).reshape(n_arrays, -1) + np.tile(
np.zeros(len_chunk), n_arrays).reshape(n_arrays, -1)), dtype=np.intp)
return array_matrix[rows, columns]

How can I find the dimensions of a matrix in Python?

How can I find the dimensions of a matrix in Python. Len(A) returns only one variable.
Edit:
close = dataobj.get_data(timestamps, symbols, closefield)
Is (I assume) generating a matrix of integers (less likely strings). I need to find the size of that matrix, so I can run some tests without having to iterate through all of the elements. As far as the data type goes, I assume it's an array of arrays (or list of lists).
The number of rows of a list of lists would be: len(A) and the number of columns len(A[0]) given that all rows have the same number of columns, i.e. all lists in each index are of the same size.
If you are using NumPy arrays, shape can be used.
For example
>>> a = numpy.array([[[1,2,3],[1,2,3]],[[12,3,4],[2,1,3]]])
>>> a
array([[[ 1, 2, 3],
[ 1, 2, 3]],
[[12, 3, 4],
[ 2, 1, 3]]])
>>> a.shape
(2, 2, 3)
As Ayman farhat mentioned
you can use the simple method len(matrix) to get the length of rows and get the length of the first row to get the no. of columns using len(matrix[0]) :
>>> a=[[1,5,6,8],[1,2,5,9],[7,5,6,2]]
>>> len(a)
3
>>> len(a[0])
4
Also you can use a library that helps you with matrices "numpy":
>>> import numpy
>>> numpy.shape(a)
(3,4)
To get just a correct number of dimensions in NumPy:
len(a.shape)
In the first case:
import numpy as np
a = np.array([[[1,2,3],[1,2,3]],[[12,3,4],[2,1,3]]])
print("shape = ",np.shape(a))
print("dimensions = ",len(a.shape))
The output will be:
shape = (2, 2, 3)
dimensions = 3
m = [[1, 1, 1, 0],[0, 5, 0, 1],[2, 1, 3, 10]]
print(len(m),len(m[0]))
Output
(3 4)
The correct answer is the following:
import numpy
numpy.shape(a)
Suppose you have a which is an array. to get the dimensions of an array you should use shape.
import numpy as np
a = np.array([[3,20,99],[-13,4.5,26],[0,-1,20],[5,78,-19]])
a.shape
The output of this will be
(4,3)
You may use as following to get Height and Weight of an Numpy array:
int height = arr.shape[0]
int width = arr.shape[1]
If your array has multiple dimensions, you can increase the index to access them.
You simply can find a matrix dimension by using Numpy:
import numpy as np
x = np.arange(24).reshape((6, 4))
x.ndim
output will be:
2
It means this matrix is a 2 dimensional matrix.
x.shape
Will show you the size of each dimension. The shape for x is equal to:
(6, 4)
A simple way I look at it:
example:
h=np.array([[[[1,2,3],[3,4,5]],[[5,6,7],[7,8,9]],[[9,10,11],[12,13,14]]]])
h.ndim
4
h
array([[[[ 1, 2, 3],
[ 3, 4, 5]],
[[ 5, 6, 7],
[ 7, 8, 9]],
[[ 9, 10, 11],
[12, 13, 14]]]])
If you closely observe, the number of opening square brackets at the beginning is what defines the dimension of the array.
In the above array to access 7, the below indexing is used,
h[0,1,1,0]
However if we change the array to 3 dimensions as below,
h=np.array([[[1,2,3],[3,4,5]],[[5,6,7],[7,8,9]],[[9,10,11],[12,13,14]]])
h.ndim
3
h
array([[[ 1, 2, 3],
[ 3, 4, 5]],
[[ 5, 6, 7],
[ 7, 8, 9]],
[[ 9, 10, 11],
[12, 13, 14]]])
To access element 7 in the above array, the index is h[1,1,0]

Categories

Resources