How to repeat a numpy array on both axis? [duplicate] - python

This question already has answers here:
Quick way to upsample numpy array by nearest neighbor tiling [duplicate]
(3 answers)
Closed 3 years ago.
I have a 2d array, lets say the array is:
1 2 3
4 5 6
I want it to repeat 3 times on both axis, so it will look like:
1 1 1 2 2 2 3 3 3
1 1 1 2 2 2 3 3 3
1 1 1 2 2 2 3 3 3
4 4 4 5 5 5 6 6 6
4 4 4 5 5 5 6 6 6
4 4 4 5 5 5 6 6 6
Ive tried using numpy.repeat but unsuccessful.
any suggestions? thx

You can do it with the kronicker product, np.kron and a ones array of the size of the block.
a = np.arange(6).reshape(2,3) + 1
np.kron(a, np.ones((3,3), dtype = a.dtype))
Out[]:
array([[1, 1, 1, 2, 2, 2, 3, 3, 3],
[1, 1, 1, 2, 2, 2, 3, 3, 3],
[1, 1, 1, 2, 2, 2, 3, 3, 3],
[4, 4, 4, 5, 5, 5, 6, 6, 6],
[4, 4, 4, 5, 5, 5, 6, 6, 6],
[4, 4, 4, 5, 5, 5, 6, 6, 6]])

You can do it with numpy repeat
>>> data = np.array([[1,2,3],[4,5,6]])
>>> data
array([[1, 2, 3],
[4, 5, 6]])
>>> np.repeat(data,[3,3,3],axis=1).repeat([3,3],axis=0)
array([[1, 1, 1, 2, 2, 2, 3, 3, 3],
[1, 1, 1, 2, 2, 2, 3, 3, 3],
[1, 1, 1, 2, 2, 2, 3, 3, 3],
[4, 4, 4, 5, 5, 5, 6, 6, 6],
[4, 4, 4, 5, 5, 5, 6, 6, 6],
[4, 4, 4, 5, 5, 5, 6, 6, 6]])

Related

Repeat values of an array on both the axes

Say I have this array:
array = np.array([[1,2,3],[4,5,6],[7,8,9]])
Returns:
123
456
789
How should I go about getting it to return something like this?
111222333
111222333
111222333
444555666
444555666
444555666
777888999
777888999
777888999
You'd have to use np.repeat twice here.
np.repeat(np.repeat(array, 3, axis=1), 3, axis=0)
# [[1 1 1 2 2 2 3 3 3]
# [1 1 1 2 2 2 3 3 3]
# [1 1 1 2 2 2 3 3 3]
# [4 4 4 5 5 5 6 6 6]
# [4 4 4 5 5 5 6 6 6]
# [4 4 4 5 5 5 6 6 6]
# [7 7 7 8 8 8 9 9 9]
# [7 7 7 8 8 8 9 9 9]
# [7 7 7 8 8 8 9 9 9]]
For fun (because the nested repeat will be more efficient), you could use einsum on the input array and an array of ones that has extra dimensions to create a multidimensional array with the dimensions in an ideal order to reshape to the expected 2D shape:
np.einsum('ij,ikjl->ikjl', array, np.ones((3,3,3,3))).reshape(9,9)
The generic method being:
i,j = array.shape
k = 3 # extra rows
l = 3 # extra cols
np.einsum('ij,ikjl->ikjl', a, np.ones((i,k,j,l))).reshape(i*k,j*l)
Output:
array([[1, 1, 1, 2, 2, 2, 3, 3, 3],
[1, 1, 1, 2, 2, 2, 3, 3, 3],
[1, 1, 1, 2, 2, 2, 3, 3, 3],
[4, 4, 4, 5, 5, 5, 6, 6, 6],
[4, 4, 4, 5, 5, 5, 6, 6, 6],
[4, 4, 4, 5, 5, 5, 6, 6, 6],
[7, 7, 7, 8, 8, 8, 9, 9, 9],
[7, 7, 7, 8, 8, 8, 9, 9, 9],
[7, 7, 7, 8, 8, 8, 9, 9, 9]])
What is however nice with this method, is that it's quite easy to change the order to obtain other patterns or work with higher dimensions.
Example with other patterns:
>>> np.einsum('ij,iklj->iklj', a, np.ones((3,3,3,3))).reshape(9,9)
array([[1, 2, 3, 1, 2, 3, 1, 2, 3],
[1, 2, 3, 1, 2, 3, 1, 2, 3],
[1, 2, 3, 1, 2, 3, 1, 2, 3],
[4, 5, 6, 4, 5, 6, 4, 5, 6],
[4, 5, 6, 4, 5, 6, 4, 5, 6],
[4, 5, 6, 4, 5, 6, 4, 5, 6],
[7, 8, 9, 7, 8, 9, 7, 8, 9],
[7, 8, 9, 7, 8, 9, 7, 8, 9],
[7, 8, 9, 7, 8, 9, 7, 8, 9]])
>>> np.einsum('ij,kjil->kjil', a, np.ones((3,3,3,3))).reshape(9,9)
array([[1, 1, 1, 4, 4, 4, 7, 7, 7],
[2, 2, 2, 5, 5, 5, 8, 8, 8],
[3, 3, 3, 6, 6, 6, 9, 9, 9],
[1, 1, 1, 4, 4, 4, 7, 7, 7],
[2, 2, 2, 5, 5, 5, 8, 8, 8],
[3, 3, 3, 6, 6, 6, 9, 9, 9],
[1, 1, 1, 4, 4, 4, 7, 7, 7],
[2, 2, 2, 5, 5, 5, 8, 8, 8],
[3, 3, 3, 6, 6, 6, 9, 9, 9]])

Expand Array w/ duplicates (numpy, python)

If I have a 2 x 2 array like this:
1 2
3 4
and i want to double it into a 4 x 4 array like this:
1 1 2 2
1 1 2 2
3 3 4 4
3 3 4 4
or triple it into a 6 x 6 array like this:
1 1 1 2 2 2
1 1 1 2 2 2
1 1 1 2 2 2
3 3 3 4 4 4
3 3 3 4 4 4
3 3 3 4 4 4
etc etc... how would I go about doing that?
Not sure if it's the best solution but definitely works =p
(you could use just one function as well, I divided it to make it easier to read)
matrix = [[1, 2], [3, 4]]
def expand_array(input, multiplyer):
return [x for x in input for _ in range(multiplyer)]
def expand_matrix(input, multiplyer):
return [expand_array(x, multiplyer) for x in input for _ in range(multiplyer)]
print(matrix)
print(expand_matrix(matrix, 1))
print(expand_matrix(matrix, 2))
print(expand_matrix(matrix, 3))
print(expand_matrix(matrix, 4))
"""
[[1, 2], [3, 4]]
[[1, 2], [3, 4]]
[[1, 1, 2, 2], [1, 1, 2, 2], [3, 3, 4, 4], [3, 3, 4, 4]]
[[1, 1, 1, 2, 2, 2], [1, 1, 1, 2, 2, 2], [1, 1, 1, 2, 2, 2], [3, 3, 3, 4, 4, 4], [3, 3, 3, 4, 4, 4], [3, 3, 3, 4, 4, 4]]
[[1, 1, 1, 1, 2, 2, 2, 2], [1, 1, 1, 1, 2, 2, 2, 2], [1, 1, 1, 1, 2, 2, 2, 2], [1, 1, 1, 1, 2, 2, 2, 2], [3, 3, 3, 3, 4, 4, 4, 4], [3, 3, 3, 3, 4, 4, 4, 4], [3, 3, 3, 3, 4, 4, 4, 4], [3, 3, 3, 3, 4, 4, 4, 4]]
"""
You can use np.repeat:
a = [[1,2],
[3,4]]
dim_expand = 2 # double
b = np.repeat(a, dim_expand, axis=0).repeat(dim_expand, axis=1)
print(b)
"""
[[1 1 2 2]
[1 1 2 2]
[3 3 4 4]
[3 3 4 4]]
"""
dim_expand = 3 # triple
b = np.repeat(a, dim_expand, axis=0).repeat(dim_expand, axis=1)
print(b)
"""
[[1 1 1 2 2 2]
[1 1 1 2 2 2]
[1 1 1 2 2 2]
[3 3 3 4 4 4]
[3 3 3 4 4 4]
[3 3 3 4 4 4]]
"""

Equal Less and Greater List python

Hi guys I'm trying to figure out how to compare the previous number with the current one until the last digit.
this is the list:
[1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 1, 2, 3, 4, 5, 6, 1, 2, 3, 4, 5, 6, 7, 1, 2, 3, 1, 2, 3, 4, 5, 1, 2, 3, 4, 1, 2, 3, 4, 5, 6, 7, 8, 1, 2, 3, 4, 5, 6, 7]
I need on each sequence of iteration the highest number (e.g. in the first one it's 10).
After the sequence is finalized it again begins counting from the beginning (1,2,3,4..etc) until a condition is reached.
Now the problem is that I get the result correctly all until the very last iteration, the max number should be in the 7 (as you can see: 1,2,3,4,5,6,7)
but the algorithm skips it. I tried with zip function even with iter loop the same issue.
example codes that yield the same results are the following:
def printElements(arr, n):
# Traverse array from index 1 to n-2
# and check for the given condition
for i in range(1, n - 1, 1):
if (arr[i] > arr[i - 1] and
arr[i] > arr[i + 1]):
print(arr[i], end = " ")
# Driver Code
if __name__ == '__main__':
arr = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 1, 2, 3, 4, 5, 6, 1, 2, 3, 4, 5, 6, 7, 1, 2, 3, 1, 2, 3, 4, 5, 1, 2, 3, 4, 1, 2, 3, 4, 5, 6, 7, 8, 1, 2, 3, 4, 5, 6, 7]
n = len(arr)
printElements(arr, n)
print(count_shelf)
arr = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 1, 2, 3, 4, 5, 6, 1, 2, 3, 4, 5, 6, 7, 1, 2, 3, 1, 2, 3, 4, 5, 1, 2, 3, 4, 1, 2, 3, 4, 5, 6, 7, 8, 1, 2, 3, 4, 5, 6, 7]
for prev, current in zip(arr, arr[1:]):
print(prev,current)
if prev > current:
x = prev
print(prev,'prev greater')
print(current,'current')
results of the last alg:
2 3
3 4
4 5
5 6
6 7
7 8
8 9
9 10
10 1
10 prev greater
1 current
1 2
2 3
3 4
4 5
5 6
6 1
6 prev greater
1 current
1 2
2 3
3 4
4 5
5 6
6 7
7 1
7 prev greater
1 current
1 2
2 3
3 1
3 prev greater
1 current
1 2
2 3
3 4
4 5
5 1
5 prev greater
1 current
1 2
2 3
3 4
4 1
4 prev greater
1 current
1 2
2 3
3 4
4 5
5 6
6 7
7 8
8 1
8 prev greater
1 current
1 2
2 3
3 4
4 5
5 6
6 7 ``
arr.append(float('-inf'))
def printElements(arr, n):
# Traverse array from index 1 to n-2
# and check for the given condition
for i in range(1, n - 1, 1):
if (arr[i] > arr[i - 1] and
arr[i] > arr[i + 1]):
print(arr[i], end = " ")
# Driver Code
if __name__ == '__main__':
arr = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 1, 2, 3, 4, 5, 6, 1, 2, 3, 4, 5, 6, 7, 1, 2, 3, 1, 2, 3, 4, 5, 1, 2, 3, 4, 1, 2, 3, 4, 5, 6, 7, 8, 1, 2, 3, 4, 5, 6, 7]
arr.append(float('-inf'))
n = len(arr)
printElements(arr, n)
You can use list comprehension to get the maximum value for each sequence.
lst = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 1, 2, 3, 4, 5, 6, 1, 2, 3, 4, 5, 6, 7, 1, 2, 3, 1, 2, 3, 4, 5, 1, 2, 3, 4, 1, 2, 3, 4, 5, 6, 7, 8, 1, 2, 3, 4, 5, 6, 7]
maxvals = [lst[x] for x in range(len(lst)) if x == len(lst)-1 or lst[x] > lst[x+1]]
print(maxvals)
Output
[10, 6, 7, 3, 5, 4, 8, 7]
I don't see any way to use zip to find the solution.

R sequence function in Python

pandas version: 1.2
I am trying to take a python pandas dataframe column pandas and create the same type of logic as in R that would be
ss=sequence(df$los)
Which produces for the first two records
[1] 1 2 3 4 5 1 2 3 4 5
Example dataframe:
df = pd.DataFrame([('test', 5), ('t2', 5), ('t3', 2), ('t4', 6)],
columns=['first', 'los'])
df
first los
0 test 5
1 t2 5
2 t3 2
3 t4 6
So the first row is sequenced 1-5 and second row is sequenced 1-5 and third row is sequenced 1-2 etc. In R this becomes one sequenced list. I would like that is python.
What I have been able to do is.
ss = df['los']
ss.apply(lambda x: np.array(range(1, x)))
18 [1, 2, 3, 4, 5]
90 [1, 2, 3, 4, 5]
105 [1,2]
106 [1, 2, 3, 4, 5, 6]
Which is close but then I need to combine it into a single pd.Series so that it should be:
[1, 2, 3, 4, 5, 1, 2, 3, 4, 5, 1, 2, 1, 2, 3, 4, 5, 6]
Use explode():
df.los.apply(lambda x: np.arange(1, x+1)).explode().tolist()
Output:
[1, 2, 3, 4, 5, 1, 2, 3, 4, 5, 1, 2, 1, 2, 3, 4, 5, 6]
Note - you can skip the ss assignment step, and use np.arange to streamline a bit.
You can just use concatenate:
np.concatenate([np.arange(x)+1 for x in df['los']])
Output:
array([1, 2, 3, 4, 5, 1, 2, 3, 4, 5, 1, 2, 1, 2, 3, 4, 5, 6])

Add numpy array as column to Pandas data frame

I have a Pandas data frame object of shape (X,Y) that looks like this:
[[1, 2, 3],
[4, 5, 6],
[7, 8, 9]]
and a numpy sparse matrix (CSC) of shape (X,Z) that looks something like this
[[0, 1, 0],
[0, 0, 1],
[1, 0, 0]]
How can I add the content from the matrix to the data frame in a new named column such that the data frame will end up like this:
[[1, 2, 3, [0, 1, 0]],
[4, 5, 6, [0, 0, 1]],
[7, 8, 9, [1, 0, 0]]]
Notice the data frame now has shape (X, Y+1) and rows from the matrix are elements in the data frame.
import numpy as np
import pandas as pd
import scipy.sparse as sparse
df = pd.DataFrame(np.arange(1,10).reshape(3,3))
arr = sparse.coo_matrix(([1,1,1], ([0,1,2], [1,2,0])), shape=(3,3))
df['newcol'] = arr.toarray().tolist()
print(df)
yields
0 1 2 newcol
0 1 2 3 [0, 1, 0]
1 4 5 6 [0, 0, 1]
2 7 8 9 [1, 0, 0]
Consider using a higher dimensional datastructure (a Panel), rather than storing an array in your column:
In [11]: p = pd.Panel({'df': df, 'csc': csc})
In [12]: p.df
Out[12]:
0 1 2
0 1 2 3
1 4 5 6
2 7 8 9
In [13]: p.csc
Out[13]:
0 1 2
0 0 1 0
1 0 0 1
2 1 0 0
Look at cross-sections etc, etc, etc.
In [14]: p.xs(0)
Out[14]:
csc df
0 0 1
1 1 2
2 0 3
See the docs for more on Panels.
df = pd.DataFrame(np.arange(1,10).reshape(3,3))
df['newcol'] = pd.Series(your_2d_numpy_array)
You can add and retrieve a numpy array from dataframe using this:
import numpy as np
import pandas as pd
df = pd.DataFrame({'b':range(10)}) # target dataframe
a = np.random.normal(size=(10,2)) # numpy array
df['a']=a.tolist() # save array
np.array(df['a'].tolist()) # retrieve array
This builds on the previous answer that confused me because of the sparse part and this works well for a non-sparse numpy arrray.
Here is other example:
import numpy as np
import pandas as pd
""" This just creates a list of touples, and each element of the touple is an array"""
a = [ (np.random.randint(1,10,10), np.array([0,1,2,3,4,5,6,7,8,9])) for i in
range(0,10) ]
""" Panda DataFrame will allocate each of the arrays , contained as a touple
element , as column"""
df = pd.DataFrame(data =a,columns=['random_num','sequential_num'])
The secret in general is to allocate the data in the form a = [ (array_11, array_12,...,array_1n),...,(array_m1,array_m2,...,array_mn) ] and panda DataFrame will order the data in n columns of arrays. Of course , arrays of arrays could be used instead of touples, in that case the form would be :
a = [ [array_11, array_12,...,array_1n],...,[array_m1,array_m2,...,array_mn] ]
This is the output if you print(df) from the code above:
random_num sequential_num
0 [7, 9, 2, 2, 5, 3, 5, 3, 1, 4] [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
1 [8, 7, 9, 8, 1, 2, 2, 6, 6, 3] [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
2 [3, 4, 1, 2, 2, 1, 4, 2, 6, 1] [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
3 [3, 1, 1, 1, 6, 2, 8, 6, 7, 9] [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
4 [4, 2, 8, 5, 4, 1, 2, 2, 3, 3] [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
5 [3, 2, 7, 4, 1, 5, 1, 4, 6, 3] [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
6 [5, 7, 3, 9, 7, 8, 4, 1, 3, 1] [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
7 [7, 4, 7, 6, 2, 6, 3, 2, 5, 6] [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
8 [3, 1, 6, 3, 2, 1, 5, 2, 2, 9] [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
9 [7, 2, 3, 9, 5, 5, 8, 6, 9, 8] [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
Other variation of the example above:
b = [ (i,"text",[14, 5,], np.array([0,1,2,3,4,5,6,7,8,9])) for i in
range(0,10) ]
df = pd.DataFrame(data=b,columns=['Number','Text','2Elemnt_array','10Element_array'])
Output of df:
Number Text 2Elemnt_array 10Element_array
0 0 text [14, 5] [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
1 1 text [14, 5] [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
2 2 text [14, 5] [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
3 3 text [14, 5] [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
4 4 text [14, 5] [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
5 5 text [14, 5] [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
6 6 text [14, 5] [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
7 7 text [14, 5] [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
8 8 text [14, 5] [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
9 9 text [14, 5] [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
If you want to add other columns of arrays, then:
df['3Element_array']=[([1,2,3]),([1,2,3]),([1,2,3]),([1,2,3]),([1,2,3]),([1,2,3]),([1,2,3]),([1,2,3]),([1,2,3]),([1,2,3])]
The final output of df will be:
Number Text 2Elemnt_array 10Element_array 3Element_array
0 0 text [14, 5] [0, 1, 2, 3, 4, 5, 6, 7, 8, 9] [1, 2, 3]
1 1 text [14, 5] [0, 1, 2, 3, 4, 5, 6, 7, 8, 9] [1, 2, 3]
2 2 text [14, 5] [0, 1, 2, 3, 4, 5, 6, 7, 8, 9] [1, 2, 3]
3 3 text [14, 5] [0, 1, 2, 3, 4, 5, 6, 7, 8, 9] [1, 2, 3]
4 4 text [14, 5] [0, 1, 2, 3, 4, 5, 6, 7, 8, 9] [1, 2, 3]
5 5 text [14, 5] [0, 1, 2, 3, 4, 5, 6, 7, 8, 9] [1, 2, 3]
6 6 text [14, 5] [0, 1, 2, 3, 4, 5, 6, 7, 8, 9] [1, 2, 3]
7 7 text [14, 5] [0, 1, 2, 3, 4, 5, 6, 7, 8, 9] [1, 2, 3]
8 8 text [14, 5] [0, 1, 2, 3, 4, 5, 6, 7, 8, 9] [1, 2, 3]
9 9 text [14, 5] [0, 1, 2, 3, 4, 5, 6, 7, 8, 9] [1, 2, 3]

Categories

Resources