Related
Say I have this array:
array = np.array([[1,2,3],[4,5,6],[7,8,9]])
Returns:
123
456
789
How should I go about getting it to return something like this?
111222333
111222333
111222333
444555666
444555666
444555666
777888999
777888999
777888999
You'd have to use np.repeat twice here.
np.repeat(np.repeat(array, 3, axis=1), 3, axis=0)
# [[1 1 1 2 2 2 3 3 3]
# [1 1 1 2 2 2 3 3 3]
# [1 1 1 2 2 2 3 3 3]
# [4 4 4 5 5 5 6 6 6]
# [4 4 4 5 5 5 6 6 6]
# [4 4 4 5 5 5 6 6 6]
# [7 7 7 8 8 8 9 9 9]
# [7 7 7 8 8 8 9 9 9]
# [7 7 7 8 8 8 9 9 9]]
For fun (because the nested repeat will be more efficient), you could use einsum on the input array and an array of ones that has extra dimensions to create a multidimensional array with the dimensions in an ideal order to reshape to the expected 2D shape:
np.einsum('ij,ikjl->ikjl', array, np.ones((3,3,3,3))).reshape(9,9)
The generic method being:
i,j = array.shape
k = 3 # extra rows
l = 3 # extra cols
np.einsum('ij,ikjl->ikjl', a, np.ones((i,k,j,l))).reshape(i*k,j*l)
Output:
array([[1, 1, 1, 2, 2, 2, 3, 3, 3],
[1, 1, 1, 2, 2, 2, 3, 3, 3],
[1, 1, 1, 2, 2, 2, 3, 3, 3],
[4, 4, 4, 5, 5, 5, 6, 6, 6],
[4, 4, 4, 5, 5, 5, 6, 6, 6],
[4, 4, 4, 5, 5, 5, 6, 6, 6],
[7, 7, 7, 8, 8, 8, 9, 9, 9],
[7, 7, 7, 8, 8, 8, 9, 9, 9],
[7, 7, 7, 8, 8, 8, 9, 9, 9]])
What is however nice with this method, is that it's quite easy to change the order to obtain other patterns or work with higher dimensions.
Example with other patterns:
>>> np.einsum('ij,iklj->iklj', a, np.ones((3,3,3,3))).reshape(9,9)
array([[1, 2, 3, 1, 2, 3, 1, 2, 3],
[1, 2, 3, 1, 2, 3, 1, 2, 3],
[1, 2, 3, 1, 2, 3, 1, 2, 3],
[4, 5, 6, 4, 5, 6, 4, 5, 6],
[4, 5, 6, 4, 5, 6, 4, 5, 6],
[4, 5, 6, 4, 5, 6, 4, 5, 6],
[7, 8, 9, 7, 8, 9, 7, 8, 9],
[7, 8, 9, 7, 8, 9, 7, 8, 9],
[7, 8, 9, 7, 8, 9, 7, 8, 9]])
>>> np.einsum('ij,kjil->kjil', a, np.ones((3,3,3,3))).reshape(9,9)
array([[1, 1, 1, 4, 4, 4, 7, 7, 7],
[2, 2, 2, 5, 5, 5, 8, 8, 8],
[3, 3, 3, 6, 6, 6, 9, 9, 9],
[1, 1, 1, 4, 4, 4, 7, 7, 7],
[2, 2, 2, 5, 5, 5, 8, 8, 8],
[3, 3, 3, 6, 6, 6, 9, 9, 9],
[1, 1, 1, 4, 4, 4, 7, 7, 7],
[2, 2, 2, 5, 5, 5, 8, 8, 8],
[3, 3, 3, 6, 6, 6, 9, 9, 9]])
If I have a 2 x 2 array like this:
1 2
3 4
and i want to double it into a 4 x 4 array like this:
1 1 2 2
1 1 2 2
3 3 4 4
3 3 4 4
or triple it into a 6 x 6 array like this:
1 1 1 2 2 2
1 1 1 2 2 2
1 1 1 2 2 2
3 3 3 4 4 4
3 3 3 4 4 4
3 3 3 4 4 4
etc etc... how would I go about doing that?
Not sure if it's the best solution but definitely works =p
(you could use just one function as well, I divided it to make it easier to read)
matrix = [[1, 2], [3, 4]]
def expand_array(input, multiplyer):
return [x for x in input for _ in range(multiplyer)]
def expand_matrix(input, multiplyer):
return [expand_array(x, multiplyer) for x in input for _ in range(multiplyer)]
print(matrix)
print(expand_matrix(matrix, 1))
print(expand_matrix(matrix, 2))
print(expand_matrix(matrix, 3))
print(expand_matrix(matrix, 4))
"""
[[1, 2], [3, 4]]
[[1, 2], [3, 4]]
[[1, 1, 2, 2], [1, 1, 2, 2], [3, 3, 4, 4], [3, 3, 4, 4]]
[[1, 1, 1, 2, 2, 2], [1, 1, 1, 2, 2, 2], [1, 1, 1, 2, 2, 2], [3, 3, 3, 4, 4, 4], [3, 3, 3, 4, 4, 4], [3, 3, 3, 4, 4, 4]]
[[1, 1, 1, 1, 2, 2, 2, 2], [1, 1, 1, 1, 2, 2, 2, 2], [1, 1, 1, 1, 2, 2, 2, 2], [1, 1, 1, 1, 2, 2, 2, 2], [3, 3, 3, 3, 4, 4, 4, 4], [3, 3, 3, 3, 4, 4, 4, 4], [3, 3, 3, 3, 4, 4, 4, 4], [3, 3, 3, 3, 4, 4, 4, 4]]
"""
You can use np.repeat:
a = [[1,2],
[3,4]]
dim_expand = 2 # double
b = np.repeat(a, dim_expand, axis=0).repeat(dim_expand, axis=1)
print(b)
"""
[[1 1 2 2]
[1 1 2 2]
[3 3 4 4]
[3 3 4 4]]
"""
dim_expand = 3 # triple
b = np.repeat(a, dim_expand, axis=0).repeat(dim_expand, axis=1)
print(b)
"""
[[1 1 1 2 2 2]
[1 1 1 2 2 2]
[1 1 1 2 2 2]
[3 3 3 4 4 4]
[3 3 3 4 4 4]
[3 3 3 4 4 4]]
"""
Hi guys I'm trying to figure out how to compare the previous number with the current one until the last digit.
this is the list:
[1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 1, 2, 3, 4, 5, 6, 1, 2, 3, 4, 5, 6, 7, 1, 2, 3, 1, 2, 3, 4, 5, 1, 2, 3, 4, 1, 2, 3, 4, 5, 6, 7, 8, 1, 2, 3, 4, 5, 6, 7]
I need on each sequence of iteration the highest number (e.g. in the first one it's 10).
After the sequence is finalized it again begins counting from the beginning (1,2,3,4..etc) until a condition is reached.
Now the problem is that I get the result correctly all until the very last iteration, the max number should be in the 7 (as you can see: 1,2,3,4,5,6,7)
but the algorithm skips it. I tried with zip function even with iter loop the same issue.
example codes that yield the same results are the following:
def printElements(arr, n):
# Traverse array from index 1 to n-2
# and check for the given condition
for i in range(1, n - 1, 1):
if (arr[i] > arr[i - 1] and
arr[i] > arr[i + 1]):
print(arr[i], end = " ")
# Driver Code
if __name__ == '__main__':
arr = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 1, 2, 3, 4, 5, 6, 1, 2, 3, 4, 5, 6, 7, 1, 2, 3, 1, 2, 3, 4, 5, 1, 2, 3, 4, 1, 2, 3, 4, 5, 6, 7, 8, 1, 2, 3, 4, 5, 6, 7]
n = len(arr)
printElements(arr, n)
print(count_shelf)
arr = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 1, 2, 3, 4, 5, 6, 1, 2, 3, 4, 5, 6, 7, 1, 2, 3, 1, 2, 3, 4, 5, 1, 2, 3, 4, 1, 2, 3, 4, 5, 6, 7, 8, 1, 2, 3, 4, 5, 6, 7]
for prev, current in zip(arr, arr[1:]):
print(prev,current)
if prev > current:
x = prev
print(prev,'prev greater')
print(current,'current')
results of the last alg:
2 3
3 4
4 5
5 6
6 7
7 8
8 9
9 10
10 1
10 prev greater
1 current
1 2
2 3
3 4
4 5
5 6
6 1
6 prev greater
1 current
1 2
2 3
3 4
4 5
5 6
6 7
7 1
7 prev greater
1 current
1 2
2 3
3 1
3 prev greater
1 current
1 2
2 3
3 4
4 5
5 1
5 prev greater
1 current
1 2
2 3
3 4
4 1
4 prev greater
1 current
1 2
2 3
3 4
4 5
5 6
6 7
7 8
8 1
8 prev greater
1 current
1 2
2 3
3 4
4 5
5 6
6 7 ``
arr.append(float('-inf'))
def printElements(arr, n):
# Traverse array from index 1 to n-2
# and check for the given condition
for i in range(1, n - 1, 1):
if (arr[i] > arr[i - 1] and
arr[i] > arr[i + 1]):
print(arr[i], end = " ")
# Driver Code
if __name__ == '__main__':
arr = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 1, 2, 3, 4, 5, 6, 1, 2, 3, 4, 5, 6, 7, 1, 2, 3, 1, 2, 3, 4, 5, 1, 2, 3, 4, 1, 2, 3, 4, 5, 6, 7, 8, 1, 2, 3, 4, 5, 6, 7]
arr.append(float('-inf'))
n = len(arr)
printElements(arr, n)
You can use list comprehension to get the maximum value for each sequence.
lst = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 1, 2, 3, 4, 5, 6, 1, 2, 3, 4, 5, 6, 7, 1, 2, 3, 1, 2, 3, 4, 5, 1, 2, 3, 4, 1, 2, 3, 4, 5, 6, 7, 8, 1, 2, 3, 4, 5, 6, 7]
maxvals = [lst[x] for x in range(len(lst)) if x == len(lst)-1 or lst[x] > lst[x+1]]
print(maxvals)
Output
[10, 6, 7, 3, 5, 4, 8, 7]
I don't see any way to use zip to find the solution.
pandas version: 1.2
I am trying to take a python pandas dataframe column pandas and create the same type of logic as in R that would be
ss=sequence(df$los)
Which produces for the first two records
[1] 1 2 3 4 5 1 2 3 4 5
Example dataframe:
df = pd.DataFrame([('test', 5), ('t2', 5), ('t3', 2), ('t4', 6)],
columns=['first', 'los'])
df
first los
0 test 5
1 t2 5
2 t3 2
3 t4 6
So the first row is sequenced 1-5 and second row is sequenced 1-5 and third row is sequenced 1-2 etc. In R this becomes one sequenced list. I would like that is python.
What I have been able to do is.
ss = df['los']
ss.apply(lambda x: np.array(range(1, x)))
18 [1, 2, 3, 4, 5]
90 [1, 2, 3, 4, 5]
105 [1,2]
106 [1, 2, 3, 4, 5, 6]
Which is close but then I need to combine it into a single pd.Series so that it should be:
[1, 2, 3, 4, 5, 1, 2, 3, 4, 5, 1, 2, 1, 2, 3, 4, 5, 6]
Use explode():
df.los.apply(lambda x: np.arange(1, x+1)).explode().tolist()
Output:
[1, 2, 3, 4, 5, 1, 2, 3, 4, 5, 1, 2, 1, 2, 3, 4, 5, 6]
Note - you can skip the ss assignment step, and use np.arange to streamline a bit.
You can just use concatenate:
np.concatenate([np.arange(x)+1 for x in df['los']])
Output:
array([1, 2, 3, 4, 5, 1, 2, 3, 4, 5, 1, 2, 1, 2, 3, 4, 5, 6])
I have a Pandas data frame object of shape (X,Y) that looks like this:
[[1, 2, 3],
[4, 5, 6],
[7, 8, 9]]
and a numpy sparse matrix (CSC) of shape (X,Z) that looks something like this
[[0, 1, 0],
[0, 0, 1],
[1, 0, 0]]
How can I add the content from the matrix to the data frame in a new named column such that the data frame will end up like this:
[[1, 2, 3, [0, 1, 0]],
[4, 5, 6, [0, 0, 1]],
[7, 8, 9, [1, 0, 0]]]
Notice the data frame now has shape (X, Y+1) and rows from the matrix are elements in the data frame.
import numpy as np
import pandas as pd
import scipy.sparse as sparse
df = pd.DataFrame(np.arange(1,10).reshape(3,3))
arr = sparse.coo_matrix(([1,1,1], ([0,1,2], [1,2,0])), shape=(3,3))
df['newcol'] = arr.toarray().tolist()
print(df)
yields
0 1 2 newcol
0 1 2 3 [0, 1, 0]
1 4 5 6 [0, 0, 1]
2 7 8 9 [1, 0, 0]
Consider using a higher dimensional datastructure (a Panel), rather than storing an array in your column:
In [11]: p = pd.Panel({'df': df, 'csc': csc})
In [12]: p.df
Out[12]:
0 1 2
0 1 2 3
1 4 5 6
2 7 8 9
In [13]: p.csc
Out[13]:
0 1 2
0 0 1 0
1 0 0 1
2 1 0 0
Look at cross-sections etc, etc, etc.
In [14]: p.xs(0)
Out[14]:
csc df
0 0 1
1 1 2
2 0 3
See the docs for more on Panels.
df = pd.DataFrame(np.arange(1,10).reshape(3,3))
df['newcol'] = pd.Series(your_2d_numpy_array)
You can add and retrieve a numpy array from dataframe using this:
import numpy as np
import pandas as pd
df = pd.DataFrame({'b':range(10)}) # target dataframe
a = np.random.normal(size=(10,2)) # numpy array
df['a']=a.tolist() # save array
np.array(df['a'].tolist()) # retrieve array
This builds on the previous answer that confused me because of the sparse part and this works well for a non-sparse numpy arrray.
Here is other example:
import numpy as np
import pandas as pd
""" This just creates a list of touples, and each element of the touple is an array"""
a = [ (np.random.randint(1,10,10), np.array([0,1,2,3,4,5,6,7,8,9])) for i in
range(0,10) ]
""" Panda DataFrame will allocate each of the arrays , contained as a touple
element , as column"""
df = pd.DataFrame(data =a,columns=['random_num','sequential_num'])
The secret in general is to allocate the data in the form a = [ (array_11, array_12,...,array_1n),...,(array_m1,array_m2,...,array_mn) ] and panda DataFrame will order the data in n columns of arrays. Of course , arrays of arrays could be used instead of touples, in that case the form would be :
a = [ [array_11, array_12,...,array_1n],...,[array_m1,array_m2,...,array_mn] ]
This is the output if you print(df) from the code above:
random_num sequential_num
0 [7, 9, 2, 2, 5, 3, 5, 3, 1, 4] [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
1 [8, 7, 9, 8, 1, 2, 2, 6, 6, 3] [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
2 [3, 4, 1, 2, 2, 1, 4, 2, 6, 1] [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
3 [3, 1, 1, 1, 6, 2, 8, 6, 7, 9] [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
4 [4, 2, 8, 5, 4, 1, 2, 2, 3, 3] [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
5 [3, 2, 7, 4, 1, 5, 1, 4, 6, 3] [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
6 [5, 7, 3, 9, 7, 8, 4, 1, 3, 1] [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
7 [7, 4, 7, 6, 2, 6, 3, 2, 5, 6] [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
8 [3, 1, 6, 3, 2, 1, 5, 2, 2, 9] [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
9 [7, 2, 3, 9, 5, 5, 8, 6, 9, 8] [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
Other variation of the example above:
b = [ (i,"text",[14, 5,], np.array([0,1,2,3,4,5,6,7,8,9])) for i in
range(0,10) ]
df = pd.DataFrame(data=b,columns=['Number','Text','2Elemnt_array','10Element_array'])
Output of df:
Number Text 2Elemnt_array 10Element_array
0 0 text [14, 5] [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
1 1 text [14, 5] [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
2 2 text [14, 5] [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
3 3 text [14, 5] [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
4 4 text [14, 5] [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
5 5 text [14, 5] [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
6 6 text [14, 5] [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
7 7 text [14, 5] [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
8 8 text [14, 5] [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
9 9 text [14, 5] [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
If you want to add other columns of arrays, then:
df['3Element_array']=[([1,2,3]),([1,2,3]),([1,2,3]),([1,2,3]),([1,2,3]),([1,2,3]),([1,2,3]),([1,2,3]),([1,2,3]),([1,2,3])]
The final output of df will be:
Number Text 2Elemnt_array 10Element_array 3Element_array
0 0 text [14, 5] [0, 1, 2, 3, 4, 5, 6, 7, 8, 9] [1, 2, 3]
1 1 text [14, 5] [0, 1, 2, 3, 4, 5, 6, 7, 8, 9] [1, 2, 3]
2 2 text [14, 5] [0, 1, 2, 3, 4, 5, 6, 7, 8, 9] [1, 2, 3]
3 3 text [14, 5] [0, 1, 2, 3, 4, 5, 6, 7, 8, 9] [1, 2, 3]
4 4 text [14, 5] [0, 1, 2, 3, 4, 5, 6, 7, 8, 9] [1, 2, 3]
5 5 text [14, 5] [0, 1, 2, 3, 4, 5, 6, 7, 8, 9] [1, 2, 3]
6 6 text [14, 5] [0, 1, 2, 3, 4, 5, 6, 7, 8, 9] [1, 2, 3]
7 7 text [14, 5] [0, 1, 2, 3, 4, 5, 6, 7, 8, 9] [1, 2, 3]
8 8 text [14, 5] [0, 1, 2, 3, 4, 5, 6, 7, 8, 9] [1, 2, 3]
9 9 text [14, 5] [0, 1, 2, 3, 4, 5, 6, 7, 8, 9] [1, 2, 3]