Related
I have to write a function empty_matrix that must return a list of lists (a matrix)
desired output:
empty_matrix(3,4)
returns a list with three lists each of length four:
[[None,None,None,None],[None,None,None,None],[None,None,None,None]]
What should I change in my code below???
def empty_matrix(row,column):
first_col = []
for x in range(len(matrix)):
sublist = matrix[x]
for y in range(len(sublist)):
first_col.append(matrix[x][y])
Using a list comprehension:
def empty_matrix(row, column):
return [[None for _ in range(row)] for _ in range(column)]
But to fix your code, you're using len on variables matrix and sublist that aren't defined, your indentation is off, try something like this if you don't want to use a list comprehension.
def empty_matrix(row, column):
matrix = []
for c in range(column):
column = []
for r in range(row):
column.append(None)
matrix.append(column)
return matrix
My Python code generates some matrices (one at a time) through a loop over some index called i.
Storing matrices with names like mat_0, mat_1,..., mat_i is straightforward but I was wondering if it somehow possible to store matrices like iterable elements like mat[0], mat[1],...,mat[i]?
Note: The matrices are stored in scipy sparse coo_matrix format.
Edit 1 : The index i does not necessarily follow a proper sequence and may loop over some random numbers like 0,2,3,7,... In that case the matrices have to be stored as mat[0], mat[2], mat[3], mat[7],... and so on.
Edit 2: Minimal working code
import numpy as np
from math import sqrt
from scipy.sparse import coo_matrix, csr_matrix
primesqt = np.array([1.41421356, 1.73205080, 2.23606797, 2.64575131, 3.31662479, 3.60555127, 4.12310562, 4.35889894, 4.79583152, 5.38516480, 5.56776436, 6.08276253, 6.40312423, 6.55743852, 6.85565460, 7.28010988, 7.68114574, 7.81024967, 8.18535277, 8.42614977, 8.54400374, 8.88819441, 9.11043357, 9.43398113, 9.84885780, 10.04987562, 10.14889156, 10.34408043, 10.44030650, 10.63014581, 11.26942766, 11.44552314, 11.70469991, 11.78982612, 12.20655561, 12.28820572, 12.52996408, 12.76714533, 12.92284798, 13.15294643, 13.37908816, 13.45362404, 13.82027496, 13.89244398, 14.03566884, 14.10673597, 14.52583904, 14.93318452, 15.06651917, 15.13274595])
def bg(n, k, min_elem, max_elem):
allowed = range(max_elem, min_elem-1, -1)
def helper(n, k, t):
if k == 0:
if n == 0:
yield t
elif k == 1:
if n in allowed:
yield t + (n,)
elif min_elem * k <= n <= max_elem * k:
for v in allowed:
yield from helper(n - v, k - 1, t + (v,))
return helper(n, k, ())
def BinarySearch(lys, val):
first = 0
last = len(lys)-1
index = -1
while (first <= last) and (index == -1):
mid = (first+last)//2
if lys[mid] == val:
index = mid
else:
if val<lys[mid]:
last = mid -1
else:
first = mid +1
return index
m = 4
dim = 16
nmax = 1
a = []
for n in range(0,(nmax*m)+1):
for x in bg(n, m, 0, nmax):
a.append(x)
T = np.zeros(dim)
for ii in range(dim):
for jj in range(m):
T[ii] += primesqt[jj]*float(a[ii][jj])
ind = np.argsort(T)
T = sorted(T)
all_bs = [0,2,3,7] # i_list
# Evaluate 'mat_ee' for each 'ee' given in the list 'all_bs'
for ee in all_bs:
row = []
col = []
val = []
for ii in range(m):
for vv in range(dim):
Tg = 0
if a[vv][ii]+1 < nmax+1:
k = np.copy(a[vv])
elem = sqrt(float(k[ii]+1.0))+ee
k[ii] = k[ii]+1
# Generate tag Tg for elem != 0
for jj in range(m):
Tg += float((primesqt[jj])*k[jj])
# Search location of non-zero element in sorted T
location = BinarySearch(T, Tg)
uu = ind[location]
row.append(uu)
col.append(vv)
val.append(elem)
mat_ee = (coo_matrix((val, (row, col)), shape=(dim, dim)).tocsr()) # To be stored as mat[0], mat[2], mat[3], mat[7]
print(mat_ee)
A dictionary would allow you to reference an object using an arbitrary (but immutable) object. In your case, you could store the matrices mat_ee in each iteration of the outer loop (for ee in all_bs:) using that ee index:
csr_matrices = {}
for ee in all_bs:
# your inner loops, all the way to…
mat_ee = (coo_matrix((val, (row, col)),
shape=(dim, dim))
.tocsr())
csr_matrices[ee] = mat_ee
From that moment, you can access the elements of the dictionaries using the indices you had in all_bs:
print(csr_matrices[2])
and when you inspect the dictionary, you’ll notice it only contains the keys you specified:
print(csr_matrices.keys())
You could use a List of your objects.
items_list = list()
for something:
result = function
items_list.append(result)
I'm not familiar with list comprehensions but I would like to compute the bray-curtis dissimilarity using list comprehensions. The dissimilarity is given by
def bray(x):
bray_diss = np.zeros((x.shape[0], x.shape[0]))
for i in range(0, bray_diss.shape[0]):
bray_diss[i,i] = 0
for j in range(i+1, bray_diss.shape[0]):
l1_diff = abs(x[i,:] - x[j,:])
l1_sum = x[i,:] + x[j,:] + 1
bray_diss[i,j] = l1_diff.sum() / l1_sum.sum()
bray_diss[j,i] = bray_diss[i,j]
return bray_diss
I tryed something like :
def bray(x):
[[((abs(x[i,:] - x[j,:])).sum() / (x[i,:] + x[j,:] + 1).sum()) for j in range(0, x.shape[0])] for i in range(0, x.shape[0])]
without succes, and I can't figure out what is wrong! Moreover, in the first implementation, the second loop is not performed on all the matrix row values to save computation time, how is it possible to do it with list comprehension ?
Thanks !
You won't gain anything wxith a list comprehension... except a better comprehension of list comprehensions!
What you have to understand is that list comprehension is a functional concept. I will not go in functional programming detail,
but you have to keep in mind that functional programming forbids side effects. An example:
my_matrix = np.zeros(n, n)
for i in range(n):
for j in range(n):
my_matrix[i,j] = value_of_cell(i,j)
The last line is a side effect: you modifiy the state of my_matrix. In contrast, a side effect free version would do:
np.array([[value_of_cell(i,j) for j in range(n)] for i in range(n)])
You don't have the "create-then-assign" sequence: you create the matrix by declaring the values at each position. More precisely, to create a matrix:
you have to declare a value for every cell;
when you are given the pair (i,j), you can't use it to declare the value of another cell (e.g. (j,i))
(If you need to transform the matrix later, you have to recreate it. That's why this method may be expensive -- in time and space.)
Now, take look at your code. When you write a list comprehension, a good rule of thumb is to use auxiliary functions as they help to clean the code (we don't try to create a one-liner here):
def bray(x):
n = x.shape[0] # cleaner than to repeat x.shape[0] everywhere
def diss(i,j): # I hope it's correct
l1_diff = abs(x[i,:] - x[j,:])
l1_sum = x[i,:] + x[j,:] + 1
return l1_diff.sum() / l1_sum.sum()
bray_diss = np.zeros((n, n))
for i in range(n): # range(n) = range(0,n)
# bray_diss[i,i] = 0 <-- you don't need to set it to zero here
for j in range(i+1, n):
bray_diss[i,j] = diss(i,j)
bray_diss[j,i] = bray_diss[i,j]
return bray_diss
That's cleaner. What is the next step? In the code above, you choose to iterate over j that are greater than i and to set two values at once. But in a list comprehension, you don't choose the cells: the list comprehension gives you, for each cell, the coordinates and you have to declare the values.
First, let's try to set only one value per iteration, that is to use two loops:
def bray(x):
...
bray_diss = np.zeros((n, n))
for i in range(n):
for j in range(i+1, n):
bray_diss[i,j] = inner(i,j)
for i in range(n):
for j in range(i):
bray_diss[i,j] = bray_diss[j,i]
return bray_diss
That's better. Second, we need to assign a value to every cell of the matrix, not just prefill with zeroes and choose the cells we wan't to update:
def bray(x):
...
bray_diss = np.zeros((n, n))
for i in range(n):
for j in range(n):
if j>i: # j in range(i+1, n)
bray_diss[i,j] = inner(i,j) # top right corner
else # j in range(i+1)
bray_diss[i,j] = 0. # zeroes in the bottom left corner + diagonal
for i in range(n):
for j in range(n):
if j<i: # j in range(i)
bray_diss[i,j] = bray_diss[j,i] # fill the bottom left corner now
else # j in range(i, n)
bray_diss[i,j] = bray_diss[i,j] # top right corner + diagonal is already ok
return bray_diss
A short version would be, using the "fake ternary conditional operator" of Python:
def bray(x):
...
bray_diss = np.zeros((n, n))
for i in range(n):
for j in range(n):
bray_diss[i,j] = inner(i,j) if j>i else 0.
for i in range(n):
for j in range(n):
bray_diss[i,j] = bray_diss[j,i] if j<i else bray_diss[i,j]
return bray_diss
Now we can turn this into list comprehensions:
def bray(x):
...
bray_diss_top_right = np.array([[diss(i,j) if j>i else 0. for j in range(n)] for i in range(n)])
bray_diss = np.array([[bray_diss_top_right[j,i] if j<i else bray_diss_top_right[i,j] for j in range(n)] for i in range(n)])
return bray_diss
And, if I'm not wrong, it is even more simple like this (final version):
def bray(x):
n = x.shape[0]
def diss(i,j):
l1_diff = abs(x[i,:] - x[j,:])
l1_sum = x[i,:] + x[j,:] + 1
return l1_diff.sum() / l1_sum.sum()
bray_diss_top_right = np.array([[diss(i,j) if j>i else 0. for j in range(n)] for i in range(n)])
return bray_diss_top_right + bray_diss_top_right.transpose()
Note that this version is probably (I didn't measure) slower than yours, but the way the matrix is built is, in my opinion, easier to grasp.
I see there is an array_split and split methods but these are not very handy when you have to split an array of length which is not integer multiple of the chunk size. Moreover, these method’s input is the number of slices rather than the slice size. I need something more like Matlab's buffer method which is more suitable for signal processing.
For example, if I want to buffer a signals to chunks of size 60 I need to do: np.vstack(np.hsplit(x.iloc[0:((len(x)//60)*60)], len(x)//60)) which is cumbersome.
I wrote the following routine to handle the use cases I needed, but I have not implemented/tested for "underlap".
Please feel free to make suggestions for improvement.
def buffer(X, n, p=0, opt=None):
'''Mimic MATLAB routine to generate buffer array
MATLAB docs here: https://se.mathworks.com/help/signal/ref/buffer.html
Parameters
----------
x: ndarray
Signal array
n: int
Number of data segments
p: int
Number of values to overlap
opt: str
Initial condition options. default sets the first `p` values to zero,
while 'nodelay' begins filling the buffer immediately.
Returns
-------
result : (n,n) ndarray
Buffer array created from X
'''
import numpy as np
if opt not in [None, 'nodelay']:
raise ValueError('{} not implemented'.format(opt))
i = 0
first_iter = True
while i < len(X):
if first_iter:
if opt == 'nodelay':
# No zeros at array start
result = X[:n]
i = n
else:
# Start with `p` zeros
result = np.hstack([np.zeros(p), X[:n-p]])
i = n-p
# Make 2D array and pivot
result = np.expand_dims(result, axis=0).T
first_iter = False
continue
# Create next column, add `p` results from last col if given
col = X[i:i+(n-p)]
if p != 0:
col = np.hstack([result[:,-1][-p:], col])
i += n-p
# Append zeros if last row and not length `n`
if len(col) < n:
col = np.hstack([col, np.zeros(n-len(col))])
# Combine result with next row
result = np.hstack([result, np.expand_dims(col, axis=0).T])
return result
def buffer(X = np.array([]), n = 1, p = 0):
#buffers data vector X into length n column vectors with overlap p
#excess data at the end of X is discarded
n = int(n) #length of each data vector
p = int(p) #overlap of data vectors, 0 <= p < n-1
L = len(X) #length of data to be buffered
m = int(np.floor((L-n)/(n-p)) + 1) #number of sample vectors (no padding)
data = np.zeros([n,m]) #initialize data matrix
for startIndex,column in zip(range(0,L-n,n-p),range(0,m)):
data[:,column] = X[startIndex:startIndex + n] #fill in by column
return data
This Keras function may be considered as a Python equivalent of MATLAB Buffer().
See the Sample Code :
import numpy as np
S = np.arange(1,99) #A Demo Array
See Output Here
import tensorflow.keras.preprocessing as kp
list(kp.timeseries_dataset_from_array(S, targets = None,sequence_length=7,sequence_stride=7,batch_size=5))
See the Buffered Array Output Here
Reference : See This
Same as the other answer, but faster.
def buffer(X, n, p=0):
'''
Parameters
----------
x: ndarray
Signal array
n: int
Number of data segments
p: int
Number of values to overlap
Returns
-------
result : (n,m) ndarray
Buffer array created from X
'''
import numpy as np
d = n - p
m = len(X)//d
if m * d != len(X):
m = m + 1
Xn = np.zeros(d*m)
Xn[:len(X)] = X
Xn = np.reshape(Xn,(m,d))
Xne = np.concatenate((Xn,np.zeros((1,d))))
Xn = np.concatenate((Xn,Xne[1:,0:p]), axis = 1)
return np.transpose(Xn[:-1])
ryanjdillon's answer rewritten for significant performance improvement; it appends to a list instead of concatenating arrays, latter which copies the array iteratively and is much slower.
def buffer(x, n, p=0, opt=None):
if opt not in ('nodelay', None):
raise ValueError('{} not implemented'.format(opt))
i = 0
if opt == 'nodelay':
# No zeros at array start
result = x[:n]
i = n
else:
# Start with `p` zeros
result = np.hstack([np.zeros(p), x[:n-p]])
i = n-p
# Make 2D array, cast to list for .append()
result = list(np.expand_dims(result, axis=0))
while i < len(x):
# Create next column, add `p` results from last col if given
col = x[i:i+(n-p)]
if p != 0:
col = np.hstack([result[-1][-p:], col])
# Append zeros if last row and not length `n`
if len(col):
col = np.hstack([col, np.zeros(n - len(col))])
# Combine result with next row
result.append(np.array(col))
i += (n - p)
return np.vstack(result).T
def buffer(X, n, p=0):
'''
Parameters:
x: ndarray, Signal array, input a long vector as raw speech wav
n: int, frame length
p: int, Number of values to overlap
-----------
Returns:
result : (n,m) ndarray, Buffer array created from X
'''
import numpy as np
d = n - p
#print(d)
m = len(X)//d
c = n//d
#print(c)
if m * d != len(X):
m = m + 1
#print(m)
Xn = np.zeros(d*m)
Xn[:len(X)] = X
Xn = np.reshape(Xn,(m,d))
Xn_out = Xn
for i in range(c-1):
Xne = np.concatenate((Xn,np.zeros((i+1,d))))
Xn_out = np.concatenate((Xn_out, Xne[i+1:,:]),axis=1)
#print(Xn_out.shape)
if n-d*c>0:
Xne = np.concatenate((Xn, np.zeros((c,d))))
Xn_out = np.concatenate((Xn_out,Xne[c:,:n-p*c]),axis=1)
return np.transpose(Xn_out)
here is a improved code of Ali Khodabakhsh's sample code which is not work in my cases. Feel free to comment and use it.
Comparing the execution time of the proposed answers, by running
x = np.arange(1,200000)
start = timer()
y = buffer(x,60,20)
end = timer()
print(end-start)
the results are:
Andrzej May, 0.005595300000095449
OverLordGoldDragon, 0.06954789999986133
ryanjdillon, 2.427092700000003
I am trying to create a function (or series of functions), that perform the following operations:
Having an input array(A), for each cell A[i,j], extract a window (W), of custom size, where the value 'min' will be:
min = np.min(W)
The output matrix (H) will store the values as:
H[i,j] = A[i,j] - min(W)
For an easier understanding of the issue, I attached a picture (Example):
My current code is this:
def res_array(matrix, size):
result = []
sc.generic_filter(matrix, nothing, size, extra_arguments=(result,), mode = 'nearest')
mat_out = result
return mat_out
def local(window):
H = np.empty_like(window)
w = res_array(window, 3)
win_min = np.apply_along_axis(min, 1, w)
# This is where I think it's broken
for k in win_min:
for i in range(window.shape[0]):
for j in range(window.shape[1]):
h[i, j] = window[i,j] - k
k += 1
return h
def nothing(window, out):
list = []
for i in range(window.shape[0]):
list.append(window[i])
out.append(list)
return 0
test = np.ones((10, 10)) * np.arange(10)
a = local(test)
I need the code to pass to the next value in 'for k in win_min', for each cell of the input matrix A, or test.
Edit: I thought of something like directly accessing the index of the 'win_min', and increment by one, like I saw here: Increment the value inside a list element, but I don't know how to do that.
Thanks for any help!
N=4 #matrix size
a=random((N,N)) #input
#--window size
wl=1 #left
wr=1 #right
wt=1 #top
wb=1 #bottom
#---
H=np.zeros((N,N)) #output
def h(k,l): #individual cell function
#--- checks to not run out of array
k1=max(k-wt,0)
k2=min(k+wb+1,N)
l1=max(l-wl,0)
l2=min(l+wr,N)
#---
return a[k,l]-np.amin(a[k1:k2,l1:l2])
H=array([[h(k,l) for l in range(N)] for k in range(N)]) #running over all matrix elements
print a
print H