i've recreated a code of Haar Tranform matrix from matlab to python it's a success upon entering the value of n for 2 and 4 but when i'm trying to input 8 there's an error
"Traceback (most recent call last):
File "python", line 20, in
ValueError: shape too large to be a matrix."
here's my code
import numpy as np
import math
n=8
# check input parameter and make sure it's the power of 2
Level1 = math.log(n, 2)
Level = int(Level1)+1
#Initialization
H = [1]
NC = 1 / math.sqrt(2) #normalization constant
LP = [1, 1]
HP = [1,-1]
for i in range(1,Level):
H = np.dot(NC, [np.matrix(np.kron(H, LP)), np.matrix(np.kron(np.eye(len(H)), HP))])
print H
I'm assuming you got the definition of the haar transform from the wikipedia article or a similar source, so I'll try to stick to their notation.
The problem with your code is that on the wikipedia article a slight abuse of notation is used. In the equation defining H_2N in terms of H_N, two matrices are stacked on top of eachother with brackets around them. Technically, this would be something like an array consisting of 2 arrays, but they mean it to be a single array where the top half of the values is equal to the one matrix and the bottom half equal to the other matrix.
In your code, the array of two matrices is the following part:
[np.matrix(np.kron(H, LP)), np.matrix(np.kron(np.eye(len(H)), HP))]
You can make this into a single matrix as described above using the np.concatenate function as follows:
H = np.dot(NC, np.concatenate([np.matrix(np.kron(H, LP)), np.matrix(np.kron(np.eye(len(H)), HP))]))
Related
I want to decompose a 3-dimensional tensor using SVD.
I am not quite sure if and, how following decomposition can be achieved.
I already know how I can split the tensor horizontally from this tutorial: tensors.org Figure 2.2b
d = 10; A = np.random.rand(d,d,d)
Am = A.reshape(d**2,d)
Um,Sm,Vh = LA.svd(Am,full_matrices=False)
U = Um.reshape(d,d,d); S = np.diag(Sm)
Matrix methods can be naturally extended to higher-orders. SVD, for instance, can be generalized to tensors e.g. with the Tucker decomposition, sometimes called a higher-order SVD.
We maintain a Python library for tensor methods, TensorLy, which lets you do this easily. In this case you want a partial Tucker as you want to leave one of the modes uncompressed.
Let's import the necessary parts:
import tensorly as tl
from tensorly import random
from tensorly.decomposition import partial_tucker
For testing, let's create a 3rd order tensor of size (10, 10, 10):
size = 10
order = 3
shape = (size, )*order
tensor = random.random_tensor(shape)
You can now decompose the tensor using the tensor decomposition. In your case, you want to leave one of the dimensions untouched, so you'll only have two factors (your U and V) and a core tensor (your S):
core, factors = partial_tucker(tensor, rank=size, modes=[0, 2])
You can reconstruct the original tensor from your approximation using a series of n-mode products to contract the core with the factors:
from tensorly import tenalg
rec = tenalg.multi_mode_dot(core, factors, modes=[0, 2])
rec_error = tl.norm(rec - tensor)/tl.norm(tensor)
print(f'Relative reconstruction error: {rec_error}')
In my case, I get
Relative reconstruction error: 9.66027176805661e-16
You can also use "tensorlearn" package in python for example using tensor-train (TT) SVD algorithm.
https://github.com/rmsolgi/TensorLearn/tree/main/Tensor-Train%20Decomposition
import numpy as np
import tensorlearn as tl
#lets generate an arbitrary array
tensor = np.arange(0,1000)
#reshaping it into a higher (3) dimensional tensor
tensor = np.reshape(tensor,(10,20,5))
epsilon=0.05
#decompose the tensor to its factors
tt_factors=tl.auto_rank_tt(tensor, epsilon) #epsilon is the error bound
#tt_factors is a list of three arrays which are the tt-cores
#rebuild (estimating) the tensor using the factors again as tensor_hat
tensor_hat=tl.tt_to_tensor(tt_factors)
#lets see the error
error_tensor=tensor-tensor_hat
error=tl.tensor_frobenius_norm(error_tensor)/tl.tensor_frobenius_norm(tensor)
print('error (%)= ',error*100) #which is less than epsilon
# one usage of tensor decomposition is data compression
# So, lets calculate the compression ratio
data_compression_ratio=tl.tt_compression_ratio(tt_factors)
#data saving
data_saving=1-(1/data_compression_ratio)
print('data_saving (%): ', data_saving*100)
I need to generate a Healpyx map (using Healpy) from random $a_{\ell m}$, for a spin-2 function.
Schematically, this should look like that:
import healpy as hp
nside = 16 # for example
for el in range(1, L+1): #loop over ell mode
for m in range(-el,el): #for each ell mode loop over m
ind = hp.sphtfunc.Alm.getidx(nside, el, m)
if m == 0:
a_lm[ind] = np.random.randn()
else:
a_lm[ind] = np.random.randn() + 1j * np.random.randn()
a_tmp = hp.sphtfunc.alm2map(a_lm, nside, pol=True)
My two questions are:
1) how do I initialise a_lm ? Specifically, what would be its dimension, using
a_lm = np.zeros(???)
2) if I understood correctly, the output a_tmp is a 1 dimensional list. How do I reshape it into a two-dimensional list (the map) for plotting?
1) What properties do you want your alm to have? You could also just assume a certain power spectrum (C_ell) and use hp.synalm() or hp.synfast().
For the initialization, you've already implemented that m goes from -ell to +ell, so you have a one-dimensional array of length sum_0^ell [2ell+1]. Doing the math should give you the length you need.
2) For the plotting, you could just directly generate a random map and then use e.g. hp.mollview(), which takes the 1-dimensional HEALPix map.
Alternatively, you can use hp.alm2map() to convert your alm to a map.
I also suggest you check out the tutorial for the plotting.
Usually we can follow the following steps to get the length of a_lm.
import healpy as hp
inside = 16
# Get the maximum multipole with the current nside
lmax = 3*nside - 1 #This can vary according to the use. In cosmology, the common value is 2*nside
alm_len = hp.Alm.getsize(lmax)
a_lm = np.empty(alm_len)
I think the tutorial linked in #Daniel's answer is a good resource for plotting Healpix maps.
I'm trying to evaluate a multivariate normal CDF several times using theano's scan function, but I'm getting a ValueError.
Here's an example of the original function I'm trying to vectorize:
from scipy.stats.mvn import mvnun # library that calculates MVN CDF
low = [-1.96, 0 ] # lower bounds of integration
upp = [0 , 1.96] # upper bounds of integration
mean = [0 , 0 ] # means of the jointly distributed random variables
covs = [[1,0.25],[0.25,1]] # covariance matrix
print(mvnun(low,upp,mean,cov))
This produces the following output:
(0.19620339269649473, 0)
Simple and straightforward, right?
What I'm really trying to do is create 4 large input objects with 1500 elements each. That way, I get to evaluate the mvnun function 1500 times. The idea is that at each iteration, all inputs are different from the last, and no information from the previous iteration is necessary.
Here is my setup:
import theano
import numpy as np
lower = theano.tensor.dmatrix("lower") # lower bounds - dim: 1500 x 2
upper = theano.tensor.dmatrix("upper") # upper bounds - dim: 1500 x 2
means = theano.tensor.dmatrix("means") # means means - dim: 1500 x 2
covs = theano.tensor.dtensor3("covs") # cov matrices - dim: 1500 x 2 x 2
results, updates = theano.scan(fn=mvnun,
sequences=[lower,upper,means,covs])
f = theano.function(inputs=[lower, upper, means, covs],
outputs=results,
updates=updates)
However, when I try to run this block of code, I get an error on the line with the scan command. The error states: ValueError: setting an array element with a sequence.. The full traceback of the error is below:
Traceback (most recent call last):
File "", line 7, in
sequences=[lower,upper,means,covs])
File "C:\Anaconda2\lib\site-packages\theano\scan_module\scan.py",
line 745, in scan
condition, outputs, updates = scan_utils.get_updates_and_outputs(fn(*args))
ValueError: setting an array element with a sequence.
I originally thought that the code wasn't working because the mvnun function returns a two-element tuple instead of a single value.
However, when I tried to vectorize a test function (that I created) that also returned a two-element tuple, things worked just fine. Here's the full example:
# Some weird crazy function that takes in three Nx1 vectors
# and an NxN matrix and spits out a tuple of scalars.
def test_func(low_i,upp_i,mean_i,cov_i):
r1 = low_i.sum() + upp_i.sum()
r2 = np.dot(mean_i,cov_i).sum()
test_func_out = (r1,r2)
return(test_func_out)
lower = theano.tensor.dmatrix("lower") # lower
upper = theano.tensor.dmatrix("upper") # upper
means = theano.tensor.dmatrix("means") # means
covs = theano.tensor.dtensor3("covs") # covs
results, updates = theano.scan(fn=test_func,
sequences=[lower,upper,means,covs])
f = theano.function(inputs=[lower, upper, means, covs],
outputs=results,
updates=updates)
np.random.seed(666)
obs = 1500 # number of elements in the dataset
dim = 2 # dimension of multivariate normal distribution
# Generating random values for the lower bounds, upper bounds and means
lower_vals = np.random.rand(obs,dim)
upper_vals = lower_vals + np.random.rand(obs,dim)
means_vals = np.random.rand(obs,dim)
# Creates a symmetric matrix - used for the random covariance matrices
def make_sym_matrix(dim,vals):
m = np.zeros([dim,dim])
xs,ys = np.triu_indices(dim,k=1)
m[xs,ys] = vals[:-dim]
m[ys,xs] = vals[:-dim]
m[ np.diag_indices(dim) ] = vals[-dim:]
return m
# Generating the random covariance matrices
covs_vals = []
for i in range(obs):
cov_vals = np.random.rand((dim^2 - dim)/2+dim)
cov_mtx = make_sym_matrix(dim,cov_vals)
covs_vals.append(cov_mtx)
covs_vals = np.array(covs_vals)
# Evaluating the test function on all 1500 elements
print(f(lower_vals,upper_vals,means_vals,covs_vals))
When I run this block of code, everything works out fine and the output I get is a list with 2 arrays, each containing 1500 elements:
[array([ 4.24700864, 3.80830129, 2.60806493, ..., 3.12995381, 4.41907055, 4.12880839]),
array([ 0.87814314, 1.01768617, 0.45072405, ..., 1.15788282, 0.15766754, 1.32393402])]
It's also worth noting that the order in which the vectorized function is getting elements from the sequences is perfect. I ran a sanity check with the first 3 numbers in the list:
for i in range(3):
print(test_func(lower_vals[i],upper_vals[i],means_vals[i],covs_vals[i]))
And the results are:
(4.2470086396797502, 0.87814313729162796)
(3.808301289302495, 1.017686166097616)
(2.6080649327828564, 0.45072405177076169)
These values are practically identical to the first 3 output values in the vectorized approach.
So back to the main problem: why can't I get the mvnun function to work when I use it in the scan statement? Why am I getting this odd ValueError?
Any kind of advice would be really helpful!!!
Thanks!!!
I'm trying to rewrite Zhao Koch steganography method from matlab into python and I am stuck right at the start.
The first two procedures as they are in matlab:
Step 1:
A = imread(casepath); # Reading stegonography case image and aquiring it's RGB values. In my case it's a 400x400 PNG image, so it gives a 400x400x3 array.
Step 2:
D = dct2(A(:,:,3)); # Applying 2D DCT to blue values of the image
Python code analog:
from scipy import misc
from numpy import empty,arange,exp,real,imag,pi
from numpy.fft import rfft,irfft
arr = misc.imread('casepath')# 400x480x3 array (Step 1)
arr[20, 30, 2] # Getting blue pixel value
def dct(y): #Basic DCT build from numpy
N = len(y)
y2 = empty(2*N,float)
y2[:N] = y[:]
y2[N:] = y[::-1]
c = rfft(y2)
phi = exp(-1j*pi*arange(N)/(2*N))
return real(phi*c[:N])
def dct2(y): #2D DCT bulid from numpy and using prvious DCT function
M = y.shape[0]
N = y.shape[1]
a = empty([M,N],float)
b = empty([M,N],float)
for i in range(M):
a[i,:] = dct(y[i,:])
for j in range(N):
b[:,j] = dct(a[:,j])
return b
D = dct2(arr) # step 2 anlogue
However, when I try to execute the code I get the following error:
Traceback (most recent call last):
File "path to .py file", line 31, in <module>
D = dct2(arr)
File "path to .py file", line 25, in dct2
a[i,:] = dct(y[i,:])
File "path to .py file", line 10, in dct
y2[:N] = y[:]
ValueError: could not broadcast input array from shape (400,3) into shape (400)
Perhaps someone could kindly explain to me what am I doing wrong?
Additional Info:
OS: Windows 10 Pro 64 bit
Python: 2.7.12
scipy:0.18.1
numpy:1.11.2
pillow: 3.4.1
Your code works fine, but it is designed to only accept a 2D array, just like dct2() in Matlab. Since your arr is a 3D array, you want to do
D = dct2(arr[...,2])
As mentioned in my comment, instead or reinventing the wheel, use the (fast) built-in dct() from the scipy package.
The code from the link in my comment effectively provides you this:
import numpy as np
from scipy.fftpack import dct, idct
def dct2(block):
return dct(dct(block.T, norm='ortho').T, norm='ortho')
def idct2(block):
return idct(idct(block.T, norm='ortho').T, norm='ortho')
But again, I must stress that you have to call this function for each colour plane individually. Scipy's dct() will happily accept any N-dimensional array and will apply the transform on the last axis. Since that's your colour planes and not your rows and columns of your pixels, you'll get the wrong result. Yes, there is a way to address this with the axis input parameter, but I won't unnecessarily overcomplicate this answer.
Regarding the various DCT implementations involved here, your version and scipy's implementation give the same result if you omit the norm='ortho' parameter from the snippet above. But with that parameter included, scipy's transform will agree with Matlab's.
In R, I am using ccf or acf to compute the pair-wise cross-correlation function so that I can find out which shift gives me the maximum value. From the looks of it, R gives me a normalized sequence of values. Is there something similar in Python's scipy or am I supposed to do it using the fft module? Currently, I am doing it as follows:
xcorr = lambda x,y : irfft(rfft(x)*rfft(y[::-1]))
x = numpy.array([0,0,1,1])
y = numpy.array([1,1,0,0])
print xcorr(x,y)
To cross-correlate 1d arrays use numpy.correlate.
For 2d arrays, use scipy.signal.correlate2d.
There is also scipy.stsci.convolve.correlate2d.
There is also matplotlib.pyplot.xcorr which is based on numpy.correlate.
See this post on the SciPy mailing list for some links to different implementations.
Edit: #user333700 added a link to the SciPy ticket for this issue in a comment.
If you are looking for a rapid, normalized cross correlation in either one or two dimensions
I would recommend the openCV library (see http://opencv.willowgarage.com/wiki/ http://opencv.org/). The cross-correlation code maintained by this group is the fastest you will find, and it will be normalized (results between -1 and 1).
While this is a C++ library the code is maintained with CMake and has python bindings so that access to the cross correlation functions is convenient. OpenCV also plays nicely with numpy. If I wanted to compute a 2-D cross-correlation starting from numpy arrays I could do it as follows.
import numpy
import cv
#Create a random template and place it in a larger image
templateNp = numpy.random.random( (100,100) )
image = numpy.random.random( (400,400) )
image[:100, :100] = templateNp
#create a numpy array for storing result
resultNp = numpy.zeros( (301, 301) )
#convert from numpy format to openCV format
templateCv = cv.fromarray(numpy.float32(template))
imageCv = cv.fromarray(numpy.float32(image))
resultCv = cv.fromarray(numpy.float32(resultNp))
#perform cross correlation
cv.MatchTemplate(templateCv, imageCv, resultCv, cv.CV_TM_CCORR_NORMED)
#convert result back to numpy array
resultNp = np.asarray(resultCv)
For just a 1-D cross-correlation create a 2-D array with shape equal to (N, 1 ). Though there is some extra code involved to convert to an openCV format the speed-up over scipy is quite impressive.
I just finished writing my own optimised implementation of normalized cross-correlation for N-dimensional arrays. You can get it from here.
It will calculate cross-correlation either directly, using scipy.ndimage.correlate, or in the frequency domain, using scipy.fftpack.fftn/ifftn depending on whichever will be quickest.
For 1D array, numpy.correlate is faster than scipy.signal.correlate, under different sizes, I see a consistent 5x peformance gain using numpy.correlate. When two arrays are of similar size (the bright line connecting the diagonal), the performance difference is even more outstanding (50x +).
# a simple benchmark
res = []
for x in range(1, 1000):
list_x = []
for y in range(1, 1000):
# generate different sizes of series to compare
l1 = np.random.choice(range(1, 100), size=x)
l2 = np.random.choice(range(1, 100), size=y)
time_start = datetime.now()
np.correlate(a=l1, v=l2)
t_np = datetime.now() - time_start
time_start = datetime.now()
scipy.signal.correlate(in1=l1, in2=l2)
t_scipy = datetime.now() - time_start
list_x.append(t_scipy / t_np)
res.append(list_x)
plt.imshow(np.matrix(res))
As default, scipy.signal.correlate calculates a few extra numbers by padding and that might explained the performance difference.
>> l1 = [1,2,3,2,1,2,3]
>> l2 = [1,2,3]
>> print(numpy.correlate(a=l1, v=l2))
>> print(scipy.signal.correlate(in1=l1, in2=l2))
[14 14 10 10 14]
[ 3 8 14 14 10 10 14 8 3] # the first 3 is [0,0,1]dot[1,2,3]