Correlating two 3D arrays in Python - python

I have two datasets that I need to correlate in Python. One array is a .mat file and the other is a list of .bin files. From these datasets I have created two 3D arrays with the same extent (120x112x244). While familiar with Python I have not worked with such datasets before, and thus am seeking advice on how to correlate these arrays. I attempted numpy correlate and received:
"ValueError: object too deep for desired array"
Any suggestions would be greatly appreciated

One idea I would try is to flatten the 3D matrix first, then use coorelate -- since coorelate only takes 1D vectors.
http://docs.scipy.org/doc/numpy/reference/generated/numpy.correlate.html.
Let's say your two matricies are called A and B.
>>> import numpy
>>> array_a = numpy.ndarray.flatten(A)
>>> array_b = numpy.ndarray.flatten(B)
>>> results = numpy.correlate(array_a, array_b)

Related

Numpy array with different data types

We both know that: "Numpy array is multidimensional array of objects of all the same type"
However, I could create a Numpy array that contains different data types as example below. Can anyone give an explain, how it could be.
import numpy as np
a = np.array([('a',1),('b',2)],dtype=[('alpha','U11'),('num','i8')])
print(a[0][1]+1)
print(len(a[0][0]))
Output:
2
1
Those are numpy records:
https://numpy.org/doc/stable/user/basics.rec.html
Numpy provides two data structures, the homogeneous arrays and the structured (aka record) arrays. The latter one, what you just stumbled across, is a structure that not only allows you to have different data types (float, int, str, etc.) but also provides handy methods to access them, through labels for instance.
In Numpy: it's called Structured arrays
Please read more here:
https://numpy.org/doc/stable/user/basics.rec.html
P/S: thanks Brandt

python plot 2d numpy array

I am having a bit of misunderstanding with numpy array
I have a set of two data (m, error) which I would like to plot
I save them in the array like this as I catch them in the same loop (which is probably causing the issue)
sum_error = np.append(m,error)
Then just simply trying to plot like this but it doesn't work as apparently this array is only of size 1 (with a tuple?)
plt.scatter(x=sum_error[:, 0], y=sum_error[:, 1])
What is the correct way to proceed? should I really make two different arrays, one for m, one for error in order to plot them?
Thanks
As it is answered on this thread, try using
np.vstack((m,error))

Saving List of Numpy 2D arrays using numpy.save (the arrays together are jagged)

I have a large image dataset. When I use the images, I have several components--a mirrored image, a regular image, an eigenvector matrix and an eigenvalue vector.
I would like to store it like:
training_sunsets_data = [cropped_training_sunsets,
mirrored_training_sunsets,
rgb_cov_eigvec_training_sunsets,
rgb_cov_eigval_training_sunsets]
np.save('training_sunsets_data',training_sunsets_data)
And as I was writing this I was testing it (because I was sure it would fail), and the strangest thing happened when I did this: it worked.
Further, when I loaded it back up into the code, it was type ndarray, but it is a jagged array.
How is this possible if numpy does not allow jagged multidimensional arrays? Did I just find a backdoor way to create a jagged array in numpy?
After testing on my machine:
import numpy as np
np.save('testnp.npy', [[2,3,4],[1,2]])
np.load('testnp.npy')
# array([[2, 3, 4], [1, 2]], dtype=object)
As shown in the example code, the loaded object is of type ndarray, but its data type is object. That means, np.save store an array of python objects, which can be anything. According to the documentation, it seems to use python pickle to pack those objects.
So you didn't find a backdoor, it behaves just as expected.
np.savez() would work in your situation. save each as a variable.
So to see what you are getting at lets runs some code.
>>> a =[np.array([[1,2,3],[4,5,6]]),np.array([[1,2],[3,4]])]
>>> type(a)
<type 'list'>
>>> np.array(a)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ValueError: could not broadcast input array from shape (2,3) into shape (2)
We see here that we are perfectly able to make a list of np.arrays of different dimensions. We cannot however cast that list into a np.array.
I suspect based on your syntax that you are saving a list, and loading a list maintaining the type np.array for each element in the list.

Load a huge sparse array and save it back as a dense array

I have a huge sparse matrix. I would like to save the dense equivalent one into file system.
The problem is the memory limit on my machine.
My original idea is:
convert huge_sparse_matrix to ndarray by np.asarray(huge_sparse_matrix)
assign values
save it back to file system
However, at step 1, Python raises MemoryError.
One possible approach in my mind is:
create a chunk of the dense array
assign values from the corresponding sparse one
save the dense array chunk back to file system
repeat 1-3
But how to do that?
you can use the scipy.sparse function to read sparse matrix and then convert it to numpy , see documentation here scipy.sparse docs and examples
I think np.asarray() is not really the function you're looking for.
You might try the SciPy matrix format cco_matrix() (coordinate formatted matrix).
scipy.sparse.coo_matrix
this format allows to save huge sparse matrices in very little memory.
furthermore there are many mathematical scipy functions which also work with this matrix format.
The matrix representation in this format are basically three lists:
row: the index of the row
col: the index of the column
data: the value at this position
hope that helped, cheers
The common and most straightforward answer to memory problems is: Do not create objects, use an iterator or a generator.
If I understand correctly, you have a sparse matrix and you want to transform it into a list representation. Here's a sample code:
def iter_sparse_matrix ( m, d1, d2 ):
for i in xrange(d1):
for j in xrange(d2):
if m[i][j]:
yield ( i, j, m[i][j] )
dense_array = list(iter_sparse_matrix(m, d1, d2))
You might also want to look here:
http://cvxopt.org/userguide/matrices.html#sparse-matrices
If I'm not wrong the problem you have is that the dense of the sparse matrix does not fit in your memory, and thus, you are not able to save it.
What I would suggest you is to use HDF5. HDF5 handles big data in disk passing it to memory only when needed.
I something like this should work:
import h5py
data = # your sparse matrix
cx = data.tocoo() # coo sparse representation
This will create your data matrix (of zeros) in disk.
f = h5py.File('dset.h5','w')
dataset = f.create_dataset("data", data.shape)
Fill the matrix with the sparse data:
dataset[cx.row, cx.col] = cx.data
Add any modifications you want to dataset:
dataset[something, something] = something
And finally, save it:
file.close()
The way HDF5 works I think is perfect for your needs. The matrix is stored always in disk, so it doesn't require memory, however, you can operate with it as if it was a standard numpy matrix (indexing, slicing, np.(..) operations and so on) and the h5py driver will send the parts of the matrix that you need to memory (never the whole matrix unless you specifically require it with something like data[:, :]).
PS: I'm assuming your sparse matrix is one of the scipy's sparse matrix. If not replace cx.row, cx.col and cx.data from the ones provided by your matrix representation (should be something like it).

concatenating arrays in python like matlab without knowing the size of the output array

I am trying to concatenate arrays in python similar to matlab
array1= zeros(3,500);
array2=ones(3,700);
array=[array1, array2];
I did the following in python:
array1=np.zeros((3,500))
array2=np.ones((3,700))
array=numpy.concatenate((array1, array2), axis=2)
however this gives me different results when i access try to "array[0,:]"
is there a way in python to put arrays in one array similar to matlab.
Thank you
concatenate((a,b),1) or
hstack((a,b)) or
column_stack((a,b)) or
c_[a,b]
From here: Link

Categories

Resources