I need to store a numpy array of shape (2000,720,1280) which is created in every loop. My code looks like:
U_list = []
for N_f in range(N):
U = somefunction(N_f)
U_list.append(U)
del U
So I delete the matrix U in every loop because my RAM get full.
Is this a good method to store the matrix U or would you recommend another solution? I compare the code to matlab and matlab need the half time to compute. I think the storage of U in a list could be the reason.
Using this method will tell you if you are able to store the total U arrays right out the gate. If N is so large that you can't make the results numpy array, you'll have to get creative. Maybe save every 20 into a pickle file or something.
import numpy as np
N = 20
shape = (2000, 720, 1280)
#Make sure to match the dtype returned by somefunction
results = np.zeros((N, *shape))
for N_f in range(N):
results[N_f] = somefunction(N_f)
Related
I would like to print a Numpy array and then read it back. This is what I have done so far:
#printer
import numpy as np
N = 100
x = np.arange(N)
for xi in x:
print(xi)
#reader
import numpy as np
N = 100
x = np.empty(N)
for i in range(N):
x[i] = float(input())
This gets the job done but I think that it may not be the most
efficient way due to the multiple uses of input(). An alternative way I considered is printing only once, reading only once and modifying what I read. This approach has some similarities with this question. In contrast to that question, I have some extra info that could possibly be used to improve performance:
N is known in advance(to both programs)
Arrays are only 1D or 2D(of sizes N and NxN respectively)
Data are float
Data are fully trusted
Thanks in advance.
Edit: I have to add that the value of N will not be that large, even N=1000 will be huge for my problem.
I am generating a large set of monte carlo data and ideally want to store in it an array of arrays.
When i use the array.append(x) and then cycle over a loop that returns a new array for x when I look at the elements of the array at the end they are all the same as the last array x added to the list. I believe this must be because i'm adding the memory location to the list and not the actual array data hence when I add more arrays all the other elements that point to the same location also update.
Is there anyway to prevent this by setting a kwarg or something or do i have to construct my arrays in a different way?
#test to illustrate point
import numpy as np
x = np.random.choice((-1, 1), size=(5, 5))
array_test = []
for T in range(10):
array_test.append(x)
print(x)
x += 10
print(array_test)
i got a problem to solve and i cannot come up with a good solution.
To ease it down I got an array of 10x10 and i want to slice out "little arrays" of 3x3. Right now i do this the following way:
array = np.arange(100).reshape((10,10))
patch = np.array(array[:3, :3]
for n in range(3, 10, 3):
for m in range(3, 10, 3):
patch = numpy.append(patch, array[n:n+3, m:m+3]
i basically create the numpy array patch with the first slice and append all other slices afterwards. The problem with this is that its horribly slow and does not do good use of the slicing opportunities of numpy. I need to do this for a high number of much bigger arrays.
Can anyone give me any advice on how to make this more efficient?
1000 thanks!
Your problem is entirely down to using numpy.append. append creates a new array each time you use it. As your patch array gets bigger this will take progressively longer.
Instead, use a presized array (you already know the final size of the patch array), and avoid making intermediary copies of any data.
# setup
x, y = 999, 999
array = np.arange(x * y)
array.shape = x, y
little_array_size = 3
# creates an array of "little arrays"
patch = np.empty(array.size, dtype=int)
patch.shape = -1, little_array_size, little_array_size
i = 0
for n in range(0, array.shape[0], little_array_size):
for m in range(0, array.shape[1], little_array_size):
# uses view, so data is copied straight from array to patch
patch[i,:] = array[n:n+little_array_size, m:m+little_array_size]
i += 1
patch.shape = -1 # flattens array
The above takes about a third of second on my computer (two orders of magnitude faster than using numpy.append (20+ seconds)).
The code is too complicated to paste here, but I have a numpy array shaped (800, 800, 1300), or 1300 matrices shaped (800, 800). This is 5GB.
I pass this array into a function, whereby the function
multiplies each "matrix" in the above array by a float in a (1300,) shaped array
sums the array into one "matrix", shaped (800, 800)
and takes the inverse of the matrix
This program runs at 20.2 GB RAM! Is that possible? I cannot see any memory leaks. I am simply taking numpy arrays, and passing them through a function. I then save the resulting arrays.
I'll try to post the code.
import math
import matplotlib.pyplot as plt
import numpy as np
import scipy
import scipy.io
import os
data_file1 = "filename1.npy"
data_file2 = "filename2.npy"
data_file3 = "filename3.npy"
data1 = np.load(data_file1)
data2 = np.load(data_file2)
data3 = np.load(data_file3)
data_total = np.concatenate((data1, data2, data3)) # This array is shape (800,800,1300), around 6 GB.
array1 = np.arange(1300) + 1
vector = np.arange(800) + 1
def function_matrix(data_total, vector):
Multi_matrix = array1[:, None, None] * data_total # step 1, multiplies each (800,800) matrix
Sum_matrix = np.sum(Multi_matrix, axis=0) #sum matrix
mTCm = np.array([np.dot(vector.T , (np.linalg.solve(Sum_matrix , vector)) )])
return mTCm
draw_pointsA = np.asarray([[function_matrix(data_total[i], vector[j]) for i in np.arange(0,100)] for j in np.arange(0,100)])
filename = "save_datapoints.npy"
np.save(filename, draw_pointsA)
EDIT 2:
See below. It is actually 12 GB RAM, 20.1 GB virtual size of process.
This doesn't answer your question, but proposes a way to avoid the problem from the start.
Step 1 is sequential -- you only need 1 matrix loaded at a time.
Change your code to process each matrix independently
By Step 2 your memory requirement is down to 800 * 800 * sizeof(datum), which is a few megabytes, and you can certainly afford to keep that in memory.
It sounds like this could be a type issue, i.e. you converted the values in the matrices to a different type. Perhaps you stored the original matrix with values as int16 or a single, and after multiplying it with a float, it's stored as a matrix of double values (which require 2 times more space in memory).
You can use the dtype argument to set the value type for the matrix.
Other possible reasons could be that some additional matrices are created underway. That's obviously impossible to decode unless you post the code.
A possible solution to your memory problem is to use HDF5 files, and write the matrices to disk. Then you could load the matrix one at a time. This is easy with h5py, as the matrices can be compressed, and/or sliced using numpy/scipy syntax.
I have a 2d numpy array X = (xrows, xcols) and I want to apply dot product on each row combination of the array to obtain another array which is of the shape P = (xrow, xrow).
The code looks like the following:
P = np.zeros((xrow, xrow))
for i in range(xrow):
for j in range(xrow):
P[i, j] = numpy.dot(X[i], X[j])
which works well if the array X is small but takes a lot of time for huge X. Is there any way to make it faster or do it more pythonically so that it is fast?
That is obtained by doing result = X.dot(X.T)
When the array becomes large, it can be done be blocks, but depending on your numpy backend this should already parallelize threadwise as much as possible. It seems that this is what you are looking for.
If for some reason you don't want to rely on that, and finally do resort to multiprocessing, you can try something along the lines of
import numpy as np
X = np.random.randn(1000, 100000)
block_size = 10000
from sklearn.externals.joblib import Parallel, delayed
products = Parallel(n_jobs=10)(delayed(np.dot)(X[:, pos:pos + block_size], X.T[pos:pos + block_size]) for pos in range(0, X.shape[1], block_size))
product = np.sum(products, axis=0)
I don't think this is useful for relatively small arrays. And threading can sometimes take care of this better as well.
This is 10% faster on my machine as it avoids loops:
numpy.matrix(X) * numpy.matrix(X.T)
but still there is 50% redundancy.