Why does this loop in python runs progressively slower? - python

In this code, there is a 4-D array of 13x13 images. I would like to save each 13x13 image using matplotlib.pyplot. Here for debugging purposes, I limit the outer loop to 1.
#fts is a numpy array of shape (4000,100,13,13)
no_images = 4000
for m in [1]:
for i in range(no_images):
print i,
fm = fts[i][m]
if fm.min() != fm.max():
fm -= fm.min()
fm /= fm.max() #scale to [0,1]
else:
print 'unscaled'
plt.imshow(fmap)
plt.savefig('m'+str(m)+'_i'+str(i)+'.png')
Saving 4000 images took more than 20 hours. Why is it this slow?
If I limit the inner loop to the first 100 images, it takes about 1 minute. So the whole thing should be completed in 40 minutes, not over 20 hours! And I notice it seems to run progressively slower.

What you experience here is a memory leak: you keep creating instances of AxesImage objects (by repetitively calling plt.imshow) to the moment they can't fit into RAM; and then the whole thing begins swapping to disk, which is incredibly slow. To avoid memory leaks, you can either destroy AxesImage instance when you don't need it:
...
image = plt.imshow(fmap)
plt.savefig('m'+str(m)+'_i'+str(i)+'.png')
del(image)
Or, alternatively, you can create only one AxesImage, and then just change the data in it:
...
image = None
for m in [1]:
for i in range(no_images):
...
if image is None:
image = plt.imshow(fmap)
else:
image.set_data(fmap)
...

I have got the same issue and I tried the above solutions but my dataset is too big for my ram it just collapses after running 20000 images and then I got the answer both plt.close() and del image are not working because they are not clearing total data which is stored they the just adding strain to ram to clear total plt data we need to use plt.figure().clear(), plt.close(), plt.cla(), plt.clf()
This might work for you
#fts is a numpy array of shape (4000,100,13,13)
no_images = 4000
for m in [1]:
for i in range(no_images):
print i,
fm = fts[i][m]
if fm.min() != fm.max():
fm -= fm.min()
fm /= fm.max() #scale to [0,1]
else:
print 'unscaled'
plt.imshow(fmap)
plt.savefig('m'+str(m)+'_i'+str(i)+'.png')
plt.figure().clear()
plt.close()
plt.cla()
plt.clf()

Related

Very high memory usage with simple Python loop

I have the following code, which reads in a set of (small) observations, runs a cross-correlation calculation on them, and then saves some plots:
import matplotlib.pyplot as plt
import numpy as np
import astropy.units as u
from sunkit_image.time_lag import cross_correlation, get_lags, max_cross_correlation, time_lag
time=np.linspace(0,43200,num=int(43200/12))
timeu = time * u.s
for i in range(len(folders)): # loop over all dates
os.chdir('/Volumes/LaCie/timelags/RARs/'+folders[i])
print(folders[i])
for j in range(len(pairs)): # iterates over every pair of data sets
for x in range(36): # sets up a sliding 2-hour window that shifts 20 min at a time
ch_a = np.load('dc'+pairs[j][0]+'.npy',allow_pickle=True)[()][100*x:(100*x)+600,:,:] # read in only necessary data (but entire file is only ~6 Gb)
ch_b = np.load('dc'+pairs[j][1]+'.npy',allow_pickle=True)[()][100*x:(100*x)+600,:,:] # read in only necessary data (but entire file is only ~6 Gb)
ctime= timeu[100*x:(100*x)+600] # sets up the correct time array
print('ctime range:',ctime[0],ctime[-1],len(ctime))
max_cc_map = max_cross_correlation(ch_a, ch_b, ctime)
tl_map = time_lag(ch_a, ch_b, ctime)
del ch_a # trying to deal with memory issue
del ch_b # trying to deal with memory issue
plt.close('all') # making sure I don't just create endless open plots
fig = plt.figure()
ax = fig.add_subplot()
im = ax.imshow(np.flip(tl_map,axis=0), cmap="cubehelix", vmin=-6000, vmax=6000)
cax = make_axes_locatable(ax).append_axes("right", size="5%", pad="10%")
fig.colorbar(im, cax=cax,label=r"$\tau_{AB}$ [s]")
plt.tight_layout()
fig.savefig('timelag_'+pairs[j][0]+'_'+pairs[j][1]+'_'+str(x)+'.png',dpi=400)
fig = plt.figure()
ax = fig.add_subplot()
im = ax.imshow(np.flip(max_cc_map,axis=0), cmap="plasma",vmin=0,vmax=1)
cax = make_axes_locatable(ax).append_axes("right", size="5%", pad="10%")
fig.colorbar(im, cax=cax,label=r"Max Cross-correlation")
plt.tight_layout()
fig.savefig('maxcc_'+pairs[j][0]+'_'+pairs[j][1]+'_'+str(x)+'.png',dpi=400)
fig=plt.figure(figsize=(10,6))
values_tl, bins_tl, bars = plt.hist(np.ravel(np.asarray(tl_map)),bins=np.arange(-6000,6000,12000/50),log=True,label='Time Lags')
values_masked, bins_masked, bars = plt.hist(np.ravel(np.asarray(tl_map)[np.where(np.asarray(max_cc_map) > 0.25)])
,bins=np.arange(-6000,6000,12000/50),log=True,label='Masked CC > 0.25')
values_masked2, bins_masked2, bars = plt.hist(np.ravel(np.asarray(tl_map)[np.where(np.asarray(max_cc_map) > 0.5)])
,bins=np.arange(-6000,6000,12000/50),log=True,label='Masked CC > 0.5')
values_masked3, bins_masked3, bars = plt.hist(np.ravel(np.asarray(tl_map)[np.where(np.asarray(max_cc_map) > 0.75)])
,bins=np.arange(-6000,6000,12000/50),log=True,label='Masked CC > 0.75')
plt.ylabel('Pixel Occurrence')
plt.legend()
fig.savefig('hist_tl_cc_'+pairs[j][0]+'_'+pairs[j][1]+'_'+str(x)+'.png',dpi=400)
As noted in the comments, I've inserted a few lines to try to dump unnecessary data between iterations; I know a 3-deep for loop isn't the most efficient way to code, but the loop over the dates and channel pairs are very short -- almost all of the time/memory is spent in the innermost loop. The problem is that after a few minutes, the memory usage is oscillating between 30-55 GB. My Mac is becoming sluggish, and it's only at the beginning of the dataset. Is there something I'm missing here? Even if the entire files were being read in at the beginning instead of a subset, it's only ~ 12 Gb of data, and the code would crash if I was reading in the whole thing (i.e., it's definitely only reading in part of the raw data). I tried a with statement but that didn't take up less memory. Any suggestions would be very welcome!
Per loop you create 3 figures but you never close them. After each fig.savefig(...), you should close the figure with plt.close(fig).

How can I simplify the following code so it runs faster?

I have a three-dimensional array containing many 2D images (frames). I want to remove the background by considering a threshold for each pixel value and copy new elements in a new 3D array. I wrote the following code lines, but it is too expensive for running. How can I speed up this code?
ss = stack #3D array (571, 1040, 1392)
T,ni,nj = ss.shape
Background_intensity = np.ones([T,ni,nj])
Intensity = np.zeros([T,ni,nj])
DeltaF_F_max = np.zeros([T,ni,nj])
for t in range(T):
for i in range(ni):
for j in range(nj):
if ss[t,i,j]<12:
Background_intensity[t,i,j] = ss[t,i,j]
if Background_intensity[t,i,j] == 0 :
Background_intensity[t,i,j] = 1
else:
Intensity[t,i,j] = ss[t,i,j]
DeltaF_F_max[t,i,j]=(((Intensity[t,i,j] - Background_intensity[t,i,j])))/(Background_intensity[t,i,j])
I had a go at this with Numpy. I am not sure what results you got, but it takes around 20s on my Mac. It is quite a memory hog even after I reduced all the sizes by a factor of 8 because you don't need an int64 to store a 1 or a number under 12 or 255.
I wonder if you need to do 571 images all in one go or whether you could do them "on-the-fly" as you acquire them rather than gathering them all in one enormous lump.
You could also consider doing this with Numba as it is very good at optimising for loops - try putting [numba] in the search box above, or looking at this example - using prange to parallelise the loops across your CPU cores.
Anyway, here is my code:
#!/usr/bin/env python3
# https://stackoverflow.com/q/71460343/2836621
import numpy as np
T, ni, nj = 571, 1040, 1392
# Create representative input data, such that around 1/3 of it is < 12 for testing
ss = np.random.randint(0,36,(T,ni,nj), np.uint8)
# Ravel into 1-D representation for simpler indexing
ss_r = ss.ravel()
# Create extra arrays but using 800MB rather than 6.3GB each, also ravelled
Background_intensity = np.ones(T*ni*nj, np.uint8)
Intensity = np.zeros(T*ni*nj, np.uint8)
# Make Boolean (True/False) mask of elements below threshold
mask = ss_r < 12
# Quick check here - print(np.count_nonzero(mask)/np.size(ss)) and check it is 0.333
# Set Background_intensity to "ss" according to mask
Background_intensity[mask] = ss_r[mask]
# Make sure no zeroes present
Background_intensity[Background_intensity==0] = 1
# This corresponds to the "else" of your original "if" statement
Intensity[~mask] = ss_r[~mask]
# Final calculation and reshaping back to original shape
DeltaF_F_max = (Intensity - Background_intensity)/Background_intensity
DeltaF_F_max.reshape((T,ni,nj))

is there a way to reduce ram memory consumption for my python code

I am trying to deploy a deep learning code for human action recognition on kaggle platform and I came across a ram memory shortage problem caused by this part of my code which is reading the frames of mp4 files of a dataset (350 files with fine resolution and 30 fps) :
data = []
labls = []
I for i,item in enumerate(tqdm(names)):
print(names[i])
imgs = get_frames(names[i])
for j in imgs:
data.append(j)
labls.append(labels[i])
and
def get_frames(fileFullPath):
# Declare a list to store video frames.
images = []
video_reader = cv2.VideoCapture(fileFullPath)
#Get the total number of frames in the video.
video_frames_count = int(video_reader.get(cv2.CAP_PROP_FRAME_COUNT))
SEQUENCE_LENGTH = min(int(video_frames_count * SEQUENCE_Ratio), 25)
# Calculate the the interval after which frames will be added to the list.
skip_frames_window = max((video_frames_count/SEQUENCE_LENGTH), 1)
frame_counter= 0
print(" Fetched frames=",SEQUENCE_LENGTH)
while frame_counter < SEQUENCE_LENGTH:
# Print the percentage-progress.
print_progress(count=frame_counter, max_count=SEQUENCE_LENGTH - 1)
int(frame_counter * skip_frames_window) , "Seq=" , SEQUENCE_LENGTH )
# Set the current frame position of the video.
video_reader.set(cv2.CAP_PROP_POS_FRAMES, int(frame_counter * skip_frames_window))
success,image = video_reader.read()
RGB_img = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
# resizing frames to (224 * 224)
res = cv2.resize(RGB_img, dsize=(IMAGE_HEIGHT_WIDTH, IMAGE_HEIGHT_WIDTH),interpolation=cv2.INTER_CUBIC)
# Normalize the resized frame by dividing it with 255 so that each pixel value then lies between 0 and 1
normalized_frame = res / 255
images.append(normalized_frame)
frame_counter += 1
# Release the VideoCapture object.
video_reader.release()
# Return the frames list.
return images
i lowered the number of frames as much as to 25 frames per a file (not second) and yet the 13 GB provided by Kaggle is not enough. And to make the problem even worse, next i have to convert the arrays into numpy array which also takes a lot of memory :
# convert the data and labels to NumPy arrays
data = np.array(data)
labls = np.array(labls)
print ('number of frames will use to train and test the module is ',len(data))
any suggestion would be appreciated ,thanks.
If your data does not fit into your RAM, use a data pipeline. It works like this:
Read a single batch of data from disk into memory
Process that batch (e.g. normalize/scale/crop/...)
Feed that batch to the model and perform optimization step
Repeat from 1. until all samples in the dataset have been used.
Apply the steps for the training set first to run training for one epoch. After that you can iterate your validation set in the same way and subsequently continue with the second training epoch.
If you train your model using a GPU, the data loading and preprocessing can be handled by the CPU during the time it takes for the GPU to complete the optimization step. This provides a massive speed up as well.
There are packages in PyTorch and Tensorflow to make this really easy and even apply multiprocessing to speed up the whole routine.
Check this guide by tensorflow for the basics and this guide to optimize the pipeline.
PyTorch has something similar here

Creating moving images in python

I am wondering what's the best approach to turn a large number of images into a moving one in Python. A lot of examples I've found seem to deal with actual videos or video games, such as pygame, which seems over complicated for what I'm looking to do.
I have created a loop, and would like the image to update every time the code runs through the loop. Is there a possibly a method in python to overplot each image and erase the previous image with each iteration?
sweeps_no = 10
for t in range(sweeps_no):
i = np.random.randint(N)
j = np.random.randint(N)
arr = nearestneighbours(lat, N, i, j)
energy = delta_E(lat[i,j], arr, J)
if energy <= 0:
matrix[i,j] *= matrix[i,j]
elif np.exp(energy/T) >= np.random.random():
matrix[i,j] *= -matrix[i,j]
else:
matrix[i,j] = matrix[i,j]
t +=1
print t
res.append(switch)
image = plt.imshow(lat)
plt.show()
Also, I can't understand why the loop above doesn't result in 10 different images showing up when the image is contained in the loop.
You can update a single figure using fig.canvas.draw() after your call to imshow(). It is important to include a pause i.e. plt.pause(2), so that you can see the changes to your figure.
The following is a runnable example:
import matplotlib.pyplot as plt
import numpy as np
fig = plt.figure() # create the figure
for i in range(10):
data = np.random.randn(25).reshape(5,5) # some fake data
plt.imshow(data)
fig.canvas.draw()
plt.pause(2) # pause for 2 seconds

fast downsampling of huge matrix using python (numpy memmap, pytables or other?)

As part of my data processing I produce huge non sparse matrices in the order of 100000*100000 cells, which I want to downsample by a factor of 10 to reduce the amount of data. In this case I want to average over blocks of 10*10 pixels, to reduce the size of my matrix from 100000*100000 to 10000*10000.
What is the fastest way to do so using python? It does not matter for me if I need to save my original data to a new dataformat, because I have to do the downsampling of the same dataset multiple times.
Currently I am using numpy.memmap:
import numpy as np
data_1 = 'data_1.dat'
date_2 = 'data_2.dat'
lines = 100000
pixels = 100000
window = 10
new_lines = lines / window
new_pixels = pixels / window
dat_1 = np.memmap(data_1, dtype='float32', mode='r', shape=(lines, pixels))
dat_2 = np.memmap(data_2, dtype='float32', mode='r', shape=(lines, pixels))
dat_in = dat_1 * dat_2
dat_out = dat_in.reshape([new_lines, window, new_pixels, window]).mean(3).mean(1)
But with with large files this method becomes very slow. Likely this has something to do with the binary data of these files, which are ordered by line. Therefore, I think that a data format which stores my data in blocks instead of lines will be faster, but I am not sure what the performance gain will be and whether there are python packages who support this.
I have also thought about downsampling of the data before creating such a huge matrix (not shown here), but my input data is fractured and irregular, so that would become very complex.
Based on this answer, I think this might be a relatively fast method, depending on how much overhead reshape gives you with memmap.
def downSample(a, window):
i, j = a.shape
ir = np.arange(0, i, window)
jr = np.arange(0, j, window)
n = 1./(window**2)
return n * np.add.reduceat(np.add.reduceat(a, ir), jr, axis=1)
Hard to test speed without your dataset.
This avoids an intermediate copy, as the reshape keeps dimensions contiguous
dat_in.reshape((lines/window, window, pixels/window, window)).mean(axis=(1,3))

Categories

Resources