I am trying to debug a Python with Flask application with memory growth over time.
I a understand that tracemalloc tracks allocated memory by Python, and it shows me the backtrace of that memory being allocated in my program in each snapshot.
My question is if the recent snapshot shows historical data of the memory allocated or if it shows the memory allocated but wasn't freed.
I took a tracemalloc snapshot after memory was increased in my program
Based on a test I did, it looks like tracemalloc doesn't track freed memory.
import tracemalloc
import gc
tracemalloc.start(1)
y = None
def dump_tracemalloc_snapshot():
gc.collect()
snapshot = tracemalloc.take_snapshot()
stats = snapshot.statistics('traceback')
for stat in stats[:3]:
print("%s memory blocks: %.1f KiB" % (stat.count, stat.size / 1024))
for line in stat.traceback.format():
print(line)
print('\n\n')
def allocate_memory():
global y
x = []
for i in range(100000):
x.append(f"AAA{i}")
y = x
if __name__=="__main__":
allocate_memory()
dump_tracemalloc_snapshot()
del y
dump_tracemalloc_snapshot()
Output:
100001 memory blocks: 6337.7 KiB
File "/Users/jafar.atili/Code/tracemalloc_parser/file.py", line 23
x.append(f"AAA{i}")
1 memory blocks: 0.1 KiB
File "/Users/jafar.atili/Code/tracemalloc_parser/file.py", line 19
def allocate_memory():
1 memory blocks: 0.1 KiB
File "/Users/jafar.atili/Code/tracemalloc_parser/file.py", line 6
def dump_tracemalloc_snapshot():
26 memory blocks: 2.1 KiB
File "/opt/homebrew/Cellar/python#3.11/3.11.1/Frameworks/Python.framework/Versions/3.11/lib/python3.11/linecache.py", line 137
lines = fp.readlines()
2 memory blocks: 0.2 KiB
File "/opt/homebrew/Cellar/python#3.11/3.11.1/Frameworks/Python.framework/Versions/3.11/lib/python3.11/linecache.py", line 143
cache[filename] = size, mtime, lines, fullname
1 memory blocks: 0.1 KiB
File "/Users/jafar.atili/Code/tracemalloc_parser/file.py", line 19
def allocate_memory():
Related
I'm coding a Python script which makes many plots. These plots are called from a main program which calls them recursively (by this, I mean hundreds of times).
As the main function runs, I see how my computer's RAM memory fills up during the execution. Furthermore, even after the main function finishes, the RAM memory usage is still much much higher than before the main program execution. Sometimes it can even completely fill the RAM memory.
I tried to delete the heaviest variables and use garbage collector but the net RAM memory usage is always higher. Why this is happening?
I attached a simple (and exaggerated) example of one of my functions and I used memory profiler to see line-by-line the memory usage.
Line # Mem usage Increment Occurrences Line Contents
=============================================================
15 100.926 MiB 100.926 MiB 1 #profile
16 def my_func():
17 108.559 MiB 7.633 MiB 1 a = [1] * (10 ** 6)
18 261.148 MiB 152.590 MiB 1 b = [2] * (2 * 10 ** 7)
19 421.367 MiB 160.219 MiB 1 c = a + b
20 428.609 MiB 7.242 MiB 1 plt.figure(dpi=10000)
21 430.328 MiB 1.719 MiB 1 plt.plot(np.random.rand(1000),np.random.rand(1000))
22 487.738 MiB 57.410 MiB 1 plt.show()
23 487.738 MiB 0.000 MiB 1 plt.close('all')
24 167.297 MiB -320.441 MiB 1 del a,b,c
25 118.922 MiB -48.375 MiB 1 print(gc.collect())
I tried to delete the heaviest variables and use garbage collector but the net RAM memory usage is always higher.
I am trying to use psutil to measure the memory usage.
However, I found a strange behavior that even if I don't store or load anything, I see that memory usage is keep increasing in a nested for loop.
For example, if I run the following code,
import os
import psutil
for i in range(10):
print(i)
for j in range(5):
mem_usage = psutil.Process(os.getpid()).memory_info()[0] / 2 ** 20
print("{}{} MB".format(j,mem_usage))
I get the following output
0
0178 MB
1178 MB
2178 MB
3178 MB
4178 MB
1
0178 MB
1178 MB
2178 MB
3178 MB
4178 MB
What is going on here?
Is psutil not doing what I intend to do?
It's the formatting of your string which is not correct:
"{}{} MB".format(j,mem_usage)
There is no space between "j" and "mem_usage" so it looks like the memory increases when it's not. Also, your math to calculate MB is not correct. It should be:
import os
import psutil
p = psutil.Process(os.getpid())
for i in range(10):
print(i)
for j in range(5):
mem_usage = p.memory_info().rss / 1024 / 1024
print("{} {} MB".format(j, mem_usage))
I am having trouble with high memory usage when performing ffts with scipy's fftpack. Example obtained with the module memory_profiler:
Line # Mem usage Increment Line Contents
================================================
4 50.555 MiB 0.000 MiB #profile
5 def test():
6 127.012 MiB 76.457 MiB a = np.random.random(int(1e7))
7 432.840 MiB 305.828 MiB b = fftpack.fft(a)
8 891.512 MiB 458.672 MiB c = fftpack.ifft(b)
9 585.742 MiB -305.770 MiB del b, c
10 738.629 MiB 152.887 MiB b = fftpack.fft(a)
11 891.512 MiB 152.883 MiB c = fftpack.ifft(b)
12 509.293 MiB -382.219 MiB del a, b, c
13 547.520 MiB 38.227 MiB a = np.random.random(int(5e6))
14 700.410 MiB 152.891 MiB b = fftpack.fft(a)
15 929.738 MiB 229.328 MiB c = fftpack.ifft(b)
16 738.625 MiB -191.113 MiB del a, b, c
17 784.492 MiB 45.867 MiB a = np.random.random(int(6e6))
18 967.961 MiB 183.469 MiB b = fftpack.fft(a)
19 1243.160 MiB 275.199 MiB c = fftpack.ifft(b)
My attempt at understanding what is going on here:
The amount of memory allocated by both fft and ifft on lines 7 and 8 is more than what they need to allocate to return a result. For the call b = fftpack.fft(a), 305 MiB is allocated. The amount of memory needed for the b array is 16 B/value * 1e7 values = 160 MiB (16 B per value as the code is returning complex128). It seems that fftpack is allocating some type of workspace, and that the workspace is equal in size to the output array (?).
On lines 10 and 11 the same procedure is run again, but the memory usage is less this time, and more in line with what I expect. It therefore seems that fftpack is able to reuse the workspace.
On lines 13-15 and 17-19 ffts with different, smaller input sizes are performed. In both of these cases more memory than what is needed is allocated, and memory does not seem to be reused.
The memory usage reported above agrees with what windows task manager reports (to the accuracy I am able to read those graphs). If I write such a script with larger input sizes, I can make my (windows) computer very slow, indicating that it is swapping.
A second example to illustrate the problem of the memory allocated for workspace:
factor = 4.5
a = np.random.random(int(factor * 3e7))
start = time()
b = fftpack.fft(a)
c = fftpack.ifft(b)
end = time()
print("Elapsed: {:.4g}".format(end - start))
del a, b, c
print("Finished first fft")
a = np.random.random(int(factor * 2e7))
start = time()
b = fftpack.fft(a)
c = fftpack.ifft(b)
end = time()
print("Elapsed: {:.4g}".format(end - start))
del a, b, c
print("Finished first fft")
The code prints the following:
Elapsed: 17.62
Finished first fft
Elapsed: 38.41
Finished first fft
Filename: ffttest.py
Notice how the second fft, which has the smaller input size, takes more than twice as long to compute. I noticed that my computer was very slow (likely swapping) during the execution of this script.
Questions:
Is it correct that the fft can be calculated inplace, without the need for extra workspace? If so, why does not fftpack do that?
Is there a problem with fftpack here? Even if it needs extra workspace, why does it not reuse its workspace when the fft is rerun with different input sizes?
EDIT:
Old, but possibly related: https://mail.scipy.org/pipermail/scipy-dev/2012-March/017286.html
Is this the answer? https://github.com/scipy/scipy/issues/5986
This is a known issue, and is caused by fftpack caching its strategy for computing the fft for a given size. That cache is about as large as the output of the computation, so if one does large ffts with different input sizes memory the memory consumption can become significant.
The problem is described in detail here:
https://github.com/scipy/scipy/issues/5986
Numpy has a similar problem, which is being worked on:
https://github.com/numpy/numpy/pull/7686
I'm loading large h5 files into memory using numpy ndarray's. I read that my system (Win 7 prof., 6 GB RAM) is supposed to allow python.exe to use about 2 GB of physical memory.
However I'm getting a MemoryError already just shy of 1 GB. Even stranger this lower limit seems to only apply for numpy array's but not for a list.
I've tested my memory consumption using the following function found here:
import psutil
import gc
import os
import numpy as np
from matplotlib.pyplot import pause
def memory_usage_psutil():
# return the memory usage in MB
process = psutil.Process(os.getpid())
mem = process.get_memory_info()[0]/float(2**20)
return mem
Test 1: Testing memory limits for an ordinary list
print 'Memory - %d MB' %memory_usage_psutil() # prints memory usage after imports
a = []
while 1:
try:
a.append([x*2000 for x in xrange(10000)])
except MemoryError:
print 'Memory - %d MB' %memory_usage_psutil()
a = []
print 'Memory - %d MB' %memory_usage_psutil()
print 'run garbage collector: collected %d objects.' %gc.collect()
print 'Memory - %d MB\n\n' %memory_usage_psutil()
break
Test 1 prints:
Memory - 39 MB
Memory - 1947 MB
Memory - 1516 MB
run garbage collector: collected 0 objects.
Memory - 49 MB
Test 2: Creating a number of large np.array's
shape = (5500,5500)
names = ['b', 'c', 'd', 'g', 'h']
try:
for n in names:
globals()[n] = np.ones(shape, dtype='float64')
print 'created variable %s with %0.2f MB'\
%(n,(globals()[n].nbytes/2.**20))
except MemoryError:
print 'MemoryError, Memory - %d MB. Deleting files..'\
%memory_usage_psutil()
pause(2)
# Just added the pause here to be able to observe
# the spike of memory in the Windows task manager.
for n in names:
globals()[n] = []
print 'Memory - %d MB' %memory_usage_psutil()
print 'run garbage collector: collected %d objects.' %gc.collect()
print 'Memory - %d MB' %memory_usage_psutil()
Test 2 prints:
Memory - 39 MB
created variable b with 230.79 MB
created variable c with 230.79 MB
created variable d with 230.79 MB
created variable g with 230.79 MB
MemoryError, Memory - 964 MB. Deleting files..
Memory - 39 MB
run garbage collector: collected 0 objects.
Memory - 39 MB
My question: Why do I get a MemoryError before I'm even close to the 2GB limit and why is there a difference in memory limits for a list and np.array respectively or what am I missing?
I'm using python 2.7 and numpy 1.7.1
This is probably happening because numpy array is using some C array library (for speed), that is somewhere calling a malloc. This then fails because it cannot allocate a contiguous 1GB of memory. I am further guessing that Python lists are implemented as a linked list, thus the memory needed for a list need not be contiguous. Hence, if you have enough memory available but it is fragmented, your array malloc would fail but your linked list would allow you to use all of the noncontiguous pieces.
I'm curious to understand why in the first example, the memory consumption happens like I was imaging:
s = StringIO()
s.write('abc'*10000000)
# Memory increases: OK
s.seek(0)
s.truncate()
# Memory decreases: OK
while in this second example, at the end, I use the same thing but the memory does not seem to decrease after the truncate method.
The following code is in a method of a class.
from StringIO import StringIO
import requests
self.BUFFER_SIZE = 5 * 1024 * 2 ** 10 # 5 MB
self.MAX_MEMORY = 3 * 1024 * 2 ** 10 # 3 MB
r = requests.get(self.target, stream=True) # stream=True to not download the data at once
chunks = r.iter_content(chunk_size=self.BUFFER_SIZE)
buff = StringIO()
# Get the MAX_MEMORY first data
for chunk in chunks:
buff.write(chunk)
if buff.len > self.MAX_MEMORY:
break
# Left the loop because there is no more chunks: it stays in memory
if buff.len < self.MAX_MEMORY:
self.data = buff.getvalue()
# Otherwise, prepare a temp file and process the remaining chunks
else:
self.path = self._create_tmp_file_path()
with open(self.path, 'w') as f:
# Write the first downloaded data
buff.seek(0)
f.write(buffer.read())
# Free the buffer ?
buff.seek(0)
buff.truncate()
###################
# Memory does not decrease
# And another 5MB will be added to the memory hiting the next line which is normal because it is the size of a chunk
# But if the buffer was freed, the memory would stay steady: - 5 MB + 5 MB
# Write the remaining chunks directly into the file
for chunk in chunks:
f.write(chunk)
Any thoughts?
Thanks.