Python - increased memory usage when using a wrapper function? - python

I wrote a test for a program and checked the memory usage. For some reason, wrapping the same operation in a function results in a polynomial (or maybe exponential) increase in memory usage. I am using memory_profiler to track memory usage, Python 2.7.1.
Code:
def convertIntToBitint(number): return (pow(2, number - 1))
Test code and results (cleaned up a bit):
Line # Mem usage Increment Line Contents
23 9.070 MB 0.000 MB def test_convertBigIntToBitint(self):
24 9.070 MB 0.000 MB result1 = convertIntToBitint(21000)
25 9.070 MB 0.000 MB answer1 = pow(2, 20999)
27 10.496 MB 1.426 MB result2 = convertIntToBitint(5015280)
28 11.785 MB 1.289 MB answer2 = pow(2, 5015279)
30 70.621 MB 58.836 MB result3 = convertIntToBitint(121000000)
31 85.367 MB 14.746 MB answer3 = pow(2, 120999999)
Why does convertIntToBitint take up more memory? And why doesn't the memory usage grow linearly?
EDIT for 2rs2ts
Interesting. Not sure if this is what you meant.
Line # Mem usage Increment Line Contents
23 9.074 MB 0.000 MB def test_convertBigIntToBitint(self):
24 56.691 MB 47.617 MB result3 = convertIntToBitint(121000000)
25 85.539 MB 28.848 MB answer3 = pow(2, 120999999)
26 85.539 MB 0.000 MB result2 = convertIntToBitint(5015280)
27 84.258 MB -1.281 MB answer2 = pow(2, 5015279)
28 83.773 MB -0.484 MB result1 = convertIntToBitint(21000)
29 81.211 MB -2.562 MB answer1 = pow(2, 20999)

Related

Matplotlib does not free up RAM memory after closing a function

I'm coding a Python script which makes many plots. These plots are called from a main program which calls them recursively (by this, I mean hundreds of times).
As the main function runs, I see how my computer's RAM memory fills up during the execution. Furthermore, even after the main function finishes, the RAM memory usage is still much much higher than before the main program execution. Sometimes it can even completely fill the RAM memory.
I tried to delete the heaviest variables and use garbage collector but the net RAM memory usage is always higher. Why this is happening?
I attached a simple (and exaggerated) example of one of my functions and I used memory profiler to see line-by-line the memory usage.
Line # Mem usage Increment Occurrences Line Contents
=============================================================
15 100.926 MiB 100.926 MiB 1 #profile
16 def my_func():
17 108.559 MiB 7.633 MiB 1 a = [1] * (10 ** 6)
18 261.148 MiB 152.590 MiB 1 b = [2] * (2 * 10 ** 7)
19 421.367 MiB 160.219 MiB 1 c = a + b
20 428.609 MiB 7.242 MiB 1 plt.figure(dpi=10000)
21 430.328 MiB 1.719 MiB 1 plt.plot(np.random.rand(1000),np.random.rand(1000))
22 487.738 MiB 57.410 MiB 1 plt.show()
23 487.738 MiB 0.000 MiB 1 plt.close('all')
24 167.297 MiB -320.441 MiB 1 del a,b,c
25 118.922 MiB -48.375 MiB 1 print(gc.collect())
I tried to delete the heaviest variables and use garbage collector but the net RAM memory usage is always higher.

Python memory profile using psutil

I am trying to use psutil to measure the memory usage.
However, I found a strange behavior that even if I don't store or load anything, I see that memory usage is keep increasing in a nested for loop.
For example, if I run the following code,
import os
import psutil
for i in range(10):
print(i)
for j in range(5):
mem_usage = psutil.Process(os.getpid()).memory_info()[0] / 2 ** 20
print("{}{} MB".format(j,mem_usage))
I get the following output
0
0178 MB
1178 MB
2178 MB
3178 MB
4178 MB
1
0178 MB
1178 MB
2178 MB
3178 MB
4178 MB
What is going on here?
Is psutil not doing what I intend to do?
It's the formatting of your string which is not correct:
"{}{} MB".format(j,mem_usage)
There is no space between "j" and "mem_usage" so it looks like the memory increases when it's not. Also, your math to calculate MB is not correct. It should be:
import os
import psutil
p = psutil.Process(os.getpid())
for i in range(10):
print(i)
for j in range(5):
mem_usage = p.memory_info().rss / 1024 / 1024
print("{} {} MB".format(j, mem_usage))

Python Memory Profiler results different from expected

I am trying to get an understanding of using memory_profiler on my python app.
referring to the Python memory profile guide I copied the following code snippet :-
from memory_profiler import profile
#profile
def my_func():
a = [1] * (10 ** 6)
b = [2] * (2 * 10 ** 7)
del b
return a
The expected result according to the link is :-
Line # Mem usage Increment Line Contents
==============================================
3 #profile
4 5.97 MB 0.00 MB def my_func():
5 13.61 MB 7.64 MB a = [1] * (10 ** 6)
6 166.20 MB 152.59 MB b = [2] * (2 * 10 ** 7)
7 13.61 MB -152.59 MB del b
8 13.61 MB 0.00 MB return a
But when i ran my it on my VM running Ubuntu 16.04 I got the following results instead :-
Line # Mem usage Increment Line Contents
================================================
3 35.4 MiB 35.4 MiB #profile
4 def my_func():
5 43.0 MiB 7.7 MiB a = [1] * (10 ** 6)
6 195.7 MiB 152.6 MiB b = [2] * (2 * 10 ** 7)
7 43.1 MiB -152.5 MiB del b
8 43.1 MiB 0.0 MiB return a
There seems to be a huge overhead of around 30MiB difference between the expected and my run. I am trying to get an understanding of where this comes from and if I am doing anything incorrect. Should I be worried about it?
Please advice if anyone have any idea. Thanks
EDIT:
O/S : Ubuntu 16.06.4 (Xenial) running inside a VM
Python : Python 3.6.4 :: Anaconda, Inc.
The memory taken by a list, or an integer heavily depends on the python version/build.
For instance, in python 3, all integers are long integers, whereas in python 2, long is used only when the integer doesn't fit in a CPU register / C int.
On my machine, python 2:
>>> sys.getsizeof(2)
24
Python 3.6.2:
>>> sys.getsizeof(2)
28
When computing your ratios vs the 24/28 ratio, it's pretty close:
>>> 195.7/166.2
1.177496991576414
>>> 28/24
1.1666666666666667
(this is probably not the only difference, but that's the most obvious I can think of)
So no, as long as the results are proportional, you shouldn't worry, but if you have memory issues with python integers (python 3, that is), you could use alternatives, like numpy or other native integer types.

Memory usage of scipy.fftpack

I am having trouble with high memory usage when performing ffts with scipy's fftpack. Example obtained with the module memory_profiler:
Line # Mem usage Increment Line Contents
================================================
4 50.555 MiB 0.000 MiB #profile
5 def test():
6 127.012 MiB 76.457 MiB a = np.random.random(int(1e7))
7 432.840 MiB 305.828 MiB b = fftpack.fft(a)
8 891.512 MiB 458.672 MiB c = fftpack.ifft(b)
9 585.742 MiB -305.770 MiB del b, c
10 738.629 MiB 152.887 MiB b = fftpack.fft(a)
11 891.512 MiB 152.883 MiB c = fftpack.ifft(b)
12 509.293 MiB -382.219 MiB del a, b, c
13 547.520 MiB 38.227 MiB a = np.random.random(int(5e6))
14 700.410 MiB 152.891 MiB b = fftpack.fft(a)
15 929.738 MiB 229.328 MiB c = fftpack.ifft(b)
16 738.625 MiB -191.113 MiB del a, b, c
17 784.492 MiB 45.867 MiB a = np.random.random(int(6e6))
18 967.961 MiB 183.469 MiB b = fftpack.fft(a)
19 1243.160 MiB 275.199 MiB c = fftpack.ifft(b)
My attempt at understanding what is going on here:
The amount of memory allocated by both fft and ifft on lines 7 and 8 is more than what they need to allocate to return a result. For the call b = fftpack.fft(a), 305 MiB is allocated. The amount of memory needed for the b array is 16 B/value * 1e7 values = 160 MiB (16 B per value as the code is returning complex128). It seems that fftpack is allocating some type of workspace, and that the workspace is equal in size to the output array (?).
On lines 10 and 11 the same procedure is run again, but the memory usage is less this time, and more in line with what I expect. It therefore seems that fftpack is able to reuse the workspace.
On lines 13-15 and 17-19 ffts with different, smaller input sizes are performed. In both of these cases more memory than what is needed is allocated, and memory does not seem to be reused.
The memory usage reported above agrees with what windows task manager reports (to the accuracy I am able to read those graphs). If I write such a script with larger input sizes, I can make my (windows) computer very slow, indicating that it is swapping.
A second example to illustrate the problem of the memory allocated for workspace:
factor = 4.5
a = np.random.random(int(factor * 3e7))
start = time()
b = fftpack.fft(a)
c = fftpack.ifft(b)
end = time()
print("Elapsed: {:.4g}".format(end - start))
del a, b, c
print("Finished first fft")
a = np.random.random(int(factor * 2e7))
start = time()
b = fftpack.fft(a)
c = fftpack.ifft(b)
end = time()
print("Elapsed: {:.4g}".format(end - start))
del a, b, c
print("Finished first fft")
The code prints the following:
Elapsed: 17.62
Finished first fft
Elapsed: 38.41
Finished first fft
Filename: ffttest.py
Notice how the second fft, which has the smaller input size, takes more than twice as long to compute. I noticed that my computer was very slow (likely swapping) during the execution of this script.
Questions:
Is it correct that the fft can be calculated inplace, without the need for extra workspace? If so, why does not fftpack do that?
Is there a problem with fftpack here? Even if it needs extra workspace, why does it not reuse its workspace when the fft is rerun with different input sizes?
EDIT:
Old, but possibly related: https://mail.scipy.org/pipermail/scipy-dev/2012-March/017286.html
Is this the answer? https://github.com/scipy/scipy/issues/5986
This is a known issue, and is caused by fftpack caching its strategy for computing the fft for a given size. That cache is about as large as the output of the computation, so if one does large ffts with different input sizes memory the memory consumption can become significant.
The problem is described in detail here:
https://github.com/scipy/scipy/issues/5986
Numpy has a similar problem, which is being worked on:
https://github.com/numpy/numpy/pull/7686

how can i release memory caused by "for"

Our game program will initialize the data of all players into the memory. My purpose is to reduce the memory which is not necessary. I traced the program and found that "for" taking a lot of memory.
For example:
Line # Mem usage Increment Line Contents
================================================
52 #profile
53 11.691 MB 0.000 MB def test():
54 19.336 MB 7.645 MB a = ["1"] * (10 ** 6)
55 19.359 MB 0.023 MB print recipe.total_size(a, verbose=False)
56 82.016 MB 62.656 MB for i in a:
57 pass
print recipe.total_size(a, verbose=False):8000098 bytes
The question is How can i release that 62.656 MB memory.
P.S.
Sorry, i know my English is not very well.I will appreciate everyone to read this.:-)
If you are absolutely desperate to reduce memory usage on the loop you can do it this way:
i = 0
while 1:
try:
a[i] #accessing an element here
i += 1
except IndexError:
break
Memory stats (if they are accurate):
12 9.215 MB 0.000 MB i = 0
13 9.215 MB 0.000 MB while 1:
14 60.484 MB 51.270 MB try:
15 60.484 MB 0.000 MB a[i]
16 60.484 MB 0.000 MB i += 1
17 60.484 MB 0.000 MB except IndexError:
18 60.484 MB 0.000 MB break
However, this code looks ugly and danger and reduction in memory usage is just tiny.
1) Instead of list iterator. You should use generator. according to your sample code:
#profile
def test():
a = ("1" for i in range(10**6)) #this will return a generator, instead of a list.
for i in a:
pass
Now if you use the generator 'a' in the for loop, it won't take that much memory.
2) If you are getting a list, then first convert it into generator.
#profile
def test():
a = ["1"] * (10**6) #getting list
g = (i for i in a) #converting list into a generator object
for i in g: #use generator object for iteration
pass
Try this. If is helps you.

Categories

Resources