So I recently SSHed into my linode Django server and whenever I try to do anything it ends up getting killed for running out of memory. After a little exploration I found this little gem (the output of top):
It seems I somehow hove many instance of apache and python open all at once eating up all my memory. Is this normal behavior for Django? How could this have happened? What do I do to fix my memory shortage?
Related
I have a python script that works fine on my main computer without problems. But when I uploaded it to the Ubuntu server it started crashing. I thought for a long time what the problem was and looked at the system logs. It turned out that ubuntu automatically forcibly terminates the script due to lack of memory (server configuration is 512 MB of RAM), how can I debug the program on the consumed memory in different work options?
Have a look at something like Guppy3, which includes heapy, a 'heap analysis toolset' that can help you find where the memory's being used/held. Some links to information on how to use it are in the project's README.
If you have a core, consider using https://github.com/vmware/chap, which will allow you to look at both python and native allocations.
Once you have opened the core, probably "summarize used" is a good place to start.
like i write above, i'm running a python script that has an ETA of 37 hours, and I need to continue working. Is there any problem if I start another Pycharm session in another window and I for example run another script? wuould it create problems like crash, memory limit or something like that?
My laptop has an i7 4710HQ, dedicated Gforce GTX850M and 8gb of RAM
the script is just a simple scrape of a website, but having ban problem i set a 5 second sleep after every request. but i have to made 20,000 requests...
thanks in advance!
If you have two instances of PyCharm running, unless the 2nd instance actually crashes the O/S (Blue Screen or equivalent), that shouldn't impact the 1st instance in anyway.
They run in seaprate processes which are almost sandboxes the only common thing they share is the initial executable file, and the file system and the O/S. If they are in different projects then they only share the PyCharm system settings.
You could even cause Python to segfault (rare but it does happen) and that wont effect your long running PyCharm application - also remember that your applications were started by PyCharm, but they aren't running in the PyCharm process. So in fact what you have is two Pycharm processes each one with at least one child process which is your running application.
I have been moving from php to python for my web-development. Have selected django as my prefeered framework. One thing that bugs is the time it takes for my changes of the python code to reload during development. ~10 sec roughly.
Probably some of my seconds are due to my selected setup of docker-for-mac with mounted volume. But even if it was down to 5sec it would be annoying. I have moved away from the built-in django development server, over to apache 2.4 with mod_wgsi, this improves the speed of the application a lot, but no the python code reloading.
I know it's like comparing apples and oranges, but coming from php my code changes are available immediately. Does anyone have any tips to speed this up?
Traced it back to slow disk access with Docker for Mac.
The Issue
I am using my laptop with Apache to act as a server for a local project involving tensorflow and python which uses an API written in Flask to service GET and POST requests coming from an app and maybe another user on the local network.The problem is that the initial page keeps loading when I specifically import tensorflow or the object detection folder within the research folder in the tensorflow github folder, and it never seems to finish doing so, effectively getting it stuck. I suspect the issue has to do with the packages being large in size, but I didn't have any issue with that when running the application on the development server provided with Flask.
Are there any pointers that I should look for when trying to solve this issue? I checked the memory usage, and it doesn't seem to be rising substantially, as well as the CPU usage.
Debugging process
I am able to print basic hello world to the root page quite quickly, but I isolated the issue to the point when the importing takes place where it gets stuck.
The only thing I can think of is to limit the number of threads that are launched, but when I limited the number of threads per child to 5 and number of connections to 5 in the httpd-mpm.conf file, it didn't help.
The error/access logs don't provide much insight to the matter.
A few notes:
Thus far, I used Flask's development server with multi-threading enabled to serve those requests, but I found it to be prone to crashing after 5 minutes of continuous run, so I am now trying to use Apache using the wsgi interface in order to use Python scripts.
I should also note that I am not servicing html files, just basic GET and POST requests. I am just viewing them using the browser.
If it helps, I also don't use virtual environments.
I am using Windows 10, Apache 2.4 and mod_wsgi 4.5.24
The tensorflow module being a C extension module, may not be implemented so it works properly in Python sub interpreters. To combat this, force your application to run in the main Python interpreter context. Details in:
http://modwsgi.readthedocs.io/en/develop/user-guides/application-issues.html#python-simplified-gil-state-api
I am running some tests nightly on a VM with a centos operating system. Recently the tests have been taking up all the memory available and nearly all the swap memory on the machine, I assigned the VM twice as much memory and it's still happening, which results in the physical host machine of the VM dying. These tests were previously running without needing half as much memory so I need to use some form of python memory analyzer to investigate what is going on.
I've looked at Pysizer and Heapy -- but after research Dowser seems to be the one I'm after as it requires zero changes to code.
So far from the documentation and googling I've got this code in it's own class:
import cherrypy
import dowser
class MemoryAnalyzer:
def memoryCheck(self):
cherrypy.config.update({'server.socket_port':8080})
cherrypy.tree.mount(dowser.Root())
cherrypy.engine.start()
I was hoping this would bring up the web interface shown in the documentation to track all instance of python running on the host, which doesn't work. I was confused by the documentation:
'python dowser __init__.py'.
Is it possible to just run this? I get the error :
/usr/bin/python: can't find '__main__.py' in 'dowser'
Can dowser run independently from my test suite on the VM? Or will I have to implement this above code into my main class to run my tests to trace instances of python?
Dowser is meant to be run as part of your application. Therefore, wherever you initialize the application, add the lines
import dowser
cherrypy.tree.mount(dowser.Root(), '/dowser')
Then you can browse to http://localhost:8080/dowser to view the dowser interface.
Note that the invocation you quoted from the documentation is for testing dowser. The correct invocation for that is python dowser/__init__.py.
Managed to get dowser to work using this blog http://www.aminus.org/blogs/index.php/2008/06/11/tracking-memory-leaks-with-dowser?blog=2 and changing the port to 8088 instead of 8080(which wasn't in use on the machine but still doesn't work!)