Preventing python script from using all ram - python

I use a jupyter notebook to execute a pyton script. The script calls the association_rules function from the mlxtend framework. Calling this function the ram literally explodes from 500 MB used to over 32 GB. But that would not be the problem. The Problem is if i execute the script locally on my windows 10 PC the ram maxes out but everything is still running. When i do the same on a unix server (Xfce) the server crashes. Is there something i can do prevent the server from crashing and to guarantee that the script continues?
Upadate:
I basically missed the fact that windows is swapping ram all the time, the only difference is that windows does not crash. I'm quite sure this would be solved on linux by fixing the swapping configuration. So basically the question is obselete.
Update:
I have made some wrong assumptions. The windows PC was already swapping, and the swapping partition went out of memory as well. So on all machine the same Problem appeared and all them crashed. In the end it was a mistake on the data preprocessing. Sorry for the unconvenience and please see this question as not relevant any more.

Run the script using the nice command to assign priority.

Related

Very high memory usage when profiling python in PyCharm

I'm trying to profile a python application in pycharm, however when the application terminates and the profiler results are displayed Pycharm requires all 16gb of RAM that I have, which makes pycharm unusable.
Said python application is doing reinforcement learning, so it does take a bit of time to run (~10 min or so), however while running it does not require large amounts of RAM.
I'm using the newest version of PyCharm on Ubuntu 16.04 and CProfile is used by Pycharm for profiling.
I'd be very glad if one of you knows a solution.
EDIT: It seems this was an issue within PyCharm, which has since been fixed (as of 2017/11/21)
It's a defect within PyCharm: https://youtrack.jetbrains.com/issue/PY-25768

Python code freeze on different pc

I got a Neural Network implemented with numpy (Python 2.7) and a machine to test it faster. Lately my code on this machine got freeze but if a test it on my notebook (less cpu, ram, etc) it run without problem (only slower).
Which can be the problem? I thought that it was my code but if it works on a slower pc, so I think that machine have a trouble.
edit: Also, sometimes it works without problems.
edit 2: Both pcs are Ubuntu 16.04
edit 3: It happens event with same input and parameters
If it doesn't always occurs and is confined to one machine it could very well be a hardware problem.
The problem is that they are often hard to test, because they generally leave little evidence in the way of log files.
Try testing the RAM.
If that doesn't turn up errors, try logging the CPU temperatures to check that it doesn't get too hot.
Also, log the different voltages. It could be that the power supply is on the way out.
Try compiling code on the same machine where your code gets frozen. Each machine (more precisely microprocessor) has different instruction set. The flaws in instruction set may be covered by using Microcode. This could be the place where the problem may exist.

Python starts extremely slowly the first time after I reboot Windows

I apologize for not having a reproducible example. My problem is with a large base of proprietary code and I don't have an extract that shows the same behavior. Even better, it isn't my software and I know about 2% of how it works.
Simply, this Python program I'm dealing with takes about 80 seconds to complete its entire setup and get to the point where all its flask code is running and the webserver being created is up and able to respond to requests. BUT -- that's only the first time I run it on Windows after rebooting. On subsequent times starting the python script in question, it takes more like 10 seconds.
And the nutty part is, in a workgroup of 10 people, mine is the only computer that has the problem.
Things I can say:
Python 2.7.11, Windows 7, git bash version 2.9.0.windows.1.
It doesn't appear to matter whether I invoke my python program from the git bash command line or the Windows command line.
However, in git bash, saying "python" gets no response forever until I hit Ctrl-C, but saying "winpty python" opens an interactive python session as it should. I mention this because for a while I thought my main problem was related to the git bash shell bug (https://stackoverflow.com/a/32599341/5593532). But point 2 above would seem to contradict this. No such weirdness occurs in invoking a bare python interactive session from the Windows command line.
I've had trouble getting meaningful profiling output, partly because of multi-threading or child processes or something. And the web server doesn't have an exit event per se, thus I can only stop it by smacking it with a Ctrl-C in the command-line window where I ran the script, which seems to kill the part of the process that would save the profiling data.
From the fragmentary profiling info I was able to produce (with gratitude to https://ymichael.com/2014/03/08/profiling-python-with-cprofile.html), I am suspicious that something weird is happening in loading a large number of imported packages, and perhaps especially the alembic and/or werkzeug packages. (And maybe even sqlalchemy.) The profiling output didn't have much tottime in those packages, but it did have rather a lot of cumtime there.
My sys.path inside Python doesn't seem meaningfully different from anyone else's nearby. I might have one or two different items in the list, or three .egg files on the path when they've only got one, but it's mostly the same list in the same order. So much for the idea that it's taking a long time to hunt and learn where packages are and then re-using the information later.
I've got PyCharm Community Edition able to run the script and its associated junk in IDE mode, set breakpoints, and all that jazz, so I can set breakpoints and follow execution to a degree, in case that would answer a noteworthy question you could raise for me.
Anyone got a wild notion what's up? (he asked quite unreasonably)

Kernel crashes when increasing iterations

I am running a Python script using Spyder 2.3.9. I have a fairly large script and when running it through with (300x600) iterations (a loop inside another loop), everything appears to be working fine and takes approximately 40 minutes. But when I increase the number to (500x600) iterations, after 2 hours, the output yields:
It seems the kernel died unexpectedly. Use 'Restart kernel' to continue using this console.
I've been trying to go through the code but don't see anything that might be causing this in particular. I am using Python 2.7.12 64bits, Qt 4.8.7, PyQt4 (API v2) 4.11.4. (Anaconda2-4.0.0-MacOSX-x86_64)
I'm not entirely sure what additional information is pertinent, but if you have any suggestions or questions, I'd be happy to read them.
https://github.com/spyder-ide/spyder/issues/3114
It seems this issue has been opened on their GitHub profile, should be addressed soon given the repo record.
Some possible solutions:
It may be helpful, if possible, to modify your script for faster convergence. Very often, for most practical purposes, the incremental value of iterations after a certain point is negligible.
An upgrade or downgrade of the Spyder environment may help.
Check your local firewall for blocked connections to 127.0.0.1 from pythonw.exe.
If nothing works, try using Spyder on Ubuntu.

Python Performance on Windows

Is Python generally slower on Windows vs. a *nix machine? Python seems to blaze on my Mac OS X machine whereas it seems to run slower on my Window's Vista machine. The machines are similar in processing power and the vista machine has 1GBs more memory.
I particularly notice this in Mercurial but I figure this may simply be how Mercurial is packaged on windows.
I wanted to follow up on this and I found something that I believe is 'my answer'. It appears that Windows (vista, which is what I notice this on) is not as fast in handling files. This was mentioned by tony-p-lee.
I found this comparisons of Ubuntu vs Vista vs Win7. Their results are interesting and like they say, you need to take the results with a grain of salt. But I think the results lead me to the cause. Python, which I feel was indirectly tested, is about equivalent if not a tad-bit faster on Windows.. See the section "Richards benchmark".
Here is their graph for file transfers:
(source: tuxradar.com)
I think this specifically help address the question because Hg is really just a series of file reads, copies and overall handling. Its likely this is causing the delay.
http://www.tuxradar.com/content/benchmarked-ubuntu-vs-vista-vs-windows-7
No real numbers here but it certainly feels like the start up time is slower on Windows platforms. I regularly switch between Ubuntu at home and Windows 7 at work and it's an order of magnitude faster starting up on Ubuntu, despite my work machine being at least 4x the speed.
As for runtime performance, it feels about the same for "quiet" applications. If there are any GUI operations using Tk on Windows, they are definitely slower. Any console applications on windows are slower, but this is most likely due to the Windows cmd rendering being slow more than python running slowly.
Maybe the python has more depend on a lot of files open (import different modules).
Windows doesn't handle file open as efficiently as Linux.
Or maybe Linux probably have more utilities depend on python and python scripts/modules are more likely to be buffered in the system cache.
I run Python locally on Windows XP and 7 as well as OSX on my Macbook. I've seen no noticable performance differences in the command line interpreter, wx widget apps run the same, and Django apps also perform virtually identically.
One thing I noticed at work was that the Kaspersky virus scanner tended to slow the python interpreter WAY down. It would take 3-5 seconds for the python prompt to properly appear and 7-10 seconds for Django's test server to fully load. Properly disabling its active scanning brought the start up times back to 0 seconds.
With the OS and network libraries, I can confirm slower performance on Windows, at least for versions =< 2.6.
I wrote a CLI podcast-fetcher script which ran great on Ubuntu, but then wouldn't download anything faster than about 80 kB/s (where ~1.6 MB/s is my usual max) on either XP or 7.
I could partially correct this by tweaking the buffer size for download streams, but there was definitely a major bottleneck on Windows, either over the network or IO, that simply wasn't a problem on Linux.
Based on this, it seems that system and OS-interfacing tasks are better optimized for *nixes than they are for Windows.
Interestingly I ran a direct comparison of a popular Python app on a Windows 10 x64 Machine (low powered admittedly) and a Ubuntu 14.04 VM running on the same machine.
I have not tested load speeds etc, but am just looking at processor usage between the two. To make the test fair, both were fresh installs and I duplicated a part of my media library and applied the same config in both scenarios. Each test was run independently.
On Windows Python was using 20% of my processor power and it triggered System Compressed Memory to run up at 40% (this is an old machine with 6GB or RAM).
With the VM on Ubuntu (linked to my windows file system) the processor usage is about 5% with compressed memory down to about 20%.
This is a huge difference. My trigger for running this test was that the app using python was running my CPU up to 100% and failing to operate. I have now been running it in the VM for 2 weeks and my processor usage is down to 65-70% on average. So both on a long and short term test, and taking into account the overhead of running a VM and second operating system, this Python app is significantly faster on Linux. I can also confirm that the Python app responds better, as does everything else on my machine.
Now this could be very application specific, but it is at minimum interesting.
The PC is an old AMD II X2 X265 Processor, 6GB of RAM, SSD HD (which Python ran from but the VM used a regular 5200rpm HD which gets used for a ton of other stuff including recording of 2 CCTV cameras).

Categories

Resources