I had some problem with Python 3.7 running a single script in .py format. To Solve this i divided the program in some .py files, even so have memory leak.
I have a web scraping program that runs some script at intervals so it has to run 24h / 7, but with 16mb / hour memory increase it gets hard.
while True:
with open('scrapy.py') as op:
exec(op.read())
time.sleep(5)
Scrapy.py have some requests, pandas etc
I think this code close 'scrapy.py' every time the loop ends, but it seems not since this program eat memory.
Related
I have been working on an MMO bot for fun and the script itself stores previous data points of where my character was to continue on the script. After a few hours I came back to my machine giving a memory error and nothing working on the computer forcing me to need a restart. Is there any sort of command that I can give to the script that would reset the memory its cached up?
With the bot, I dont need to keep this memory cached up for more than a few seconds or even minutes at most, the stored up data does nothing for me. I was wondering if anyone had a way to fresh wipe the stored memory after a given time and start fresh?
You can use this script:
import subprocess
subprocess.call(["purge"])
I work for a digital marketing agency having multiple clients. And in one of the projects, I have a very resource intensive python script (which fetches data for Facebook ads), to be run on all those clients (say 500+ in number) in ubuntu 16.04 server.
Originally script took around 2 mins to complete, with 300 MB RES & 1000 MB VM (as per htop), for 1 client. Hence optimized it with ThreadPoolExecutor (max_workers=10) so that script can run on 4 clients concurrently (almost).
Then found out that sometimes, script froze during run (or basically its in "comatose state"). Debugged & profiled and found that its not the script that's causing issue, but its the system.
Then batched the script, means if there are 20 input clients, ran 5 instances (4*5=20) of script. Here sometimes it went fine but sometimes last instance froze.
Then found out that RAM (2G) was being overused, hence increased swapping memory from 0 to 1G. That did the trick. But if few clients are heavy in memory, same thing happens.
Have attached the screenshot of the latest run where after running the 8 instances, last 2 froze. They can be left for days for that matter.
I am thinking of increasing the server RAM from 2G to 4G but not sure if that's the permanent solution. Did anyone has faced similar issue?
You need to fix the Ram consumption of your script,
if your script allocates more memory than your system can provide it get's memory errors, in case you have them in threadpools or similar constructs the threads may never return under some circumstances.
You can fix this by using async functions with timeouts and implementing automatic restart handlers, in case a process does not yield an expected results.
The best way to do that is heavily dependent on the script and will probably require altering already created code
The issue is definitly with your script and not with the OS.
The fastetst workaround would be to increase she system memory or to reduce the amount of threads.
If just adding 1GB of swap area "almost" did the trick then definitely increasing the physical memory is a good way to go. Btw remember that swapping means you're using disk storage, whose speed is measured in millisecs, while RAM speed is measured in nanosecs - so avoiding swap guarantees a performance boost.
And then, reboot your system every now and then. Although Linux is far better than Windows in this respect, memory leaks do occur in Linux too, and a reboot every few months will surely help.
As Gornoka stated you need to alter the memory comsumption of the script as added details this can also be done by removing declared variables within the script once used with the keyword
del
This can also be done by ensuring that if it is processing massive files it does this line by line and saving it as it finishes each line.
I have had this happen and it usually is an indicator of working with to much data at once within the ram and it is always better to work with it partially whenever possible and if not possible get more RAM
I am writing a python script to transfer large files via sftp with the pysftp module. I have a massive amount of data to transfer, a total of around 36Tb, divided in 54 runs, or batches.
I want only to carry out these transfers between certain hours of the day, for this example, between 6pm and 7am. So my idea is to use a for loop to iterate over all the runs/ batches. Upon each iteration, I would check what hour it is. If it is between 6pm and 7am I would transfer. Else the script would sleep until it is 6pm minimum. The code that I wrote looks like so:
runsList = 'runA runB runC'.split() # these are directories
# time constraints
bottomLimit = 7
upperLimit = 18
doNotUploadRange = range(bottomLimit, upperLimit)
for run in runsList:
hour = dt.datetime.now().hour
while hour in doNotUploadRange:
print('do not upload now')
time.sleep(1800)
hour = dt.datetime.now().hour
# when I leave the while condition above
# do the transfer via pysftp (large amount of data) per run
The question here does not concern the code itself not I want to check whether or not the script is running (which can be checked with htop), but I am concerned that my script will crash, for whatever reason, before it finishes (perhaps it would be running for a full week if nothing crashes).
I do sometimes call scripts that run for a very long time and they do crash sometimes, with no obvious reasons for crash.
So my question is whether it is, for whatever reason, obvious that the script will crash after running for 6-7 days of can I expect it to finish provided that there is no error in the code itself? My idea is to call this script on the background, inside tmux I would python script.py &
I 'm trying to run pymclevel (http://www.github.com/mcedit/pymclevel) in a script context where the total time the script takes to execute. Simply starting the file results in it taking a few seconds until it actually reaches the end of the script, skipping the main class. If I raise a systemexit as the last line and remove all other initialization, it still takes 3-4 seconds. The CPU isn't the problem, it's a Xeon E3-1290v2 at 3.8Ghz x 4 cores. Any help is greatly appreciated.
You have to understand where the bootleneck is, look at CPU, RAM and HARD-DISK usage.
i'm really confident that is a HARD-DISK realted, as far as i know minecraft map can be really big. Also if your ram can't hadle all the map it will frequently swapped to disk, causing a lot of additional computing time
I've created a script to monitor the output of a serial port that receives 3-4 lines of data every half hour - the script runs fine and grabs everything that comes off the port which at the end of the day is what matters...
What bugs me, however, is that the cpu usage seems rather high for a program that's just monitoring a single serial port, 1 core will always be at 100% usage while this script is running.
I'm basically running a modified version of the code in this question: pyserial - How to Read Last Line Sent from Serial Device
I've tried polling the inWaiting() function at regular intervals and having it sleep when inWaiting() is 0 - I've tried intervals from 1 second down to 0.001 seconds (basically, as often as I can without driving up the cpu usage) - this will succeed in grabbing the first line but seems to miss the rest of the data.
Adjusting the timeout of the serial port doesn't seem to have any effect on cpu usage, nor does putting the listening function into it's own thread (not that I really expected a difference but it was worth trying).
Should python/pyserial be using this much cpu? (this seems like overkill)
Am I wasting my time on this quest / Should I just bite the bullet and schedule the script to sleep for the periods that I know no data will be coming?
Maybe you could issue a blocking read(1) call, and when it succeeds use read(inWaiting()) to get the right number of remaining bytes.
Would a system style solution be better? Create the python script and have it executed via Cron/Scheduled Task?
pySerial shouldn't be using that much CPU but if its just sitting there polling for an hour I can see how it may happen. Sleeping may be a better option in conjunction with periodic wakeup and polls.