Why do multiple processes slow down python package importing? - python

If I import numpy in a single process, it takes approximately 0.0749 seconds:
python -c "import time; s=time.time(); import numpy; print(time.time() - s)"
Now if I run the same code in multiple Processes, they all import significantly slower:
import subprocess
cmd = 'python -c "import time; s=time.time(); import numpy; print(time.time() - s)"'
for n in range(5):
m = 2**n
print(f"Importing numpy on {m} Process(es):")
processes = []
for i in range(m):
processes.append(subprocess.Popen(cmd, shell=True))
for p in processes:
p.wait()
print()
gives the output:
Importing numpy on 1 Process(es):
0.07726049423217773
Importing numpy on 2 Process(es):
0.110260009765625
0.11645245552062988
Importing numpy on 4 Process(es):
0.13133740425109863
0.1264667510986328
0.13683867454528809
0.153900146484375
Importing numpy on 8 Process(es):
0.13650751113891602
0.15682148933410645
0.17088770866394043
0.1705784797668457
0.1690073013305664
0.18076491355895996
0.18901371955871582
0.18936467170715332
Importing numpy on 16 Process(es):
0.24082279205322266
0.24885773658752441
0.25356197357177734
0.27071142196655273
0.29327893257141113
0.2999141216278076
0.297823429107666
0.31664466857910156
0.20108580589294434
0.33217334747314453
0.24672770500183105
0.34597229957580566
0.24964046478271484
0.3546409606933594
0.26511287689208984
0.2684178352355957
The import time per Process seems to grow almost linearly with the number of Processes (especially as the number of Processes grows large), it seems we spend a total of about O(n^2) time on importing. I know there is an import lock, but not sure why it is there. Are there any work arounds? And if I work on a server with many users running many tasks, could I be slowed down by someone spawning tons of workers that just import common packages?
The pattern is clearer for larger n, here's a script that shows that more clearly by just reporting the average import time for n workers:
import multiprocessing
import time
def f(x):
s = time.time()
import numpy as np
return time.time() - s
ps = []
for n in range(10):
m = 2**n
with multiprocessing.Pool(m) as p:
print(f"importing with {m} worker(s): {sum(p.map(f, range(m)))/m}")
output:
importing with 1 worker(s): 0.06654548645019531
importing with 2 worker(s): 0.11186492443084717
importing with 4 worker(s): 0.11750376224517822
importing with 8 worker(s): 0.14901494979858398
importing with 16 worker(s): 0.20824094116687775
importing with 32 worker(s): 0.32718323171138763
importing with 64 worker(s): 0.5660803504288197
importing with 128 worker(s): 1.034045523032546
importing with 256 worker(s): 1.8989756992086768
importing with 512 worker(s): 3.558808562345803
extra details about environment in which I ran this:
python version: 3.8.6
pip list:
Package Version
---------- -------
numpy 1.20.1
pip 21.0.1
setuptools 53.0.0
wheel 0.36.2
os:
NAME="Pop!_OS"
VERSION="20.10"
Is it just reading from filesystem that is the problem?
I've added this simple test where instead of importing, I now just read the numpy files and do some sanity check calculations:
import subprocess
cmd = 'python read_numpy.py'
for n in range(5):
m = 2**n
print(f"Running on {m} Process(es):")
processes = []
for i in range(m):
processes.append(subprocess.Popen(cmd, shell=True))
for p in processes:
p.wait()
print()
with read_numpy.py:
import os
import time
file_path = "/home/.virtualenvs/multiprocessing-import/lib/python3.8/site-packages/numpy"
t1 = time.time()
parity = 0
for root, dirs, filenames in os.walk(file_path):
for name in filenames:
contents = open(os.path.join(root, name), "rb").read()
parity = (parity + sum([x%2 for x in contents]))%2
print(parity, time.time() - t1)
Running this gives me the following output:
Running on 1 Process(es):
1 0.8050086498260498
Running on 2 Process(es):
1 0.8164374828338623
1 0.8973987102508545
Running on 4 Process(es):
1 0.8233649730682373
1 0.81931471824646
1 0.8731539249420166
1 0.8883578777313232
Running on 8 Process(es):
1 0.9382946491241455
1 0.9511561393737793
1 0.9752676486968994
1 1.0584545135498047
1 1.1573944091796875
1 1.163221836090088
1 1.1602907180786133
1 1.219961166381836
Running on 16 Process(es):
1 1.337137222290039
1 1.3456192016601562
1 1.3102262020111084
1 1.527071475982666
1 1.5436983108520508
1 1.651414394378662
1 1.656200647354126
1 1.6047494411468506
1 1.6851506233215332
1 1.6949374675750732
1 1.744239330291748
1 1.798882246017456
1 1.8150532245635986
1 1.8266475200653076
1 1.769331455230713
1 1.8609044551849365
There is some slowdown, 0.805 seconds for 1 worker, and between 0.819 and 0.888 seconds for 4 workers. Compared to import: 0.07 seconds for 1 worker, and between 0.126 and 0.153 seconds for 4 workers. Seems like there might be something other than filesystem reads slowing down import

Related

How to get CPU and RAM usage from windows machine using python? [duplicate]

How can I get the current system status (current CPU, RAM, free disk space, etc.) in Python? Ideally, it would work for both Unix and Windows platforms.
There seems to be a few possible ways of extracting that from my search:
Using a library such as PSI (that currently seems not actively developed and not supported on multiple platforms) or something like pystatgrab (again no activity since 2007 it seems and no support for Windows).
Using platform specific code such as using a os.popen("ps") or similar for the *nix systems and MEMORYSTATUS in ctypes.windll.kernel32 (see this recipe on ActiveState) for the Windows platform. One could put a Python class together with all those code snippets.
It's not that those methods are bad but is there already a well-supported, multi-platform way of doing the same thing?
The psutil library gives you information about CPU, RAM, etc., on a variety of platforms:
psutil is a module providing an interface for retrieving information on running processes and system utilization (CPU, memory) in a portable way by using Python, implementing many functionalities offered by tools like ps, top and Windows task manager.
It currently supports Linux, Windows, OSX, Sun Solaris, FreeBSD, OpenBSD and NetBSD, both 32-bit and 64-bit architectures, with Python versions from 2.6 to 3.5 (users of Python 2.4 and 2.5 may use 2.1.3 version).
Some examples:
#!/usr/bin/env python
import psutil
# gives a single float value
psutil.cpu_percent()
# gives an object with many fields
psutil.virtual_memory()
# you can convert that object to a dictionary
dict(psutil.virtual_memory()._asdict())
# you can have the percentage of used RAM
psutil.virtual_memory().percent
79.2
# you can calculate percentage of available memory
psutil.virtual_memory().available * 100 / psutil.virtual_memory().total
20.8
Here's other documentation that provides more concepts and interest concepts:
https://psutil.readthedocs.io/en/latest/
Use the psutil library. On Ubuntu 18.04, pip installed 5.5.0 (latest version) as of 1-30-2019. Older versions may behave somewhat differently.
You can check your version of psutil by doing this in Python:
from __future__ import print_function # for Python2
import psutil
print(psutil.__versi‌​on__)
To get some memory and CPU stats:
from __future__ import print_function
import psutil
print(psutil.cpu_percent())
print(psutil.virtual_memory()) # physical memory usage
print('memory % used:', psutil.virtual_memory()[2])
The virtual_memory (tuple) will have the percent memory used system-wide. This seemed to be overestimated by a few percent for me on Ubuntu 18.04.
You can also get the memory used by the current Python instance:
import os
import psutil
pid = os.getpid()
python_process = psutil.Process(pid)
memoryUse = python_process.memory_info()[0]/2.**30 # memory use in GB...I think
print('memory use:', memoryUse)
which gives the current memory use of your Python script.
There are some more in-depth examples on the pypi page for psutil.
Only for Linux:
One-liner for the RAM usage with only stdlib dependency:
import os
tot_m, used_m, free_m = map(int, os.popen('free -t -m').readlines()[-1].split()[1:])
One can get real time CPU and RAM monitoring by combining tqdm and psutil. It may be handy when running heavy computations / processing.
It also works in Jupyter without any code changes:
from tqdm import tqdm
from time import sleep
import psutil
with tqdm(total=100, desc='cpu%', position=1) as cpubar, tqdm(total=100, desc='ram%', position=0) as rambar:
while True:
rambar.n=psutil.virtual_memory().percent
cpubar.n=psutil.cpu_percent()
rambar.refresh()
cpubar.refresh()
sleep(0.5)
It's convenient to put those progress bars in separate process using multiprocessing library.
This code snippet is also available as a gist.
Below codes, without external libraries worked for me. I tested at Python 2.7.9
CPU Usage
import os
CPU_Pct=str(round(float(os.popen('''grep 'cpu ' /proc/stat | awk '{usage=($2+$4)*100/($2+$4+$5)} END {print usage }' ''').readline()),2))
print("CPU Usage = " + CPU_Pct) # print results
And Ram Usage, Total, Used and Free
import os
mem=str(os.popen('free -t -m').readlines())
"""
Get a whole line of memory output, it will be something like below
[' total used free shared buffers cached\n',
'Mem: 925 591 334 14 30 355\n',
'-/+ buffers/cache: 205 719\n',
'Swap: 99 0 99\n',
'Total: 1025 591 434\n']
So, we need total memory, usage and free memory.
We should find the index of capital T which is unique at this string
"""
T_ind=mem.index('T')
"""
Than, we can recreate the string with this information. After T we have,
"Total: " which has 14 characters, so we can start from index of T +14
and last 4 characters are also not necessary.
We can create a new sub-string using this information
"""
mem_G=mem[T_ind+14:-4]
"""
The result will be like
1025 603 422
we need to find first index of the first space, and we can start our substring
from from 0 to this index number, this will give us the string of total memory
"""
S1_ind=mem_G.index(' ')
mem_T=mem_G[0:S1_ind]
"""
Similarly we will create a new sub-string, which will start at the second value.
The resulting string will be like
603 422
Again, we should find the index of first space and than the
take the Used Memory and Free memory.
"""
mem_G1=mem_G[S1_ind+8:]
S2_ind=mem_G1.index(' ')
mem_U=mem_G1[0:S2_ind]
mem_F=mem_G1[S2_ind+8:]
print 'Summary = ' + mem_G
print 'Total Memory = ' + mem_T +' MB'
print 'Used Memory = ' + mem_U +' MB'
print 'Free Memory = ' + mem_F +' MB'
To get a line-by-line memory and time analysis of your program, I suggest using memory_profiler and line_profiler.
Installation:
# Time profiler
$ pip install line_profiler
# Memory profiler
$ pip install memory_profiler
# Install the dependency for a faster analysis
$ pip install psutil
The common part is, you specify which function you want to analyse by using the respective decorators.
Example: I have several functions in my Python file main.py that I want to analyse. One of them is linearRegressionfit(). I need to use the decorator #profile that helps me profile the code with respect to both: Time & Memory.
Make the following changes to the function definition
#profile
def linearRegressionfit(Xt,Yt,Xts,Yts):
lr=LinearRegression()
model=lr.fit(Xt,Yt)
predict=lr.predict(Xts)
# More Code
For Time Profiling,
Run:
$ kernprof -l -v main.py
Output
Total time: 0.181071 s
File: main.py
Function: linearRegressionfit at line 35
Line # Hits Time Per Hit % Time Line Contents
==============================================================
35 #profile
36 def linearRegressionfit(Xt,Yt,Xts,Yts):
37 1 52.0 52.0 0.1 lr=LinearRegression()
38 1 28942.0 28942.0 75.2 model=lr.fit(Xt,Yt)
39 1 1347.0 1347.0 3.5 predict=lr.predict(Xts)
40
41 1 4924.0 4924.0 12.8 print("train Accuracy",lr.score(Xt,Yt))
42 1 3242.0 3242.0 8.4 print("test Accuracy",lr.score(Xts,Yts))
For Memory Profiling,
Run:
$ python -m memory_profiler main.py
Output
Filename: main.py
Line # Mem usage Increment Line Contents
================================================
35 125.992 MiB 125.992 MiB #profile
36 def linearRegressionfit(Xt,Yt,Xts,Yts):
37 125.992 MiB 0.000 MiB lr=LinearRegression()
38 130.547 MiB 4.555 MiB model=lr.fit(Xt,Yt)
39 130.547 MiB 0.000 MiB predict=lr.predict(Xts)
40
41 130.547 MiB 0.000 MiB print("train Accuracy",lr.score(Xt,Yt))
42 130.547 MiB 0.000 MiB print("test Accuracy",lr.score(Xts,Yts))
Also, the memory profiler results can also be plotted using matplotlib using
$ mprof run main.py
$ mprof plot
Note: Tested on
line_profiler version == 3.0.2
memory_profiler version == 0.57.0
psutil version == 5.7.0
EDIT: The results from the profilers can be parsed using the TAMPPA package. Using it, we can get line-by-line desired plots as
We chose to use usual information source for this because we could find instantaneous fluctuations in free memory and felt querying the meminfo data source was helpful. This also helped us get a few more related parameters that were pre-parsed.
Code
import os
linux_filepath = "/proc/meminfo"
meminfo = dict(
(i.split()[0].rstrip(":"), int(i.split()[1]))
for i in open(linux_filepath).readlines()
)
meminfo["memory_total_gb"] = meminfo["MemTotal"] / (2 ** 20)
meminfo["memory_free_gb"] = meminfo["MemFree"] / (2 ** 20)
meminfo["memory_available_gb"] = meminfo["MemAvailable"] / (2 ** 20)
Output for reference (we stripped all newlines for further analysis)
MemTotal: 1014500 kB MemFree: 562680 kB MemAvailable: 646364 kB
Buffers: 15144 kB Cached: 210720 kB SwapCached: 0 kB Active: 261476 kB
Inactive: 128888 kB Active(anon): 167092 kB Inactive(anon): 20888 kB
Active(file): 94384 kB Inactive(file): 108000 kB Unevictable: 3652 kB
Mlocked: 3652 kB SwapTotal: 0 kB SwapFree: 0 kB Dirty: 0 kB Writeback:
0 kB AnonPages: 168160 kB Mapped: 81352 kB Shmem: 21060 kB Slab: 34492
kB SReclaimable: 18044 kB SUnreclaim: 16448 kB KernelStack: 2672 kB
PageTables: 8180 kB NFS_Unstable: 0 kB Bounce: 0 kB WritebackTmp: 0 kB
CommitLimit: 507248 kB Committed_AS: 1038756 kB VmallocTotal:
34359738367 kB VmallocUsed: 0 kB VmallocChunk: 0 kB HardwareCorrupted:
0 kB AnonHugePages: 88064 kB CmaTotal: 0 kB CmaFree: 0 kB
HugePages_Total: 0 HugePages_Free: 0 HugePages_Rsvd: 0 HugePages_Surp:
0 Hugepagesize: 2048 kB DirectMap4k: 43008 kB DirectMap2M: 1005568 kB
Here's something I put together a while ago, it's windows only but may help you get part of what you need done.
Derived from:
"for sys available mem"
http://msdn2.microsoft.com/en-us/library/aa455130.aspx
"individual process information and python script examples"
http://www.microsoft.com/technet/scriptcenter/scripts/default.mspx?mfr=true
NOTE: the WMI interface/process is also available for performing similar tasks
I'm not using it here because the current method covers my needs, but if someday it's needed to extend or improve this, then may want to investigate the WMI tools a vailable.
WMI for python:
http://tgolden.sc.sabren.com/python/wmi.html
The code:
'''
Monitor window processes
derived from:
>for sys available mem
http://msdn2.microsoft.com/en-us/library/aa455130.aspx
> individual process information and python script examples
http://www.microsoft.com/technet/scriptcenter/scripts/default.mspx?mfr=true
NOTE: the WMI interface/process is also available for performing similar tasks
I'm not using it here because the current method covers my needs, but if someday it's needed
to extend or improve this module, then may want to investigate the WMI tools available.
WMI for python:
http://tgolden.sc.sabren.com/python/wmi.html
'''
__revision__ = 3
import win32com.client
from ctypes import *
from ctypes.wintypes import *
import pythoncom
import pywintypes
import datetime
class MEMORYSTATUS(Structure):
_fields_ = [
('dwLength', DWORD),
('dwMemoryLoad', DWORD),
('dwTotalPhys', DWORD),
('dwAvailPhys', DWORD),
('dwTotalPageFile', DWORD),
('dwAvailPageFile', DWORD),
('dwTotalVirtual', DWORD),
('dwAvailVirtual', DWORD),
]
def winmem():
x = MEMORYSTATUS() # create the structure
windll.kernel32.GlobalMemoryStatus(byref(x)) # from cytypes.wintypes
return x
class process_stats:
'''process_stats is able to provide counters of (all?) the items available in perfmon.
Refer to the self.supported_types keys for the currently supported 'Performance Objects'
To add logging support for other data you can derive the necessary data from perfmon:
---------
perfmon can be run from windows 'run' menu by entering 'perfmon' and enter.
Clicking on the '+' will open the 'add counters' menu,
From the 'Add Counters' dialog, the 'Performance object' is the self.support_types key.
--> Where spaces are removed and symbols are entered as text (Ex. # == Number, % == Percent)
For the items you wish to log add the proper attribute name in the list in the self.supported_types dictionary,
keyed by the 'Performance Object' name as mentioned above.
---------
NOTE: The 'NETFramework_NETCLRMemory' key does not seem to log dotnet 2.0 properly.
Initially the python implementation was derived from:
http://www.microsoft.com/technet/scriptcenter/scripts/default.mspx?mfr=true
'''
def __init__(self,process_name_list=[],perf_object_list=[],filter_list=[]):
'''process_names_list == the list of all processes to log (if empty log all)
perf_object_list == list of process counters to log
filter_list == list of text to filter
print_results == boolean, output to stdout
'''
pythoncom.CoInitialize() # Needed when run by the same process in a thread
self.process_name_list = process_name_list
self.perf_object_list = perf_object_list
self.filter_list = filter_list
self.win32_perf_base = 'Win32_PerfFormattedData_'
# Define new datatypes here!
self.supported_types = {
'NETFramework_NETCLRMemory': [
'Name',
'NumberTotalCommittedBytes',
'NumberTotalReservedBytes',
'NumberInducedGC',
'NumberGen0Collections',
'NumberGen1Collections',
'NumberGen2Collections',
'PromotedMemoryFromGen0',
'PromotedMemoryFromGen1',
'PercentTimeInGC',
'LargeObjectHeapSize'
],
'PerfProc_Process': [
'Name',
'PrivateBytes',
'ElapsedTime',
'IDProcess',# pid
'Caption',
'CreatingProcessID',
'Description',
'IODataBytesPersec',
'IODataOperationsPersec',
'IOOtherBytesPersec',
'IOOtherOperationsPersec',
'IOReadBytesPersec',
'IOReadOperationsPersec',
'IOWriteBytesPersec',
'IOWriteOperationsPersec'
]
}
def get_pid_stats(self, pid):
this_proc_dict = {}
pythoncom.CoInitialize() # Needed when run by the same process in a thread
if not self.perf_object_list:
perf_object_list = self.supported_types.keys()
for counter_type in perf_object_list:
strComputer = "."
objWMIService = win32com.client.Dispatch("WbemScripting.SWbemLocator")
objSWbemServices = objWMIService.ConnectServer(strComputer,"root\cimv2")
query_str = '''Select * from %s%s''' % (self.win32_perf_base,counter_type)
colItems = objSWbemServices.ExecQuery(query_str) # "Select * from Win32_PerfFormattedData_PerfProc_Process")# changed from Win32_Thread
if len(colItems) > 0:
for objItem in colItems:
if hasattr(objItem, 'IDProcess') and pid == objItem.IDProcess:
for attribute in self.supported_types[counter_type]:
eval_str = 'objItem.%s' % (attribute)
this_proc_dict[attribute] = eval(eval_str)
this_proc_dict['TimeStamp'] = datetime.datetime.now().strftime('%Y-%m-%d %H:%M:%S.') + str(datetime.datetime.now().microsecond)[:3]
break
return this_proc_dict
def get_stats(self):
'''
Show process stats for all processes in given list, if none given return all processes
If filter list is defined return only the items that match or contained in the list
Returns a list of result dictionaries
'''
pythoncom.CoInitialize() # Needed when run by the same process in a thread
proc_results_list = []
if not self.perf_object_list:
perf_object_list = self.supported_types.keys()
for counter_type in perf_object_list:
strComputer = "."
objWMIService = win32com.client.Dispatch("WbemScripting.SWbemLocator")
objSWbemServices = objWMIService.ConnectServer(strComputer,"root\cimv2")
query_str = '''Select * from %s%s''' % (self.win32_perf_base,counter_type)
colItems = objSWbemServices.ExecQuery(query_str) # "Select * from Win32_PerfFormattedData_PerfProc_Process")# changed from Win32_Thread
try:
if len(colItems) > 0:
for objItem in colItems:
found_flag = False
this_proc_dict = {}
if not self.process_name_list:
found_flag = True
else:
# Check if process name is in the process name list, allow print if it is
for proc_name in self.process_name_list:
obj_name = objItem.Name
if proc_name.lower() in obj_name.lower(): # will log if contains name
found_flag = True
break
if found_flag:
for attribute in self.supported_types[counter_type]:
eval_str = 'objItem.%s' % (attribute)
this_proc_dict[attribute] = eval(eval_str)
this_proc_dict['TimeStamp'] = datetime.datetime.now().strftime('%Y-%m-%d %H:%M:%S.') + str(datetime.datetime.now().microsecond)[:3]
proc_results_list.append(this_proc_dict)
except pywintypes.com_error, err_msg:
# Ignore and continue (proc_mem_logger calls this function once per second)
continue
return proc_results_list
def get_sys_stats():
''' Returns a dictionary of the system stats'''
pythoncom.CoInitialize() # Needed when run by the same process in a thread
x = winmem()
sys_dict = {
'dwAvailPhys': x.dwAvailPhys,
'dwAvailVirtual':x.dwAvailVirtual
}
return sys_dict
if __name__ == '__main__':
# This area used for testing only
sys_dict = get_sys_stats()
stats_processor = process_stats(process_name_list=['process2watch'],perf_object_list=[],filter_list=[])
proc_results = stats_processor.get_stats()
for result_dict in proc_results:
print result_dict
import os
this_pid = os.getpid()
this_proc_results = stats_processor.get_pid_stats(this_pid)
print 'this proc results:'
print this_proc_results
I feel like these answers were written for Python 2, and in any case nobody's made mention of the standard resource package that's available for Python 3. It provides commands for obtaining the resource limits of a given process (the calling Python process by default). This isn't the same as getting the current usage of resources by the system as a whole, but it could solve some of the same problems like e.g. "I want to make sure I only use X much RAM with this script."
This aggregate all the goodies:
psutil + os to get Unix & Windows compatibility:
That allows us to get:
CPU
memory
disk
code:
import os
import psutil # need: pip install psutil
In [32]: psutil.virtual_memory()
Out[32]: svmem(total=6247907328, available=2502328320, percent=59.9, used=3327135744, free=167067648, active=3671199744, inactive=1662668800, buffers=844783616, cached=1908920320, shared=123912192, slab=613048320)
In [33]: psutil.virtual_memory().percent
Out[33]: 60.0
In [34]: psutil.cpu_percent()
Out[34]: 5.5
In [35]: os.sep
Out[35]: '/'
In [36]: psutil.disk_usage(os.sep)
Out[36]: sdiskusage(total=50190790656, used=41343860736, free=6467502080, percent=86.5)
In [37]: psutil.disk_usage(os.sep).percent
Out[37]: 86.5
Taken feedback from first response and done small changes
#!/usr/bin/env python
#Execute commond on windows machine to install psutil>>>>python -m pip install psutil
import psutil
print (' ')
print ('----------------------CPU Information summary----------------------')
print (' ')
# gives a single float value
vcc=psutil.cpu_count()
print ('Total number of CPUs :',vcc)
vcpu=psutil.cpu_percent()
print ('Total CPUs utilized percentage :',vcpu,'%')
print (' ')
print ('----------------------RAM Information summary----------------------')
print (' ')
# you can convert that object to a dictionary
#print(dict(psutil.virtual_memory()._asdict()))
# gives an object with many fields
vvm=psutil.virtual_memory()
x=dict(psutil.virtual_memory()._asdict())
def forloop():
for i in x:
print (i,"--",x[i]/1024/1024/1024)#Output will be printed in GBs
forloop()
print (' ')
print ('----------------------RAM Utilization summary----------------------')
print (' ')
# you can have the percentage of used RAM
print('Percentage of used RAM :',psutil.virtual_memory().percent,'%')
#79.2
# you can calculate percentage of available memory
print('Percentage of available RAM :',psutil.virtual_memory().available * 100 / psutil.virtual_memory().total,'%')
#20.8
"... current system status (current CPU, RAM, free disk space, etc.)" And "*nix and Windows platforms" can be a difficult combination to achieve.
The operating systems are fundamentally different in the way they manage these resources. Indeed, they differ in core concepts like defining what counts as system and what counts as application time.
"Free disk space"? What counts as "disk space?" All partitions of all devices? What about foreign partitions in a multi-boot environment?
I don't think there's a clear enough consensus between Windows and *nix that makes this possible. Indeed, there may not even be any consensus between the various operating systems called Windows. Is there a single Windows API that works for both XP and Vista?
This script for CPU usage:
import os
def get_cpu_load():
""" Returns a list CPU Loads"""
result = []
cmd = "WMIC CPU GET LoadPercentage "
response = os.popen(cmd + ' 2>&1','r').read().strip().split("\r\n")
for load in response[1:]:
result.append(int(load))
return result
if __name__ == '__main__':
print get_cpu_load()
For CPU details use psutil library
https://psutil.readthedocs.io/en/latest/#cpu
For RAM Frequency (in MHz) use the built in Linux library dmidecode and manipulate the output a bit ;). this command needs root permission hence supply your password too. just copy the following commend replacing mypass with your password
import os
os.system("echo mypass | sudo -S dmidecode -t memory | grep 'Clock Speed' | cut -d ':' -f2")
------------------- Output ---------------------------
1600 MT/s
Unknown
1600 MT/s
Unknown 0
more specificly
[i for i in os.popen("echo mypass | sudo -S dmidecode -t memory | grep 'Clock Speed' | cut -d ':' -f2").read().split(' ') if i.isdigit()]
-------------------------- output -------------------------
['1600', '1600']
you can read /proc/meminfo to get used memory
file1 = open('/proc/meminfo', 'r')
for line in file1:
if 'MemTotal' in line:
x = line.split()
memTotal = int(x[1])
if 'Buffers' in line:
x = line.split()
buffers = int(x[1])
if 'Cached' in line and 'SwapCached' not in line:
x = line.split()
cached = int(x[1])
if 'MemFree' in line:
x = line.split()
memFree = int(x[1])
file1.close()
percentage_used = int ( ( memTotal - (buffers + cached + memFree) ) / memTotal * 100 )
print(percentage_used)
Based on the cpu usage code by #Hrabal, this is what I use:
from subprocess import Popen, PIPE
def get_cpu_usage():
''' Get CPU usage on Linux by reading /proc/stat '''
sub = Popen(('grep', 'cpu', '/proc/stat'), stdout=PIPE, stderr=PIPE)
top_vals = [int(val) for val in sub.communicate()[0].split('\n')[0].split[1:5]]
return (top_vals[0] + top_vals[2]) * 100. /(top_vals[0] + top_vals[2] + top_vals[3])
You can use psutil or psmem with subprocess
example code
import subprocess
cmd = subprocess.Popen(['sudo','./ps_mem'],stdout=subprocess.PIPE,stderr=subprocess.PIPE)
out,error = cmd.communicate()
memory = out.splitlines()
Reference
https://github.com/Leo-g/python-flask-cmd
You can always use the library recently released SystemScripter by using the command pip install SystemScripter. This is a library that uses the other library like psutil among others to create a full library of system information that spans from CPU to disk information.
For current CPU usage use the function:
SystemScripter.CPU.CpuPerCurrentUtil(SystemScripter.CPU()) #class init as self param if not work
This gets the usage percentage or use:
SystemScripter.CPU.CpuCurrentUtil(SystemScripter.CPU())
https://pypi.org/project/SystemScripter/#description
Run with crontab won't print pid
Setup: */1 * * * * sh dog.sh this line in crontab -e
import os
import re
CUT_OFF = 90
def get_cpu_load():
cmd = "ps -Ao user,uid,comm,pid,pcpu --sort=-pcpu | head -n 2 | tail -1"
response = os.popen(cmd, 'r').read()
arr = re.findall(r'\S+', response)
print(arr)
needKill = float(arr[-1]) > CUT_OFF
if needKill:
r = os.popen(f"kill -9 {arr[-2]}")
print('kill:', r)
if __name__ == '__main__':
# Test CPU with
# $ stress --cpu 1
# crontab -e
# Every 1 min
# */1 * * * * sh dog.sh
# ctlr o, ctlr x
# crontab -l
print(get_cpu_load())
Shell-out not needed for #CodeGench's solution, so assuming Linux and Python's standard libraries:
def cpu_load():
with open("/proc/stat", "r") as stat:
(key, user, nice, system, idle, _) = (stat.readline().split(None, 5))
assert key == "cpu", "'cpu ...' should be the first line in /proc/stat"
busy = int(user) + int(nice) + int(system)
return 100 * busy / (busy + int(idle))
I don't believe that there is a well-supported multi-platform library available. Remember that Python itself is written in C so any library is simply going to make a smart decision about which OS-specific code snippet to run, as you suggested above.

Python resource.getrusage says no new memory allocated

In the following, I try to measure the current process's memory usage with
resource.getrusage.
I allocate a large array (and can see the system memory increase with psutil), but getrusage reports that same rss value before and after allocating the array. What's up with that?
import psutil
import os
import resource
import numpy as np
def total_memory():
v = psutil.virtual_memory()
s = psutil.swap_memory()
return v.used + s.used
result = resource.getrusage(resource.RUSAGE_SELF)
print "resource START ", result.ru_maxrss, result.ru_ixrss, result.ru_idrss, result.ru_isrss
init = total_memory()
print "psutil START ", init
data = np.random.rand(10000,10000)
result = resource.getrusage(resource.RUSAGE_SELF)
print "resource END ", result.ru_maxrss, result.ru_ixrss, result.ru_idrss, result.ru_isrss
print "psutil END+", total_memory()-init
reports:
resource START 1586580 0 0 0
psutil START 5074391040
resource END 1586580 0 0 0 # same
psutil END+ 793374720 # total memory increase by this much
Python 2.7.6. Ubuntu 14

How do I use line_profiler (from Robert Kern)?

I have tried using the line_profiler module for getting a line-by-line profile over a Python file. This is what I've done so far:
1) Installed line_profiler from pypi by using the .exe file (I am on WinXP and Win7). Just clicked through the installation wizard.
2) Written a small piece of code (similar to what has been asked in another answered question here).
from line_profiler import LineProfiler
def do_stuff(numbers):
print numbers
numbers = 2
profile = LineProfiler(do_stuff(numbers))
profile.print_stats()
3) Run the code from IDLE/PyScripter. I got only the time.
Timer unit: 4.17188e-10 s
How do I get full line-by-line profile over the code I execute? I have never used any advanced Python features like decorators, so it is hard for me to understand how shall I use the guidelines provided by several posts like here and here.
This answer is a copy of my answer here for how to get line_profiler statistics from within a Python script (without using kernprof from the command line or having to add #profile decorators to functions and class methods). All answers (that I've seen) to similar line_profiler questions only describe using kernprof.
The line_profiler test cases (found on GitHub) have an example of how to generate profile data from within a Python script. You have to wrap the function that you want to profile and then call the wrapper passing any desired function arguments.
from line_profiler import LineProfiler
import random
def do_stuff(numbers):
s = sum(numbers)
l = [numbers[i]/43 for i in range(len(numbers))]
m = ['hello'+str(numbers[i]) for i in range(len(numbers))]
numbers = [random.randint(1,100) for i in range(1000)]
lp = LineProfiler()
lp_wrapper = lp(do_stuff)
lp_wrapper(numbers)
lp.print_stats()
Output:
Timer unit: 1e-06 s
Total time: 0.000649 s
File: <ipython-input-2-2e060b054fea>
Function: do_stuff at line 4
Line # Hits Time Per Hit % Time Line Contents
==============================================================
4 def do_stuff(numbers):
5 1 10 10.0 1.5 s = sum(numbers)
6 1 186 186.0 28.7 l = [numbers[i]/43 for i in range(len(numbers))]
7 1 453 453.0 69.8 m = ['hello'+str(numbers[i]) for i in range(len(numbers))]
Adding Additional Functions to Profile
Also, you can add additional functions to be profiled as well. For example, if you had a second called function and you only wrap the calling function, you'll only see the profile results from the calling function.
from line_profiler import LineProfiler
import random
def do_other_stuff(numbers):
s = sum(numbers)
def do_stuff(numbers):
do_other_stuff(numbers)
l = [numbers[i]/43 for i in range(len(numbers))]
m = ['hello'+str(numbers[i]) for i in range(len(numbers))]
numbers = [random.randint(1,100) for i in range(1000)]
lp = LineProfiler()
lp_wrapper = lp(do_stuff)
lp_wrapper(numbers)
lp.print_stats()
The above would only produce the following profile output for the calling function:
Timer unit: 1e-06 s
Total time: 0.000773 s
File: <ipython-input-3-ec0394d0a501>
Function: do_stuff at line 7
Line # Hits Time Per Hit % Time Line Contents
==============================================================
7 def do_stuff(numbers):
8 1 11 11.0 1.4 do_other_stuff(numbers)
9 1 236 236.0 30.5 l = [numbers[i]/43 for i in range(len(numbers))]
10 1 526 526.0 68.0 m = ['hello'+str(numbers[i]) for i in range(len(numbers))]
In this case, you can add the additional called function to profile like this:
from line_profiler import LineProfiler
import random
def do_other_stuff(numbers):
s = sum(numbers)
def do_stuff(numbers):
do_other_stuff(numbers)
l = [numbers[i]/43 for i in range(len(numbers))]
m = ['hello'+str(numbers[i]) for i in range(len(numbers))]
numbers = [random.randint(1,100) for i in range(1000)]
lp = LineProfiler()
lp.add_function(do_other_stuff) # add additional function to profile
lp_wrapper = lp(do_stuff)
lp_wrapper(numbers)
lp.print_stats()
Output:
Timer unit: 1e-06 s
Total time: 9e-06 s
File: <ipython-input-4-dae73707787c>
Function: do_other_stuff at line 4
Line # Hits Time Per Hit % Time Line Contents
==============================================================
4 def do_other_stuff(numbers):
5 1 9 9.0 100.0 s = sum(numbers)
Total time: 0.000694 s
File: <ipython-input-4-dae73707787c>
Function: do_stuff at line 7
Line # Hits Time Per Hit % Time Line Contents
==============================================================
7 def do_stuff(numbers):
8 1 12 12.0 1.7 do_other_stuff(numbers)
9 1 208 208.0 30.0 l = [numbers[i]/43 for i in range(len(numbers))]
10 1 474 474.0 68.3 m = ['hello'+str(numbers[i]) for i in range(len(numbers))]
NOTE: Adding functions to profile in this way does not require changes to the profiled code (i.e., no need to add #profile decorators).
Just follow Dan Riti's example from the first link, but use your code. All you have to do after installing the line_profiler module is add a #profile decorator right before each function you wish to profile line-by-line and make sure each one is called at least once somewhere else in the code—so for your trivial example code that would be something like this:
example.py file:
#profile
def do_stuff(numbers):
print numbers
numbers = 2
do_stuff(numbers)
Having done that, run your script via the kernprof.py✶ that was installed in your C:\Python27\Scripts directory. Here's the (not very interesting) actual output from doing this in a Windows 7 command-line session:
> python "C:\Python27\Scripts\kernprof.py" -l -v example.py
2
Wrote profile results to example.py.lprof
Timer unit: 3.2079e-07 s
File: example.py
Function: do_stuff at line 2
Total time: 0.00185256 s
Line # Hits Time Per Hit % Time Line Contents
==============================================================
1 #profile
2 def do_stuff(numbers):
3 1 5775 5775.0 100.0 print numbers
You likely need to adapt this last step—the running of your test script with kernprof.py instead of directly by the Python interpreter—in order to do the equivalent from within IDLE or PyScripter.
✶Update
It appears that in line_profiler v1.0, the kernprof utility is distributed as an executable, not a .py script file as it was when I wrote the above. This means the following now needs to used to invoke it from the command-line:
> "C:\Python27\Scripts\kernprof.exe" -l -v example.py
load the line_profiler and numpy
%load_ext line_profiler
import numpy as np
define a function for example:
def take_sqr(array):
sqr_ar = [np.sqrt(x) for x in array]
return sqr_ar
use line_profiler to count the time as follows:
%lprun -f take_sqr take_sqr([1,2,3])
the output looks like this:
Timer unit: 1e-06 s
Total time: 6e-05 s File: <ipython-input-5-e50c1b05a473> Function:
take_sqr at line 1
Line # Hits Time Per Hit % Time Line Contents
==============================================================
1 def take_sqr(array):
2 4 59.0 14.8 98.3 sqr_ar = [np.sqrt(x) for x in array]
3 1 1.0 1.0 1.7 return sqr_ar
Found a good use to line_profiler using decorator i.e. #profile that worked for me:
def profile(func):
from functools import wraps
#wraps(func)
def wrapper(*args, **kwargs):
from line_profiler import LineProfiler
prof = LineProfiler()
try:
return prof(func)(*args, **kwargs)
finally:
prof.print_stats()
return wrapper
Credits to: pavelpatrin
If you're using PyCharm, you can also take a look at
https://plugins.jetbrains.com/plugin/16536-line-profiler
It's a plugin I created that allows you to load and visualize line profiler results into the PyCharm editor.
Just an addition to #Lhenkel answer.
This is a decorator for async functions
def async_profile(func):
"""line profiler for an async funciton"""
from functools import wraps
#wraps(func)
async def wrapper(*args, **kwargs):
from line_profiler import LineProfiler
prof = LineProfiler()
try:
return await prof(func)(*args, **kwargs)
finally:
prof.print_stats()
return wrapper
To use these decorators with methods read this answer

Why does line_profiler in python not add up the times correctly?

I am new to the line_profiler package in python. Am I reading the result incorrectly, or shouldn't the components in the output below add up to 1.67554 seconds? Instead, they add up to 3.918 seconds (2426873 microseconds + 1491105 microseconds). Thanks!
# test.py
import numpy as np
def tf():
arr = np.random.randn(3000,6000)
np.where(arr>1,arr,np.nan)
import test
%lprun -f test.tf test.tf()
Timer unit: 4.27654e-07 s
File: test.py
Function: tf at line 9
Total time: 1.67554 s
Line # Hits Time Per Hit % Time Line Contents
==============================================================
9 def tf():
10 1 2426873 2426873.0 61.9 arr = np.random.randn(3000,6000)
11 1 1491105 1491105.0 38.1 np.where(arr>1,arr,np.nan)
You misread the time there; those are not microseconds.
From the documentation:
Time: The total amount of time spent executing the line in the timer's units. In the header information before the tables, you will see a line "Timer unit:" giving the conversion factor to seconds. It may be different on different systems.
Emphasis mine. Your output shows each Timer unit is about 0.428 microseconds. The totals match if you multiply the units with the Timer unit value:
>>> unit = 4.27654e-07
>>> 2426873 * unit + 1491105 * unit
1.675538963612

How to get current CPU and RAM usage in Python?

How can I get the current system status (current CPU, RAM, free disk space, etc.) in Python? Ideally, it would work for both Unix and Windows platforms.
There seems to be a few possible ways of extracting that from my search:
Using a library such as PSI (that currently seems not actively developed and not supported on multiple platforms) or something like pystatgrab (again no activity since 2007 it seems and no support for Windows).
Using platform specific code such as using a os.popen("ps") or similar for the *nix systems and MEMORYSTATUS in ctypes.windll.kernel32 (see this recipe on ActiveState) for the Windows platform. One could put a Python class together with all those code snippets.
It's not that those methods are bad but is there already a well-supported, multi-platform way of doing the same thing?
The psutil library gives you information about CPU, RAM, etc., on a variety of platforms:
psutil is a module providing an interface for retrieving information on running processes and system utilization (CPU, memory) in a portable way by using Python, implementing many functionalities offered by tools like ps, top and Windows task manager.
It currently supports Linux, Windows, OSX, Sun Solaris, FreeBSD, OpenBSD and NetBSD, both 32-bit and 64-bit architectures, with Python versions from 2.6 to 3.5 (users of Python 2.4 and 2.5 may use 2.1.3 version).
Some examples:
#!/usr/bin/env python
import psutil
# gives a single float value
psutil.cpu_percent()
# gives an object with many fields
psutil.virtual_memory()
# you can convert that object to a dictionary
dict(psutil.virtual_memory()._asdict())
# you can have the percentage of used RAM
psutil.virtual_memory().percent
79.2
# you can calculate percentage of available memory
psutil.virtual_memory().available * 100 / psutil.virtual_memory().total
20.8
Here's other documentation that provides more concepts and interest concepts:
https://psutil.readthedocs.io/en/latest/
Use the psutil library. On Ubuntu 18.04, pip installed 5.5.0 (latest version) as of 1-30-2019. Older versions may behave somewhat differently.
You can check your version of psutil by doing this in Python:
from __future__ import print_function # for Python2
import psutil
print(psutil.__versi‌​on__)
To get some memory and CPU stats:
from __future__ import print_function
import psutil
print(psutil.cpu_percent())
print(psutil.virtual_memory()) # physical memory usage
print('memory % used:', psutil.virtual_memory()[2])
The virtual_memory (tuple) will have the percent memory used system-wide. This seemed to be overestimated by a few percent for me on Ubuntu 18.04.
You can also get the memory used by the current Python instance:
import os
import psutil
pid = os.getpid()
python_process = psutil.Process(pid)
memoryUse = python_process.memory_info()[0]/2.**30 # memory use in GB...I think
print('memory use:', memoryUse)
which gives the current memory use of your Python script.
There are some more in-depth examples on the pypi page for psutil.
Only for Linux:
One-liner for the RAM usage with only stdlib dependency:
import os
tot_m, used_m, free_m = map(int, os.popen('free -t -m').readlines()[-1].split()[1:])
One can get real time CPU and RAM monitoring by combining tqdm and psutil. It may be handy when running heavy computations / processing.
It also works in Jupyter without any code changes:
from tqdm import tqdm
from time import sleep
import psutil
with tqdm(total=100, desc='cpu%', position=1) as cpubar, tqdm(total=100, desc='ram%', position=0) as rambar:
while True:
rambar.n=psutil.virtual_memory().percent
cpubar.n=psutil.cpu_percent()
rambar.refresh()
cpubar.refresh()
sleep(0.5)
It's convenient to put those progress bars in separate process using multiprocessing library.
This code snippet is also available as a gist.
Below codes, without external libraries worked for me. I tested at Python 2.7.9
CPU Usage
import os
CPU_Pct=str(round(float(os.popen('''grep 'cpu ' /proc/stat | awk '{usage=($2+$4)*100/($2+$4+$5)} END {print usage }' ''').readline()),2))
print("CPU Usage = " + CPU_Pct) # print results
And Ram Usage, Total, Used and Free
import os
mem=str(os.popen('free -t -m').readlines())
"""
Get a whole line of memory output, it will be something like below
[' total used free shared buffers cached\n',
'Mem: 925 591 334 14 30 355\n',
'-/+ buffers/cache: 205 719\n',
'Swap: 99 0 99\n',
'Total: 1025 591 434\n']
So, we need total memory, usage and free memory.
We should find the index of capital T which is unique at this string
"""
T_ind=mem.index('T')
"""
Than, we can recreate the string with this information. After T we have,
"Total: " which has 14 characters, so we can start from index of T +14
and last 4 characters are also not necessary.
We can create a new sub-string using this information
"""
mem_G=mem[T_ind+14:-4]
"""
The result will be like
1025 603 422
we need to find first index of the first space, and we can start our substring
from from 0 to this index number, this will give us the string of total memory
"""
S1_ind=mem_G.index(' ')
mem_T=mem_G[0:S1_ind]
"""
Similarly we will create a new sub-string, which will start at the second value.
The resulting string will be like
603 422
Again, we should find the index of first space and than the
take the Used Memory and Free memory.
"""
mem_G1=mem_G[S1_ind+8:]
S2_ind=mem_G1.index(' ')
mem_U=mem_G1[0:S2_ind]
mem_F=mem_G1[S2_ind+8:]
print 'Summary = ' + mem_G
print 'Total Memory = ' + mem_T +' MB'
print 'Used Memory = ' + mem_U +' MB'
print 'Free Memory = ' + mem_F +' MB'
To get a line-by-line memory and time analysis of your program, I suggest using memory_profiler and line_profiler.
Installation:
# Time profiler
$ pip install line_profiler
# Memory profiler
$ pip install memory_profiler
# Install the dependency for a faster analysis
$ pip install psutil
The common part is, you specify which function you want to analyse by using the respective decorators.
Example: I have several functions in my Python file main.py that I want to analyse. One of them is linearRegressionfit(). I need to use the decorator #profile that helps me profile the code with respect to both: Time & Memory.
Make the following changes to the function definition
#profile
def linearRegressionfit(Xt,Yt,Xts,Yts):
lr=LinearRegression()
model=lr.fit(Xt,Yt)
predict=lr.predict(Xts)
# More Code
For Time Profiling,
Run:
$ kernprof -l -v main.py
Output
Total time: 0.181071 s
File: main.py
Function: linearRegressionfit at line 35
Line # Hits Time Per Hit % Time Line Contents
==============================================================
35 #profile
36 def linearRegressionfit(Xt,Yt,Xts,Yts):
37 1 52.0 52.0 0.1 lr=LinearRegression()
38 1 28942.0 28942.0 75.2 model=lr.fit(Xt,Yt)
39 1 1347.0 1347.0 3.5 predict=lr.predict(Xts)
40
41 1 4924.0 4924.0 12.8 print("train Accuracy",lr.score(Xt,Yt))
42 1 3242.0 3242.0 8.4 print("test Accuracy",lr.score(Xts,Yts))
For Memory Profiling,
Run:
$ python -m memory_profiler main.py
Output
Filename: main.py
Line # Mem usage Increment Line Contents
================================================
35 125.992 MiB 125.992 MiB #profile
36 def linearRegressionfit(Xt,Yt,Xts,Yts):
37 125.992 MiB 0.000 MiB lr=LinearRegression()
38 130.547 MiB 4.555 MiB model=lr.fit(Xt,Yt)
39 130.547 MiB 0.000 MiB predict=lr.predict(Xts)
40
41 130.547 MiB 0.000 MiB print("train Accuracy",lr.score(Xt,Yt))
42 130.547 MiB 0.000 MiB print("test Accuracy",lr.score(Xts,Yts))
Also, the memory profiler results can also be plotted using matplotlib using
$ mprof run main.py
$ mprof plot
Note: Tested on
line_profiler version == 3.0.2
memory_profiler version == 0.57.0
psutil version == 5.7.0
EDIT: The results from the profilers can be parsed using the TAMPPA package. Using it, we can get line-by-line desired plots as
We chose to use usual information source for this because we could find instantaneous fluctuations in free memory and felt querying the meminfo data source was helpful. This also helped us get a few more related parameters that were pre-parsed.
Code
import os
linux_filepath = "/proc/meminfo"
meminfo = dict(
(i.split()[0].rstrip(":"), int(i.split()[1]))
for i in open(linux_filepath).readlines()
)
meminfo["memory_total_gb"] = meminfo["MemTotal"] / (2 ** 20)
meminfo["memory_free_gb"] = meminfo["MemFree"] / (2 ** 20)
meminfo["memory_available_gb"] = meminfo["MemAvailable"] / (2 ** 20)
Output for reference (we stripped all newlines for further analysis)
MemTotal: 1014500 kB MemFree: 562680 kB MemAvailable: 646364 kB
Buffers: 15144 kB Cached: 210720 kB SwapCached: 0 kB Active: 261476 kB
Inactive: 128888 kB Active(anon): 167092 kB Inactive(anon): 20888 kB
Active(file): 94384 kB Inactive(file): 108000 kB Unevictable: 3652 kB
Mlocked: 3652 kB SwapTotal: 0 kB SwapFree: 0 kB Dirty: 0 kB Writeback:
0 kB AnonPages: 168160 kB Mapped: 81352 kB Shmem: 21060 kB Slab: 34492
kB SReclaimable: 18044 kB SUnreclaim: 16448 kB KernelStack: 2672 kB
PageTables: 8180 kB NFS_Unstable: 0 kB Bounce: 0 kB WritebackTmp: 0 kB
CommitLimit: 507248 kB Committed_AS: 1038756 kB VmallocTotal:
34359738367 kB VmallocUsed: 0 kB VmallocChunk: 0 kB HardwareCorrupted:
0 kB AnonHugePages: 88064 kB CmaTotal: 0 kB CmaFree: 0 kB
HugePages_Total: 0 HugePages_Free: 0 HugePages_Rsvd: 0 HugePages_Surp:
0 Hugepagesize: 2048 kB DirectMap4k: 43008 kB DirectMap2M: 1005568 kB
Here's something I put together a while ago, it's windows only but may help you get part of what you need done.
Derived from:
"for sys available mem"
http://msdn2.microsoft.com/en-us/library/aa455130.aspx
"individual process information and python script examples"
http://www.microsoft.com/technet/scriptcenter/scripts/default.mspx?mfr=true
NOTE: the WMI interface/process is also available for performing similar tasks
I'm not using it here because the current method covers my needs, but if someday it's needed to extend or improve this, then may want to investigate the WMI tools a vailable.
WMI for python:
http://tgolden.sc.sabren.com/python/wmi.html
The code:
'''
Monitor window processes
derived from:
>for sys available mem
http://msdn2.microsoft.com/en-us/library/aa455130.aspx
> individual process information and python script examples
http://www.microsoft.com/technet/scriptcenter/scripts/default.mspx?mfr=true
NOTE: the WMI interface/process is also available for performing similar tasks
I'm not using it here because the current method covers my needs, but if someday it's needed
to extend or improve this module, then may want to investigate the WMI tools available.
WMI for python:
http://tgolden.sc.sabren.com/python/wmi.html
'''
__revision__ = 3
import win32com.client
from ctypes import *
from ctypes.wintypes import *
import pythoncom
import pywintypes
import datetime
class MEMORYSTATUS(Structure):
_fields_ = [
('dwLength', DWORD),
('dwMemoryLoad', DWORD),
('dwTotalPhys', DWORD),
('dwAvailPhys', DWORD),
('dwTotalPageFile', DWORD),
('dwAvailPageFile', DWORD),
('dwTotalVirtual', DWORD),
('dwAvailVirtual', DWORD),
]
def winmem():
x = MEMORYSTATUS() # create the structure
windll.kernel32.GlobalMemoryStatus(byref(x)) # from cytypes.wintypes
return x
class process_stats:
'''process_stats is able to provide counters of (all?) the items available in perfmon.
Refer to the self.supported_types keys for the currently supported 'Performance Objects'
To add logging support for other data you can derive the necessary data from perfmon:
---------
perfmon can be run from windows 'run' menu by entering 'perfmon' and enter.
Clicking on the '+' will open the 'add counters' menu,
From the 'Add Counters' dialog, the 'Performance object' is the self.support_types key.
--> Where spaces are removed and symbols are entered as text (Ex. # == Number, % == Percent)
For the items you wish to log add the proper attribute name in the list in the self.supported_types dictionary,
keyed by the 'Performance Object' name as mentioned above.
---------
NOTE: The 'NETFramework_NETCLRMemory' key does not seem to log dotnet 2.0 properly.
Initially the python implementation was derived from:
http://www.microsoft.com/technet/scriptcenter/scripts/default.mspx?mfr=true
'''
def __init__(self,process_name_list=[],perf_object_list=[],filter_list=[]):
'''process_names_list == the list of all processes to log (if empty log all)
perf_object_list == list of process counters to log
filter_list == list of text to filter
print_results == boolean, output to stdout
'''
pythoncom.CoInitialize() # Needed when run by the same process in a thread
self.process_name_list = process_name_list
self.perf_object_list = perf_object_list
self.filter_list = filter_list
self.win32_perf_base = 'Win32_PerfFormattedData_'
# Define new datatypes here!
self.supported_types = {
'NETFramework_NETCLRMemory': [
'Name',
'NumberTotalCommittedBytes',
'NumberTotalReservedBytes',
'NumberInducedGC',
'NumberGen0Collections',
'NumberGen1Collections',
'NumberGen2Collections',
'PromotedMemoryFromGen0',
'PromotedMemoryFromGen1',
'PercentTimeInGC',
'LargeObjectHeapSize'
],
'PerfProc_Process': [
'Name',
'PrivateBytes',
'ElapsedTime',
'IDProcess',# pid
'Caption',
'CreatingProcessID',
'Description',
'IODataBytesPersec',
'IODataOperationsPersec',
'IOOtherBytesPersec',
'IOOtherOperationsPersec',
'IOReadBytesPersec',
'IOReadOperationsPersec',
'IOWriteBytesPersec',
'IOWriteOperationsPersec'
]
}
def get_pid_stats(self, pid):
this_proc_dict = {}
pythoncom.CoInitialize() # Needed when run by the same process in a thread
if not self.perf_object_list:
perf_object_list = self.supported_types.keys()
for counter_type in perf_object_list:
strComputer = "."
objWMIService = win32com.client.Dispatch("WbemScripting.SWbemLocator")
objSWbemServices = objWMIService.ConnectServer(strComputer,"root\cimv2")
query_str = '''Select * from %s%s''' % (self.win32_perf_base,counter_type)
colItems = objSWbemServices.ExecQuery(query_str) # "Select * from Win32_PerfFormattedData_PerfProc_Process")# changed from Win32_Thread
if len(colItems) > 0:
for objItem in colItems:
if hasattr(objItem, 'IDProcess') and pid == objItem.IDProcess:
for attribute in self.supported_types[counter_type]:
eval_str = 'objItem.%s' % (attribute)
this_proc_dict[attribute] = eval(eval_str)
this_proc_dict['TimeStamp'] = datetime.datetime.now().strftime('%Y-%m-%d %H:%M:%S.') + str(datetime.datetime.now().microsecond)[:3]
break
return this_proc_dict
def get_stats(self):
'''
Show process stats for all processes in given list, if none given return all processes
If filter list is defined return only the items that match or contained in the list
Returns a list of result dictionaries
'''
pythoncom.CoInitialize() # Needed when run by the same process in a thread
proc_results_list = []
if not self.perf_object_list:
perf_object_list = self.supported_types.keys()
for counter_type in perf_object_list:
strComputer = "."
objWMIService = win32com.client.Dispatch("WbemScripting.SWbemLocator")
objSWbemServices = objWMIService.ConnectServer(strComputer,"root\cimv2")
query_str = '''Select * from %s%s''' % (self.win32_perf_base,counter_type)
colItems = objSWbemServices.ExecQuery(query_str) # "Select * from Win32_PerfFormattedData_PerfProc_Process")# changed from Win32_Thread
try:
if len(colItems) > 0:
for objItem in colItems:
found_flag = False
this_proc_dict = {}
if not self.process_name_list:
found_flag = True
else:
# Check if process name is in the process name list, allow print if it is
for proc_name in self.process_name_list:
obj_name = objItem.Name
if proc_name.lower() in obj_name.lower(): # will log if contains name
found_flag = True
break
if found_flag:
for attribute in self.supported_types[counter_type]:
eval_str = 'objItem.%s' % (attribute)
this_proc_dict[attribute] = eval(eval_str)
this_proc_dict['TimeStamp'] = datetime.datetime.now().strftime('%Y-%m-%d %H:%M:%S.') + str(datetime.datetime.now().microsecond)[:3]
proc_results_list.append(this_proc_dict)
except pywintypes.com_error, err_msg:
# Ignore and continue (proc_mem_logger calls this function once per second)
continue
return proc_results_list
def get_sys_stats():
''' Returns a dictionary of the system stats'''
pythoncom.CoInitialize() # Needed when run by the same process in a thread
x = winmem()
sys_dict = {
'dwAvailPhys': x.dwAvailPhys,
'dwAvailVirtual':x.dwAvailVirtual
}
return sys_dict
if __name__ == '__main__':
# This area used for testing only
sys_dict = get_sys_stats()
stats_processor = process_stats(process_name_list=['process2watch'],perf_object_list=[],filter_list=[])
proc_results = stats_processor.get_stats()
for result_dict in proc_results:
print result_dict
import os
this_pid = os.getpid()
this_proc_results = stats_processor.get_pid_stats(this_pid)
print 'this proc results:'
print this_proc_results
I feel like these answers were written for Python 2, and in any case nobody's made mention of the standard resource package that's available for Python 3. It provides commands for obtaining the resource limits of a given process (the calling Python process by default). This isn't the same as getting the current usage of resources by the system as a whole, but it could solve some of the same problems like e.g. "I want to make sure I only use X much RAM with this script."
This aggregate all the goodies:
psutil + os to get Unix & Windows compatibility:
That allows us to get:
CPU
memory
disk
code:
import os
import psutil # need: pip install psutil
In [32]: psutil.virtual_memory()
Out[32]: svmem(total=6247907328, available=2502328320, percent=59.9, used=3327135744, free=167067648, active=3671199744, inactive=1662668800, buffers=844783616, cached=1908920320, shared=123912192, slab=613048320)
In [33]: psutil.virtual_memory().percent
Out[33]: 60.0
In [34]: psutil.cpu_percent()
Out[34]: 5.5
In [35]: os.sep
Out[35]: '/'
In [36]: psutil.disk_usage(os.sep)
Out[36]: sdiskusage(total=50190790656, used=41343860736, free=6467502080, percent=86.5)
In [37]: psutil.disk_usage(os.sep).percent
Out[37]: 86.5
Taken feedback from first response and done small changes
#!/usr/bin/env python
#Execute commond on windows machine to install psutil>>>>python -m pip install psutil
import psutil
print (' ')
print ('----------------------CPU Information summary----------------------')
print (' ')
# gives a single float value
vcc=psutil.cpu_count()
print ('Total number of CPUs :',vcc)
vcpu=psutil.cpu_percent()
print ('Total CPUs utilized percentage :',vcpu,'%')
print (' ')
print ('----------------------RAM Information summary----------------------')
print (' ')
# you can convert that object to a dictionary
#print(dict(psutil.virtual_memory()._asdict()))
# gives an object with many fields
vvm=psutil.virtual_memory()
x=dict(psutil.virtual_memory()._asdict())
def forloop():
for i in x:
print (i,"--",x[i]/1024/1024/1024)#Output will be printed in GBs
forloop()
print (' ')
print ('----------------------RAM Utilization summary----------------------')
print (' ')
# you can have the percentage of used RAM
print('Percentage of used RAM :',psutil.virtual_memory().percent,'%')
#79.2
# you can calculate percentage of available memory
print('Percentage of available RAM :',psutil.virtual_memory().available * 100 / psutil.virtual_memory().total,'%')
#20.8
"... current system status (current CPU, RAM, free disk space, etc.)" And "*nix and Windows platforms" can be a difficult combination to achieve.
The operating systems are fundamentally different in the way they manage these resources. Indeed, they differ in core concepts like defining what counts as system and what counts as application time.
"Free disk space"? What counts as "disk space?" All partitions of all devices? What about foreign partitions in a multi-boot environment?
I don't think there's a clear enough consensus between Windows and *nix that makes this possible. Indeed, there may not even be any consensus between the various operating systems called Windows. Is there a single Windows API that works for both XP and Vista?
This script for CPU usage:
import os
def get_cpu_load():
""" Returns a list CPU Loads"""
result = []
cmd = "WMIC CPU GET LoadPercentage "
response = os.popen(cmd + ' 2>&1','r').read().strip().split("\r\n")
for load in response[1:]:
result.append(int(load))
return result
if __name__ == '__main__':
print get_cpu_load()
For CPU details use psutil library
https://psutil.readthedocs.io/en/latest/#cpu
For RAM Frequency (in MHz) use the built in Linux library dmidecode and manipulate the output a bit ;). this command needs root permission hence supply your password too. just copy the following commend replacing mypass with your password
import os
os.system("echo mypass | sudo -S dmidecode -t memory | grep 'Clock Speed' | cut -d ':' -f2")
------------------- Output ---------------------------
1600 MT/s
Unknown
1600 MT/s
Unknown 0
more specificly
[i for i in os.popen("echo mypass | sudo -S dmidecode -t memory | grep 'Clock Speed' | cut -d ':' -f2").read().split(' ') if i.isdigit()]
-------------------------- output -------------------------
['1600', '1600']
you can read /proc/meminfo to get used memory
file1 = open('/proc/meminfo', 'r')
for line in file1:
if 'MemTotal' in line:
x = line.split()
memTotal = int(x[1])
if 'Buffers' in line:
x = line.split()
buffers = int(x[1])
if 'Cached' in line and 'SwapCached' not in line:
x = line.split()
cached = int(x[1])
if 'MemFree' in line:
x = line.split()
memFree = int(x[1])
file1.close()
percentage_used = int ( ( memTotal - (buffers + cached + memFree) ) / memTotal * 100 )
print(percentage_used)
Based on the cpu usage code by #Hrabal, this is what I use:
from subprocess import Popen, PIPE
def get_cpu_usage():
''' Get CPU usage on Linux by reading /proc/stat '''
sub = Popen(('grep', 'cpu', '/proc/stat'), stdout=PIPE, stderr=PIPE)
top_vals = [int(val) for val in sub.communicate()[0].split('\n')[0].split[1:5]]
return (top_vals[0] + top_vals[2]) * 100. /(top_vals[0] + top_vals[2] + top_vals[3])
You can use psutil or psmem with subprocess
example code
import subprocess
cmd = subprocess.Popen(['sudo','./ps_mem'],stdout=subprocess.PIPE,stderr=subprocess.PIPE)
out,error = cmd.communicate()
memory = out.splitlines()
Reference
https://github.com/Leo-g/python-flask-cmd
You can always use the library recently released SystemScripter by using the command pip install SystemScripter. This is a library that uses the other library like psutil among others to create a full library of system information that spans from CPU to disk information.
For current CPU usage use the function:
SystemScripter.CPU.CpuPerCurrentUtil(SystemScripter.CPU()) #class init as self param if not work
This gets the usage percentage or use:
SystemScripter.CPU.CpuCurrentUtil(SystemScripter.CPU())
https://pypi.org/project/SystemScripter/#description
Run with crontab won't print pid
Setup: */1 * * * * sh dog.sh this line in crontab -e
import os
import re
CUT_OFF = 90
def get_cpu_load():
cmd = "ps -Ao user,uid,comm,pid,pcpu --sort=-pcpu | head -n 2 | tail -1"
response = os.popen(cmd, 'r').read()
arr = re.findall(r'\S+', response)
print(arr)
needKill = float(arr[-1]) > CUT_OFF
if needKill:
r = os.popen(f"kill -9 {arr[-2]}")
print('kill:', r)
if __name__ == '__main__':
# Test CPU with
# $ stress --cpu 1
# crontab -e
# Every 1 min
# */1 * * * * sh dog.sh
# ctlr o, ctlr x
# crontab -l
print(get_cpu_load())
Shell-out not needed for #CodeGench's solution, so assuming Linux and Python's standard libraries:
def cpu_load():
with open("/proc/stat", "r") as stat:
(key, user, nice, system, idle, _) = (stat.readline().split(None, 5))
assert key == "cpu", "'cpu ...' should be the first line in /proc/stat"
busy = int(user) + int(nice) + int(system)
return 100 * busy / (busy + int(idle))
I don't believe that there is a well-supported multi-platform library available. Remember that Python itself is written in C so any library is simply going to make a smart decision about which OS-specific code snippet to run, as you suggested above.

Categories

Resources