Related
Below is a simple and perfect solution on Windows for IPC with shared memory, without having to use networking / sockets (that have annoying limits on Windows).
The only problem is that it's not portable on Linux:
Avoiding the use of the tag parameter will assist in keeping your code portable between Unix and Windows.
Question: is there a simple way built-in in Python, without having a conditional branch "if platform is Windows, if platform is Linux" to have a shared-memory mmap?
Something like
mm = sharedmemory(size=2_000_000_000, name="id1234") # 2 GB, id1234 is a global
# id available for all processes
mm.seek(1_000_000)
mm.write(b"hello")
that would internally default to mmap.mmap(..., tagname="id1234") on Windows and use /dev/shm on Linux (or maybe even a better solution that I don't know?), and probably something else on Mac, but without having to handle this manually for each different OS.
Working Windows-only solution:
#server
import mmap, time
mm = mmap.mmap(-1, 1_000_000_000, tagname="foo")
while True:
mm.seek(500_000_000)
mm.write(str(time.time()).encode())
mm.flush()
time.sleep(1)
# client
import mmap, time
mm = mmap.mmap(-1, 1_000_000_000, tagname="foo")
while True:
mm.seek(500_000_000)
print(mm.read(128))
time.sleep(1)
The easiest way is to use python with version >=3.8, it has added a built-in abstraction for shared memory,
it works on both windows and linux
https://docs.python.org/3.10/library/multiprocessing.shared_memory.html
The code will look something like this:
Process #1:
from multiprocessing import shared_memory
# create=true to create a new shared memory instance, if it already exists with the same name, an exception is thrown
shm_a = shared_memory.SharedMemory(name="example", create=True, size=10)
shm_a.buf[:3] = bytearray([1, 2, 3])
while True:
do_smt()
shm_a.close()
Process #2:
from multiprocessing import shared_memory
# create=false, use existing
shm_a = shared_memory.SharedMemory(name="example", size=10)
print(bytes(shm.buf[:3]))
# [0x01, 0x02, 0x03]
while True:
do_smt()
shm_a.close()
Otherwise, I think there are no common good solutions and you will need to reinvent the wheel :)
Personally this has worked well for me
Option 1: http://www.inspirel.com/yami4/
The YAMI4 suite for general computing is a multi-language and multi-platform package.
Several Operating systems:
Sample code
Microsoft Windows, POSIX (Linux, Max OS X, FreeBSD, ...), QNX (with native IPC messaging), FreeRTOS, ThreadX, TI-RTOS. Programming languages: C++, Ada, Java, .NET, Python, Wolfram.
Option 2: ZeroMq https://zeromq.org/
Most of the Numpy's function will enable multithreading by default.
for example, I work on a 8-cores intel cpu workstation, if I run a script
import numpy as np
x=np.random.random(1000000)
for i in range(100000):
np.sqrt(x)
the linux top will show 800% cpu usage during running like
Which means numpy automatically detects that my workstation has 8 cores, and np.sqrt automatically use all 8 cores to accelerate computation.
However, I found a weird bug. If I run a script
import numpy as np
import pandas as pd
df=pd.DataFrame(np.random.random((10,10)))
df+df
x=np.random.random(1000000)
for i in range(100000):
np.sqrt(x)
the cpu usage is 100%!!.
It means that if you plus two pandas DataFrame before running any numpy function, the auto multithreading feature of numpy is gone without any warning! This is absolutely not reasonable, why would Pandas dataFrame calculation affect Numpy threading setting? Is it a bug? How to work around this?
PS:
I dig further using Linux perf tool.
running first script shows
While running second script shows
So both script involves libmkl_vml_avx2.so, while the first script involves additional libiomp5.so which seems to be related to openMP.
And since vml means intel vector math library, so according to vml doc I guess at least below functions are all automatically multithreaded
Pandas uses numexpr under the hood to calculate some operations, and numexpr sets the maximal number of threads for vml to 1, when it is imported:
# The default for VML is 1 thread (see #39)
set_vml_num_threads(1)
and it gets imported by pandas when df+df is evaluated in expressions.py:
from pandas.core.computation.check import _NUMEXPR_INSTALLED
if _NUMEXPR_INSTALLED:
import numexpr as ne
However, Anaconda distribution also uses vml-functionality for such functions as sqrt, sin, cos and so on - and once numexpr set the maximal number of vml-threads to 1, the numpy-functions no longer use parallelization.
The problem can be easily seen in gdb (using your slow script):
>>> gdb --args python slow.py
(gdb) b mkl_serv_domain_set_num_threads
function "mkl_serv_domain_set_num_threads" not defined.
Make breakpoint pending on future shared library load? (y or [n]) y
Breakpoint 1 (mkl_serv_domain_set_num_threads) pending.
(gbd) run
Thread 1 "python" hit Breakpoint 1, 0x00007fffee65cd70 in mkl_serv_domain_set_num_threads () from /home/ed/anaconda37/lib/python3.7/site-packages/numpy/../../../libmkl_intel_thread.so
(gdb) bt
#0 0x00007fffee65cd70 in mkl_serv_domain_set_num_threads () from /home/ed/anaconda37/lib/python3.7/site-packages/numpy/../../../libmkl_intel_thread.so
#1 0x00007fffe978026c in _set_vml_num_threads(_object*, _object*) () from /home/ed/anaconda37/lib/python3.7/site-packages/numexpr/interpreter.cpython-37m-x86_64-linux-gnu.so
#2 0x00005555556cd660 in _PyMethodDef_RawFastCallKeywords () at /tmp/build/80754af9/python_1553721932202/work/Objects/call.c:694
...
(gdb) print $rdi
$1 = 1
i.e. we can see, numexpr sets number of threads to 1. Which is later used when vml-sqrt function is called:
(gbd) b mkl_serv_domain_get_max_threads
Breakpoint 2 at 0x7fffee65a900
(gdb) (gdb) c
Continuing.
Thread 1 "python" hit Breakpoint 2, 0x00007fffee65a900 in mkl_serv_domain_get_max_threads () from /home/ed/anaconda37/lib/python3.7/site-packages/numpy/../../../libmkl_intel_thread.so
(gdb) bt
#0 0x00007fffee65a900 in mkl_serv_domain_get_max_threads () from /home/ed/anaconda37/lib/python3.7/site-packages/numpy/../../../libmkl_intel_thread.so
#1 0x00007ffff01fcea9 in mkl_vml_serv_threader_d_1i_1o () from /home/ed/anaconda37/lib/python3.7/site-packages/numpy/../../../libmkl_intel_thread.so
#2 0x00007fffedf78563 in vdSqrt () from /home/ed/anaconda37/lib/python3.7/site-packages/numpy/../../../libmkl_intel_lp64.so
#3 0x00007ffff5ac04ac in trivial_two_operand_loop () from /home/ed/anaconda37/lib/python3.7/site-packages/numpy/core/_multiarray_umath.cpython-37m-x86_64-linux-gnu.so
So we can see numpy uses vml's implementation of vdSqrt which utilizes mkl_vml_serv_threader_d_1i_1o to decide whether calculation should be done in parallel and it looks the number of threads:
(gdb) fin
Run till exit from #0 0x00007fffee65a900 in mkl_serv_domain_get_max_threads () from /home/ed/anaconda37/lib/python3.7/site-packages/numpy/../../../libmkl_intel_thread.so
0x00007ffff01fcea9 in mkl_vml_serv_threader_d_1i_1o () from /home/ed/anaconda37/lib/python3.7/site-packages/numpy/../../../libmkl_intel_thread.so
(gdb) print $rax
$2 = 1
the register %rax has the maximal number of threads and it is 1.
Now we can use numexpr to increase the number of vml-threads, i.e.:
import numpy as np
import numexpr as ne
import pandas as pd
df=pd.DataFrame(np.random.random((10,10)))
df+df
#HERE: reset number of vml-threads
ne.set_vml_num_threads(8)
x=np.random.random(1000000)
for i in range(10000):
np.sqrt(x) # now in parallel
Now multiple cores are utilized!
Looking at numpy, it looks like, under the hood it has had on/off issues with multithreading, and depending on what version you are using you may expect to may start to see crashes when you bump up ne.set_vml_num_threads() ..
http://numpy-discussion.10968.n7.nabble.com/ANN-NumExpr-2-7-0-Release-td47414.html
I need to get my head around how this is glued in to the python interpreter, given your code example where it seems to be somehow allowing multiple apparently synchronous/ordered calls to np.sqrt() to proceed in parallel. I guess if python interpreter is always just returning a reference to an object when it pops the stack, and in your example is just pitching those references and not assigning or manipulating them in any way it would be fine. But if subsequent loop iterations depend on previous ones then it seems less clear how these could be safely parallelized. Arguably silent failure / wrong results is an outcome worse than crashes.
I think that your initial premise may be incorrect -
You stated: Which means numpy automatically detects that my workstation has 8 cores, and np.sqrt automatically use all 8 cores to accelerate computation.
A single function np.sqrt() cannot guess how it will next be invoked or return before it has partially completed. There are parallelism mechanisms in python, but none are automatic.
Now, having said that, the python interpreter may be able to optimize the for loop for parallelism, which may be what you are seeing, but I strongly suspect if you look at the wall-clock time for this loop to execute it will be no different regardless if you are (apparently) using 8 cores or 1 core.
UPDATE: Having read a bit more of the comments it seems as though the multi-core behavior you are seeing is related to the anaconda distribution of the python interpreter. I took a look but was unable to find any source code for it, but it seems that the python license permits entities (like anaconda.com) to compile and distribute derivatives of the interpreter without requiring their changes to be published.
I guess that you can reach out to the anaconda folks - the behaviour you are seeing will be difficult to figure out without knowing what/if anything they've changed in the interpreter ..
Also do a quick check of the wall clock time with/without the optimization to see if it is indeed 8x faster - even if you've really got all 8 cores working instead of 1 it would be good to know if the results are actually 8x faster or if there are spinlocks in use which are still serializing on a single mutex.
I run conda 4.6.3 with python 3.7.2 win32. In python, when I import numpy, i see the RAM usage increase by 80MB. Since I am using multiprocessing, I wonder if this is normal and if there is anyway to avoid this RAM overhead? Please see below all the versions from relevant packages (from conda list):
python...........3.7.2 h8c8aaf0_2
mkl_fft...........1.0.10 py37h14836fe_0
mkl_random..1.0.2 py37h343c172_0
numpy...........1.15.4 py37h19fb1c0_0
numpy-base..1.15.4 py37hc3f5095_0
thanks!
You can't avoid this cost, but it's likely not as bad as it seems. The numpy libraries (a copy of C only libopenblasp, plus all the Python numpy extension modules) occupy over 60 MB on disk, and they're all going to be memory mapped into your Python process on import; adding on all the Python modules and the dynamically allocated memory involved in loading and initializing all of them, and 80 MB of increased reported RAM usage is pretty normal.
That said:
The C libraries and Python extension modules are memory mapped in, but that doesn't actually mean they occupy "real" RAM; if the code paths in a given page aren't exercised, the page will either never be loaded, or will be dropped under memory pressure (not even written to the page file, since it can always reload it from the original DLL).
On UNIX-like systems, when you fork (multiprocessing does this by default everywhere but Windows) that memory is shared between parent and worker processes in copy-on-write mode. Since the code itself is generally not written, the only cost is the page tables themselves (a tiny fraction of the memory they reference), and both parent and child will share that RAM.
Sadly, on Windows, fork isn't an option (unless you're running Ubuntu bash on Windows, in which case it's only barely Windows, effectively Linux), so you'll likely pay more of the memory costs in each process. But even there, libopenblasp, the C library backing large parts of numpy, will be remapped per process, but the OS should properly share that read-only memory across processes (and large parts, if not all, of the Python extension modules as well).
Basically, until this actually causes a problem (and it's unlikely to do so), don't worry about it.
[NumPy]: NumPy
is the fundamental package for scientific computing with Python.
It is a big package, designed to work with large datasets and optimized (primarily) for speed. If you look in its __init__.py (which gets executed when importing it (e.g.: import numpy)), you'll notice that it imports lots of items (packages / modules):
Those items themselves, may import others
Some of them are extension modules (.pyds (.dlls) or .sos) which get loaded into the current process (their dependencies as well)
I've prepared a demo.
code.py:
#!/usr/bin/env python3
import sys
import os
import psutil
#import pprint
def main():
display_text = "This {:s} screenshot was taken. Press <Enter> to continue ... "
pid = os.getpid()
print("Pid: {:d}\n".format(pid))
p = psutil.Process(pid=pid)
mod_names0 = set(k for k in sys.modules)
mi0 = p.memory_info()
input(display_text.format("first"))
import numpy
input(display_text.format("second"))
mi1 = p.memory_info()
for idx, mi in enumerate([mi0, mi1], start=1):
print("\nMemory info ({:d}): {:}".format(idx, mi))
print("\nExtra modules imported by `{:s}` :".format(numpy.__name__))
print(sorted(set(k for k in sys.modules) - mod_names0))
#pprint.pprint({k: v for k, v in sys.modules.items() if k not in mod_names0})
print("\nDone.")
if __name__ == "__main__":
print("Python {:s} on {:s}\n".format(sys.version, sys.platform))
main()
Output:
[cfati#CFATI-5510-0:e:\Work\Dev\StackOverflow\q054675983]> "e:\Work\Dev\VEnvs\py_064_03.06.08_test0\Scripts\python.exe" code.py
Python 3.6.8 (tags/v3.6.8:3c6b436a57, Dec 24 2018, 00:16:47) [MSC v.1916 64 bit (AMD64)] on win32
Pid: 27160
This first screenshot was taken. Press <Enter> to continue ...
This second screenshot was taken. Press <Enter> to continue ...
Memory info (1): pmem(rss=15491072, vms=8458240, num_page_faults=4149, peak_wset=15495168, wset=15491072, peak_paged_pool=181160, paged_pool=180984, peak_nonpaged_pool=13720, nonpaged_pool=13576, pagefile=8458240, peak_pagefile=8458240, private=8458240)
Memory info (2): pmem(rss=27156480, vms=253882368, num_page_faults=7283, peak_wset=27205632, wset=27156480, peak_paged_pool=272160, paged_pool=272160, peak_nonpaged_pool=21640, nonpaged_pool=21056, pagefile=253882368, peak_pagefile=253972480, private=253882368)
Extra modules imported by `numpy` :
['_ast', '_bisect', '_blake2', '_compat_pickle', '_ctypes', '_decimal', '_hashlib', '_pickle', '_random', '_sha3', '_string', '_struct', 'argparse', 'ast', 'atexit', 'bisect', 'copy', 'ctypes', 'ctypes._endian', 'cython_runtime', 'decimal', 'difflib', 'gc', 'gettext', 'hashlib', 'logging', 'mtrand', 'numbers', 'numpy', 'numpy.__config__', 'numpy._distributor_init', 'numpy._globals', 'numpy._import_tools', 'numpy.add_newdocs', 'numpy.compat', 'numpy.compat._inspect', 'numpy.compat.py3k', 'numpy.core', 'numpy.core._internal', 'numpy.core._methods', 'numpy.core._multiarray_tests', 'numpy.core.arrayprint', 'numpy.core.defchararray', 'numpy.core.einsumfunc', 'numpy.core.fromnumeric', 'numpy.core.function_base', 'numpy.core.getlimits', 'numpy.core.info', 'numpy.core.machar', 'numpy.core.memmap', 'numpy.core.multiarray', 'numpy.core.numeric', 'numpy.core.numerictypes', 'numpy.core.records', 'numpy.core.shape_base', 'numpy.core.umath', 'numpy.ctypeslib', 'numpy.fft', 'numpy.fft.fftpack', 'numpy.fft.fftpack_lite', 'numpy.fft.helper', 'numpy.fft.info', 'numpy.lib', 'numpy.lib._datasource', 'numpy.lib._iotools', 'numpy.lib._version', 'numpy.lib.arraypad', 'numpy.lib.arraysetops', 'numpy.lib.arrayterator', 'numpy.lib.financial', 'numpy.lib.format', 'numpy.lib.function_base', 'numpy.lib.histograms', 'numpy.lib.index_tricks', 'numpy.lib.info', 'numpy.lib.mixins', 'numpy.lib.nanfunctions', 'numpy.lib.npyio', 'numpy.lib.polynomial', 'numpy.lib.scimath', 'numpy.lib.shape_base', 'numpy.lib.stride_tricks', 'numpy.lib.twodim_base', 'numpy.lib.type_check', 'numpy.lib.ufunclike', 'numpy.lib.utils', 'numpy.linalg', 'numpy.linalg._umath_linalg', 'numpy.linalg.info', 'numpy.linalg.lapack_lite', 'numpy.linalg.linalg', 'numpy.ma', 'numpy.ma.core', 'numpy.ma.extras', 'numpy.matrixlib', 'numpy.matrixlib.defmatrix', 'numpy.polynomial', 'numpy.polynomial._polybase', 'numpy.polynomial.chebyshev', 'numpy.polynomial.hermite', 'numpy.polynomial.hermite_e', 'numpy.polynomial.laguerre', 'numpy.polynomial.legendre', 'numpy.polynomial.polynomial', 'numpy.polynomial.polyutils', 'numpy.random', 'numpy.random.info', 'numpy.random.mtrand', 'numpy.testing', 'numpy.testing._private', 'numpy.testing._private.decorators', 'numpy.testing._private.nosetester', 'numpy.testing._private.pytesttester', 'numpy.testing._private.utils', 'numpy.version', 'pathlib', 'pickle', 'pprint', 'random', 'string', 'struct', 'tempfile', 'textwrap', 'unittest', 'unittest.case', 'unittest.loader', 'unittest.main', 'unittest.result', 'unittest.runner', 'unittest.signals', 'unittest.suite', 'unittest.util', 'urllib', 'urllib.parse']
Done.
And the (before and after import) screenshots ([MS.Docs]: Process Explorer):
As a personal remark, I think that ~80 MiB (or whatever the exact amount is), is more than decent for the current "era", which is characterized by ridiculously high amounts of hardware resources, especially in the memories area. Besides, that would probably insignificant, compared to the amount required by the arrays themselves. If it's not the case, you should probably consider moving away from numpy.
There could be a way to reduce the memory footprint, by selectively importing only the modules containing the features that you need (my personal advice is against it), and thus going around __init__.py:
You'd have to be an expert in numpy's internals
Modules must be imported "manually" (by file name), using [Python 3]: importlib - The implementation of import (or alternatives)
Their dependents will be imported / loaded as well (and because of this, I don't know how much free memory you'd gain)
I'm playing around with pygame, and one thing I'd like to do is reduce the number of frames per second when the computer is on battery power (to lower the CPU usage and extend battery life).
How can I detect, from Python, whether the computer is currently on battery power?
I'm using Python 3.1 on Windows.
If you want to do it without win32api, you can use the built-in ctypes module. I usually run CPython without win32api, so I kinda like these solutions.
It's a tiny bit more work for GetSystemPowerStatus() because you have to define the SYSTEM_POWER_STATUS structure, but not bad.
# Get power status of the system using ctypes to call GetSystemPowerStatus
import ctypes
from ctypes import wintypes
class SYSTEM_POWER_STATUS(ctypes.Structure):
_fields_ = [
('ACLineStatus', wintypes.BYTE),
('BatteryFlag', wintypes.BYTE),
('BatteryLifePercent', wintypes.BYTE),
('Reserved1', wintypes.BYTE),
('BatteryLifeTime', wintypes.DWORD),
('BatteryFullLifeTime', wintypes.DWORD),
]
SYSTEM_POWER_STATUS_P = ctypes.POINTER(SYSTEM_POWER_STATUS)
GetSystemPowerStatus = ctypes.windll.kernel32.GetSystemPowerStatus
GetSystemPowerStatus.argtypes = [SYSTEM_POWER_STATUS_P]
GetSystemPowerStatus.restype = wintypes.BOOL
status = SYSTEM_POWER_STATUS()
if not GetSystemPowerStatus(ctypes.pointer(status)):
raise ctypes.WinError()
print('ACLineStatus', status.ACLineStatus)
print('BatteryFlag', status.BatteryFlag)
print('BatteryLifePercent', status.BatteryLifePercent)
print('BatteryLifeTime', status.BatteryLifeTime)
print('BatteryFullLifeTime', status.BatteryFullLifeTime)
On my system that prints this (basically meaning "desktop, plugged in"):
ACLineStatus 1
BatteryFlag -128
BatteryLifePercent -1
BatteryLifeTime 4294967295
BatteryFullLifeTime 4294967295
The most reliable way to retrieve this information in C is by using GetSystemPowerStatus. If no battery is present ACLineStatus will be set to 128. psutil exposes this information under Linux, Windows and FreeBSD, so to check if battery is present you can do this
>>> import psutil
>>> has_battery = psutil.sensors_battery() is not None
If a battery is present and you want to know whether the power cable is plugged in you can do this:
>>> import psutil
>>> psutil.sensors_battery()
sbattery(percent=99, secsleft=20308, power_plugged=True)
>>> psutil.sensors_battery().power_plugged
True
>>>
It is easy, all you have to do is to call Windows API function GetSystemPowerStatus from Python, probably by importing win32api module.
EDIT: GetSystemPowerStatus() is not yet implemented in win32api as of build 219 (2014-05-04).
A simple method for cross platform power status indication is the 'power' module which you can install with pip
import power
ans = power.PowerManagement().get_providing_power_source_type()
if not ans:
print "plugged into wall socket"
else:
print "on battery"
You can install acpi.From wikipedia
In a computer, the Advanced Configuration and Power Interface provides an open standard that operating systems can use to discover and configure computer hardware components, to perform power management by putting unused components to sleep, and to perform status monitoring.
Then use the subprocess module in python
import subprocess
cmd = 'acpi -b'
# for python 3.7+
p = subprocess.run(cmd.split(), shell=True, capture_output=True)
battery_info, error = p.stdout.decode(), p.stderr.decode()
# for python3.x (x<6)
battery_info = subprocess.check_output(cmd.split(), shell=True).decode('utf-8')
print (battery_info)
[SO]: In Python, how can I detect whether the computer is on battery power? (#BenHoyt's answer) is portable and doesn't require extra packages, but it's negatively impacted (until Python v3.12) by a CTypes (WinTypes) bug.
More details about the bug (and fix, workaround): [SO]: Why ctypes.wintypes.BYTE is signed, but native windows BYTE is unsigned? (#CristiFati's answer).
Anyway, I submitted [GitHub]: mhammond/pywin32 - Add GetSystemPowerStatus wrapper for GetSystemPowerStatus function to be available in Win32API.
Building win32api.pyd locally and overwriting the one from site-packages directory (as I mentioned in the Test section), yields:
[cfati#CFATI-5510-0:e:\Work\Dev\StackOverflow\q006153860]> sopr.bat
### Set shorter prompt to better fit when pasted in StackOverflow (or other) pages ###
[prompt]>
[prompt]> :: Power cable unplugged
[prompt]> "e:\Work\Dev\VEnvs\py_pc064_03.10_test1_pw32\Scripts\python.exe" -c "import win32api as wapi;from pprint import pprint as pp;pp(wapi.GetSystemPowerStatus(), sort_dicts=0);print(\"\nDone.\n\")"
{'ACLineStatus': 0,
'BatteryFlag': 1,
'BatteryLifePercent': 99,
'SystemStatusFlag': 0,
'BatteryLifeTime': 13094,
'BatteryFullLifeTime': 4294967295}
Done.
[prompt]>
[prompt]> :: Plug in power cable
[prompt]> "e:\Work\Dev\VEnvs\py_pc064_03.10_test1_pw32\Scripts\python.exe" -c "import win32api as wapi;from pprint import pprint as pp;pp(wapi.GetSystemPowerStatus(), sort_dicts=0);print(\"\nDone.\n\")"
{'ACLineStatus': 1,
'BatteryFlag': 1,
'BatteryLifePercent': 100,
'SystemStatusFlag': 0,
'BatteryLifeTime': 4294967295,
'BatteryFullLifeTime': 4294967295}
Done.
Check [SO]: How to change username of job in print queue using python & win32print (#CristiFati's answer) (at the end) for possible ways to benefit from the (above) patch.
Worth mentioning (if [SO]: In Python, how can I detect whether the computer is on battery power? (#GiampaoloRodolà's answer) is not clear enough about it) that [PyPI]: psutil also uses GetSystemPowerStatus in order to retrieve battery information.
From within a Python application, how can I get the total amount of RAM of the system and how much of it is currently free, in a cross-platform way?
Ideally, the amount of free RAM should consider only physical memory that can actually be allocated to the Python process.
Have you tried SIGAR - System Information Gatherer And Reporter?
After install
import os, sigar
sg = sigar.open()
mem = sg.mem()
sg.close()
print mem.total() / 1024, mem.free() / 1024
Hope this helps
psutil would be another good choice. It also needs a library installed however.
>>> import psutil
>>> psutil.virtual_memory()
vmem(total=8374149120L, available=2081050624L, percent=75.1,
used=8074080256L, free=300068864L, active=3294920704,
inactive=1361616896, buffers=529895424L, cached=1251086336)
For the free memory part, there is a function in the wx library:
wx.GetFreeMemory()
Unfortunately, this only works on Windows. Linux and Mac ports either return "-1" or raise a NotImplementedError.
You can't do this with just the standard Python library, although there might be some third party package that does it. Barring that, you can use the os package to determine which operating system you're on and use that information to acquire the info you want for that system (and encapsulate that into a single cross-platform function).
In windows I use this method. It's kinda hacky but it works using standard os library:
import os
process = os.popen('wmic memorychip get capacity')
result = process.read()
process.close()
totalMem = 0
for m in result.split(" \r\n")[1:-1]:
totalMem += int(m)
print totalMem / (1024**3)