I am new to Python. I am trying to run MATLAB from inside Python using the mlab package. I was following the guide on the website, and I entered this in the Python command line:
from mlab.releases import latest_release
The error I got was:
cannot import name find_available_releases
It seems that under matlabcom.py there was no find_available_releases function.
May I know if anyone knows how to resolve this? Thank you!
PS: I am using Windows 7, MATLAB 2012a and Python 2.7
I skimmed through the code, and I don't think all of the README file and its documentation match what's actually implemented. It appears to be mostly copied from the original mlabwrap project.
This is confusing because mlabwrap is implemented using a C extension module to interact with the MATLAB Engine API. However the mlab code seems to have replaced that part with a pure Python implementation as the MATLAB-bridge backend. It comes from "Dana Pe'er Lab" and it uses two different methods to interact with MATLAB depending on the platform (COM/ActiveX on Windows and pipes on Linux/Mac).
Now that we understand how the backend is implemented, you can start looking at the import error.
Note: the Linux/Mac part of the code tries to find the MATLAB executable in some hardcoded fixed locations, and allows to choose between different versions.
However you are working on Windows, and the code doesn't really implement any way of picking between MATLAB releases for this platform (so all of the methods like discover_location and find_available_releases are useless on Windows). In the end, the COM object is created as:
self.client = win32com.client.Dispatch('matlab.application')
As explained here, the ProgID matlab.application is not version-specific, and will simply use whatever was registered as the default MATLAB COM server. We can explicitly specify what MATLAB version we want (assuming you have multiple installations), for instance matlab.application.8.3 will pick MATLAB R2014a.
So to fix the code, IMO the easiest way would be to get rid of all that logic about multiple MATLAB versions (in the Windows part of the code), and just let it create the MATLAB COM object as is. I haven't attempted it, but I don't think it's too involved... Good luck!
EDIT:
I download the module and I managed to get it to work on Windows (I'm using Python 2.7.6 and MATLAB R2014a). Here are the changes:
$ git diff
diff --git a/src/mlab/matlabcom.py b/src/mlab/matlabcom.py
index 93f075c..da1c6fa 100644
--- a/src/mlab/matlabcom.py
+++ b/src/mlab/matlabcom.py
## -21,6 +21,11 ## except:
print 'win32com in missing, please install it'
raise
+def find_available_releases():
+ # report we have all versions
+ return [('R%d%s' % (y,v), '')
+ for y in range(2006,2015) for v in ('a','b')]
+
def discover_location(matlab_release):
pass
## -62,7 +67,7 ## class MatlabCom(object):
"""
self._check_open()
try:
- self.eval('quit();')
+ pass #self.eval('quit();')
except:
pass
del self.client
diff --git a/src/mlab/mlabraw.py b/src/mlab/mlabraw.py
index 3471362..16e0e2b 100644
--- a/src/mlab/mlabraw.py
+++ b/src/mlab/mlabraw.py
## -42,6 +42,7 ## def open():
if is_win:
ret = MatlabConnection()
ret.open()
+ return ret
else:
if settings.MATLAB_PATH != 'guess':
matlab_path = settings.MATLAB_PATH + '/bin/matlab'
diff --git a/src/mlab/releases.py b/src/mlab/releases.py
index d792b12..9d6cf5d 100644
--- a/src/mlab/releases.py
+++ b/src/mlab/releases.py
## -88,7 +88,7 ## class MatlabVersions(dict):
# Make it a module
sys.modules['mlab.releases.' + matlab_release] = instance
sys.modules['matlab'] = instance
- return MlabWrap()
+ return instance
def pick_latest_release(self):
return get_latest_release(self._available_releases)
First I added the missing find_available_releases function. I made it so that it reports that all MATLAB versions are available (like I explained above, it doesn't really matter because of the way the COM object is created). An even better fix would be to detect the installed/registered MATLAB versions using the Windows registry (check the keys HKCR\Matlab.Application.X.Y and follow their CLSID in HKCR\CLSID). That way you can truly choose and pick which version to run.
I also fixed two unrelated bugs (one where the author forgot the function return value, and the other unnecessarily creating the wrapper object twice).
Note: During testing, it might be faster NOT to start/shutdown a MATLAB instance each time the script is called. This is why I commented self.eval('quit();') in the close function. That way you can start MATLAB using matlab.exe -automation (do this only once), and then repeatedly re-use the session without shutting it down. Just kill the process when you're done :)
Here is a Python example to test the module (I also show a comparison against NumPy/SciPy/Matplotlib):
test_mlab.py
# could be anything from: latest_release, R2014b, ..., R2006a
# makes no difference :)
from mlab.releases import R2014a as matlab
# show MATLAB version
print "MATLAB version: ", matlab.version()
print matlab.matlabroot()
# compute SVD of a NumPy array
import numpy as np
A = np.random.rand(5, 5)
U, S, V = matlab.svd(A, nout=3)
print "S = \n", matlab.diag(S)
# compare MATLAB's SVD against Scipy's SVD
U, S, V = np.linalg.svd(A)
print S
# 3d plot in MATLAB
X, Y, Z = matlab.peaks(nout=3)
matlab.figure(1)
matlab.surf(X, Y, Z)
matlab.title('Peaks')
matlab.xlabel('X')
matlab.ylabel('Y')
matlab.zlabel('Z')
# compare against matplotlib surface plot
from mpl_toolkits.mplot3d import Axes3D
import matplotlib.pyplot as plt
fig = plt.figure()
ax = fig.gca(projection='3d')
ax.plot_surface(X, Y, Z, rstride=1, cstride=1, cmap='jet')
ax.view_init(30.0, 232.5)
plt.title('Peaks')
plt.xlabel('X')
plt.ylabel('Y')
ax.set_zlabel('Z')
plt.show()
Here is the output I get:
C:\>python test_mlab.py
MATLAB version: 8.3.0.532 (R2014a)
C:\Program Files\MATLAB\R2014a
S =
[[ 2.41632007]
[ 0.78527851]
[ 0.44582117]
[ 0.29086795]
[ 0.00552422]]
[ 2.41632007 0.78527851 0.44582117 0.29086795 0.00552422]
EDIT2:
The above changes have been accepted and merged into mlab.
You are right in saying that the find_available_releases() is not written. 2 ways to work this out
Check out the code in linux and work on it (You are working on
windows !)
Change the Code as below
Add the following function in matlabcom.py as in matlabpipe.py
def find_available_releases():
global _RELEASES
if not _RELEASES:
_RELEASES = list(_list_releases())
return _RELEASES
If you see mlabraw.py file, the following code will give you a clear idea why I am saying this !
import sys
is_win = 'win' in sys.platform
if is_win:
from matlabcom import MatlabCom as MatlabConnection
from matlabcom import MatlabError as error
from matlabcom import discover_location, find_available_releases
from matlabcom import WindowsMatlabReleaseNotFound as MatlabReleaseNotFound
else:
from matlabpipe import MatlabPipe as MatlabConnection
from matlabpipe import MatlabError as error
from matlabpipe import discover_location, find_available_releases
from matlabpipe import UnixMatlabReleaseNotFound as MatlabReleaseNotFound
Related
I am trying to print my Sympy-expression as a string ready to be used with Numpy. I just cannot figure out how to do it.
I found that there is sp.printing.pycode: https://docs.sympy.org/latest/_modules/sympy/printing/pycode.html
The web page states that "This module contains python code printers for plain python as well as NumPy & SciPy enabled code.", but I just cannot figure out how to get it to output the expression numpy format.
sp.printing.pycode(expr)
'math.cos((1/2)*alpha)*math.cos((1/2)*beta)'
That web page also contain class NumPyPrinter(PythonCodePrinter) but I do not know how to use it. def pycode(expr, **settings) just seems to use return PythonCodePrinter(settings).doprint(expr) as a default all the time.
The definition of pycode is almost trivial:
def pycode(expr, **settings):
# docstring skipped
return PythonCodePrinter(settings).doprint(expr)
It should be straight forward to run NumPyPrinter().doprint(expr) instead. The problem is that sympy.printing re-exports the pycode function which shadows the module with the same name. However, we can still import the class directly and use it:
import sympy as sy
from sympy.printing.pycode import NumPyPrinter
x = sy.Symbol('x')
y = x * sy.cos(x * sy.pi)
code = NumPyPrinter().doprint(y)
print(code)
# x*numpy.cos(numpy.pi*x)
My website runs this Python script that would be way more optimized if Cython is used. Recently I needed to add Sympy with Lambdify, and this is not going well with Cython.
So I stripped the problem to a minimum working example. In the code, I have a dictionary with string keys with values that are lists. I would like to use these keys as variables. In the following simplified example, there's only 1 variable, but generally I need more. Please check the following example:
import numpy as np
from sympy.parsing.sympy_parser import parse_expr
from sympy.utilities.lambdify import lambdify, implemented_function
from sympy import S, Symbol
from sympy.utilities.autowrap import ufuncify
def CreateMagneticFieldsList(dataToSave,equationString,DSList):
expression = S(equationString)
numOfElements = len(dataToSave["MagneticFields"])
#initialize the magnetic field output array
magFieldsArray = np.empty(numOfElements)
magFieldsArray[:] = np.NaN
lam_f = lambdify(tuple(DSList),expression,modules='numpy')
try:
# pass
for i in range(numOfElements):
replacementList = np.zeros(len(DSList))
for j in range(len(DSList)):
replacementList[j] = dataToSave[DSList[j]][i]
try:
val = np.double(lam_f(*replacementList))
except:
val = np.nan
magFieldsArray[i] = val
except:
print("Error while evaluating the magnetic field expression")
return magFieldsArray
list={"MagneticFields":[1,2,3,4,5]}
out=CreateMagneticFieldsList(list,"MagneticFields*5.1",["MagneticFields"])
print(out)
Let's call this test.py. This works very well. Now I would like to cythonize this, so I use the following script:
#!/bin/bash
cython --embed -o test.c test.py
gcc -pthread -fPIC -fwrapv -Ofast -Wall -L/lib/x86_64-linux-gnu/ -lpython3.4m -I/usr/include/python3.4 -o test.exe test.c
Now if I execute ./test.exe, it throws an exception! Here's the exception:
Traceback (most recent call last):
File "test.py", line 42, in init test (test.c:1811)
out=CreateMagneticFieldsList(list,"MagneticFields*5.1",["MagneticFields"])
File "test.py", line 19, in test.CreateMagneticFieldsList (test.c:1036)
lam_f = lambdify(tuple(DSList),expression,modules='numpy')
File "/usr/local/lib/python3.4/dist-packages/sympy/utilities/lambdify.py", line 363, in lambdify
callers_local_vars = inspect.currentframe().f_back.f_locals.items()
AttributeError: 'NoneType' object has no attribute 'f_locals'
So the question is: How can I get lambdify to work with Cython?
Notes: I would like to point out that I have Debian Jessie, and that's why I'm using Python 3.4. Also I would like to point out that I don't have any problem with Cython when not using lambdify. Also I would like to point out that Cython is updated to the last version with pip3 install cython --upgrade.
This is a something of a workround to the real problem (identified in the comments and #jjhakala's answer) that Cython doesn't generate full tracebacks/introspection information for compiled functions. I gather from your comments that you'd like to keep most of your program compiled with Cython for speed reasons.
The "solution" is to use the Python interpreter for only the individual function that needs to call lambdify and leave the rest in Cython. You can do this using exec.
A very simple example of the idea is
exec("""
def f(func_to_call):
return func_to_call()""")
# a Cython compiled version
def f2(func_to_call):
return func_to_call())
This can be compiled as a Cython module and imported, and upon being imported the Python interpreter runs the code in the string, and correctly adds f to the module globals. If we create a pure Python function
def g():
return inspect.currentframe().f_back.f_locals
calling cython_module.f(g) gives me a dictionary with key func_to_call (as expected) while cython_module.f2(g) gives me the __main__ module globals (but this is because I'm running from an interpreter rather than using --embed).
Edit: Full example, based on your code
from sympy import S, lambdify # I'm assuming "S" comes from sympy
import numpy as np
CreateMagneticFieldsList = None # stops a compile error about CreateMagneticFieldsList being undefined
exec("""def CreateMagneticFieldsList(dataToSave,equationString,DSList):
expression = S(equationString)
numOfElements = len(dataToSave["MagneticFields"])
#initialize the magnetic field output array
magFieldsArray = np.empty(numOfElements)
magFieldsArray[:] = np.NaN
lam_f = lambdify(tuple(DSList),expression,modules='numpy')
try:
# pass
for i in range(numOfElements):
replacementList = np.zeros(len(DSList))
for j in range(len(DSList)):
replacementList[j] = dataToSave[DSList[j]][i]
try:
val = np.double(lam_f(*replacementList))
except:
val = np.nan
magFieldsArray[i] = val
except:
print("Error while evaluating the magnetic field expression")
return magFieldsArray""")
list={"MagneticFields":[1,2,3,4,5]}
out=CreateMagneticFieldsList(list,"MagneticFields*5.1",["MagneticFields"])
print(out)
When compiled with your script this prints
[ 5.1 10.2 15.3 20.4 25.5 ]
Substantially all I've done is wrapped your function in an exec statement, so it's executed by the Python interpreter. This part won't see any benefit from Cython, however the rest of your program still will. If you want to maximise the amount compiled with Cython you could divide it up into multiple functions so that only the small part containing lambdify is in the exec.
It is stated in cython docs - limitations that
Stack frames
Currently we generate fake tracebacks as part of exception
propagation, but don’t fill in locals and can’t fill in co_code. To be
fully compatible, we would have to generate these stack frame objects
at function call time (with a potential performance penalty). We may
have an option to enable this for debugging.
f_locals in
AttributeError: 'NoneType' object has no attribute 'f_locals'
seems to point towards this incompability issue.
This is my first attempt at using IPython.parallel so please bear with me.
I read this question
Parfor for Python
and am having trouble implementing a simple example as follows:
import gmpy2 as gm
import numpy as np
from IPython.parallel import Client
rc = Client()
lview = rc.load_balanced_view()
lview.block = True
a = 1
def L2(ii,jj):
out = []
out.append(gm.fac(ii+jj+a))
return out
Nloop = 100
ii = range(Nloop)
jj = range(Nloop)
R2 = lview.map(L2, zip(ii, jj))
The problems I have are:
a is defined outside the loop and I think I need to do something like "push" but am a bit confused by that. Do I need to "pull" after?
there are two arguments that are required for the function and I don't know how to pass them correctly. I tried things like zip(ii,jj) but got some errors.
Also,, I assume the fact that I'm using a random library gmpy2 shouldn't affect things. Is this correct? Do I need to do anything special for this?
Ideally I would like your help so on this simple example the code runs error free.
If you think it would be beneficial to post my failed attempts at #2 let me know. I'm in the dark with #1.
I found two ways that make this work:
One is pushing the variable to the cores. There is no need to pull it. The variable will simply be defined in the namespace of each process-engine.
rc.client[:].push({'a':a})
R2 = lview.map(L2, ii, jj)
The other way is as to redefine L2 to take a as an input and pass an array of a's to the map function:
def L2(ii,jj,a):
out = []
out.append(gm.fac(ii+jj+a))
return out
R2 = lview.map(L2, ii, jj, [a]*Nloop)
With regards to the import as per this website:
http://ipython.org/ipython-doc/dev/parallel/parallel_multiengine.html#non-blocking-execution
You simply import the required libraries in the function:
Note the import inside the function. This is a common model, to ensure
that the appropriate modules are imported where the task is run. You
can also manually import modules into the engine(s) namespace(s) via
view.execute('import numpy')().
Or you can do as per this link
http://ipython.org/ipython-doc/dev/parallel/parallel_multiengine.html#remote-imports
I am trying to calculate and generate plots using multiprocessing. On Linux the code below runs correctly, however on the Mac (ML) it doesn't, giving the error below:
import multiprocessing
import matplotlib.pyplot as plt
import numpy as np
import rpy2.robjects as robjects
def main():
pool = multiprocessing.Pool()
num_figs = 2
# generate some random numbers
input = zip(np.random.randint(10,1000,num_figs),
range(num_figs))
pool.map(plot, input)
def plot(args):
num, i = args
fig = plt.figure()
data = np.random.randn(num).cumsum()
plt.plot(data)
main()
The Rpy2 is rpy2==2.3.1 and R is 2.13.2 (I could not install R 3.0 and rpy2 latest version on any mac without getting segmentation fault).
The error is:
The process has forked and you cannot use this CoreFoundation functionality safely. You MUST exec().
Break on __THE_PROCESS_HAS_FORKED_AND_YOU_CANNOT_USE_THIS_COREFOUNDATION_FUNCTIONALITY___YOU_MUST_EXEC__() to debug.
The process has forked and you cannot use this CoreFoundation functionality safely. You MUST exec().
I have tried everything to understand what the problem is with no luck. My configuration is:
Danials-MacBook-Pro:~ danialt$ brew --config
HOMEBREW_VERSION: 0.9.4
ORIGIN: https://github.com/mxcl/homebrew
HEAD: 705b5e133d8334cae66710fac1c14ed8f8713d6b
HOMEBREW_PREFIX: /usr/local
HOMEBREW_CELLAR: /usr/local/Cellar
CPU: dual-core 64-bit penryn
OS X: 10.8.3-x86_64
Xcode: 4.6.2
CLT: 4.6.0.0.1.1365549073
GCC-4.2: build 5666
LLVM-GCC: build 2336
Clang: 4.2 build 425
X11: 2.7.4 => /opt/X11
System Ruby: 1.8.7-358
Perl: /usr/bin/perl
Python: /usr/local/bin/python => /usr/local/Cellar/python/2.7.4/Frameworks/Python.framework/Versions/2.7/bin/python2.7
Ruby: /usr/bin/ruby => /System/Library/Frameworks/Ruby.framework/Versions/1.8/usr/bin/ruby
Any ideas?
This error occurs on Mac OS X when you perform a GUI operation outside the main thread, which is exactly what you are doing by shifting your plot function to the multiprocessing.Pool (I imagine that it will not work on Windows either for the same reason - since Windows has the same requirement). The only way that I can imagine it working is using the pool to generate the data, then have your main thread wait in a loop for the data that's returned (a queue is the way I usually handle it...).
Here is an example (recognizing that this may not do what you want - plot all the figures "simultaneously"? - plt.show() blocks so only one is drawn at a time and I note that you do not have it in your sample code - but without I don't see anything on my screen - however, if I take it out - there is no blocking and no error because all GUI functions are happening in the main thread):
import multiprocessing
import matplotlib.pyplot as plt
import numpy as np
import rpy2.robjects as robjects
data_queue = multiprocessing.Queue()
def main():
pool = multiprocessing.Pool()
num_figs = 10
# generate some random numbers
input = zip(np.random.randint(10,10000,num_figs), range(num_figs))
pool.map(worker, input)
figs_complete = 0
while figs_complete < num_figs:
data = data_queue.get()
plt.figure()
plt.plot(data)
plt.show()
figs_complete += 1
def worker(args):
num, i = args
data = np.random.randn(num).cumsum()
data_queue.put(data)
print('done ',i)
main()
Hope this helps.
I had a similar issue with my worker, which was loading some data, generating a plot, and saving it to a file. Note that this is slightly different than what the OP's case, which seems to be oriented around interactive plotting. Still, I think it's relevant.
A simplified version of my code:
def worker(id):
data = load_data(id)
plot_data_to_file(data) # Generates a plot and saves it to a file.
def plot_something_parallel(ids):
pool = multiprocessing.Pool()
pool.map(worker, ids)
plot_something_parallel(ids=[1,2,3])
This caused the same error others mention:
The process has forked and you cannot use this CoreFoundation functionality safely. You MUST exec().
Break on __THE_PROCESS_HAS_FORKED_AND_YOU_CANNOT_USE_THIS_COREFOUNDATION_FUNCTIONALITY___YOU_MUST_EXEC__() to debug.
Following #bbbruce's train of thought, I solved my problem by switching the matplotlib backend from TKAgg to the default. Specifically, I commented out the following line in my matplotlibrc file:
#backend : TkAgg
This might be rpy2-specific.
There are reports of a similar problem with OS X and multiprocessing here and there.
I think that using an initializer that imports the packages needed to run the code in plot could solve the problem (multiprocessing-doc).
I had a similar issue and found that setting the start method in multiprocessing to use forkserver works as long as it comes after your if name == main: statement.
if __name__ == '__main__':
multiprocessing.set_start_method('forkserver')
first_process = multiprocessing.Process(target = targetOne)
second_process = multiprocessing.Process(target = targetTwo)
first_process.start()
second_process.start()
Try to upgrade matplotlib to 3.0.3:
pip3 install matplotlib --upgrade
Then everything goes fine.
=======================================================================
No need to read below anymore.
Yesterday, my multiprocess plot works on my MacBook Air. But not working on my MacBook Pro tomorrow morning with the same code, displaying many:
The process has forked and you cannot use this CoreFoundation functionality safely. You MUST exec().
Break on __THE_PROCESS_HAS_FORKED_AND_YOU_CANNOT_USE_THIS_COREFOUNDATION_FUNCTIONALITY___YOU_MUST_EXEC__() to debug.
The process has forked and you cannot use this CoreFoundation functionality safely. You MUST exec().
They are all using 4th gen i intel CPU (i5-4xxx with air and i7-4xxx with pro). So if there are no difference on hardware, it must be on software.
So I just tried update matplot to 3.0.3 on MacBook Pro( was 3.0.1), every thing goes fine.
Also, no need to do pool.apply_async anymore.
Receiving a segfault when running this very short script in Ubuntu.
from osgeo import ogr, osr
shpfile = 'Census_County_TIGER00_IN.shp'
def cust_field(field):
'''cust_field(shpfile, field) creates a field definition, which, by calling cust_field(), can be used to create a field using the CreateField() function.
cust_field() DOES NOT create a field -- it simply creates a "model" for a field, that can then be called later. It's weird, but that's GDAL/OGR, as far as I can tell.'''
fieldDefn = ogr.FieldDefn(field, ogr.OFTInteger)
fieldDefn.SetWidth(14)
fieldDefn.SetPrecision(6)
return fieldDefn
ds = ogr.Open(shpfile, 1)
lyr = ds.GetLayerByIndex(0)
field = cust_field("Test")
lyr.CreateField(field)
Everything runs smoothly until that last line, when iPython, normal shell Python and the IDLE command line all dump to a segmentation fault. Is this an error on my end or an issue with the underlying C that I'm not addressing properly?
Is this an error on my end or an issue
with the underlying C that I'm not
addressing properly?
It is probably both. GDAL/OGR's bindings do tend to segfault occasionally, when objects go out of scope and are garbage collected. While this is a known bug, it is unlikely to be fixed any time soon.
Chances are you can find a way to work around this. I can't reproduce this segfault with another shapefile on Windows XP, and the following version of GDAL/OGR:
>>> gdal.VersionInfo('')
'GDAL 1.6.0, released 2008/12/04'
You could try temporarily to refactor the cust_field function into the body of the script like this:
from osgeo import ogr, osr
shpfile = 'Census_County_TIGER00_IN.shp'
ds = ogr.Open(shpfile, 1)
lyr = ds.GetLayerByIndex(0)
fieldDefn = ogr.FieldDefn("Test", ogr.OFTInteger)
fieldDefn.SetWidth(14)
fieldDefn.SetPrecision(6)
lyr.CreateField(fieldDefn)
Let me know if this solves your problem.