I am trying to embed Python 2.6 into MATLAB (7.12). I wanted to embed with a mex file written in C. This worked fine for small simple examples using scalars. However, if Numpy (1.6.1) is imported in anyway MATLAB crashes. I say anyway because I have tried a number of ways to load the numpy libraries including
In the python module (.py):
from numpy import *
With PyRun_SimpleString in the mex file:
PyRun_SimpleString(“from numpy import *”);
Calling numpy functions with Py_oBject_CallObject:
pOut = PyObject_CallObject(pFunc, pArgs);
Originally, I thought this may be a problem with embedding Numpy in C. However, Numpy works fine when embedded in simple C files that are compiled from the command line with /MD (multithread) switch with the Visual Studios 2005 C compiler. Next, I thought I will just change the make file in MATLAB to include the /MD switch. No such luck, mexopts.bat compiles with the /MD switch. I also manually commented out lines in the Numpy init module to find what was crashing MATLAB. It seems that loading any file with the extension pyd crashes MATLAB. The first of such files loaded in NumPy is multiarray.pyd. The MATLAB documentation describes how to debug mex files with visual studios which I did and placed the error message below. At this point I know the problem is a memory problem with the pyd’s and some conflict with MATLAB. Interestingly, I can use a system command in MATLAB to kick off a process in python that uses numpy and no error is generated. I will paste below the error message from MATLAB followed by the DEBUG output in visual studios of the processes that crash MATLAB. However, I am not pasting the whole thing because the list of first-chance exceptions is very long. Are there any suggestions for solving this integration problem?
MATLAB error
Matlab has encountered an internal problem and needs to close
MATLAB crash file:C:\Users\pml355\AppData\Local\Temp\matlab_crash_dump.3484-1:
------------------------------------------------------------------------
Segmentation violation detected at Tue Oct 18 12:19:03 2011
------------------------------------------------------------------------
Configuration:
Crash Decoding : Disabled
Default Encoding: windows-1252
MATLAB License : 163857
MATLAB Root : C:\Program Files\MATLAB\R2011a
MATLAB Version : 7.12.0.635 (R2011a)
Operating System: Microsoft Windows 7
Processor ID : x86 Family 6 Model 7 Stepping 10, GenuineIntel
Virtual Machine : Java 1.6.0_17-b04 with Sun Microsystems Inc. Java HotSpot(TM) Client VM mixed mode
Window System : Version 6.1 (Build 7600)
Fault Count: 1
Abnormal termination:
Segmentation violation
Register State (from fault):
EAX = 00000001 EBX = 69c38c20
ECX = 00000001 EDX = 24ae1da8
ESP = 0088af0c EBP = 0088af44
ESI = 69c38c20 EDI = 24ae1da0
EIP = 69b93d31 EFL = 00010202
CS = 0000001b DS = 00000023 SS = 00000023
ES = 00000023 FS = 0000003b GS = 00000000
Stack Trace (from fault):
[ 0] 0x69b93d31 C:/Python26/Lib/site-packages/numpy/core/multiarray.pyd+00081201 ( ???+000000 )
[ 1] 0x69bfead4 C:/Python26/Lib/site-packages/numpy/core/multiarray.pyd+00518868 ( ???+000000 )
[ 2] 0x69c08039 C:/Python26/Lib/site-packages/numpy/core/multiarray.pyd+00557113 ( ???+000000 )
[ 3] 0x08692b09 C:/Python26/python26.dll+00076553 ( PyEval_EvalFrameEx+007833 )
[ 4] 0x08690adf C:/Python26/python26.dll+00068319 ( PyEval_EvalCodeEx+002255 )
This error was detected while a MEX-file was running. If the MEX-file
is not an official MathWorks function, please examine its source code
for errors. Please consult the External Interfaces Guide for information
on debugging MEX-files.
If this problem is reproducible, please submit a Service Request via:
http://www.mathworks.com/support/contact_us/
A technical support engineer might contact you with further information.
Thank you for your help.
Output from Visual Studios DEBUGGER
First-chance exception at 0x0c12c128 in MATLAB.exe: 0xC0000005: Access violation reading location 0x00000004.
First-chance exception at 0x0c12c128 in MATLAB.exe: 0xC0000005: Access violation reading location 0x00000004.
First-chance exception at 0x0c12c128 in MATLAB.exe: 0xC0000005: Access violation reading location 0x00000004.
First-chance exception at 0x751d9673 in MATLAB.exe: Microsoft C++ exception: jitCgFailedException at memory location 0x00c3e210..
First-chance exception at 0x751d9673 in MATLAB.exe: Microsoft C++ exception: jitCgFailedException at memory location 0x00c3e400..
First-chance exception at 0x69b93d31 in MATLAB.exe: 0xC0000005: Access violation writing location 0x00000001.
> throw_segv_longjmp_seh_filter()
throw_segv_longjmp_seh_filter(): invoking THROW_SEGV_LONGJMP SEH filter
> mnUnhandledWindowsExceptionFilter()
MATLAB.exe has triggered a breakpoint
Try to approach the problem from the Python side: Python is a great glue language, I would suggest you to have Python run your Matlab and C programs. Python has:
Numpy
PyLab
Matplotlib
IPython
Thus, the combination is a good alternative for almost any existing Matlab module.
With matlab 2014b a possibility to call python functions directly in m code was added.
Related
I am using pygraphviz to create a large number of graphs for different configurations of data. I have found that no matter what information is put in the graph the program will crash after drawing the 170th graph. There are no error messages generated the program just stops. Is there something that needs to be reset if drawing this many graphs?
I am running Python 3.7 on a Windows 10 machine, Pygraphviz 1.5, and graphviz 2.38
for graph_number in range(200):
config_graph = pygraphviz.AGraph(strict=False, directed=False, compound=True, ranksep='0.2', nodesep='0.2')
# Create Directory
if not os.path.exists('Graph'):
os.makedirs('Graph')
# Draw Graph
print('draw_' + str(graph_number))
config_graph.layout(prog = 'dot')
config_graph.draw('Graph/'+str(graph_number)+'.png')
I was able to constantly reproduce the behavior with:
Python 3.7.6 (pc064 (64bit), then also with pc032)
PyGraphviz 1.5 (that I built - available for download at [GitHub]: CristiFati/Prebuilt-Binaries - Various software built on various platforms. (under PyGraphviz, naturally).
Might also want to check [SO]: Installing pygraphviz on Windows 10 64-bit, Python 3.6 (#CristiFati's answer))
Graphviz 2.42.2 ((pc032) same as #2.)
I suspected an Undefined Behavior somewhere in the code, even if the behavior was precisely the same:
OK for 169 graphs
Crash for 170
Did some debugging (added some print(f) statements in agraph.py, and cgraph.dll (write.c)).
PyGraphviz invokes Graphviz's tools (.exes) for many operations. For that, it uses subprocess.Popen and communicates with the child process via its 3 available streams (stdin, stdout, stderr).
From the beginning I noticed that 170 * 3 = 510 (awfully close to 512 (0x200)), but didn't pay as much attention as I should have until later (mostly because the Python process (running the code below) had no more than ~150 open handles in Task Manager (TM) and also Process Explorer (PE)).
However, a bit of Googleing revealed:
[SO]: Is there a limit on number of open files in Windows (#stackprogrammer's answer) (and from here)
[MS.Learn]: _setmaxstdio (which states (emphasis is mine)):
C run-time I/O now supports up to 8,192 files open simultaneously at the low I/O level. This level includes files opened and accessed using the _open, _read, and _write family of I/O functions. By default, up to 512 files can be open simultaneously at the stream I/O level. This level includes files opened and accessed using the fopen, fgetc, and fputc family of functions. The limit of 512 open files at the stream I/O level can be increased to a maximum of 8,192 by use of the _setmaxstdio function.
[SO]: Python: Which command increases the number of open files on Windows? (#NorthCat's answer)
Below is your code that I modified for debugging and reproducing the error. It needs (for code shortness' sake, as same thing can be achieved via CTypes) the PyWin32 package (python -m pip install pywin32).
code00.py:
#!/usr/bin/env python
import os
import sys
#import time
import pygraphviz as pgv
import win32file as wfile
def handle_graph(idx, dir_name):
graph_name = "draw_{:03d}".format(idx)
graph_args = {
"name": graph_name,
"strict": False,
"directed": False,
"compound": True,
"ranksep": "0.2",
"nodesep": "0.2",
}
graph = pgv.AGraph(**graph_args)
# Draw Graph
img_base_name = graph_name + ".png"
print(" {:s}".format(img_base_name))
graph.layout(prog="dot")
img_full_name = os.path.join(dir_name, img_base_name)
graph.draw(img_full_name)
graph.close() # !!! Has NO (visible) effect, but I think it should be called anyway !!!
def main(*argv):
print("OLD max open files: {:d}".format(wfile._getmaxstdio()))
# 513 is enough for your original code (170 graphs), but you can set it up to 8192
#wfile._setmaxstdio(513) # !!! COMMENT this line to reproduce the crash !!!
print("NEW max open files: {:d}".format(wfile._getmaxstdio()))
dir_name = "Graph"
# Create Directory
if not os.path.isdir(dir_name):
os.makedirs(dir_name)
#ts_global_start = time.time()
start = 0
count = 170
#count = 1
step_sleep = 0.05
for i in range(start, start + count):
#ts_local_start = time.time()
handle_graph(i, dir_name)
#print(" Time: {:.3f}".format(time.time() - ts_local_start))
#time.sleep(step_sleep)
handle_graph(count, dir_name)
#print("Global time: {:.3f}".format(time.time() - ts_global_start - step_sleep * count))
if __name__ == "__main__":
print("Python {:s} {:03d}bit on {:s}\n".format(" ".join(elem.strip() for elem in sys.version.split("\n")),
64 if sys.maxsize > 0x100000000 else 32, sys.platform))
rc = main(*sys.argv[1:])
print("\nDone.\n")
sys.exit(rc)
Output:
e:\Work\Dev\StackOverflow\q060876623> "e:\Work\Dev\VEnvs\py_pc064_03.07.06_test0\Scripts\python.exe" ./code00.py
Python 3.7.6 (tags/v3.7.6:43364a7ae0, Dec 19 2019, 00:42:30) [MSC v.1916 64 bit (AMD64)] 064bit on win32
OLD max open files: 512
NEW max open files: 513
draw_000.png
draw_001.png
draw_002.png
...
draw_167.png
draw_168.png
draw_169.png
Done.
Conclusions:
Apparently, some file handles (fds) are open, although they are not "seen" by TM or PE (probably they are on a lower level). However I don't know why this happens (is it a MS UCRT bug?), but from what I am concerned, once a child process ends, its streams should be closed, but I don't know how to force it (this would be a proper fix)
Also, the behavior (crash) when attempting to write (not open) to a fd (above the limit), seems a bit strange
As a workaround, the max open fds number can be increased. Based on the following inequality: 3 * (graph_count + 1) <= max_fds, you can get an idea about the numbers. From there, if you set the limit to 8192 (I didn't test this) you should be able handle 2729 graphs (assuming that there are no additional fds opened by the code)
Side notes:
While investigating, I ran into or noticed several adjacent issues, that I tried to fix:
Graphviz:
[GitLab]: graphviz/graphviz - [Issue #1481]: MSB4018 The NativeCodeAnalysis task failed unexpectedly. (merged on 200406)
PyGraphviz:
[GitHub]: pygraphviz/pygraphviz - AGraph Graphviz handle close mechanism (merged on 200720)
There's also an issue open for this behavior (probably the same author): [GitHub]: pygraphviz/pygraphviz - Pygraphviz crashes after drawing 170 graphs
I tried you code and it generated 200 graphs with no problem (I also tried with 2000).
My suggestion is to use these versions of the packages, I installed a conda environment on mac os with python 3.7 :
graphviz 2.40.1 hefbbd9a_2
pygraphviz 1.3 py37h1de35cc_1
I'm building TF from source and have no trouble including contrib in python. I get a segfault when I try to access this module with the following error:
error: _single_image_random_dot_stereograms.so debug map object file '/private/var/tmp/_bazel_mattmurphy/7ec540cd2482edb7e06749c20652a791/execroot/org_tensorflow/bazel-out/darwin-dbg/bin/tensorflow/contrib/image/_objs/python/ops/_single_image_random_dot_stereograms.so/tensorflow/contrib/image/kernels/single_image_random_dot_stereograms_ops.o' has changed (actual time is 2018-04-23 12:26:04.000000000, debug map time is 2018-04-21 20:47:03.000000000) since this executable was linked, file will be ignored
error: _single_image_random_dot_stereograms.so debug map object file '/private/var/tmp/_bazel_mattmurphy/7ec540cd2482edb7e06749c20652a791/execroot/org_tensorflow/bazel-out/darwin-dbg/bin/tensorflow/contrib/image/_objs/python/ops/_single_image_random_dot_stereograms.so/tensorflow/contrib/image/ops/single_image_random_dot_stereograms_ops.o' has changed (actual time is 2018-04-23 12:26:05.000000000, debug map time is 2018-04-21 20:47:02.000000000) since this executable was linked, file will be ignored
Process 58138 stopped
* thread #1, queue = 'com.apple.main-thread', stop reason = EXC_BAD_ACCESS (code=1, address=0x48)
frame #0: 0x0000000131682290 _single_image_random_dot_stereograms.so`google::protobuf::internal::Mutex::Lock(this=0x0000000000000048) at common.cc:376
373 }
374
375 void Mutex::Lock() {
-> 376 int result = pthread_mutex_lock(&mInternal->mutex);
377 if (result != 0) {
378 GOOGLE_LOG(FATAL) << "pthread_mutex_lock: " << strerror(result);
379 }
Target 0: (python) stopped.
It looks like the issue is protobuf related, but this has been difficult to diagnose.
I'm observing the same problem when compiling on MacOS 10.13.4 with Xcode 9.3.
The problem is that protobuf is statically linked into libtensorflow_framework.so, but also into _single_image_random_dot_stereograms.so and libforestprotos.so, which get loaded when contrib is imported.
Here a relevant protobuf issue.
A comment in that issue says that the problem appears when compiling with Xcode 8.3 or later, so I assume that the official tensorflow binary works, because it is built using an older version.
As workaround, I have locally removed the two occurences of "#protobuf_archive//:protobuf" in /tensorflow/tensorflow/tensorflow/contrib/image/BUILD and /tensorflow/tensorflow/tensorflow/contrib/tensor_forest/BUILD.
This doesn't seem to break anything for my use case of doing local experiments in python.
I would like to use a script to call an executable program and input some instructions in the exe program (a DOS window) to run this exe program (output not required)
For example if I directly run the program I'll double click it and type as follows:
load XXX.txt
oper
quit
Here's my code, fwiw I do not have a deep understanding about subprocess.
import subprocess
import os
os.chdir('D:/Design/avl3.35/Avl/Runs')
Process=subprocess.Popen(['avl.exe'], stdin=subprocess.PIPE)
Process.communicate(b'load allegro.avl\n')
When I run this code, I get the following:
===================================================
Athena Vortex Lattice Program Version 3.35
Copyright (C) 2002 Mark Drela, Harold Youngren
This software comes with ABSOLUTELY NO WARRANTY,
subject to the GNU General Public License.
Caveat computor
===================================================
==========================================================
Quit Exit program
.OPER Compute operating-point run cases
.MODE Eigenvalue analysis of run cases
.TIME Time-domain calculations
LOAD f Read configuration input file
MASS f Read mass distribution file
CASE f Read run case file
CINI Clear and initialize run cases
MSET i Apply mass file data to stored run case(s)
.PLOP Plotting options
NAME s Specify new configuration name
AVL c>
Reading file: allegro.avl ...
Configuration: Allegro-lite 2M
Building surface: WING
Reading airfoil from file: ag35.dat
Reading airfoil from file: ag36.dat
Reading airfoil from file: ag37.dat
Reading airfoil from file: ag38.dat
At line 145 of file ../src/userio.f (unit = 5, file = 'stdin') #!!!!Error here!!
Fortran runtime error: End of file #!!!!Error here!!
Building duplicate image-surface: WING (YDUP)
Building surface: Horizontal tail
Building duplicate image-surface: Horizontal tail (YDUP)
Building surface: Vertical tail
Mach = 0.0000 (default)
Nbody = 0 Nsurf = 5 Nstrp = 64 Nvor = 410
Initializing run cases...
I have no idea what's wrong with this, nor do I know why the error prompt shows inside the codes. After searching I see that communicate method is to wait for the process to finish and return all the output, though I do not need output, but I still don't know what to do.
Could you explain what is happening here and how could I finish what I want to do?
I am trying to do Incomplete Cholesky Decomposition in Python, but no direct Python package I can find.
Since the most available codes I can find online are written in Matlab, I want to take a detour by
compiling the matlab code to a shared library (I am using Mac OS and MATLAB_R2014a, so it should produce .dylib file)
load library in Python by using Ctypes
The following lists the detailed steps:
0. Download Matlab Source Code
The code can be downloaded from F. Bach's webpage link to zip file, which contains the following files:
panc:csi-1.0 panc25$ ls
center.m csi.dll csi.mexglx csi_gaussian.dll csi_gaussian.mexglx readme.txt
csi.c csi.m csi_gaussian.c csi_gaussian.m demo_csi.m sqdist.m
1. Compiling the matlab code to a shared library
Then by following this post, I run the command:
mcc -v -W cpplib:libcsi -T link:lib csi
After around a minute, the terminal prints MEX completed successfully and in my folder there are
panc:csi-1.0 panc25$ ls
center.m csi.m csi_gaussian.dll demo_csi.m libcsi.exports readme.txt
csi.c csi.mexglx csi_gaussian.m libcsi.cpp libcsi.h sqdist.m
csi.dll csi_gaussian.c csi_gaussian.mexglx libcsi.dylib mccExcludedFiles.log
where libcsi.dylib is the shared library I want.
2. Loading library in Python
Then I open IPython and try to load the library:
In [1]: import ctypes
In [2]: ctypes.C
ctypes.CDLL ctypes.CFUNCTYPE
In [2]: ctypes.CDLL('libcsi.dylib')
---------------------------------------------------------------------------
OSError Traceback (most recent call last)
<ipython-input-2-b6d0c1a91651> in <module>()
----> 1 ctypes.CDLL('libcsi.dylib')
/Users/panc25/anaconda/lib/python2.7/ctypes/__init__.pyc in __init__(self, name, mode, handle, use_errno, use_last_error)
363
364 if handle is None:
--> 365 self._handle = _dlopen(self._name, mode)
366 else:
367 self._handle = handle
OSError: dlopen(libcsi.dylib, 6): Library not loaded: #rpath/libmwmclmcrrt.8.3.dylib
Referenced from: /Users/panc25/Downloads/csi-1.0/libcsi.dylib
Reason: image not found
This problem persists even after I replace file name in ctypes.CDLL('libcsi.dylib') with the full path.
So I am confused. The shared library is there, but why Python says "image not found"?
BTW
SInce the source code also provide C implementation through mex.h, I also tried to first create a .mex file, then compile the .mex to a shared library as follows:
panc:csi-1.0 panc25$ mex csi.c
which created the csi.mexmaci64 file. Then according to this link, I called:
panc:csi-1.0 panc25$ mcc -B csharedlib:csi2 csi.mexmaci64
which produced csi2.dylib file.
But when I tried to load it in Python, I had the same error.
Could anyone let me know what is wrong?
I would avoid Matlab altogether, and instead use the Incomplete Cholesky Decomposition available in PyMC2:
from pymc.gp.incomplete_chol import ichol_full
The f2py wrapped Fortran code, that was actually adapted from a MEX file, can be found here. So you could use this independently of PyMC2 if need be.
If you are interested, you could also propose to add this function to scipy (see this githib issue ).
I have a Python daemon running on a 64-bit Linux box. It is crashing. Not a friendly, straightforward to debug, Python exception stack trace sort of crash, either-- this is a segmentation fault. Linux's dmesg log has a succinct post-mortem:
python2.7[27509]: segfault at 7fe500000008 ip 00007fe56644a891 sp 00007fe54e1fa230 error 4 in libpython2.7.so.1.0[7fe566359000+193000]
python2.7[23517]: segfault at 7f5600000008 ip 00007f568bb45891 sp 00007f5678e55230 error 4 in libpython2.7.so.1.0[7f568ba54000+193000]
libpython2.7.so.1.0 on this system has symbols and I can run objdump -d to get an assembly language dump. So I'm curious to know which function is causing the segfault.
How can I decode one of these dmesg segfault notices and find the errant function? One line says "7fe566359000+193000" and the next says "7f568ba54000+193000". I'm guessing this means both segfaults come from the same location. 193000 = 0x2f1e8. I thought that 0x2f1e8 would lead to an instruction in the Python library assembly dump, but it didn't; 0x2f1e8 is well out of range of the disassembly.
That is the address from the base of the library load, so you should compare it with the load address of .text as returned by (for example) eu-readelf:
flame#saladin ~ % eu-readelf -S /usr/lib/libpython2.7.so
There are 25 section headers, starting at offset 0x1b1a80:
Section Headers:
[Nr] Name Type Addr Off Size ES Flags Lk Inf Al
[ 0] NULL 0000000000000000 00000000 00000000 0 0 0 0
[….]
[10] .text PROGBITS 000000000003f220 0003f220 000e02a0 0 AX 0 0 16
[….]
What you should be able to do is to use the address you got with the addr2line tool:
addr2line -e /usr/lib/libpython2.7.so 0x6e408
In this case I can't get the data because my copy of the library differs from yours so the address makes no sense.
Of course you still won't get a full backtrace unless you had a core file.