Tensorflow build from source, cannot include contrib - python

I'm building TF from source and have no trouble including contrib in python. I get a segfault when I try to access this module with the following error:
error: _single_image_random_dot_stereograms.so debug map object file '/private/var/tmp/_bazel_mattmurphy/7ec540cd2482edb7e06749c20652a791/execroot/org_tensorflow/bazel-out/darwin-dbg/bin/tensorflow/contrib/image/_objs/python/ops/_single_image_random_dot_stereograms.so/tensorflow/contrib/image/kernels/single_image_random_dot_stereograms_ops.o' has changed (actual time is 2018-04-23 12:26:04.000000000, debug map time is 2018-04-21 20:47:03.000000000) since this executable was linked, file will be ignored
error: _single_image_random_dot_stereograms.so debug map object file '/private/var/tmp/_bazel_mattmurphy/7ec540cd2482edb7e06749c20652a791/execroot/org_tensorflow/bazel-out/darwin-dbg/bin/tensorflow/contrib/image/_objs/python/ops/_single_image_random_dot_stereograms.so/tensorflow/contrib/image/ops/single_image_random_dot_stereograms_ops.o' has changed (actual time is 2018-04-23 12:26:05.000000000, debug map time is 2018-04-21 20:47:02.000000000) since this executable was linked, file will be ignored
Process 58138 stopped
* thread #1, queue = 'com.apple.main-thread', stop reason = EXC_BAD_ACCESS (code=1, address=0x48)
frame #0: 0x0000000131682290 _single_image_random_dot_stereograms.so`google::protobuf::internal::Mutex::Lock(this=0x0000000000000048) at common.cc:376
373 }
374
375 void Mutex::Lock() {
-> 376 int result = pthread_mutex_lock(&mInternal->mutex);
377 if (result != 0) {
378 GOOGLE_LOG(FATAL) << "pthread_mutex_lock: " << strerror(result);
379 }
Target 0: (python) stopped.
It looks like the issue is protobuf related, but this has been difficult to diagnose.

I'm observing the same problem when compiling on MacOS 10.13.4 with Xcode 9.3.
The problem is that protobuf is statically linked into libtensorflow_framework.so, but also into _single_image_random_dot_stereograms.so and libforestprotos.so, which get loaded when contrib is imported.
Here a relevant protobuf issue.
A comment in that issue says that the problem appears when compiling with Xcode 8.3 or later, so I assume that the official tensorflow binary works, because it is built using an older version.
As workaround, I have locally removed the two occurences of "#protobuf_archive//:protobuf" in /tensorflow/tensorflow/tensorflow/contrib/image/BUILD and /tensorflow/tensorflow/tensorflow/contrib/tensor_forest/BUILD.
This doesn't seem to break anything for my use case of doing local experiments in python.

Related

module 'jax' has no attribute 'tree_multimap' in AlphaFold2 CoLab

I am attempting to model a protein using an AlphaFold2 (AlphaFold v2.1.0.) CoLab (https://colab.research.google.com/github/deepmind/alphafold/blob/main/notebooks/AlphaFold.ipynb#scrollTo=pc5-mbsX9PZC).
I have done this successfully on 9/2/2022. However I have repeatedly had issues since 9/7/2022 doing the modelling with a different peptide sequence.
I get the following warning when I run the search against the genetic databases:
/opt/conda/lib/python3.7/site-packages/haiku/_src/data_structures.py:37: FutureWarning: jax.tree_structure is deprecated, and will be removed in a future release. Use jax.tree_util.tree_structure instead.
PyTreeDef = type(jax.tree_structure(None))
I then get several other future warnings when I run AlphaFold2 about other jax.tree_ deprecations.
The problem with AlphaFold running seems to be related to this:
AttributeError: module 'jax' has no attribute 'tree_multimap'
I have tried substituting jax.tree_util.tree_structure with no success.
I see another question on stackoverflow that is similar (AttributeError: module 'jaxlib.xla_extension' has no attribute 'PmapFunction'), however I do not know how best to implement the solution in the CoLab environment.
How should I fix this issue so that AlphaFold2 will run properly?
Traceback shown below:
44 processed_feature_dict = model_runner.process_features(np_example, random_seed=0)
---> 45 prediction = model_runner.predict(processed_feature_dict, random_seed=random.randrange(sys.maxsize))
/opt/conda/lib/python3.7/site-packages/haiku/_src/stateful.py in difference(before, after)
310 params_before, params_after = box_and_fill_missing(before.params,
311 after.params)
--> 312 params_after = jax.tree_multimap(functools.partial(if_changed, is_new_param),
313 params_before, params_after)
jax.tree_multimap was deprecated in JAX version 0.3.5, and removed in JAX version 0.3.16.
You can either change the source to use jax.tree_map as a drop-in replacement for jax.tree_multimap, or install an older version of JAX, e.g.:
!pip install "jax<=0.3.16" "jaxlib<=0.3.16"
And then be sure to restart your runtime to pick up the new versiom.

Pygraphviz crashes after drawing 170 graphs

I am using pygraphviz to create a large number of graphs for different configurations of data. I have found that no matter what information is put in the graph the program will crash after drawing the 170th graph. There are no error messages generated the program just stops. Is there something that needs to be reset if drawing this many graphs?
I am running Python 3.7 on a Windows 10 machine, Pygraphviz 1.5, and graphviz 2.38
for graph_number in range(200):
config_graph = pygraphviz.AGraph(strict=False, directed=False, compound=True, ranksep='0.2', nodesep='0.2')
# Create Directory
if not os.path.exists('Graph'):
os.makedirs('Graph')
# Draw Graph
print('draw_' + str(graph_number))
config_graph.layout(prog = 'dot')
config_graph.draw('Graph/'+str(graph_number)+'.png')
I was able to constantly reproduce the behavior with:
Python 3.7.6 (pc064 (64bit), then also with pc032)
PyGraphviz 1.5 (that I built - available for download at [GitHub]: CristiFati/Prebuilt-Binaries - Various software built on various platforms. (under PyGraphviz, naturally).
Might also want to check [SO]: Installing pygraphviz on Windows 10 64-bit, Python 3.6 (#CristiFati's answer))
Graphviz 2.42.2 ((pc032) same as #2.)
I suspected an Undefined Behavior somewhere in the code, even if the behavior was precisely the same:
OK for 169 graphs
Crash for 170
Did some debugging (added some print(f) statements in agraph.py, and cgraph.dll (write.c)).
PyGraphviz invokes Graphviz's tools (.exes) for many operations. For that, it uses subprocess.Popen and communicates with the child process via its 3 available streams (stdin, stdout, stderr).
From the beginning I noticed that 170 * 3 = 510 (awfully close to 512 (0x200)), but didn't pay as much attention as I should have until later (mostly because the Python process (running the code below) had no more than ~150 open handles in Task Manager (TM) and also Process Explorer (PE)).
However, a bit of Googleing revealed:
[SO]: Is there a limit on number of open files in Windows (#stackprogrammer's answer) (and from here)
[MS.Learn]: _setmaxstdio (which states (emphasis is mine)):
C run-time I/O now supports up to 8,192 files open simultaneously at the low I/O level. This level includes files opened and accessed using the _open, _read, and _write family of I/O functions. By default, up to 512 files can be open simultaneously at the stream I/O level. This level includes files opened and accessed using the fopen, fgetc, and fputc family of functions. The limit of 512 open files at the stream I/O level can be increased to a maximum of 8,192 by use of the _setmaxstdio function.
[SO]: Python: Which command increases the number of open files on Windows? (#NorthCat's answer)
Below is your code that I modified for debugging and reproducing the error. It needs (for code shortness' sake, as same thing can be achieved via CTypes) the PyWin32 package (python -m pip install pywin32).
code00.py:
#!/usr/bin/env python
import os
import sys
#import time
import pygraphviz as pgv
import win32file as wfile
def handle_graph(idx, dir_name):
graph_name = "draw_{:03d}".format(idx)
graph_args = {
"name": graph_name,
"strict": False,
"directed": False,
"compound": True,
"ranksep": "0.2",
"nodesep": "0.2",
}
graph = pgv.AGraph(**graph_args)
# Draw Graph
img_base_name = graph_name + ".png"
print(" {:s}".format(img_base_name))
graph.layout(prog="dot")
img_full_name = os.path.join(dir_name, img_base_name)
graph.draw(img_full_name)
graph.close() # !!! Has NO (visible) effect, but I think it should be called anyway !!!
def main(*argv):
print("OLD max open files: {:d}".format(wfile._getmaxstdio()))
# 513 is enough for your original code (170 graphs), but you can set it up to 8192
#wfile._setmaxstdio(513) # !!! COMMENT this line to reproduce the crash !!!
print("NEW max open files: {:d}".format(wfile._getmaxstdio()))
dir_name = "Graph"
# Create Directory
if not os.path.isdir(dir_name):
os.makedirs(dir_name)
#ts_global_start = time.time()
start = 0
count = 170
#count = 1
step_sleep = 0.05
for i in range(start, start + count):
#ts_local_start = time.time()
handle_graph(i, dir_name)
#print(" Time: {:.3f}".format(time.time() - ts_local_start))
#time.sleep(step_sleep)
handle_graph(count, dir_name)
#print("Global time: {:.3f}".format(time.time() - ts_global_start - step_sleep * count))
if __name__ == "__main__":
print("Python {:s} {:03d}bit on {:s}\n".format(" ".join(elem.strip() for elem in sys.version.split("\n")),
64 if sys.maxsize > 0x100000000 else 32, sys.platform))
rc = main(*sys.argv[1:])
print("\nDone.\n")
sys.exit(rc)
Output:
e:\Work\Dev\StackOverflow\q060876623> "e:\Work\Dev\VEnvs\py_pc064_03.07.06_test0\Scripts\python.exe" ./code00.py
Python 3.7.6 (tags/v3.7.6:43364a7ae0, Dec 19 2019, 00:42:30) [MSC v.1916 64 bit (AMD64)] 064bit on win32
OLD max open files: 512
NEW max open files: 513
draw_000.png
draw_001.png
draw_002.png
...
draw_167.png
draw_168.png
draw_169.png
Done.
Conclusions:
Apparently, some file handles (fds) are open, although they are not "seen" by TM or PE (probably they are on a lower level). However I don't know why this happens (is it a MS UCRT bug?), but from what I am concerned, once a child process ends, its streams should be closed, but I don't know how to force it (this would be a proper fix)
Also, the behavior (crash) when attempting to write (not open) to a fd (above the limit), seems a bit strange
As a workaround, the max open fds number can be increased. Based on the following inequality: 3 * (graph_count + 1) <= max_fds, you can get an idea about the numbers. From there, if you set the limit to 8192 (I didn't test this) you should be able handle 2729 graphs (assuming that there are no additional fds opened by the code)
Side notes:
While investigating, I ran into or noticed several adjacent issues, that I tried to fix:
Graphviz:
[GitLab]: graphviz/graphviz - [Issue #1481]: MSB4018 The NativeCodeAnalysis task failed unexpectedly. (merged on 200406)
PyGraphviz:
[GitHub]: pygraphviz/pygraphviz - AGraph Graphviz handle close mechanism (merged on 200720)
There's also an issue open for this behavior (probably the same author): [GitHub]: pygraphviz/pygraphviz - Pygraphviz crashes after drawing 170 graphs
I tried you code and it generated 200 graphs with no problem (I also tried with 2000).
My suggestion is to use these versions of the packages, I installed a conda environment on mac os with python 3.7 :
graphviz 2.40.1 hefbbd9a_2
pygraphviz 1.3 py37h1de35cc_1

Yocto recipe written in python giving me an error when trying to build with Bitbake

It's the first time i have come across a recipe file written in python and it's giving me an error. The error is:
../meta-intel/recipes-rt/images/core-image-rt.bb: Error executing a python function in <code>:
This is a recipe which is coming from the meta-intel branch "[master] intel-vaapi-driver: 2.1.0 -> 2.2.0".
My poky version is" [morty] documentation: Updated manual revision table for 2.2.4 release date.
My BITBAKE version is: "BitBake Build Tool Core version 1.32.0"
The contents of core-image-rt.bb are:
require recipes-core/images/core-image-minimal.bb
# Skip processing of this recipe if linux-intel-rt is not explicitly specified as the
# PREFERRED_PROVIDER for virtual/kernel. This avoids errors when trying
# to build multiple virtual/kernel providers.
python () {
if d.getVar("PREFERRED_PROVIDER_virtual/kernel") != "linux-intel-rt":
raise bb.parse.SkipPackage("Set PREFERRED_PROVIDER_virtual/kernel to linux-intel-rt to enable it")
}
DESCRIPTION = "A small image just capable of allowing a device to boot plus a \
real-time test suite and tools appropriate for real-time use."
DEPENDS += "linux-intel-rt"
IMAGE_INSTALL += "rt-tests hwlatdetect"
LICENSE = "MIT"
If you need any additional information please let me know and i'll try and supply it.
I can normally build images on my ubuntu machine but don't believe have ever had to build an image in which the recipes were written in python
You are using incompatible API of using g.getVar method. In morty release as the last one with old way of using second parameter, there is still need to provide boolean parameter:
...
if d.getVar("PREFERRED_PROVIDER_virtual/kernel", True) != "linux-intel-rt":
...
Please take a look at one of the commit, that remove this in next releases.

GDAL reprojection error: in method 'Geometry_Transform', argument 2 of type 'OSRCoordinateTransformationShadow *'

Using Python 2.7.9 with GDAL 1.11.1, with miniconda for package management --
Performing this a simple reprojection of a coordinate point causes the error described below.
I am relatively new to GDAL, so I checked to see if the code from the Python GDAL/OGR 1.0 Cookbook produces the same issue, and it does:
from osgeo import ogr
from osgeo import osr
source = osr.SpatialReference()
source.ImportFromEPSG(2927)
target = osr.SpatialReference()
target.ImportFromEPSG(4326)
transform = osr.CoordinateTransformation(source, target)
point = ogr.CreateGeometryFromWkt("POINT (1120351.57 741921.42)")
point.Transform(transform)
print point.ExportToWkt()
This is the error:
/opt/miniconda/envs/pygeo/lib/python2.7/site-packages/osgeo/ogr.pyc in Transform(self, *args)
4880 OGRERR_NONE on success or an error code.
4881 """
-> 4882 return _ogr.Geometry_Transform(self, *args)
4883
4884 def GetSpatialReference(self, *args):
TypeError: in method 'Geometry_Transform', argument 2 of type 'OSRCoordinateTransformationShadow *'
CoordinateTransform is a proxy for the C++ OSRCoordinateTransformationShadow class, generated by SWIG.
Per the source code for osgeo.ogr.Geometry (what Point is), the correct types were passed to the Transform method.
Best guess: Could this be caused by using a version of _ogr that is too old, and so the implementation of _ogr.Geometry_Transform(self, *args) is expecting a different?
_ogr is another SWIG-generated proxy, I'm guessing for the OGR class?
What everyone new to GDAL must learn: assign an error handler. (example: http://pcjericks.github.io/py-gdalogr-cookbook/gdal_general.html#install-gdal-ogr-error-handler)
With an error handler assigned, the output includes the explanation for the error. In this case, it was: "Unable to load PROJ.4 library (libproj.so), creation of OGRCoordinateTransformation failed."
Hopefully, imparting the knowledge of enabling GDAL error handling will help others who may stumble upon this very issue.
Similar information can be found on a rasterio FAQ and in unable to load "gcs.csv" file in gdal.
I encountered this problem when running GDAL transformations in my Anaconda3 QGIS environment. The problem is that the coordinate system informations were not loading through the GDAL_DATA environment variable.
To remedy, locate where the directory containing gcs.csv exists within your system (potentially ".../Library/share/gdal"). Add this to your environment prior to importing GDAL & other dependents.
import os
os.environ['GDAL_DATA'] = r'/path/to/dir/'
With help of the answer of jeremy that GDAL fails to load its information, I just edited the code to specify directly the PROJ.4 parameters from the EPSG webside and it runs
#target.ImportFromEPSG(4326)
target.ImportFromProj4('+proj=longlat +datum=WGS84 +no_defs')

Embedding Python in MATLAB

I am trying to embed Python 2.6 into MATLAB (7.12). I wanted to embed with a mex file written in C. This worked fine for small simple examples using scalars. However, if Numpy (1.6.1) is imported in anyway MATLAB crashes. I say anyway because I have tried a number of ways to load the numpy libraries including
In the python module (.py):
from numpy import *
With PyRun_SimpleString in the mex file:
PyRun_SimpleString(“from numpy import *”);
Calling numpy functions with Py_oBject_CallObject:
pOut = PyObject_CallObject(pFunc, pArgs);
Originally, I thought this may be a problem with embedding Numpy in C. However, Numpy works fine when embedded in simple C files that are compiled from the command line with /MD (multithread) switch with the Visual Studios 2005 C compiler. Next, I thought I will just change the make file in MATLAB to include the /MD switch. No such luck, mexopts.bat compiles with the /MD switch. I also manually commented out lines in the Numpy init module to find what was crashing MATLAB. It seems that loading any file with the extension pyd crashes MATLAB. The first of such files loaded in NumPy is multiarray.pyd. The MATLAB documentation describes how to debug mex files with visual studios which I did and placed the error message below. At this point I know the problem is a memory problem with the pyd’s and some conflict with MATLAB. Interestingly, I can use a system command in MATLAB to kick off a process in python that uses numpy and no error is generated. I will paste below the error message from MATLAB followed by the DEBUG output in visual studios of the processes that crash MATLAB. However, I am not pasting the whole thing because the list of first-chance exceptions is very long. Are there any suggestions for solving this integration problem?
MATLAB error
Matlab has encountered an internal problem and needs to close
MATLAB crash file:C:\Users\pml355\AppData\Local\Temp\matlab_crash_dump.3484-1:
------------------------------------------------------------------------
Segmentation violation detected at Tue Oct 18 12:19:03 2011
------------------------------------------------------------------------
Configuration:
Crash Decoding : Disabled
Default Encoding: windows-1252
MATLAB License : 163857
MATLAB Root : C:\Program Files\MATLAB\R2011a
MATLAB Version : 7.12.0.635 (R2011a)
Operating System: Microsoft Windows 7
Processor ID : x86 Family 6 Model 7 Stepping 10, GenuineIntel
Virtual Machine : Java 1.6.0_17-b04 with Sun Microsystems Inc. Java HotSpot(TM) Client VM mixed mode
Window System : Version 6.1 (Build 7600)
Fault Count: 1
Abnormal termination:
Segmentation violation
Register State (from fault):
EAX = 00000001 EBX = 69c38c20
ECX = 00000001 EDX = 24ae1da8
ESP = 0088af0c EBP = 0088af44
ESI = 69c38c20 EDI = 24ae1da0
EIP = 69b93d31 EFL = 00010202
CS = 0000001b DS = 00000023 SS = 00000023
ES = 00000023 FS = 0000003b GS = 00000000
Stack Trace (from fault):
[ 0] 0x69b93d31 C:/Python26/Lib/site-packages/numpy/core/multiarray.pyd+00081201 ( ???+000000 )
[ 1] 0x69bfead4 C:/Python26/Lib/site-packages/numpy/core/multiarray.pyd+00518868 ( ???+000000 )
[ 2] 0x69c08039 C:/Python26/Lib/site-packages/numpy/core/multiarray.pyd+00557113 ( ???+000000 )
[ 3] 0x08692b09 C:/Python26/python26.dll+00076553 ( PyEval_EvalFrameEx+007833 )
[ 4] 0x08690adf C:/Python26/python26.dll+00068319 ( PyEval_EvalCodeEx+002255 )
This error was detected while a MEX-file was running. If the MEX-file
is not an official MathWorks function, please examine its source code
for errors. Please consult the External Interfaces Guide for information
on debugging MEX-files.
If this problem is reproducible, please submit a Service Request via:
http://www.mathworks.com/support/contact_us/
A technical support engineer might contact you with further information.
Thank you for your help.
Output from Visual Studios DEBUGGER
First-chance exception at 0x0c12c128 in MATLAB.exe: 0xC0000005: Access violation reading location 0x00000004.
First-chance exception at 0x0c12c128 in MATLAB.exe: 0xC0000005: Access violation reading location 0x00000004.
First-chance exception at 0x0c12c128 in MATLAB.exe: 0xC0000005: Access violation reading location 0x00000004.
First-chance exception at 0x751d9673 in MATLAB.exe: Microsoft C++ exception: jitCgFailedException at memory location 0x00c3e210..
First-chance exception at 0x751d9673 in MATLAB.exe: Microsoft C++ exception: jitCgFailedException at memory location 0x00c3e400..
First-chance exception at 0x69b93d31 in MATLAB.exe: 0xC0000005: Access violation writing location 0x00000001.
> throw_segv_longjmp_seh_filter()
throw_segv_longjmp_seh_filter(): invoking THROW_SEGV_LONGJMP SEH filter
> mnUnhandledWindowsExceptionFilter()
MATLAB.exe has triggered a breakpoint
Try to approach the problem from the Python side: Python is a great glue language, I would suggest you to have Python run your Matlab and C programs. Python has:
Numpy
PyLab
Matplotlib
IPython
Thus, the combination is a good alternative for almost any existing Matlab module.
With matlab 2014b a possibility to call python functions directly in m code was added.

Categories

Resources