Using both matplotlib and rpy2 with multi-buildpacks on Heroku - python

I'm trying to use a multi-buildpack setup on Heroku with these two buildpacks:
https://github.com/virtualstaticvoid/heroku-buildpack-r.git
https://github.com/dbrgn/heroku-buildpack-python-sklearn/
I am using rpy2 to call R from python. I detailed the full process I used to get the slug to compile here.
It works fine for numpy, scipy, and scikit-learn with rpy2. However, I'm also trying to get matplotlib to work with this setup, and I'm getting an error.
I used matplotlib==1.1.0, as suggested by this StackOverflow post.
However, when I have my LD_LIBRARY_PATH set so that rpy2 will work, like this:
LD_LIBRARY_PATH=/app/vendor/R/lib64/R/modules:/app/vendor/R/lib64/R/lib:/app/vendor/gcc-4.3/lib64
I get this error:
>>> from matplotlib import ft2font
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ImportError: /app/vendor/R/lib64/R/lib/libstdc++.so.6: version `GLIBCXX_3.4.11' not found (required by /app/.heroku/python/lib/python2.7/site-packages/matplotlib/ft2font.so)
If I remove the LD_LIBRARY_PATH settings, then matplotlib works, but rpy2 can't find the R library located in /app/vendor/R/lib64/R/lib. Changing the order of the directories in LD_LIBRARY_PATH doesn't seem to have an effect, for some reason.
So I can get either matplotlib or rpy2 to work, but not at the same time.
I have libraries in these locations:
~ $ find . -name "*libstd*"
/app/vendor/gcc-4.3/gcc-4.3/lib64/libstdc++.so.6.0.10
/app/vendor/gcc-4.3/gcc-4.3/lib64/libstdc++.so.6
/app/vendor/gcc-4.3/gcc-4.3/lib64/libstdc++.so
/app/vendor/gcc-4.3/gcc-4.3/lib64/libstdc++.a
/app/vendor/gcc-4.3/gcc-4.3/lib64/libstdc++.la
/app/vendor/gcc-4.3/gcc-4.3/lib/libstdc++.so.6.0.10
/app/vendor/gcc-4.3/gcc-4.3/lib/libstdc++.so.6
/app/vendor/gcc-4.3/gcc-4.3/lib/libstdc++.so
/app/vendor/gcc-4.3/gcc-4.3/lib/libstdc++.a
/app/vendor/gcc-4.3/gcc-4.3/lib/libstdc++.la
/app/vendor/R/lib64/R/lib/libstdc++.so.6.0.10
/app/vendor/R/lib64/R/lib/libstdc++.so.6
/app/vendor/R/lib64/R/lib/libstdc++.so
/app/vendor/R/lib64/R/lib/libstdc++.a
/app/vendor/R/lib64/R/lib/libstdc++.la
I suspect that matplotlib should be using /app/vendor/gcc-4.3/gcc-4.3/lib64/libstdc++.so.6 (how do I tell?), but I can't seem to get it to use that one.
Any suggestions? I'm totally stuck. I must say the multi buildpack process is pretty messed up for this kind of thing.

Ok, I figured it out. It turns out that the correct library wasn't in /app/vendor/gcc-4.3/gcc-4.3/lib64/ after all, but just in /usr/lib. That explains why changing the order of the directories in LD_LIBRARY_PATH had no effect, since it wasn't in any of them.
There must be some logic that looks in the directories of LD_LIBRARY_PATH, and it doesn't find a match there, then looks in /usr/lib. That is why it works some of the time. The file in /app/vendor/R/lib64/R/lib/ must have been close enough to be considered a match (so it didn't look in /usr/lib), but in the wrong format so it caused a later error.
The fix is just to include /usr/lib in LD_LIBRARY_PATH. I added /usr/local/lib as well, for good measure. You should now use:
heroku config:set LD_LIBRARY_PATH=/usr/lib:/usr/local/lib:/app/vendor/R/lib64/R/modules:/app/vendor/R/lib64/R/lib:/app/vendor/gcc-4.3/lib64

Related

Changing order of imports results in error in Python

I'm trying to understand the cause of the following error. First if I type the following into python
>>> import scipy.sparse
>>> import torch
it runs without error. However, when I type in in
>>> import torch
>>> import scipy.sparse
I get the following error:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/global/software/sl-7.x86_64/modules/langs/python/3.6/lib/python3.6/site-packages/scipy/sparse/__init__.py", line 229, in <module>
from .csr import *
File "/global/software/sl-7.x86_64/modules/langs/python/3.6/lib/python3.6/site-packages/scipy/sparse/csr.py", line 15, in <module>
from ._sparsetools import csr_tocsc, csr_tobsr, csr_count_blocks, \
ImportError: /lib64/libstdc++.so.6: version `CXXABI_1.3.9' not found (required by /global/software/sl-7.x86_64/modules/langs/python/3.6/lib/python3.6/site-packages/scipy/sparse/_sparsetools.cpython-36m-x86_64-linux-gnu.so)
I can even go the directory "/global/software/sl-7.x86_64/modules/langs/python/3.6/lib/python3.6/site-packages/scipy/sparse/" and import the binary "_sparsetools.cpython-36m-x86_64-linux-gnu.so" followed by torch without issue. But if I try it the other way around I again get the above error.
Does anyone have any idea why changing the order of these imports should have a different effect?
The simple search strategy for shared objects assumes that a single version of each exists—or at least that directories containing newer versions are put first on the search path. The path includes $LD_LIBRARY_PATH (which should be avoided), DT_RPATH and its newer variant DT_RUNPATH (which crucially depend on the client being loaded), and system directories like /lib. This works well for systems following the FHS with global package management, but packages that are installed, with copies of their dependencies, in a single per-package directory (as is common on Windows and with some “normal user” package managers) can easily produce multiple versions of a shared object with the same soname.
The expectation is that sharing that name is harmless because one can be used in place of the other (and therefore be put first on the path). The reality is that there is no single directory that is newest for all libraries, and there’s not even a single path to configure given the DT_ tags.
The result is that whichever one is loaded first wins: the dynamic loader can’t load both of them (since they provide many of the same symbols), so the second request has only the effect of checking the library version tags, which here are found to be inadequate. In this case, one client (torch) is relying on the system’s C++ standard library, while the other (_sparsetools) has its own, newer version. It may or may not even need its newer version: since it was built against it, it is conservatively marked as needing it by default.
This is a hard problem: not even tools like virtual environments or environment modules can handle it, since the problem lies in the incompatible compilation of extant packages. Systems that rebuild everything from source (like Nix or Spack) can, but only at their usual cost of controlling all relevant builds. It may be that simply controlling the import order is, unfortunately, the best choice available.
#DavisHerring's answer gives you an explanation what is going on, here is a possible workaround to ensure that the right version is loaded - the LD_PRELOAD-trick:
1. Step:
Find the right libc++.so-version via console:
$ ldd _sparsetools.cpython-36m-x86_64-linux-gnu.so
libstdc++.so.6 => <path to right version>/libstdc++.so.6(...)
2. Step:
While starting python, preload the right version, so the loader picks the right version:
$ LD_PRELOAD=<path to right version>/libstdc++.so python
However the best solution would to rebuild pytorch with the right libc++-dependency.

RDKit installation under Windows and Python3.7.4

RDKit could be a nice package if it wasn't so complicated to install.
Here on SO, there are several questions having problems with the installation of RDKit.
However, on different operating systems or different environments.
My configuration is:
Win10, Python 3.7.4, pip is installed, PATH is set, PYTHONPATH is set.
The installation of other modules is working fine via python -m pip install <package>.
I'm aware that the site recommends the fastest installation with Anaconda.
However, I don't have and don't want Anaconda.
On the webpage it says:
"Get the appropriate windows binary build from: https://github.com/rdkit/rdkit/releases".
However, there are no binaries of the latest versions.
This means, I would have to build it from source. I'm hesitating because the process seems to be pretty complicated, many extra installations with new problems and unknowns, and furthermore, the instructions seem to be outdated and incomplete for somebody who would build binaries from the source for the first time.
So, then I tried some unofficial binaries of RDKit.
If I unpack them and set the paths according to instructions, I get this error message:
>>> from rdkit import Chem
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "C:\Users\xyz\Programs\RDKit\rdkit\__init__.py", line 2, in <module>
from .rdBase import rdkitVersion as __version__
ImportError: DLL load failed: The specified module could not be found.
So, finally my questions:
How to properly install RDKit with the above mentioned configuration?
What is the specified DLL which is missing?
Where is it expecting it and searching it?
Are these RDKit 3.6 binaries maybe incompatible with Python 3.7.4?
I'm pretty sure it is probably a "small" thing (a path here or a check there), but I'm stuck. Thank you for any hints.
Update:
Apparently, it is not just a "small" thing. Chances to get this to work are most likely very low.
In the meantime I found this:
https://github.com/rdkit/rdkit/issues/1812
https://github.com/rdkit/rdkit/issues/2389
If the author of rdkit writes (April 2019):
I would be happy to be able to do pip distributions of the RDKit, but
to the best of my knowledge no one has managed to figure out how to
make it actually work.
I'd be happy to accept a PR from someone who has figured this out, but
I am not likely to have the time to do this myself anytime in the near
future.
So, if anybody feels capable achieving this, please feel free.
I will invest time in something else or will have to switch to Anaconda if I want to use RDKit.
On the webpage you linked there is a section about missing DLLs:
"In Win7 systems, you may run into trouble due to missing DLLs, see one thread from the mailing list: http://www.mail-archive.com/rdkit-discuss#lists.sourceforge.net/msg01632.html You can download the missing DLLs from here: http://www.microsoft.com/en-us/download/details.aspx?id=5555"
Not sure if this helps

Pylab ImportError - Library not loaded - confusing directory structure of Mountain Lion?

I wrote a little program while I was using my previous computer (my previous work's computer) which was a Windows machine. Now, based on the advice of a friend, I got a Mac, but I've had a hell of a time getting stuff to work on it.
In particular, my program uses pylab (part of matplotlib), and I am having an ImportError after import pylab:
Error: ~/Documents/New folder/Programowanie/Projekt/SimAccents_v2d.py:2:
/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/pylab.py:1:
/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/matplotlib/pylab.py:222:
/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/matplotlib/mpl.py:2:
/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/matplotlib/axis.py:14:
/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/matplotlib/text.py:29:
/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/matplotlib/backend_bases.py:47:
/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/matplotlib/textpath.py:11:
/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/matplotlib/mathtext.py:61: ImportError: dlopen(/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/matplotlib/_png.so, 2): Library not loaded: /opt/X11/lib/libpng15.15.dylib
I tried import matplotlib in Python IDLE, which worked, so apparently the problem is with pylab, not matplotlib. However, I tried import matplotlib.pyplot and got pretty much the same error, which I guess is because they are quite similar modules/pieces of matplotlib.
I have done a good deal of digging on the internet, and found a few potentially useful things, but the result has been more confusion. This post appears to be a similar problem to mine, which the author solved by "deleting (after making a backup) the matplotlib folder in my system's site-packages folder (/Library/Python/2.7/site-packages)". I expect that the reason that this may have worked is that perhaps Python is looking in the wrong place for the file.
To check this, I used the way of finding out matplotlib's install location:
>>> import matplotlib
>>> matplotlib.__file__
'/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/matplotlib/__init__.pyc'
...and matplotlib's directory location:
>>> matplotlib.get_configdir()
'/Uses/stanislawpstrokonski/.matplotlib'
Investigating these paths, I found that the second one is a hidden folder which contains only two files - .DS_Store (hidden) and fontList.cache. The first directory, however, was a bit more spooky, as Python says that that directory, including the final "problem" file of the error message above, exists:
>>> os.path.isfile('/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/matplotlib/_png.so')
True
...but I have been unable to find ~/Library/Frameworks on my Mac, as it both doesn't appear in Finder, and Mac OS Terminal can't seem to find it either:
Stanislaws-MacBook-Pro:~ stanislawpstrokonski$ cd ~/Library/Frameworks
-bash: cd: /Users/stanislawpstrokonski/Library/Frameworks: No such file or directory
It's exactly the same story for /Library/Python - Python confirms its existence, but Terminal denies it. However, when I type in this code to Terminal, it decides that the path does exist after all:
Stanislaws-MacBook-Pro:~ stanislawpstrokonski$ cd /usr/bin; ls -l python2.7
lrwxr-xr-x 1 root wheel 75 16 Nov 16:30 python2.7 -> ../../System/Library/Frameworks/Python.framework/Versions/2.7/bin/python2.7
Another thing I checked was the Library not loaded: path from the original ImportError:
>>> os.path.isfile('/opt/X11/lib/libpng15.15.dylib')
False
So perhaps the problem is that I am missing this path? What am I supposed to do about this? Isn't matplotlib meant to sort this sort of stuff out when it is installed?
I don't know why pylab is misbehaving, when wxPython and numpy (and, apparently, matplotlib aside from pylab and pyplot) appear to be working just fine. I also am baffled by Mac OS X's directory structure, although I still have a feeling that this may be the source of the problem. Another reason could be that I have installed Python on my machine, but I have heard that Mac OS already has Python installed, so maybe the two are confusing each other somehow.
I'm sorry that this post is so long, but when I don't know exactly where the problem is, I feel like I have to write down everything. Could anyone help me get pylab working, and perhaps enlighten me about Macs in the process? I would be extremely grateful.
p.s. I am using Mountain Lion and I bought my Mac about two weeks ago.
p.p.s. This person seems to be having a similar problem, though it's a different bit that isn't importing...
I had similar problem with matplotlib on OS X. You just need to install libpng. I've used brew: brew install libpng.

Python program run in MATLAB can't import pygame

I'm trying to run a Python program, which uses the pygame modules, from MATLAB. I know I can use either
system('python program.py')
or just
! python program.py
However, I keep getting the error:
Traceback (most recent call last):
File "program.py", line 1, in <module>
import pygame
ImportError: No module named pygame
What is strange is that if I run the program from the command line, it works just fine. Does anyone know why if run from within MATLAB, Python can't find pygame?
The problem may be that MATLAB is not seeing your PYTHONPATH, which normally stores Python libraries and modules. For custom modules, PYTHONPATH should also include the path to your custom folders.
You can try setting the value of PYTHONPATH from within a MATLAB running session:
PATH_PYTHON = '<python_lib_folder>'
setenv('PYTHONPATH', PATH_PYTHON); % set env path (PYTHONPATH) for this session
system('python program.py');
See also the possibly relevant SO answer here: How can I call a Qtproject from matlab?
As I haven't used matlab too often and don't have the program available now I cannot say for sure, but matlab may be creating a custom environment with custom paths (this happens a lot, so the user has a very consistent experience in their software). When matlab installs it may not export paths to its own modules to your default environment. So when calling for pygame.py outside of matlab, python cannot find pygame.py under its usual lookup paths.
Solutions could be:
find the pygame.py, and map the path to it directly in your code, though this could cause you headaches later on during deployment
Try just copying the pygame.py file to your working directory, could have dependences that need to addressed.
Install pygame directly from its developer at http://www.pygame.org. Version differences could be a problem but pygame gets put under the usual lookup paths for python. (This would be my preferred solution personally.)
Or just export the location of path to pygame in matlab's library to your default enivronment. This could be a problem during deployment too.
For posterity, first try everything that Stewie noted here ("Undefined variable "py" or class" when trying to load Python from MATLAB R2014b?). If it doesnt work then it's possible that you have multiple pythons. You can try and check which python works (with all the related installed modules) on your bash/terminal. And then use
pyversion PYTHONPATH
to let matlab know the right path.
Also use py.importlib.import_module('yourmodule') to import the module after that.
That should get you started.

How to use OpenCV in Python?

I have just installed OpenCV on my Windows 7 machine. As a result, I get a new directory:
C:\OpenCV2.2\Python2.7\Lib\site-packages
In this directory, I have two files: cv.lib and cv.pyd.
Then I try to use the opencv from Python. I do the following:
import sys
sys.path.append('C:\OpenCV2.2\Python2.7\Lib\site-packages')
import cv
As a result I get the following error message:
File "<stdin>", line 1, in <module>
ImportError: DLL load failed: The specified module could not be found.
What am I doing wrong?
ADDED
As it was recommended here, I have copied content of C:\OpenCV2.0\Python2.6\Lib\site-packages to the C:\Python26\Lib\site-packages. It did not help.
ADDED 2
My environment variables have the following values:
Path=C:\Program Files\MiKTex\miktex\bin;C:\OpenCV2.2\bin;C:\Python26;
PYTHONPATH=C:\OpenCV2.2\Python2.7\Lib\site-packages
Do I need to change something? Do I need to add something?
ADDED 3
I think my question is general: How to use a library? Probably I need to find a *.ddl file somewhere? Then I need to use the name of the directory containing this file as a value to some environment variables? Or maybe I need to use sys.addpath? I also need to know how the way to call the library is related to the name of the file that contains the library.
ADDED 4
It is interesting that when I type import cv, I get:
ImportError: DLL load failed: The specified module could not be found.
But when I type import opencv I get:
ImportError: No module named opencv
ADDED 5
It has been suggested that I usthe e inconsistent version of python. In more details, OpenCV tries to use Python2.7 and I had Python2.6. So, I have installed Python 2.7. It makes difference. Now I do not have the old error message, but I have a new one:
ImportError: numpy.core.multiarray failed to import
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ImportError: numpy.core.multiarray failed to import
ADDED 6
I have managed to resolve the problem by installing numpy. It took some time because I did not realized that there are different numpy installer corresponding to different versions of Python. Some details can be found in my answer to my own question (see bellow).
The problem was resolved. The following steps has been done:
A new version of python (version 2.7) has been installed.
After that I still was unable to run OpenCV because I had some problems with the numpy library.
I tired to install numpy but the installer did not see my new version of the Python.
I deleted the old version of Python as well as links to the old version in the Path system vatriable.
After that numpy installer was not able to finish the installation.
I have realized that I need to run another numpy installer that is associated with the Python 2.7. It can be found here.
Finally everything worked. I was able to "import cv".
I suspect you have the same problem I've run into. If you have a 64-bit version of Python, it cannot load 32-bit DLLs. OpenCV currently only ships 32-bit binaries. If you want 64-bit .pyd and .dll files, you have to compile them yourself. There are some instructions on the OpenCV Wiki, but it's not for the faint of heart. Expect to have a substantial time investment.
The easiest solution is to:
Uninstall 64-bit Python
Install a 32-bit distribution.
The PythonXY distribution includes pyopencv -- a good set of OpenCV hooks. The only limitation is that it's 32-bit, so don't make plans to process gigapixel astronomy data with it! ;)
If you must have the 64-bit version, follow these instructions to get it OpenCV to compile with Visual Studio 2010. There's a discussion on stackoverflow that describes building 64-bit apps with VC Express.
EDIT: OpenCV now ships with 64-bit Python binaries. The .dll files need to go somewhere in your path (I put them in the scripts folder), and the .pyd files go in your site-packages directory.
I had trouble interfacing OpenCV with Python, and I was looking all over the place for help. Here's what worked for me. I basically followed this post: http://opencvpython.blogspot.com/2012/05/install-opencv-in-windows-for-python.html. After downloading and extracting OpenCV 2.4.6, you basically get a folder called "opencv" with a bunch of stuff in it. Navigate to build->python->2.7. Inside, there is only one file called "cv2.pyd". I copied this file and pasted it in "python-2.7.5\Lib\site-packages". I'm actually using the Spyder IDE, and it works fine. In the python interpreter, typing in "import cv" worked for me.
Maybe you should edit your environment variable
right click on the "My Computer" or something like this, click on properties.
In the properties window click on the Advanced tab.
Then, the environment variables button.
Change the path.

Categories

Resources