How should I execute the Bazel Build Command in compiling Tensorflow? - python

I've been trying to install Tensorflow for a few weeks now and I keep getting a lot of errors with the simple installations so I think that it would be best for me to install Tensorflow from source. I'm following the instructions on the Tensorflow website exactly, and my ./configure is mostly all default so I can see if it works before I make modifications:
./configure
Please specify the location of python. [Default is /usr/bin/python]: /Library/Frameworks/Python.framework/Versions/3.6/bin/python3
Please specify optimization flags to use during compilation when bazel option "--config=opt" is specified [Default is -march=native]:
Do you wish to build TensorFlow with Google Cloud Platform support? [y/N] n
No Google Cloud Platform support will be enabled for TensorFlow
Do you wish to build TensorFlow with Hadoop File System support? [y/N] n
No Hadoop File System support will be enabled for TensorFlow
Do you wish to build TensorFlow with the XLA just-in-time compiler (experimental)? [y/N]
No XLA support will be enabled for TensorFlow
Found possible Python library paths:
/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages
Please input the desired Python library path to use. Default is [/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages]
/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages
Do you wish to build TensorFlow with OpenCL support? [y/N] n
No OpenCL support will be enabled for TensorFlow
Do you wish to build TensorFlow with CUDA support? [y/N] n
No CUDA support will be enabled for TensorFlow
INFO: Starting clean (this may take a while). Consider using --async if the clean takes more than several minutes.
Configuration finished
(This is not the first time I've edited the configuration)
After this, I execute the following bazel build command straight from the Tensorflow.org website instructions for installing from source :
bazel build --config=opt //tensorflow/tools/pip_package:build_pip_package
(In the future, I'm going to add some additional flags to account for the fact that I've been getting CPU instruction errors about SSE, AVX, etc.)
When I execute that bazel command, I get an extremely long wait time and a list of errors that piles up:
r08ErCk:tensorflow kendrick$ bazel build --config=opt //tensorflow/tools/pip_package:build_pip_package
WARNING: /Users/kendrick/tensorflow/tensorflow/contrib/learn/BUILD:15:1: in py_library rule //tensorflow/contrib/learn:learn: target '//tensorflow/contrib/learn:learn' depends on deprecated target '//tensorflow/contrib/session_bundle:exporter': Use SavedModel Builder instead.
WARNING: /Users/kendrick/tensorflow/tensorflow/contrib/learn/BUILD:15:1: in py_library rule //tensorflow/contrib/learn:learn: target '//tensorflow/contrib/learn:learn' depends on deprecated target '//tensorflow/contrib/session_bundle:gc': Use SavedModel instead.
INFO: Found 1 target...
INFO: From Compiling external/protobuf/src/google/protobuf/compiler/js/embed.cc [for host]:
external/protobuf/src/google/protobuf/compiler/js/embed.cc:37:12: warning: unused variable 'output_file' [-Wunused-const-variable]
const char output_file[] = "well_known_types_embed.cc";
^
1 warning generated.
INFO: From Compiling external/protobuf/python/google/protobuf/pyext/message_factory.cc:
external/protobuf/python/google/protobuf/pyext/message_factory.cc:78:28: warning: ISO C++11 does not allow conversion from string literal to 'char *' [-Wwritable-strings]
static char* kwlist[] = {"pool", 0};
^
external/protobuf/python/google/protobuf/pyext/message_factory.cc:222:6: warning: ISO C++11 does not allow conversion from string literal to 'char *' [-Wwritable-strings]
{"pool", (getter)GetPool, NULL, "DescriptorPool"},
^
external/protobuf/python/google/protobuf/pyext/message_factory.cc:222:37: warning: ISO C++11 does not allow conversion from string literal to 'char *' [-Wwritable-strings]
{"pool", (getter)GetPool, NULL, "DescriptorPool"},
^
3 warnings generated.
This is only a small portion of all the errors that looked similar to this that piled up. Even after all of the error messages, the command never returns and I just get the blinking cursor on an empty line.
Can someone please provide me with some exact instructions on what I should enter into terminal to avoid these errors? I've been following stack advice for weeks but continue to get errors.
MAC OS Sierra (MacBook Air)
What should I enter into terminal? (specifically)
Everything that I've done up to this point has been almost exactly what is told to do on the Tensorflow.org website instructions.

I installed for the first time using http://queirozf.com/entries/installing-cuda-tk-and-tensorflow-on-a-clean-ubuntu-16-04-install and not only was it a very simple process, but working with tf is really easy.. just source <name_of_virtual_environment>/bin/activate and then run python/python3 through that.
Bear in mind that the walkthrough in the link is for gpu tensorflow, however using the cpu tensorflow download for your mac instead, with with this virtual environment process should work just fine.

Since you do not have a GPU, do have SSE and AVX, and are on a mac sierra - the instructions found on google will NOT work with 1.3. i am befuddled on why they do not provide an exact script to do this. Regardless, here is the answer to your question http://www.josephmiguel.com/building-tensorflow-1-3-from-source-on-mac-osx-sierra-macbook-pro-i7-with-sse-and-avx/
/*
do each of these steps independently
will take around 1hr to complete all the steps regardless of machine type
*/
one time install
install anaconda3 pkg # manually download this and install the package
conda update conda
conda create -n dl python=3.6 anaconda
source activate dl
cd /
brew install bazel
pip install six numpy wheel
pip install –upgrade https://storage.googleapis.com/tensorflow/mac/cpu/protobuf-3.1.0-cp35-none-macosx_10_11_x86_64.whl
sudo -i
cd /
rm -rf tensorflow # if rerunning the script
cd /
git clone https://github.com/tensorflow/tensorflow
Step 1
cd /tensorflow
git checkout r1.3 -f
cd /
chmod -R 777 tensorflow
cd /tensorflow
./configure # accept all default settings
Step 2
// https://stackoverflow.com/questions/41293077/how-to-compile-tensorflow-with-sse4-2-and-avx-instructions
bazel build –config=opt –copt=-mavx –copt=-mavx2 –copt=-mfma //tensorflow/tools/pip_package:build_pip_package
Step 3
bazel-bin/tensorflow/tools/pip_package/build_pip_package /tmp/tensorflow_pkg
pip install /tmp/tensorflow_pkg/tensorflow-1.0.1-cp36-cp36m-macosx_10_7_x86_64.whl
Step 4
cd ~
ipython
Step 5
import tensorflow as tf
hello = tf.constant(‘Hello, TensorFlow!’)
sess = tf.Session()
print(sess.run(hello))
Step 6
pip uninstall /tmp/tensorflow_pkg/tensorflow-1.0.1-cp36-cp36m-macosx_10_7_x86_64.whl

Related

Enable multi-threading on Caffe2

When compiling my program using Caffe2 I get this warnings:
[E init_intrinsics_check.cc:43] CPU feature avx is present on your machine, but the Caffe2 binary is not compiled with it. It means you may not get the full speed of your CPU.
[E init_intrinsics_check.cc:43] CPU feature avx2 is present on your machine, but the Caffe2 binary is not compiled with it. It means you may not get the full speed of your CPU.
[E init_intrinsics_check.cc:43] CPU feature fma is present on your machine, but the Caffe2 binary is not compiled with it. It means you may not get the full speed of your CPU.
Since I do want to get multi-threading support for Caffe2, I've searched what to do. I've found that Caffe2 has to be re-compiled setting some arguments while creating the cmake or in the CMakeLists.
Since I already had installed pytorch in a conda env, I have first uninstalled Caffe2 with:
pip uninstall -y caffe2
Then I've followed the instructions from the Caffe2 docs, to build it from sources.
I first installed the dependencies as indicated. Then I downloaded pytorch inside my conda env with:
git clone https://github.com/pytorch/pytorch.git && cd pytorch
git submodule update --init --recursive
At this time I think is the moment to change the pytorch\caffe2\CMakeLists file just downloaded. I have read that in order to enable the multi-threading support is sufficient to enable the option USE_NATIVE_ARCH inside this CMakeLists, however I'm not able to find such option where I'm looking. Maybe I'm doing something wrong. Any thoughts? Thanks.
Here some details about my platform:
I'm on macOS Big Sur
My python version is 3.8.5
UPDATE:
To answer Nega this is what I've got:
python3 -c 'import torch; print(torch.__config__.parallel_info())'
ATen/Parallel:
at::get_num_threads() : 1
at::get_num_interop_threads() : 4
OpenMP not found
Intel(R) Math Kernel Library Version 2020.0.2 Product Build 20200624 for Intel(R) 64 architecture applications
mkl_get_max_threads() : 4
Intel(R) MKL-DNN v0.21.1 (Git Hash 7d2fd500bc78936d1d648ca713b901012f470dbc)
std::thread::hardware_concurrency() : 8
Environment variables:
OMP_NUM_THREADS : [not set]
MKL_NUM_THREADS : [not set]
ATen parallel backend: OpenMP
UPDATE 2:
It turned out that the Clang that comes with XCode doesn't support OpenMP. The gcc that I was using was just a symlink to Clang. In fact after running gcc --version I got:
Configured with: --prefix=/Applications/Xcode.app/Contents/Developer/usr --with-gxx-include-dir=/Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX.sdk/usr/include/c++/4.2.1
Apple clang version 12.0.0 (clang-1200.0.32.29)
Target: x86_64-apple-darwin20.3.0
Thread model: posix
InstalledDir: /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin
I installed from Homebrew gcc-10 and set the alias like this alias gcc='gcc-10'. In fact now with gcc --version this is what I get:
gcc-10 (Homebrew GCC 10.2.0_4) 10.2.0
Copyright (C) 2020 Free Software Foundation, Inc.
This is free software; see the source for copying conditions. There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
I've also tried a simple Hello World for OpenMP using 8 threads and everything seems to be working. However after re-running the command:
python3 -c 'import torch; print(torch.__config__.parallel_info())'
I get the same outcome. Any thoughts?
AVX, AVX2, and FMA are CPU instruction sets and are not related to multi-threading. If the pip package for pytorch/caffe2 used these instructions on a CPU that didn't support them, the software wouldnt work. Pytorch, installed via pip comes with multi-threading enabled though. You can confirm this with torch.__config__.parallel_info()
❯ python3 -c 'import torch; print(torch.__config__.parallel_info())'
ATen/Parallel:
at::get_num_threads() : 6
at::get_num_interop_threads() : 6
OpenMP 201107 (a.k.a. OpenMP 3.1)
omp_get_max_threads() : 6
Intel(R) Math Kernel Library Version 2020.0.1 Product Build 20200208 for Intel(R) 64 architecture applications
mkl_get_max_threads() : 6
Intel(R) MKL-DNN v1.6.0 (Git Hash 5ef631a030a6f73131c77892041042805a06064f)
std::thread::hardware_concurrency() : 12
Environment variables:
OMP_NUM_THREADS : [not set]
MKL_NUM_THREADS : [not set]
ATen parallel backend: OpenMP
That being said, if you still want to continue building pytorch and caffe2 from source, the flag your looking for, USE_NATIVE is in pytorch/CMakeLists.txt, one level up from caffe2. Edit that file and change USE_NATIVE to ON. Then continue building pytorch with python3 setup.py build. Note that the flag USE_NATIVE doesn't do what you think it does. It only allows the building of MKL-DNN with CPU native optimization flags. It does not trickle down to caffe2 (except where caffe2 use MKL-DNN obviously.)

Could not load dynamic library 'libnvinfer.so.6'

I am trying to normally import the TensorFlow python package, but I get the following error:
Here is the text from the above terminal image:
2020-02-23 19:01:06.163940: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'libnvinfer.so.6'; dlerror: libnvinfer.so.6: cannot open shared object file: No such file or directory
2020-02-23 19:01:06.164019: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'libnvinfer_plugin.so.6'; dlerror: libnvinfer_plugin.so.6: cannot open shared object file: No such file or directory
2020-02-23 19:01:06.164030: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:30] Cannot dlopen some TensorRT libraries. If you would like to use Nvidia GPU with TensorRT, please make sure the missing libraries mentioned above are installed properly.
<module 'tensorflow_core._api.v2.version' from '/home/saman/miniconda3/envs/testconda/lib/python3.7/site-packages/tensorflow_core/_api/v2/version/__init__.py'
This is a warning, not an error. You can still use TensorFlow. The shared libraries libnvinfer and libnvinfer_plugin are optional and required only if you are using nvidia's TensorRT capabilities.
To suppress this and all other warnings, set the environment variable TF_CPP_MIN_LOG_LEVEL="2".
TensorFlow's installation instructions list the GPU dependencies (current as of December 13 2022):
The following NVIDIA® software are only required for GPU support.
NVIDIA® GPU drivers version 450.80.02 or higher.
CUDA® Toolkit 11.2.
cuDNN SDK 8.1.0.
(Optional) TensorRT to improve latency and throughput for inference.
I got this warning as a result of (accidental) update of libvnifer6 package. It got updated to 6.0.1-1+cuda10.2 while original installation used 6.0.1-1+cuda10.1.
After I uninstalled packages referencing cuda10.2 and re-ran
sudo apt-get install -y --no-install-recommends libnvinfer6=6.0.1-1+cuda10.1 \
libnvinfer-dev=6.0.1-1+cuda10.1 \
libnvinfer-plugin6=6.0.1-1+cuda10.1
this warning went away.
Most of these messages are warnings, not errors. They just mean that libraries to use an Nvidia GPU are not installed, but you don't have to have any Nvidia GPU to use Tensorflow so you don't need these libraries. The comment by jakub tells how to turn off the warnings:
export TF_CPP_MIN_LOG_LEVEL="2"
However, I too run Tensorflow without Nvidia stuff and there is one more message that is an error, not a warning:
2020-04-10 10:04:13.365696: E tensorflow/stream_executor/cuda/cuda_driver.cc:351] failed call to cuInit: UNKNOWN ERROR (303)
It should be irrelevant because it too refers to cuda, which is for Nvidia. It doesn't seems to be a fatal error though.
wget http://developer.download.nvidia.com/compute/machine-learning/repos/ubuntu1804/x86_64/nvidia-machine-learning-repo-ubuntu1804_1.0.0-1_amd64.deb
sudo apt install ./nvidia-machine-learning-repo-ubuntu1804_1.0.0-1_amd64.deb
sudo apt-get update
Little bit of an info from jakub's answer. This could occur if you don't install the 'machine-learning' repo. Try this if you already install CUDA successfully but still geting the error.
Then Install TensorRT. Requires that libcudnn7 is installed above.
sudo apt-get install -y --no-install-recommends libnvinfer6=6.0.1-1+cuda10.1 \
libnvinfer-dev=6.0.1-1+cuda10.1 \
libnvinfer-plugin6=6.0.1-1+cuda10.1
You can download the zip file of tensorRT 6 and then should paste the x86 linux folder file to /usr/lib/cuda make sure that the lib folder in the x86_linux folder that you have downloaded should be renamed to lib64 . After pasteing all the files in the cuda directory reboot the system . Now Cuda and TensorRT engine will run smoothly in your system.
I spent like 5 hrs solving this issue. For my case, I believe it means that you have the wrong version of library. libnvinfer.so.6 is located at 'TensorRT-*/lib' and the number 6 means tensorFlow is looking for the libvinfer of TensorRT6. So if it's "could not load dynamic library libnvinfer.so.5", it means that you need TensorRT 5 to run the code.
Same as above, if it is showing Could not load dynamic library 'libcudart.so.10.0', you need the library in cuda 10.0 to run the code.
So updating your tensorrt/Cuda/Cudnn to match your tensorflow version would help. Note that your tensorrt/cuda/cudnn version should also match each other.

How to build MultiNEAT from sources?

I'm building MultiNEAT from source, on macOS Mojave, with Python3, and I'm hitting a snag when trying "python setup.py build_ext". I get a bunch of errors. Any help is greatly appreciated.
Here is a step by step of what I'm doing from scratch, if you want to follow along in your own terminal. I'm doing all of this so others can do it themselves too, as it has taken me a while to even get here. I've gotten bits and pieces of information here and there, but there is no straight up instructions on how to build this library.
# Change things accordingly for you.
# Define work dir. Should be empty at this point.
WDIR=/Users/luis/Documents/neat
cd $WDIR
# Setup Python virtual environment and requirements.
python3 -m venv venv
. venv/bin/activate
pip install --upgrade pip
pip install psutil numpy opencv-python
# Get Boost.
# Get the url from here: https://www.boost.org/users/download/
curl -L https://dl.bintray.com/boostorg/release/1.70.0/source/boost_1_70_0.tar.gz | tar -xz
cd boost_1_70_0/
# Get system Python include files with: python3-config --includes
# Put that path into this exported var.
export CPLUS_INCLUDE_PATH=/Library/Frameworks/Python.framework/Versions/3.7/include/python3.7m
# Build Boost! This will take a bit.
./bootstrap.sh --prefix=$WDIR/boost
./b2 install
# Get MultiNEAT.
cd $WDIR
git clone https://github.com/peter-ch/MultiNEAT.git
cd MultiNEAT/
# Setup build. (Is this incomplete?)
export MN_BUILD=boost
export PREFIX=$WDIR/boost
# Build MultiNEAT!
python setup.py build_ext
# Supposedly, I'd do "python setup.py install" after, but errors are happening now :(
This is only the top part of when errors start happening, but most are similar:
In file included from src/Innovation.cpp:34:
In file included from src/Innovation.h:37:
src/Genome.h:689:19: error: expected ':'
public void set_children()
^
:
src/Genome.h:691:49: error: indirection requires pointer operand ('std::__1::vector<double,
std::__1::allocator<double> >::size_type' (aka 'unsigned long') invalid)
for(unsigned int ix = 0; ix < 2**coord.size(); ix++){
^~~~~~~~~~~~~
Here is what works:
First install the dependencies you want to use:
Boost 1.49 and above with Boost.Python and Boost.Serialization (optional)
ProgressBar (Python package) (optional)
NumPy (Python package) (optional)
Matplotlib (Python package) (optional)
OpenCV 2.3 and above (with Python bindings) (optional)
Cython (if you want Python bindings)
Then clone this git(this one seems to be older but does not throw those errors you encountered)
And finally enter the source directory and run "sudo python setup.py install"
This worked even with Python 3.7.

Error building Tensorflow on CentOS 7

I am trying to compile Tensorflow (r1.3) on CentOS 7.
My environment: gcc (g++) 7.20, bazel 0.5.3, python3 (with all
necessary dependencies listed on tensorflow web site), swig 3.0.12,
openjdk 8. Everything is installed in the users scope, without root access.
Whenever I am trying to build a python package invoking following command
"bazel build --config=opt //tensorflow/tools/pip_package:build_pip_package"
I am getting this error:
.....
2017-08-24 11:40:35.734659: W
tensorflow/core/framework/op_gen_lib.cc:372] Squeeze can't find input
squeeze_dims to rename ERROR:
/home/data/software/tensorflow/tensorflow/python/BUILD:2762:1:
Couldn't build file tensorflow/python/pywrap_tensorflow_internal.cc:
SWIGing tensorflow/python/tensorflow.i failed (Exit 1).
...
However building C++ lib (bazel build --config=opt //tensorflow:libtensorflow_cc.so) working without any issues
Am I doing something wrong?
Update 25.08.2017:
ok, it seems that SWIG is build automatically from source when running bazel build. The version of shipped SWIG version is 3.0.8. But still, I have no clue how to solve this problem.
ok, problem solved by using bazel version 0.5.1. Newer version producing the same error.

Trying to install Theano but don't have an Nvidia card

I am following:
http://deeplearning.net/software/theano/install_windows.html#install-windows
to install theano. I just want to play with the code, I don't need to use a GPU to improve my speed.
I don't have an Nvidia card and when I try to install cuda, the installation fails. I watch as the installation tool deletes the files I need.
I am using Anaconda python so I commented this line:
REM CALL %SCISOFT%\WinPython-64bit-2.7.9.4\scripts\env.bat
In the:
C:\SciSoft\env.bat
file. I gave up and tried to install theano with easy_install.
I try to import Theano from python, it fails with:
ton of stuff
Problem occurred during compilation with the command line below: C:\SciSoft\TDM-GCC-64\bin\g++.exe -shared -g
-march=bdver2 -mmmx -mno-3dnow -mss
more stuff
C:\Users\xxx\Anaconda\libs/python27.lib: error adding symbols: File in wr ong format collect2.exe: error: ld returned
1 exit status
--------------------------------------------------------------------------- Exception Traceback (most recent call
last) in ()
----> 1 import theano
C:\Users\xxx\Anaconda\lib\site-packages\theano-0.7.0-py2.7.egg\theano__i
nit__.pyc in ()
Even more stuff
Exception: Compilation failed (return status=1): C:\Users\xxxx\Anaconda\li . collect2.exe: error: ld
returned 1 exit statusrong format
If you don't need to run on GPU then don't worry about installing Visual Studio or CUDA. I think you just need Anaconda, but maybe also TDM GCC.
From a clean environment I install the latest version of Anaconda then run
conda install mingw libpython
I'd recommend installing Theano from Github (the bleeding edge version) since the "stable" release is not updated often and there are usually many significant improvements (especially for performance) in the bleeding edge version in comparison to the stable version.
There is no need to perform all the steps in the "Configuring the Environment" section; just make sure your C++ compiler is in the PATH.
If Theano fails to work after these minimal installation instructions, I'd recommend solving each problem on a case-by-case basis instead of trying to run the full installation instructions provided in the documentation (which may be out of date).

Categories

Resources