Optimization failure in theano - python

I am using Fedora with the Anaconda Python environment. I have a 960m nvidia gpu, for which I have installed the required drivers and the CUDA toolkit. But when I try to run the theano tests, I end up getting the following error (in a huge error output):
EE.EEEERROR (theano.gof.opt): Optimization failure due to: constant_folding
ERROR (theano.gof.opt): node: DimShuffle{x}(TensorConstant{2})
ERROR (theano.gof.opt): TRACEBACK:
ERROR (theano.gof.opt): Traceback (most recent call last):
I was trying to compile a simple function y, when I first saw the error. Searching for a solution led me to find that a lot of people had the same problem with the test function, but without any definite solutions. I followed the theano documentations and set the $CUDA_ROOT to my cuda root folder, but to no avail.
I'm using theano version 0.8.2 and Numpy 1.11.1, both from the conda repos. Seems like it is a GPU issue. But if it has problems, shouldn't it fallback to the CPU?
Any help would be highly appreciated. Thanks!

Related

ERROR WHEN IMPORTING PYTORCH (The filename or extension is too long)

I'm using Anconda to run my Transformers project locally in google colab.
I've created a new environment (tf_gpu) and installed (supposedly) everything I need.
And everything works fine, but when I try to simply import pytorch, this error appears:
[WinError 206] The filename or extension is too long: 'C:\\Users\\34662\\anaconda3\\envs\\tf_gpu\\lib\\site-packages\\torch\\lib'
When clearly the path is not long enough to trigger this error.
My python version is 3.8, and my GPU is a Nvidia GeForce GTX 1650, so it shouldn't be a GPU problem
Does anybody knows why this happens?
Any help is good at this point, I don't know how to solve this.
Here I leave a screenshot of the complete error message
Thank you in advance.
Your problem is that the error ist not a too long path error it is a file not found error which mean that pytorch is not correctly installed

Install keras and tensorflow using Rstudio

While trying to following the instructions of installing Keras and TensorFlow on Rstudio link https://keras.rstudio.com/index.html I get the following error. It is a work computer that is running Windows 7. I am not familiar with python, but I believe I have python 3.6 installed correctly (I am able to run simple python code in the Spyder IDE). Thanks in advance for any suggestions on how to get this working.
> install_keras()
Creating r-tensorflow conda environment for TensorFlow installation...
Solving environment: ...working... failed
CondaHTTPError: HTTP 000 CONNECTION FAILED for url
<https://repo.continuum.io/pkgs/main/noarch/repodata.json.bz2>
Elapsed: -
An HTTP error occurred when trying to retrieve this URL.
HTTP errors are often intermittent, and a simple retry will get you on your
way.
ConnectionError(MaxRetryError("HTTPSConnectionPool(host='repo.continuum.io', port=443): Max retries exceeded with url: /pkgs/main/noarch/repodata.json.bz2 (Caused by NewConnectionError('<urllib3.connection.VerifiedHTTPSConnection object at 0x000000000474D860>: Failed to establish a new connection: [Errno 11004] getaddrinfo failed',))",),)
Error: Error 1 occurred creating conda environment r-tensorflow
In addition: Warning message:
running command '"C:\Users\...\...\Local\CONTIN~1\ANACON~1\Scripts\conda.exe" "create" "--yes" "--name" "r-tensorflow" "python=3.6"' had status 1
Installing Keras and TensorFlow using install_keras() isn't required to use the Keras R package. You can do a custom installation of Keras (and desired backend) as described on the Keras website and the Keras R package will find and use that version.
Source
So you can circumvent this issue with the firewall using a custom installation. The R package keras will then find that installation automatically. See the linked source for more information on how to do a custom installation.
edit: btw, there is a similar question that has been answered here. That poster goes into changing the proxy settings to cicrumvent the firewall. I cannot mark this question as a duplicate due to active bounty.
I suggest you first update your conda in terminal:
conda update --all
Then run the following commands in R:
install.packages("tensorflow")
library(keras)
to_categorical(0:3)
You could also test the followings if you get any error at the time of installing tensorflow and keras:
install.packages("tensorflow")
install.packages("keras")
install_keras()
install_tensorflow()
There is a good answer here:
https://github.com/rstudio/keras/issues/649
(scroll down to the answer by skeydan)
Here is the answer:
First install tensorflow directly from GitHub, as in
devtools::install_github("rstudio/tensorflow")
devtools::install_github("rstudio/keras")
Then do
tensorflow::install_tensorflow()
tensorflow::tf_config()
which should give you version 1.12.
We have been installing TF 1.10 until yesterday because of a bug in that will only be fixed in TF 1.13 (which should be out anytime but unfortunately isn't yet). Now with people having installation problems due to incompatibilities with numpy as installed by conda, we decided to switch to TF 1.12 now and as soon as TF 1.13 is actually available, we'll install that by default.
If you still run into problems installing, please open another issue and describe the problem, indicating the output of
reticulate::py_discover_config()
reticulate::use_condaenv("r-tensorflow")
reticulate::py_config()
Thanks!

MXNet ML lib C++ segmentation fault on OS X

I have a problem with Apache MXNet machine learning library on OS X.
I have been able to run Python version of Lenet, convolutional neural network.
I installed these with pip under both Anaconda Python 2.7 and 3.6.
conda create -n mxnet27 python=2.7
conda info --envs
source activate mxnet27
conda list
pip install mxnet==0.12.1
But when I run C++ example files cpp-package/example/lenet.cpp I get the this segfault:
Segmentation fault: 11
This is the place in the code where the segfault is thrown:
Symbol conv1 =
Convolution("conv1", data, conv1_w, conv1_b, Shape(5, 5), 20);
I get similar segfault for the other C++ examples.
I have built MXNet on OS X 10.13.2
I disabled as many libraries as possible, e.g. OpenCV and CUDA.
On Simon Corston-Oliver suggestion I upgraded to MXNet 1.0.0, but that version did not compile with Clang on OS X. Error message:
operator_tune.h:150:36: note: add an explicit instantiation declaration to suppress this
warning if 'mxnet::op::OperatorTuneByType<float>::tuning_mode_' is explicitly instantiated in another translation unit
/Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/include/c++/v1/unordered_map:601:15: error: object of type 'std::__1::pair<int,
mxnet::test::perf::TimingInstrument::Info>' cannot be assigned because its copy assignment operator is implicitly deleted
I don't know of a specific issue with v0.12 that would lead to a segfault but before we dig in, I'd recommend upgrading to v1.0 which was released 2017-12-04.
If you still encounter the same problem with 1.0 we can work to debug.
I found a solution to compiling MXNet 1.0.0 posted here by helloniklas:
https://github.com/apache/incubator-mxnet/issues/9217
It involved only using make instead of CMake.
This solution worked me and compiled the code.
C++ examples runs without the seg fault, but documentation is scarce. I only got one of the to do training.

Compiling binary with tensorflow library for cpu: Cannot find cuda library?

In development, I have been using the gpu-accelerated tensorflow
https://storage.googleapis.com/tensorflow/linux/gpu/tensorflow_gpu-1.2.1-cp35-cp35m-linux_x86_64.whl
I am attempting to deploy my trained model along with an application binary for my users. I compile using PyInstaller (3.3.dev0+f0df2d2bb) on python 3.5.2 to create my application into a binary for my users.
For deployment, I install the cpu version, https://storage.googleapis.com/tensorflow/linux/cpu/tensorflow-1.2.1-cp35-cp35m-linux_x86_64.whl
However, upon successful compilation, I run my program and receive the infamous tensorflow cuda error:
tensorflow.python.framework.errors_impl.NotFoundError:
tensorflow/contrib/util/tensorflow/contrib/cudnn_rnn/python/ops/_cudnn_rnn_ops.so:
cannot open shared object file: No such file or directory
why is it looking for cuda when I've only got the cpu version installed? (Let alone the fact that I'm still on my development machine with cuda, so it should find it anyway. I can use tensorflow-gpu/cuda fine in uncompiled scripts. But this is irrelevant because deployment machines won't have cuda)
My first thought was that somehow I'm importing the wrong tensorflow, but I've not only used pip uninstall tensorflow-gpu but then I also went to delete the tensorflow-gpu in /usr/local/lib/python3.5/dist-packages/
Any ideas what could be happening? Maybe I need to start using a virtual-env..

Inception v3 guide on tensorflow broken for C++ and python

I'm following the guide here on running the pretrained inception v3 https://www.tensorflow.org/versions/r0.11/tutorials/image_recognition/index.html
However, when I try the python version, I get:
python classify_image.py
I tensorflow/stream_executor/dso_loader.cc:111] successfully opened CUDA library libcublas.so locally
I tensorflow/stream_executor/dso_loader.cc:111] successfully opened CUDA library libcudnn.so locally
I tensorflow/stream_executor/dso_loader.cc:111] successfully opened CUDA library libcufft.so locally
I tensorflow/stream_executor/dso_loader.cc:111] successfully opened CUDA library libcuda.so.1 locally
I tensorflow/stream_executor/dso_loader.cc:111] successfully opened CUDA library libcurand.so locally
Traceback (most recent call last):
File "classify_image.py", line 227, in <module>
tf.app.run(main=main, argv=[sys.argv[0]] + unparsed)
TypeError: run() got an unexpected keyword argument 'argv'
Ok.. Fine nevermind let me try the C++ Version.
Downloaded the model, run the bazel command:
➜ tensorflow git:(master) ✗ bazel build tensorflow/examples/label_image/...
.......
ERROR: /storage/git/tensorflow/tensorflow/tensorflow.bzl:636:21: syntax error at '=': expected expression.
ERROR: /storage/git/tensorflow/tensorflow/tensorflow.bzl:711:1: nested functions are not allowed. Move the function to top-level.
ERROR: /storage/git/tensorflow/tensorflow/tensorflow.bzl:739:1: nested functions are not allowed. Move the function to top-level.
ERROR: /storage/git/tensorflow/tensorflow/tensorflow.bzl:773:1: nested functions are not allowed. Move the function to top-level.
ERROR: /storage/git/tensorflow/tensorflow/tensorflow.bzl:776:1: nested functions are not allowed. Move the function to top-level.
ERROR: com.google.devtools.build.lib.packages.BuildFileContainsErrorsException: error loading package '': Extension 'tensorflow/tensorflow.bzl' has errors.
INFO: Elapsed time: 0.600s
...Okay then. Neither seems to work. Or perhaps I'm doing this wrong. Anyone has any guidance? :)
Using tensorflow 0.11 on Ubuntu 16, Anaconda distribution python 3.5
Thanks!
If it helps anyone:
Solving the C++ problem: Update Bazel to the correct version (you likely installed tensorflow ages ago and git pulled the latest which requires a new bazel version)
Solving the python problem: Remove the argv command.

Categories

Resources