ERROR WHEN IMPORTING PYTORCH (The filename or extension is too long) - python

I'm using Anconda to run my Transformers project locally in google colab.
I've created a new environment (tf_gpu) and installed (supposedly) everything I need.
And everything works fine, but when I try to simply import pytorch, this error appears:
[WinError 206] The filename or extension is too long: 'C:\\Users\\34662\\anaconda3\\envs\\tf_gpu\\lib\\site-packages\\torch\\lib'
When clearly the path is not long enough to trigger this error.
My python version is 3.8, and my GPU is a Nvidia GeForce GTX 1650, so it shouldn't be a GPU problem
Does anybody knows why this happens?
Any help is good at this point, I don't know how to solve this.
Here I leave a screenshot of the complete error message
Thank you in advance.

Your problem is that the error ist not a too long path error it is a file not found error which mean that pytorch is not correctly installed

Related

Why can't I use Tensorflow on Windows 7?

I'm in trouble. I did my best and I'm still having problems with Tensorflow. I just wanted to use something, but I can't, and that makes me extremely frustrated. Who knows when I'll be able... Anyway, I'll tell you what happened, maybe some blessed soul will clear my doubts once and for all.
I have a Windows 7 notebook, my CPU apparently doesn't support AVX, and I don't have GPU. I tried to install two versions of tensorflow that don't require AVX. Obviously, one at a time, I didn't try to install both at the same time haha.
With python 3.6: tensorflow-1.5.0-cp36-cp36m-win_amd64.whl
When using this, an error appears: ImportError: DLL load failed with error code -1073741795.
Failed to load the native TensorFlow runtime
With python 3.7: tensorflow-1.11.0-cp37-cp37m-win_amd64.whl
When using this, an error appears: ImportError: DLL load failed with error code 3221225501.
Failed to load the native TensorFlow runtime.
That is, nothing worked.
Some settings on my PC:
Microsoft Windows 7 Professional,
Processor: Intel(R) Celeron(R) CPU 847 # 1.10GHz, 1100 Mhz, 2 Cores, 2 Logic Processors,
System Type: x64-based PC,
Physical Memory (RAM): 4.00GB,
Please help me.
Try using Tensorflow cpu
pip install tensorflow-cpu
https://pypi.org/project/tensorflow-cpu/
If it doesn't work you can try this repo https://github.com/fo40225/tensorflow-windows-wheel, it provides Legacy & low-end CPU (without AVX) support.

Tensorflow CUDA - CUPTI error: CUPTI could not be loaded or symbol could not be found

I use the Tensorflow v 1.14.0. I work on Windows 10. And here is how relevant environment variables look in the PATH:
C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v10.0\bin
C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v10.0\libnvvp
C:\Program Files (x86)\NVIDIA Corporation\PhysX\Common
C:\Users\sinthes\AppData\Local\Programs\Python\Python37
C:\Users\sinthes\AppData\Local\Programs\Python\Python37\Scripts
C:\Program Files\NVIDIA Corporation\NVIDIA NvDLISR
C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v10.0\cuda\bin
Maybe also worth to mention, just in case it might be relevant.. I use Sublime Text 3 for development and I do not use Anaconda. I find it a bit cumbersome to make updates on tensorflow in the conda environment so I just use Sublime Text right now. (I was using Anaconda (Spyder) previously but I uninstalled it from my computer.)
Things seem to work fine except with some occasional strange warnings. But one consistent warning I get is the following whenever I run the fit function.
E tensorflow/core/platform/default/device_tracer.cc:68] CUPTI error: CUPTI could not be loaded or symbol could not be found.
And here is how I call the fit function:
history = model.fit(x=train_x,
y=train_y,
batch_size=BATCH_SIZE,
epochs=110,
verbose=2,
callbacks=[tensorboard, checkpoint, reduce_lr_on_plateau],
validation_data=(dev_x, dev_y),
shuffle=True,
class_weight=class_weight,
steps_per_epoch=None,
validation_steps=None)
I just wonder why I see the CUPTI Error message during the run time? It is only printed out once. Is that something that I need to fix or is it something that can be ignored? This message does not tell anything concrete to me to be able to take any action.
Add this in path for Windows:
C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v10.0\extras\CUPTI\libx64
The NVIDIA® CUDA Profiling Tools Interface (CUPTI) is a dynamic
library that enables the creation of profiling and tracing tools that
target CUDA applications.
CPUTI seems to have been added by the Tensorflow Developors to allow profiling. You can simply ignore the error if you don't mind the exception or adapt your environment path, so the dynamically linked library (DLL) can be found during execution.
Inside of you CUDA installation directory, there is an extras\CUPTI\lib64 directory that contains the cupti64_101.dll that is trying to be loaded. Adding that directory to your path should resolve the issue, e.g.,
SET PATH=C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v10.1\extras\CUPTI\lib64;%PATH%
N.B. in case you get an INSUFFICIENT_PRIVILEGES error next, try to run your program as administrator.
This answer is for Ubuntu-16.04.
I had this issue when I upgraded to Tensorflow-1.14 with Python2.7 and Python3.6. I had to add /usr/local/cuda/extras/CUPTI/lib64 to LD_LIBRARY_PATH with export LD_LIBRARY_PATH=/usr/local/cuda/extras/CUPTI/lib64:$LD_LIBRARY_PATH and logout and login. source ~/.bashrc didn't help. Note that my cuda folder was pointing to cuda-10.0.
Ran in to this same issue. This is what fixed it for me incase someone else has a similar problem fixing this.
error I received:
function cupti_interface_->Subscribe( &subscriber_, (CUpti_CallbackFunc)ApiCallback, this)failed with error CUPTI could not be loaded or symbol could not be found.
Windows Server 2019
Tensorflow 2.5
Cuda 11.2 (CUDA_PATH environment variable is set and added to the PATH environment variable)
Cudnn 8.1.0
I had already set C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.2\extras\CUPTI\lib64 in the PATH environment variable but was still receiving the error.
Running where /r c:\ cupti*.dll in a cmd prompt found dll's in the c:\Program Files\NVIDIA Corporation\Nsight Systems 2020.4.3\target-windows-x64\ directory. Simply adding this directory to the PATH environment variable fixed the error.
I had a similar error when trying to get tensorboard graph, I think it only affects you if you plan to use tensorboard.
I found the solution in this post but it is for linux
https://gist.github.com/Brainiarc7/6d6c3f23ea057775b72c52817759b25c
I think you need to create a library configuration file for cupti.
Here is what solved "my" problem:
Windows 10, Tensorflow-gpu 2.4
The first issue, was it was unclear about exactly "which" cupti64 version it was trying to load. With that in mind, I did a search for all dll's called cupti*
I then copied them all (yeh I know it's hack, but given limited information...) into my
C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.0\extras\CUPTI\lib64
folder (cupti64_2020.1.0.dll was in there already)
I then needed to also set the folder permission to get it to work, which was strange as I was running VS as admin
Here is what solved "my" problem:
I just replaced my tensorflow v 1.14 with tensorflow v 1.13.1. And no more CUPTI error messages. And even some other strange warnings / problems have disappeared. All issues should obviously have specific reasons but Tensorflow (many times) unfortunately does not provide understandable error/warning messages that give a good/fair idea that helps to solve the issue. And I end up spending hours (even days) on such strange problems, that reduces my productivity significantly.
One general learning for me (that might be relevant to share here) is that I should not be in hurry to upgrade my tensorflow installation to the latest version of it. The latest one is almost never stable, whenever I made a try, I ended up spending significant amount of time on problems that are caused by tensorflow. Poor documentation and error messages make it very very difficult to work with.
If anyone has a better answer, s/he is more than welcome to share his/her insights on the issue I shared in this question.
I also just ran into this issue, exactly as jreeves. I solved it exactly following jreeves' method (above). (Thanks to jreeves for his/her work in finding and documenting the solution.)
MY setup:
Windows 10
Gpu Support: True
Cuda Support: True
TensorFlow: 2.4.1
Python version: 3.8.8.
Tensorboard version 2.4.1.
Cuda 11.1
Cudnn 8.0.5

tf-faster-rcnn on Windows

Has anyone implement the Faster-RCNN for TensorFlow version on Windows?
I found some related repos as following:
1.Faster-RCNN for TensorFlow on Linux
https://github.com/endernewton/tf-faster-rcnn
2.Faster-RCNN for Caffe on Linux
https://github.com/rbgirshick/py-faster-rcnn
3.Faster-RCNN for Caffe on Windows
https://github.com/MrGF/py-faster-rcnn-windows
I successfully compiled the 'cpu_nms',but encountered error when trying to run demo.py:
tensorflow.python.framework.error_impl.InvalidArgumentError:ValueError:Buffer dtype mismatch, expected 'int_t' but got 'long long'
PS.Didn't compile gpu_nms cuz I don't know how to deal with 'kernel.cu' and 'gpu_nms.pyx'.I tried to do like what https://github.com/MrGF/py-faster-rcnn-windows did on 'setup_cuda.py' but failed, an error exactly the same as https://github.com/MrGF/py-faster-rcnn-windows/issues/17 happened:
LINK : fatal error LNK1181: cannot open input file 'ID=2.obj'
Has anyone implement the Faster-RCNN for TensorFlow version on Windows or can anyone give me some advice?
Thanks a lot.
LINK : fatal error LNK1181: cannot open input file 'ID=2.obj'
This error comes from the link command. In the command string, there's a segment 'ID=2.obj'. You can fix it by removing the string segment.

Compiling binary with tensorflow library for cpu: Cannot find cuda library?

In development, I have been using the gpu-accelerated tensorflow
https://storage.googleapis.com/tensorflow/linux/gpu/tensorflow_gpu-1.2.1-cp35-cp35m-linux_x86_64.whl
I am attempting to deploy my trained model along with an application binary for my users. I compile using PyInstaller (3.3.dev0+f0df2d2bb) on python 3.5.2 to create my application into a binary for my users.
For deployment, I install the cpu version, https://storage.googleapis.com/tensorflow/linux/cpu/tensorflow-1.2.1-cp35-cp35m-linux_x86_64.whl
However, upon successful compilation, I run my program and receive the infamous tensorflow cuda error:
tensorflow.python.framework.errors_impl.NotFoundError:
tensorflow/contrib/util/tensorflow/contrib/cudnn_rnn/python/ops/_cudnn_rnn_ops.so:
cannot open shared object file: No such file or directory
why is it looking for cuda when I've only got the cpu version installed? (Let alone the fact that I'm still on my development machine with cuda, so it should find it anyway. I can use tensorflow-gpu/cuda fine in uncompiled scripts. But this is irrelevant because deployment machines won't have cuda)
My first thought was that somehow I'm importing the wrong tensorflow, but I've not only used pip uninstall tensorflow-gpu but then I also went to delete the tensorflow-gpu in /usr/local/lib/python3.5/dist-packages/
Any ideas what could be happening? Maybe I need to start using a virtual-env..

undefined symbol: cudnnCreate in ubuntu google cloud vm instance

I'm trying to run a tensorflow python script in a google cloud vm instance with GPU enabled. I have followed the process for installing GPU drivers, cuda, cudnn and tensorflow. However whenever I try to run my program (which runs fine in a super computing cluster) I keep getting:
undefined symbol: cudnnCreate
I have added the next to my ~/.bashrc
export LD_LIBRARY_PATH="$LD_LIBRARY_PATH:/usr/local/cuda-8.0/lib64:/usr/local/cuda-8.0/extras/CUPTI/lib64:/usr/local/cuda-8.0/lib64"
export CUDA_HOME="/usr/local/cuda-8.0"
export PATH="$PATH:/usr/local/cuda-8.0/bin"
but still it does not work and produces the same error
Answering my own question: The issue was not that the library was not installed, the library installed was the wrong version hence it could not find it. In this case it was cudnn 5.0. However even after installing the right version it still didn't work due to incompatibilities between versions of driver, CUDA and cudnn. I solved all this issues by re-installing everything including the driver taking into account tensorflow libraries requisites.

Categories

Resources