please use torch.load with map_location=torch.device('cpu') - python

I am running Python program, but I do not have a GPU, what can I do to make Python use CPU instead of GPU?
$ python extract_feature.py --data mnist --net checkpoint_4.pth.tar --features pretrained
It gives me the following warning:
=> RuntimeError: Attempting to deserialize object on a CUDA device but torch.cuda.is_available() is False. If you are running on a CPU-only machine, please use torch.load with map_location=torch.device('cpu') to map your storages to the CPU.
The photo is the Structure of my Python project:

I got into a similar error. Then by trying the following workaround issue is solved. (If your model is .pth or .h5 format.)
MODEL_PATH = 'Somemodelname.pth'
model.load_state_dict(torch.load(MODEL_PATH,
map_location=torch.device('cpu')))
If you want certain GPU to be used in your machine. Then,
map_location = torch.device('cuda:device_id')

Related

MemoryError Precedes BrokenPipeError While Training CNN

Generally for running pip with no cache we use --no-cache-dir, like
pip install pytorch --no-cache-dir.
I downloaded a CNN model I want to use from github.
The first two lines of execution
python generate_dataset.py --is_train=True --use_phase=True --chip_size=100 --patch_size=94 --use_phase=True --dataset=soc
python generate_dataset.py --is_train=False --use_phase=True --chip_size=128 --patch_size=128 --use_phase=True --dataset=soc
executed succesfully. But while running
python train.py --config_name=config/AConvNet-SOC.json
It is giving MemoryError.
The publisher of above repository is using 32GB RAM and 11 GB GPU. But I have 8 GB RAM and 8GB GPU.
Here is what I have done:
I thought of running it without cache. like,
python train.py --config_name=config/AConvNet-SOC.json --no-cache-dir
But it is throwing below error
FATAL Flags parsing error: Unknown command line flag 'no-cache-dir' Pass --helpshort or --helpfull to see help on flags.
I think it is because no-cache-dir argument is not defined in it by using absl.flags. Does python supports using no chache directory implementation
I am able to solve it by decreasing the number of epochs and batch_size. But I want to run it for full epochs.
Using zeo_grad() of Pytorch makes the gradients zero for every minibatch, so that GPU won't run out of memory. But it is already used in the code I am running in _base.py. Is there anyway I can leverage more of this.
How to resolve this.

training YOLOv7 on CPU provides CUDA error

I am trying to run train a yolov7 model without a gpu. This is currently the command line that I am using on colab.
python train_aux.py --workers 1 --device cpu --batch-size 1 --data data/coco.yaml --img 128 128 --cfg /content/yolov7/cfg/training/yolov7-e6e.yaml --weights '' --name yolov7-e6e --hypdata/hyp.scratch.p6.yaml`
For some reason I first get an warning
warnings.warn('User provided device_type of \'cuda\', but CUDA is not available. Disabling')
and then I get the error
RuntimeError: No CUDA GPUs are available
during the first epoch. I dont understand why it is trying to use cuda when I am running it on CPU. Am I missing some spot that I have to edit in the code to fix this? Here is the link to the github that I am using
I have tried to download the cuda library incase that helped using.
!pip install cuda-python
but it didnt solve the issue.
So it looks like this issue is due to cuda being hard coded into the model for certain procedures. A more in-depth explanation can be found here link.In the meantime removing the --device cpu for some reason fixed it.

Locally opening a transformers saved model

I have a saved transformers model using BertModel.from_pretrained('test_model')
I have trained this model using google colab's GPUs
Then, I want to open it, with BertModel.from_pretrained('test_model/')
but I do not have a GPU in my local PC. I get this:
/home/seiji/.local/lib/python3.8/site-packages/torch/cuda/__init__.py:52: UserWarning: CUDA initialization: Found no NVIDIA driver on your system. Please check that you have an NVIDIA GPU and installed a driver from http://www.nvidia.com/Download/index.aspx (Triggered internally at /pytorch/c10/cuda/CUDAFunctions.cpp:100.)
return torch._C._cuda_getDeviceCount() > 0
What shoud I do? I have no idea of how can I open it using a CPU. And is it possible?
The best thing you can do is save the CPU version of the model, i.e:
model.cpu().save_pretrained("model_directory")
All the pre-trained Huggingface models are saved as CPU models anyway and you always need to move them to GPU explicitly.
PyTorch allows loading GPU models on CPU (see https://discuss.pytorch.org/t/on-a-cpu-device-how-to-load-checkpoint-saved-on-gpu-device/349), but the arguments of torch.load you would need to set are not exposed via the API, so you would need write your own from_pretrained method.

Converted ONNX model runs on CPU but not on GPU

I converted a TensorFlow Model to ONNX using this command:
python -m tf2onnx.convert --saved-model tensorflow-model-path --opset 10 --output model.onnx
The conversion was successful and I can inference on the CPU after installing onnxruntime.
But when I create a new environment, install onnxruntime-gpu on it and inference using GPU, I get different error messages based on the model. E.g. for MobileNet I receive W:onnxruntime:Default, cuda_execution_provider.cc:1498 GetCapability] CUDA kernel not supported. Fallback to CPU execution provider for Op type: Conv node name: StatefulPartitionedCall/mobilenetv2_1.00_224/Conv1/Conv2D
I tried out different opsets.
Does someone know why I am getting errors when running on GPU
That is not an error. That is a warning and it is basically telling you that that particular Conv node will run on CPU (instead of GPU). It is most likely because the GPU backend does not yet support asymmetric paddings and there is a PR in progress to mitigate this issue - https://github.com/microsoft/onnxruntime/pull/4627. Once this PR is merged, these warnings should go away and such Conv nodes will run on the GPU backend.

Is there a way to use a compiled keras model on the RPI Zero?

I am working on a Letter Recognition Application for a robot. I used my home PC for training the model and wanted the recognition to be on the RPI Zero W with the already trained model.
I got an HDF model. When I try to install Tensorflow on the RPI zero, it's throwing a hash error, as far as I found it this is due to TF beeing for 64bit machines. When I try to install Tensorflow Lite, the installation stocks and crashes.
For saving the model I use:
classifier.save('test2.h5')
That are the Prediction lines:
test_image = ks.preprocessing.image.load_img('image.jpg')
test_image = ks.preprocessing.image.img_to_array(test_image)
result = classifier.predict(test_image)
I also tried to compile the python script via Nuitka, but as the RPI is ARM and nuitka is not offering cross-compile, this possibility felt out.
You can use already available TFLite to solve your issue.
If that does not help, you can also build TFLite from source.
Please refer to below links:
https://www.tensorflow.org/lite/guide/build_rpi
https://medium.com/#haraldfernengel/compiling-tensorflow-lite-for-a-raspberry-pi-786b1b98e646

Categories

Resources