Python Caffe cpu & gpu mode simultaneously

Python Caffe cpu & gpu mode simultaneously - python

Is it possible to run Caffe in both CPU and GPU mode? I have several Caffe models, but my GPU resources are limited, so that I can't put all models into GPU memory. I want to use e.g. 3 models with GPU mode and 2 models with CPU mode, but set_mode_cpu() and set_mode_gpu() commands just switch the mode for the whole library.

In my opinion, you can write multiple python scripts one for each task.
In each script you can choose whether use CPU or GPU (and GPU device).
Then you can run these multiple scripts at the same time.
But run multiple tasks at one GPU card with slow down the speed severely in my experience. Good luck!

Related

Run code on GPU instead of CPU with detecto

I am using machine learning with detecto in Python. However, whenever I run it, I get a warning saying
It looks like you're training your model on a CPU. Consider switching to a GPU; otherwise,
this method can take hours upon hours or even days to finish. For more information, see
https://detecto.readthedocs.io/en/latest/usage/quickstart.html#technical-requirements
I have a GPU in the form of an Intel(R) HD graphics 4600, but for some reason the code is running on the CPU. I have checked out the link it gives which says
By default, Detecto will run all heavy-duty code on the GPU if it’s available and on the CPU otherwise.
It recommends using Google Collab if the computer doesn't have a GPU it can use, but I do have one, and don't want to use Google Collab.
Why is it running on the CPU instead of the GPU? And how can I fix it? The part of my code where I get the warning is
losses = fitmodel(loader, Test_dataset, epochs=25, lr_step_size=5,
learning_rate=0.001, verbose=True)
The code does work, however it takes ages to run, so want to be able to run it on the GPU to save time.

The GPU that detecto is referring to would need to be a CUDA capable Nvidia GPU. So your Intel(R) HD graphics 4600 does not meet this criterion.
Detecto uses pytorch internally, whichs GPU support is based on CUDA. So in order to use a GPU, you would need to move to a machine that has a CUDA capable card

CPU utilization when using ray and torch

I use ray and torch in my code and set one CPU core for each ray remote actor
to compute gradient(use torch package). But I find the CPU utilization of the actor
can go up to 300% in some time, This seems to be impossible since The actor is supposed to use
only one CPU core.
I want to know if the actor is actually using more CPU resources since torch may open one or more
threads to compute gradient.
My OS is Win10 and CPU is Ryzen 5600H. Thanks.

Ray currently does not automatically pin the actor to specific CPU cores and prevent it from using other CPU cores. So what you're seeing makes sense.
It is possible to use a library like psutil to pin the actor to a specific core and prevent it from using other cores. This can be helpful if you have many parallel tasks/actors that are all multi-threaded and competing with each other for resources (e.g., because they use pytorch or numpy).

Automatic GPU offloading in python

I have written a piece of scientific code in python, mainly using the numpy library (especially Fast Fourier Transforms), and a bit of Cython. Nothing in CUDA or anything GPU related that I am aware of. There is no graphic interface, everything runs in the terminal (I'm using WSL2 on Windows). The whole code is mostly about number crunching, nothing fancy at all.
When I run my program, I see that CPU usage is ~ 100% (to be expected of course), but GPU usage also rises, to around 5%.
Is it possible that a part of the work gets offloaded automatically to the GPU? How else can I explain this small but consistent increase in GPU usage ?
Thanks for the help

No, there is no automatic offloading in Numpy, at least not with the standard Numpy implementation. Note that some specific FFT libraries can use the GPU, but the standard implementation of Numpy uses its own implementation of FFT called PocketFFT based on FFTPack that do not use the GPU. Cython do not perform any automatic implicit GPU offloading. The code need to do that explicitly/manually.
No GPU offloading are automatically performed because GPUs are not faster than CPUs for all tasks and offloading data to the GPU is expensive, especially with small arrays (due to the relatively high-latency of the PCI bus and kernel calls in such a case). Moreover, this is hard to do efficiently even in case where the GPUs could be theoretically faster.
The 5% GPU usage is relative to the frequency of the GPU which is often configured to use an adaptative frequency. For example my discrete Nv-1660S GPU frequency is currently 300 MHz while it can automatically reach 1.785 GHz. Using actually 5% of a GPU running at a 17% of its maximum frequency with a 2D rendering of a terminal is totally possible. On my machine, printing lines in a for loop at 10 FPS in a Windows terminal takes 6% of my GPU still running at low-frequency (0-1% without running anything).
If you want to check the frequency of your GPU and the load there are plenty of tools for that starting from vendor tools often installed with the driver to softwares like GPU-z on Windows. For Nvidia GPU, you can list the processes currently using you GPU with nvidia-smi (it should be rocm-smi on AMD GPUs).

Is it possible to execute tensorflow-on-spark program without gpu suppport?

I want tensor-flow-on-spark programs(for learning purpose),& I don't have a gpu support . Is it possible to execute tensor-flow on spark program without GPU support?
Thank you

Yes, it is possible to use CPU. Tensorflow will automatically use CPU if it doesn't find any GPU on your system.

Tensorflow execution in a virtual machine with no GPUs

I had a question regarding tensorflow that is, somewhat critical to what task I'm trying to accomplish.
My scenario is as follows,
1. I have a tensorflow script that has been set-up, trained and tested. It is working well.
The training and testing was done on a devBox with 2 Titan X cards.
We need to now port this system to a live-pilot testing stage and are required to deploy it on a virtual-machine with Ubuntu 14.04 running atop of it.
Here lies the problem - A vm will not have access to underlying GPUs and must validate the incoming data in CPU only mode. My question,
Will the absence of GPUs hinder the validation process of my ML system? Does tensorflow, by default use GPUs for CNN computation and will the absence of a GPU affect the execution?
How do I run my script in CPU only mode?
Will setting CUDA_VISIBLE_DEVICES to none help with the validation in a CPU-only mode after the system has been trained on GPU boxes?
I'm sorry if this comes across as a noob question but I am new to TF and any advice would be much appreciated. Please let me know if you need any further information about my scenario.

Testing with CUDA_VISIBLE_DEVICES set to empty string will make sure that you don't have anything that depends on GPU being present, and theoretically it should be enough. In practice, there are some bugs in GPU codepath which can get triggered when there are no GPUs (like this one), so you want to make sure your GPU software environment (CUDA version) is the same.
Alternatively, you could compile TensorFlow without GPU support (bazel build -c opt tensorflow), this way you don't have to worry about matching CUDA environments or setting CUDA_VISIBLE_DEVICES

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.