Running a programme in cpu and Gpu without using two scripts - python

I am working on solving a problem using ml as well as deep learning in python. Deep learning models are trained on gpu whereas machine learning on cpu. Since in my code the ml part comes after dl it is executed only after dl part is completed. In theory since they will use different resources they can be run together. Is there any way to do it. One naive way I can think is to split code in two scripts and run but I am looking for a sophisticated way.
Thanks

Related

Utilizing hardware AI accelerators with PyTorch

I'm pretty new to StackOverflow, but also to using PyTorch. I'm an AI and CS major, and I'm working on a project involving processing video with ML models. I'm not going to get into the details because I want any answers to this question to be generally accessible to others using pytorch, but the issue is I'm using pytorch with vapoursynth at the moment, accelerating both with CUDA, but I'm looking into purchasing as AI accelerator like this:
Amazon
Documentation on using these with Tensorflow is pretty easy to find, but I'm having trouble trying to answer for myself how I can use one of these with PyTorch. Does anybody have experience with this? I'd simply like to be able to use this card to accelerate training a Neural Net.
It is correct that you would need to convert your code to run on XLA, but that includes only changing few lines in your code. Please refer to https://github.com/pytorch/xla README doc for references and guides. With few modifications you can get significant training speedup.
I think the experience of using Pytorch on TPU would be less smooth than it on nvidia GPU. As far as I know, you have to use XLA to convert pytorch models to make them able to run on TPU.

Tensorflow 2.7 GPU Memory not released

I am currently working on 1D Convolutional Neural Networks for Time Series Classification. Recently, i got CUDA working on my GeForce 3080 (which was a pain itself). However, i noticed a weird behavior when using tensorflow and cuda. After training a model, the gpu memory is not released, even after deleting the variables and doing garbage collection. I tried reseting the tf graph and closing the tf sessions, but the gpu memory stays allocated. This results in cross validation crashing and me having to restart my python environment every time i want to make changes and retrain my model.
After a tideous search, I found out people have been struggling with this 5 years ago. However, I am right now using tf 2.7. I am working on Ubuntu 20.04.3. Some of my colleagues are using windows and are not experiencing these problems. However, it seems like they do not have any issues with models not being able to be retrained because of already allocated memory.
I found the workaround using multiple processes, but wasn't able to get it to work for my model using 10 fold cv.
As the issue has been up for more than 5yrs now and my colleagues not having any problems, I was wondering if I am doing sth. wrong. I think that issue might very likely have been fixed after 5 years, which is why I think my code is the problem here.
Is there any solution / guide for tf 2.7 and memory allocation of the gpu?

Is there any way to train numpy neural networks faster?

I implemented a Neural Network class using only python and numpy, and I want to do some experiments with it. The problem is that it takes so long to train. My computer does not have a high-end GPU nor a wonderful CPU, so I thought about some sort of 'cloud training'.
I know libraries such as TensorFlow or PyTorch use backends to train neural networks faster, and I was wondering if something similar could be achieved with numpy. Is there a way to run numpy in the cloud?
Even if it is slow and doesn't use GPUs would be fine for me. I tried to load my files to Google Colab, but it didn't work so well. It stopped running due to inactivity after some time.
Is there any nice solution out there?
Thanks for reading it all!
Try to use cupy instead of numpy, it runs on GPU (works well on colab GPU instance) and maybe you should do just some little modifications to your code.

SampleRNN - Pytorch implementation beginner

I'm trying to start work with this: https://github.com/deepsound-project/samplernn-pytorch
I've installed all the library dependencies through Anaconda console but I'm then not sure how I'm to run the python training scripts.
I guess I just need general help with getting a git RNN in python working? I've found a lot of tutorials that show working from notebooks in Jupyter or even from scratch but can't find ones working from python code files?
I'm sorry if my terminology is backward, I'm an architect who is attempting coding, note a software engineer.
There are instructions for getting the SampleRNN implementation working in terminal on the git page. All of the commands listed on the page are for calling the Python scripts from terminal, not from a Jupyter Notebook. If you've installed all the correct dependencies then in theory all you should need to do is call the terminal scripts to try it out.
FYI it took me a while to find a combination of parameters with which this model would train without running into memory errors, but I was working with my own dataset, not the one provided. It's also very intensive - the default train time is 1000 epochs which even on my relatively capable GPU was prohibitively high, so you might want to reduce that value considerably just to reach the end of a training cycle unless you have a sweet setup :)

Designing an interface between C++ and Python

I have code for a physics simulation in C++. I am trying to solve a control problem using Deep Reinforcement Learning. Most of the popular packages such as keras, pytorch, are based on Python. So, my machine learning code resides in Python and the physics simulator code is in C++.
At every iteration of the machine learning algorithm, I require a call to the C++ program and want the program to persist its state (maintain the variable values). One way I came up with was reading and writing variable values to files. But this didn't seem scalable considering the high number of variables in the program. I googled and found Boost Python library for wrapping my entire C++ code, so that it might be available to the Python program. My question is by doing this do I lose all the speed up of running the code in C++?
Also, I came across numba, which according to the creators is specifically designed for speed up for Scientific Computing and can reach speeds as fast as native C++. But, this would essentially mean re-writing the entire code in Python. Will this be a better choice?
I am on a strict timeline for my project and any advice on which way I should go will be much appreciated.

Categories

Resources