Tensorflow code on google cloud platform

Tensorflow code on google cloud platform - python

I am facing a strange issue. I have a working code for a very simple neural network. I am running it on my laptop. Kind of slow but ok. I then created a 24 cores instance (Linux) on google cloud, and run the same code. It seems to take almost the same time. I expected to be a lot faster. Any idea why this could be the case? I am using a standard, vanilla pip installation of cpu tensorflow. Nothing fancy.
Would appreciate any ideas...
Best, Umberto

Related

cloud 9 and sagemaker - hyper parameter optimisation

I have done quite a few google searches but have not found a clear answer to the following use case. Basically, I would rather use cloud 9 (most of the time) as my IDE rather than Jupyter. What I am confused/not sure about is, how I could executed long running jobs like (Bayesian) hyper parameter optimisation from there. Can I use Sagemaker capabilities? Should I use docker and deploy to ECR (looking for the cheapest-ish option)? Any pointers w.r.t. to this particular issue would be very much appreciated. Thanks.

You could use whatever IDE you choose (including your laptop).
SaegMaker tuning job (example) is asynchronous, so you can safely close your IDE after launching it. You can monitor the job the AWS web console, or with a DescribeHyperParameterTuningJob API call.
You can launch TensorFlow, PyTorch, XGBoost, Scikit-learn, and other popular ML frameworks, using one of the built-in framework containers, avoiding the extra work of bringing your own container.

Run a ML program locally using the Colab GPU

I have a Lenovo as computer, but there is no GPU installed. So when I run a machine learning program written in python, it runs it on my local CPU. I know that Colab provides us a GPU for free. To use it, I need to take the content of all the python files from my ML program and put it in this Colab notebook. It is not very convenient at this point. Is it possible to run in any ways my ML program from my computer using directly the Colab GPU without using the Colab Notebook directly.
EDIT
Be aware that I don't want to work from Jupiter Notebook. I would like to work in Visual Studio Code and run the code on the Colab GPU directly instead of my CPU

It is possible to run. Check this article out
https://amitness.com/vscode-on-colab/
and
https://github.com/abhi1thakur/colabcode

Not that I know of. Colab's GPU and notebook runs on Google's computers. Your local jupyter notebook runs on your computer alone and sort of can't communicate to Google's computers. This is not a physics limitation or anything. It's just that no one has integrated them before.
What you can do though, to make the transfers quickly, is to create a git repo for all of your files, commit them to GitHub, then pull them down in colab's notebooks. It's relatively quick, syncs well, and serves as a backup.

Be aware that I don't want to work from Jupiter Notebook. I would like to work in Visual Studio Code and run the code on the Colab GPU directly instead of my CPU
Nope, not possible.
Update reason:
Colab itself is a jupyter notebook, you can't take away the machine resources to link to your pc and use other software with it.
If this possible, people will already abuse it and use it for mining crypto, run-heavy load programs, etc.
Colab is a free product by Google to introduce you to their cloud compute services. This mean colab have its own limitation
"Colab resources are not guaranteed and not unlimited, and the usage limits sometimes fluctuate. " -Colab FAQ
If you are a fan of colab, you might want to try the pro version for just $10/month

Did you check out colab-ssh? You SSH into colab from VS Code and can leverage the GPU the same as you would on colab.

Does running scripts from Cloud (AWS/Google/Azure) make my algorithms faster?

Hello I have designed some algorithms that we would like to implement in our company's software (start-up) but some of them take too long (10-15 min) as it is handling big datasets.
I am wondering if using for example Google Cloud to run my scripts, as it would use more nodes, it would make my algorithm to run faster.
Is it the same to run a script locally in Jupyter for instance than running it within Cloud?
Thinking of using Spark too.
Thank you

I think the only applicable answer is "it depends". The cloud is just "someone else's computer", so if it runs faster or not depends on the cloud server it's running on. For example if it is a data-intensive task with a lot of I/O it might run faster on a server with a SSD than on your local machine with a HDD. If it's a processor intensive task, it might run faster if the server has a faster CPU than your local machine has. You get the point.

Deploy TensorFlow model to server?

I am trying to deploy a Python ML app (made using Streamlit) to a server. This app essentially loads a NN model that I previously trained and makes classification predictions using this model.
The problem I am running into is that because TensorFlow is such a large package (at least 150MB for the latest tensorflow-cpu version) the hosting service I am trying to use (Heroku) keeps telling me that I exceed the storage limit of 300MB.
I was wondering if anyone else had similar problems or an idea of how to fix/get around this issue?
What I've tried so far
I've already tried replacing the tensorflow requirement with tensorflow-cpu which did significantly reduce the size, but it was still too big so -
I also tried downgrading the tensorflow-cpu version to tensorflow-cpu==2.1.0 which finally worked but then I ran into issues on model.load() (which I think might be related to the fact that I downgraded the tf version since it works fine locally)

I've faced the same problem last year. I know this does not answer your Heroku specific question, but my solution was to use Docker with AWS Beanstalk. It worked out cheaper than Heroku and I had less issues with deployment. I can guide on how to do this if you are interested

You might have multiple modules downloaded. I would recommend you to open file explorer and see the actual directory of the downloaded modules.

Set up spark using an external virtual machine

I am not as huge a computer person as many others on here, I majored in math with MatLab as my main computer knowledge. I have recently got involved with Apache Spark through the excellent edX course offered by Berkeley.
The method that they used for setting up Spark was provided in a great step by step guide, it involved: downloading Oracle VM Virtual Box with an Ubuntu 32bit VM, then through the use of a vagrant (again I'm not hugely computer-y so not 100% sure how this worked or what it is) connect this to IPython notebook. This enabled me to have access to Spark over the internet and to code in python with pySpark, this is exactly what I want to do.
Everything was going very well until the second lab exercise, it became apparent that my Windows laptop has insufficient free memory (just 3 Gb and four years old) after it continually froze and crashed when trying to work with large datasets.
It is not possible to have a VM in a VM apparently so I have spent most of today looking for alternative ways of setting up Spark to no avail; the guides are all aimed at someone with more computer knowledge than I have.
My (likely naive) idea now is to rent an external machine that I can interface with through my windows laptop completely as before but so that the virtual machine operates outside of the memory of my laptop i.e. in the cloud (using any of Ubuntu, Windows, etc.). Essentially I want to move the Oracle VM virtual box to an outside source to rid my computer of memory burdens and to use Ipython notebook as before.
How can I set up a virtual machine to use for the computational side of Spark in Ipython notebook?
Or is there an alternate method that would be simple to follow?

Don't run VMs. Instead:
Download the latest Spark version. (1.4.1 at the moment.)
Extract the archive.
Run bin/pyspark.cmd.
It's not an IPython Notebook, but you can run Python code against a local Spark instance.
If you want a beefier instance, do the same on a beefy remote machine. For example an EC2 m4.2xlarge is $0.5 per hour with 8 cores and 30 GB of RAM.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.