How to apply GoogleColab stronger CPU and more RAM? - python

I use GoogleColab to test data stuctures like chain-hashmap,probe-hashmap,AVL-tree,red-black-tree,splay-tree(written in Python),and I store very large dataset(key-value pairs) with these data stuctures to test some operation running time,its scale just like a small wikipedia,so run these python script will use very much memory(RAM),GoogleColab offers a approximately 12G RAM but not enough for me,these python scripts will use about 20-30G RAM,so when I run python program in GoogleColab,will often raise an exception that"your program run over 12G upper bound",and often restarts.On the other hand,I have some PythonScript to do some recursion algorithm,as is seen to all,recursion algorithm use CPU vety mush(as well as RAM),when I run these algorithm with 20000+ recursion,GoogleColab often fails to run and restart,I knew that GoogleColab uses two cores of Intel-XEON CPU,but how do I apply more cores of CPU from Google?

You cannot upgrade the GPU and CPU but you can increase the RAM from 12 gb to 25gb just by crashing the session with just by any non ending while loop.
l=[]
while 1:
l.append('nothing')

There is no way to request more CPU/RAM from Google Colaboratory at this point, sorry.

Google Colab Pro recently launched for $9.99 a month (Feb. 2020). Users in the US can get higher resource limits and more frequent access to better resources.
Q&A from the signup page is below:
What kinds of GPUs are available in Colab Pro?
With Colab Pro you get priority access to our fastest GPUs. For example, you may get access to T4 and P100 GPUs at times when non-subscribers get K80s. You also get priority access to TPUs. There are still usage limits in Colab Pro, though, and the types of GPUs and TPUs available in Colab Pro may vary over time.
In the free version of Colab there is very limited access to faster GPUs, and usage limits are much lower than they are in Colab Pro.
How long can notebooks run in Colab Pro?
With Colab Pro your notebooks can stay connected for up to 24 hours, and idle timeouts are relatively lenient. Durations are not guaranteed, though, and idle timeouts may sometimes vary.
In the free version of Colab notebooks can run for at most 12 hours, and idle timeouts are much stricter than in Colab Pro.
How much memory is available in Colab Pro?
With Colab Pro you get priority access to high-memory VMs. These VMs generally have double the memory of standard Colab VMs, and twice as many CPUs. You will be able to access a notebook setting to enable high-memory VMs once you are subscribed. Additionally, you may sometimes be automatically assigned a high-memory VM when Colab detects that you are likely to need it. Resources are not guaranteed, though, and there are usage limits for high memory VMs.
In the free version of Colab the high-memory preference is not available, and users are rarely automatically assigned high memory VMs.

For a paid, high-capability solution, you may want to try Google Cloud Datalab instead

Related

Run a ML program locally using the Colab GPU

I have a Lenovo as computer, but there is no GPU installed. So when I run a machine learning program written in python, it runs it on my local CPU. I know that Colab provides us a GPU for free. To use it, I need to take the content of all the python files from my ML program and put it in this Colab notebook. It is not very convenient at this point. Is it possible to run in any ways my ML program from my computer using directly the Colab GPU without using the Colab Notebook directly.
EDIT
Be aware that I don't want to work from Jupiter Notebook. I would like to work in Visual Studio Code and run the code on the Colab GPU directly instead of my CPU
It is possible to run. Check this article out
https://amitness.com/vscode-on-colab/
and
https://github.com/abhi1thakur/colabcode
Not that I know of. Colab's GPU and notebook runs on Google's computers. Your local jupyter notebook runs on your computer alone and sort of can't communicate to Google's computers. This is not a physics limitation or anything. It's just that no one has integrated them before.
What you can do though, to make the transfers quickly, is to create a git repo for all of your files, commit them to GitHub, then pull them down in colab's notebooks. It's relatively quick, syncs well, and serves as a backup.
Be aware that I don't want to work from Jupiter Notebook. I would like to work in Visual Studio Code and run the code on the Colab GPU directly instead of my CPU
Nope, not possible.
Update reason:
Colab itself is a jupyter notebook, you can't take away the machine resources to link to your pc and use other software with it.
If this possible, people will already abuse it and use it for mining crypto, run-heavy load programs, etc.
Colab is a free product by Google to introduce you to their cloud compute services. This mean colab have its own limitation
"Colab resources are not guaranteed and not unlimited, and the usage limits sometimes fluctuate. " -Colab FAQ
If you are a fan of colab, you might want to try the pro version for just $10/month
Did you check out colab-ssh? You SSH into colab from VS Code and can leverage the GPU the same as you would on colab.

Does running scripts from Cloud (AWS/Google/Azure) make my algorithms faster?

Hello I have designed some algorithms that we would like to implement in our company's software (start-up) but some of them take too long (10-15 min) as it is handling big datasets.
I am wondering if using for example Google Cloud to run my scripts, as it would use more nodes, it would make my algorithm to run faster.
Is it the same to run a script locally in Jupyter for instance than running it within Cloud?
Thinking of using Spark too.
Thank you
I think the only applicable answer is "it depends". The cloud is just "someone else's computer", so if it runs faster or not depends on the cloud server it's running on. For example if it is a data-intensive task with a lot of I/O it might run faster on a server with a SSD than on your local machine with a HDD. If it's a processor intensive task, it might run faster if the server has a faster CPU than your local machine has. You get the point.

Google colab - is there a way to record peak RAM during session?

The bar on the top right isn't very precise and I can't just watch it all day.
Is there a precise way to estimate how much RAM is peaking at during a machine learning process?
we can use any Linux commands in google colab.
We can use any cmd command with add prefix '!'.
For example,
!ls
Your best bet is to either create a bash script that will run in the background and will save memory usage or use some of Linux system tools.
Another StackOverflow thread, which will be useful for you.

How do I get my application server CPU to 100%?

I have a dedicated application server that does analytics.
I'm running on 2CPU, 8GB RAM machine.
I have two same applications running like below.
python do_analytics.py &
python do_analytics.py &
However, my CPU is below 20%. Can I run more processes to make full use of my CPU? Will it speed up or my single processes will run slower now since I only have 2 CPU?
Thanks.
The fact that your CPU usage is below 20%, means that your CPU can take more load. So yes you can run more processes.
Will it speed up or my single processes will run slower now since I only have 2 CPU?
It depends on other factors of what your application is doing. If most of the analytic logic is just using the processing power and memory. You need not worry. But if more process mean more disk access or shared resource. Then running more process may reduce the overall performance.

Set up spark using an external virtual machine

I am not as huge a computer person as many others on here, I majored in math with MatLab as my main computer knowledge. I have recently got involved with Apache Spark through the excellent edX course offered by Berkeley.
The method that they used for setting up Spark was provided in a great step by step guide, it involved: downloading Oracle VM Virtual Box with an Ubuntu 32bit VM, then through the use of a vagrant (again I'm not hugely computer-y so not 100% sure how this worked or what it is) connect this to IPython notebook. This enabled me to have access to Spark over the internet and to code in python with pySpark, this is exactly what I want to do.
Everything was going very well until the second lab exercise, it became apparent that my Windows laptop has insufficient free memory (just 3 Gb and four years old) after it continually froze and crashed when trying to work with large datasets.
It is not possible to have a VM in a VM apparently so I have spent most of today looking for alternative ways of setting up Spark to no avail; the guides are all aimed at someone with more computer knowledge than I have.
My (likely naive) idea now is to rent an external machine that I can interface with through my windows laptop completely as before but so that the virtual machine operates outside of the memory of my laptop i.e. in the cloud (using any of Ubuntu, Windows, etc.). Essentially I want to move the Oracle VM virtual box to an outside source to rid my computer of memory burdens and to use Ipython notebook as before.
How can I set up a virtual machine to use for the computational side of Spark in Ipython notebook?
Or is there an alternate method that would be simple to follow?
Don't run VMs. Instead:
Download the latest Spark version. (1.4.1 at the moment.)
Extract the archive.
Run bin/pyspark.cmd.
It's not an IPython Notebook, but you can run Python code against a local Spark instance.
If you want a beefier instance, do the same on a beefy remote machine. For example an EC2 m4.2xlarge is $0.5 per hour with 8 cores and 30 GB of RAM.

Categories

Resources