Can I run python scripts in colab?

Can I run python scripts in colab? - python

I am trying to get some open source software working. It uses things that I don't have on my system (pytorch for example) and so I thought that I could try to run it on Google Colab.
When I tried to do it though, there are some python scripts that I have to run after cloning a directory from a github repository. I guess I can't run another python script from inside a Jupyter Notebook, and so I suppose that I'm trying to do something with Colab that it isn't designed to do?
Is there something available that is more like a terminal, but using the software, GPUs etc. that are available on Colab?

You can run any shell command from jupyter-like environment (which includes colab) using ! in code cell, for example
!ls
Would list all files in colab's cwd.
To run python script you could do:
!python script.py
It works just like terminal (it might be python3, not sure how it's setup un colab)

You can call your script too.
!python script.py
But you need to put the script there, probably by git clone or direct uploading.

As Wayne mentions in the comment korakot's answer, you can use the magic command
%run 'script.py'
This also allows you to do e.g. run in the notebook's namespace by using the -i parameter
%run -i 'script.py'

Related

How to use Google Colab to run .py files from Github repositories?

I would like to use Google Colab to accelerate the process of calculation of my deep learning model. However, I cannot run the model directly from the .ipynb file in the Google Colaboratory, since it has several .py functions written separately and then the main program will call them together in another .py file.
One of the proposed solutions consisted of following three steps:
Commit the code on Github
Clone on collab
run this command: !python model_Trainer.py on Colab
I have done steps 1 and 2 successfully, however, I still cannot run the third step. And I am getting the following error: python3: can't open file 'model_trainer.py': [Errno 2] No such file or directory.
Does anyone have a solution to this problem?
I look forward to hearing from you.

When you clone a GitHub repo in Colab, it will create a directory for that repo.
To run your python script, you must use %cd before using !python.
Example:

I have the same problem.
you can use !pwd to see the current dictionary and cd to change the dictionary which your python files locate.
If you wanna run the .py file, you can just use !python example.py to run.

How to execute Kaggle Api commands on windows system?

I'm referring to https://github.com/Kaggle/kaggle-api
I tried executing the sample commands listed on the page in windows CMD and Python's IDLE. Not sure where it should be executed or how can I go to Kaggle CLI?
Eg. command: kaggle datasets list -s demographics
Windows CMD says: 'kaggle' is not recognized as an internal or external command,
operable program or batch file.

Assuming the Kaggle API has been successfully installed using pip and the python install location along with the location of the Scripts\ folder have been added into the PATH; the execution of kaggle directly within Windows command prompt (CMD) should be able.
In order to ensure Python and the folder Scripts\ have been added into the PATH execute the command WHERE python3 succeeding WHERE kaggle.
If any of the two commands above produce an equivalent output of INFO: Could not find files for the given pattern(s) manually modify the PATH using the directions in Excursus: Setting environment variables to add both python install location and location of the Scripts\ folder.

You can run Bash commands on Windows using the Bash shell, which is a little tricky to launch the first time. You can find instructions on how to do that here: https://www.windowscentral.com/how-install-bash-shell-command-line-windows-10
Hope that helps! :)

How to run a new Jupyter Notebook file that's not part of a pre-built docker image in docker?

I am new to Docker. In order to take the Udacity Deep Learning course, I had to set up TensorFlow on my Windows machine using Docker. (Although TensorFlow is now available on Windows, it only supports Python 3.5, however the Udacity course material requires Python 2.7. Therefore, I have to stick with the Docker way of using TensorFlow.)
To work on the assignments, I followed the instructions here as detailed below:
First, I installed docker toolbox.
Then, I launch Docker using the Docker Quickstart Terminal. For the first time, I ran:
docker run -p 8888:8888 --name tensorflow-udacity -it gcr.io/tensorflow/udacity-assignments:0.6.0.
Each time after, I just run this in my docker terminal:
docker start -ai tensorflow-udacity
Finally, in the address bar, with http://192.168.99.100:8888 I get the assignment Jupyter notebooks up and running (see image below).
However, what I want now is to run the final project of the course which is not part of the pre-built Udacity docker image. How can I do that? The final project can be found here, with the "digit_recognition.ipynb" specifically being the file to run in docker.
Any guidance is much appreciated.

First of all, you need a way to get this Jupyter notebook (final project) on your Docker instance.
What is an easy way to copy a file inside of a Docker container? Well, not a lot.
We could attach a volume.
We could rewrite the Dockerfile to include the final project.
We could also enter in the Docker container and download the file.
I am going to detail the last one, but, don't forget that there are many solutions to one problem.
How do we enter in the Docker container?
docker exec -it [container-id] bash
How can we get the [container-id] ?
docker ps
It will show you a list of containers, match the one you want to enter in.
Once you're in your container. How can we download the file we want?
We should try to figure out if we have wget or curl utilities to download a file. If we don't, we have to install them from any package manager available (try apt-get, if it works, do: apt-get install wget).
Once we have something to download files from the Internet, we have to find out where the notebooks are stored. That is the difficult part.
Look for any folder which might contain, there could also be some kind of magic one liner to type using find, unfortunately, I am no magic wizard anymore.
Let's assume you are in the good folder.
wget https://raw.githubusercontent.com/udacity/machine-learning/master/projects/digit_recognition/digit_recognition.ipynb
That's all! Reload your page and you should see the notebook displayed.
Side-note: You might also need to install extra dependencies in the container.

An alternative and much easier way is this:
Just start up your container like this: $ docker start -ai
tensorflow-udacity
Then, click the upload button and locate the final
project iPython Notebook file and upload it.
That's it. Whatever changes you make will be retained and you'll be able to see the new file in the container going forward!

how to use spark with python or jupyter notebook

I am trying to work with 12GB of data in python for which I desperately need to use Spark , but I guess I'm too stupid to use command line by myself or by using internet and that is why I guess I have to turn to SO ,
So by far I have downloaded the spark and unzipped the tar file or whatever that is ( sorry for the language but I am feeling stupid and out ) but now I can see nowhere to go. I have seen the instruction on spark website documentation and it says :
Spark also provides a Python API. To run Spark interactively in a Python interpreter, use bin/pyspark but where to do this ? please please help .
Edit : I am using windows 10
Note:: I have always faced problems when trying to install something mainly because I can't seem to understand Command prompt

If you are more familiar with jupyter notebook, you can install Apache Toree which integrates pyspark,scala,sql and SparkR kernels with Spark.
for installing toree
pip install toree
jupyter toree install --spark_home=path/to/your/spark_directory --interpreters=PySpark
if you want to install other kernels you can use
jupyter toree install --interpreters=SparkR,SQl,Scala
Now run
jupyter notebook
In the UI while selecting new notebook, you should see following kernels availble
Apache Toree-Pyspark
Apache Toree-SparkR
Apache Toree-SQL
Apache Toree-Scala

When you unzip the file, a directory is created.
Open a terminal.
Navigate to that directory with cd.
Do an ls. You will see its contents. bin must be placed
somewhere.
Execute bin/pyspark or maybe ./bin/pyspark.
Of course, in practice it's not that simple, you may need to set some paths, like said in TutorialsPoint, but there are plenty of such links out there.

I understand that you have already installed Spark in the windows 10.
You will need to have winutils.exe available as well. If you haven't already done so, download the file from http://public-repo-1.hortonworks.com/hdp-win-alpha/winutils.exe and install at say, C:\winutils\bin
Set up environment variables
HADOOP_HOME=C:\winutils
SPARK_HOME=C:\spark or wherever.
PYSPARK_DRIVER_PYTHON=ipython or jupyter notebook
PYSPARK_DRIVER_PYTHON_OPTS=notebook
Now navigate to the C:\Spark directory in a command prompt and type "pyspark"
Jupyter notebook will launch in a browser.
Create a spark context and run a count command as shown.

RPM Post Install Execute with Python

I'm working on a deployment process for work and have run into a bit of a snag. Its more of a quality of life thing than anything else. I've been following Hynek Schlawack's excellent guide and have gotten pretty far. The long and short of what I'm trying to do is install a python application along with a deployment of the python version I'm currently using. I'm using fpm to build an RPM that will then be sent and installed to site.
As part of my deployment, I'd like to run some post-install scripts. Which I can specify in fpm using the "--post-install {SCRIPT_NAME}" This works all well and good when the script is an actual linux script. However, I'd really like to run a python script as my post-install. I can specify an executable python script, but it fails because I believe it is trying to execute the script as: bash my_python_script.py
Does anyone know if there is a way to execute a python script post-install of an RPM?
Thanks in advance!

In the spec file you can specify what interpreter the %post script is for by using the -p parameter, e.g. %post -p /usr/bin/perl .

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.