I have been provided with access to a zip file/folder which is stored in my google drive and inside "Shared with me".
How can I download it to my laptop through terminal using "wget" or python or anything related.
The url for the whole folder within which it is contained goes like, https://drive.google.com/drive/folders/13cx4SBFLTX8CqIqjjec9-pcadGaJ0kNj
and the shareable link to the zip file is https://drive.google.com/open?id=1PMJEk3hT-_ziNhSPkU9BllLYASLzN7TL.
Since the files are 12GB in size in total, downloading them by clicking is quite tiresome when working with Jupyter notebook.
download the whole folder
!pip uninstall --yes gdown # After running this line, restart Colab runtime.
!pip install gdown -U --no-cache-dir
import gdown
url = r'https://drive.google.com/drive/folders/1sWD6urkwyZo8ZyZBJoJw40eKK0jDNEni'
gdown.download_folder(url)
You can check my answer here (Updated March 2018):
https://stackoverflow.com/a/49444877/4043524
There is one little issue with your case. In your situation, the format of the URL is different from the one mentioned in the link given above.
But, don't worry, you just have to copy the ID (a random-looking string in front of "id" key in the URL) and replace the FILEIDENTIFIER in the script with it.
Related
When finding an interesting Python Jupyter Notebook, such as 02.00-Introduction-to-NumPy.ipynb, I usally have to:
download it locally
open a shell in the same folder (tip: use SHIFT+RIGHT CLICK+ Open command window here to save 30 second browsing in the different folders) and do jupyter notebook
select the right .ipynb file, and finally run the code
Isn't there an easier way to do this?
What is the natural way to open a .ipynb notebook which is online, and run the code, without having to manually download the .ipynb?
Note: the notebook is visible here: https://github.com/jakevdp/PythonDataScienceHandbook/blob/master/notebooks/02.00-Introduction-to-NumPy.ipynb but we can't run the code
#jakevdp builds in a nice way to do that, see here. In short, on each page he has an Open in Google Colab button:
#GoogleColab can open any #ProjectJupyter notebook directly from #github!
To run the notebook, just replace "http://github.com " with "http://colab.research.google.com/github/ " in the notebook URL, and it will be loaded into Colab.
Example: 02.00-Introduction-to-NumPy.ipynb becomes: https://colab.research.google.com/github/jakevdp/PythonDataScienceHandbook/blob/master/notebooks/02.00-Introduction-to-NumPy.ipynb
By default, code will run on Colab's distant server, but it's also possible to run it locally, by clicking on top right's Connect to local runtime...:
I personally prefer the MyBinder project as a route. It will open temporary, active sessions with the contents of any Github repo, Github Gists, Gitlab repo, Zenodo archive, Dataverse repo, Datashare archive, Figshare archive, and others. Many repositories already include the necessary configuration files and even put a launch binder button them. Some don't but you can go to the form at MyBinder project and generate a session. That form will also generate a URL that you can use to target the public MyBinder system to open a session alter For example, this person posted the link to open a session for all of Jakes notebooks, you just got to the URL https://mybinder.org/v2/gh/jakevdp/PythonDataScienceHandbook/master?filepath=notebooks%2FIndex.ipynb to tell MyBinder to start a session. Then from the index page that comes up you can click on the link you listed above and run it. Jake included configuration files that MyBinder also recognizes. Note, for some repositories or archives you'll point MyBinder at, it won't have the necessary configuration files and so you can run %pip install <package_name_here> or %conda install <package_name_here> in the current session and continue on running code. Limitations include that you have to be concerned with not sharing anything you wouldn't mind be public, limited resources, and FTP is not allowed to avoid abuse.
Some others to get you started:
A Gallery of Popular Binders (You'll note the one you referenced is listed in the number one position under Featured Projects there.)
Analyze CMS Open Data in Jupyter Notebooks using Binder
Tidal constituent database mapped with Datashader
Sample Binder Repositories For example, the first one listed there includes the library seaborn installed in the environment that launches & uses it to plot a figure.
I been working on this Alexa skills and I'm almost done, but I'm stuck on this part:
Finish configuring your function. Click on your function's name
(you'll find it in the middle) and scroll to the bottom of the page,
you'll see a Cloud9 code editor.
We have provided the code for this skill here. To properly upload this
code to Lambda, you'll need to perform the following:
This skill uses the ASK SDK for Python for development. The skill code
is provided in the lambda_function.py, and the dependencies are
mentioned in requirements.txt. Download the contents of the lambda/py
folder.
On your system, navigate to the lambda folder and install the
dependencies in a new folder called “skill_env” using the following
command:
pip install -r py/requirements.txt -t skill_env
Copy the contents of
the lambda/py folder into the skill_env folder.
cp -r py/* skill_env/
Zip the contents of the skill_env folder.
Remember to zip the contents of the folder and NOT the folder itself.
pip install -r py/requirements.txt -t skill_env <--- this part doesn't not work for me in cmd
I need to use python3 to download multiple .SAO files from http. Each file has a distinct url which differs by a number in the url (for example, https://www.1.SAO, https://www.2.SAO, etc.). How do I achieve this, and how do I control where the files will be downloaded to?
Downloading a file is pretty simple using wget, and changing the URL can be done by modifying the string before executing the download() function.
to install wget, do pip install wget which will install wget to your default python instance. After that, just import it and you're good to go!
For example, if I wanted to download the .sao files from 1 to 10:
import wget
for i in range(1,11):
url = "https://www.{0}.SAO".format(i) #adds the value of "i" in at "{0}"
wget.download(url)
this will download all the files between https://www.1.SAO and https://www.10.SAO and save it to the working directory of the script.
If you wanted to change the directory, wget.download() has an optional second argument. For example, if I wanted to save my file to a directory called downloads which is located in the same directory as the script, I could call:
wget.download(url,"downloads/")
It would then save all my files in that subdirectory, instead of the working directory. If my directory is in an entire different part of the system (Let's say I wanted to download them to /usr/bin/ or something, I can specify that as well, the same way as using a normal Unix file path:
wget.download(url,"/usr/bin/")
Hopefully this helps you get started.
I'm using Google Drive to keep a copy of my code projects in case my computer dies (I'm also using GitHub, but not on some private projects).
However, when I try to create a virtual environment using virtualenv, I get the following error:
PS C:\users\fchatter\google drive> virtualenv env
New python executable in C:\users\fchatter\google drive\env\Scripts\python.exe
ERROR: The executable "C:\users\fchatter\google drive\env\Scripts\python.exe" could not be run: [Error 5] Access is denied
Things I've tried:
I thought it was because the path to the venv included blank spaces, but the command works in other paths with blank spaces. I also tried installing the win32api library, as recommended in the virtualenv docs, but it didn't work.
running the PowerShell as an administrator.
Any ideas on how to solve this? My workaround at the moment is to create the venv outside of the Google Drive, which works but is inconvenient.
After running into the same issue and playing with it for several hours, it doesn't seem possible. It has nothing to do with spaces in file/folder names. I've tested that. It seems that Google Drive Stream perform some action to the file/folder after a period of time that makes python loose its path the files. For instance, you can take clone a python module into a Google Drive Stream folder, do a "pip install -e ./", and it will work in the virtevn for a few minutes, for instance importing it into a python shell. But after a few minutes you can no longer import the module. I suspect that Google Drive Stream is simply not fully compatible with all filesystem system calls, one of which is being used by python.
Don't set up a virtual env in a cloud synced folder, nor should you run a python script from such a folder. It's a bad idea. They are not meant for version control. Write access (modifying files) to the folder is limited because in your case Google drive will periodically sync the folder which will prevent exclusive write access to the folder almost always.
TLDR; One cant possibly modify files while they are being synced.
I suggest you stick to git for version control.
Using NLTK 2.0.4. installed for EPD's Python-2.7.3 (not Canopy). on Ubuntu 12.10. In the terminal I type:
In [96]: nltk.download_shell()
NLTK Downloader
---------------------------------------------------------------------------
d) Download l) List u) Update c) Config h) Help q) Quit
---------------------------------------------------------------------------
Downloader> d
Download which package (l=list; x=cancel)?
Identifier> punkt
Downloading package 'punkt' to /home/espears/nltk_data...
And then it freezes. The relevant punkt.zip file is written to the stated directory, but the download interface never relinquishes.
This example is with IPython, but I tried the same with the regular Python 2.7.3 interpreter and got the same result.
When I try to use unzip to unzip the file directly, I see errors saying that the proper central zip-file code is not found within the file and that it cannot be unzipped. See below:
espears#computer ~/nltk_data/tokenizers $ unzip punkt.zip
Archive: punkt.zip
End-of-central-directory signature not found. Either this file is not
a zipfile, or it constitutes one disk of a multi-part archive. In the
latter case the central directory and zipfile comment will be found on
the last disk(s) of this archive.
unzip: cannot find zipfile directory in one of punkt.zip or
punkt.zip.zip, and cannot find punkt.zip.ZIP, period.
This happens with both nltk.download() and nltk.download_shell() in the same way.
I can inspect the .zip file using du to see that initially its size grows from 0 MB to about 2.7 MB, so it is actually downloading something and the file is not empty. But it stops at 2.7 MB (which may or may not correspond to the expected full size of the file) and then the Python shell downloader freezes.
I had the same problem and downloaded the necessary items manually from the following link:
http://nltk.org/nltk_data/
Not the desired solution, but will work until this is fixed.
UPDATE:
I was actually able to run nltk.download() to install cmudict. Maybe this issue only affects certain packages?
I had the same problem with nltk 3.0.01b. I downloaded the "book" package and monitored the download from the task manager's network display while at the same time checking the size of the target folder (AppData\Roaming\nltk_data on my Windows 7 system). The network traffic ceased and the folder stopped growing at a size of 379 MB. But the Python shell was locked. The following was the last message displayed:
showing info http://nltk.github.com/nltk_data/
However, if you cancel out the Tk window that shows what download items are available, the nltk.download() command will terminate and the shell prompt will come back.
Most probably it is not stuck. It may be downloading. It downloads at much slower rate even if you have good internet connectivity. I kept checking the folder size using a while loop and it slowly kept on increasing and it was successful finally. It would have worked if you waited. Unzipping might have failed because you tried to unzip before entire file downloaded.