Download a complete folder hierarchy in google drive api python - python

How can I download a complete folder hierarchy using python Google drive api. Is it that I have query each files in the folder and then download that. But doing this way, folder hierarchy will be lost. Any way to make it in proper way. Thanks

You can achieve this using Google GAM.
gam all users show filelist >filelist.csv
gam all users show filetree >filetree.csv
I got all the answers from this site. I found it very useful.
https://github.com/jay0lee/GAM/wiki

Default max results of query is 100. Must use pageToken/nextPageToken to repeat it.
see Python Google Drive API - list the entire drive file tree

Related

To read the names of all files(or to get the link for it ) inside a google drive folder using python

How to read the names of all files or get the links for all files inside a google drive folder using python?
You can check out the google drive API. They have given an example where you get the list of files and folders in your home page.
Example
You can do more by checking out API Reference.

Google cloud storage cycling through a directory

I have a storage bucket in google cloud. I have a few directories which I created with files in them.
I know that if I want to cycle through all the files in all the directories, I can use the following command:
for file in list(source_bucket.list_blobs()):
file_path=f"gs://{file.bucket.name}/{file.name}"
print(file_path)
Is there a way to only cycle through one of the directories?
I suggest studying the Cloud Storage list API in more detail. It looks like you have only experimented with the most basic use of list_blobs(). As you can see from the linked API documentation, you can pass a prefix parameter to limit the scope of the list to some path. source_bucket.list_blobs(prefix="path"):

using tf.keras.utils.get_file() for google drive

I am trying to use tf.keras.utils.get_file("URL from google drive")
When I use URL which has less than 33MB it works well
However, when I try to download file more than 33MB it's not working well.
How can I solve this problem?
_URL = 'URL FROM GOOGLE DRIVE'
path_to_zip = tf.keras.utils.get_file("file_name.zip", origin=_URL, extract=True)
PATH = os.path.join(os.path.dirname(path_to_zip), 'art_filename')
I am following https://www.tensorflow.org/tutorials/images/classification
this for my practice, and I am trying to use my own data for the practice.
In this example, it uses URL as "storage.googleapi.com..." and has large amount of data.
I want to use this code for downloading large data from google drive.
Is there anyway to solve this problem?
I also tried google mounting but since I want to access the folders and files,
I am not used to do with google mounting.
Thanks
Files that are above a certain size pop-up with a notification from Drive letting you know that it cannot be scanned for viruses which needs to be accepted before the file can download. By appending "&confirm=t" to the end of the download URL, you can bypass that message and download your files.

Using Custom Libraries in Google Colab without Mounting Drive

I am using Google Colab and I would like to use my custom libraries / scripts, that I have stored on my local machine. My current approach is the following:
# (Question 1)
from google.colab import drive
drive.mount("/content/gdrive")
# Annoying chain of granting access to Google Colab
# and entering the OAuth token.
And then I use:
# (Question 2)
!cp /content/gdrive/My\ Drive/awesome-project/*.py .
Question 1:
Is there a way to avoid the mounting of the drive entriely? Whenever the execution context changes (e.g. when I select "Hardware Acceleration = GPU", or when I wait an hour), I have to re-generate and re-enter the OAuth token.
Question 2:
Is there a way to sync files between my local machine and my Google Colab scripts more elegently?
Partial (not very satisfying answer) regarding Question 1: I saw that one could install and use Dropbox. Then you can hardcode the API Key into the application and mounting is done, regardless of whether or not it is a new execution context. I wonder if a similar approach exists based on Google Drive as well.
Question 1.
Great question and yes there is- I have been using this workaround which is particularly useful if you are a researcher and want other to be able to re run your code- or just 'colab'orate when working with larger datasets. The below method has worked well working as a team and there are challenges to each person having their own version of datasets.
I have used this regularly on 30 + Gb of image files downloaded and unzipped to colab run time.
The file id is in the link provided when you share from google drive
you can also select multiple files and select share all and then get a generate for example a .txt or .json file which you can parse and extract the file id's.
from google_drive_downloader import GoogleDriveDownloader as gdd
#some file id/ list of file ids parsed from file urls.
google_fid_id = '1-4PbytN2awBviPS4Brrb4puhzFb555g2'
destination = 'dir/dir/fid'
#if zip file ad kwarg unzip=true
gdd.download_file_from_google_drive(file_id=google_fid_id,
destination, unzip=True)
A url parsing function to get file ids from a list of urls might look like this:
def parse_urls():
with open('/dir/dir/files_urls.txt', 'r') as fb:
txt = fb.readlines()
return [url.split('/')[-2] for url in txt[0].split(',')]
One health warning is that you can only repeat this a small number of times in a 24 hour window for the same files.
Here's the gdd git repo:
https://github.com/ndrplz/google-drive-downloader
here is an working example (my own) of how it works inside bigger script:
https://github.com/fdsig/image_utils
Question 2.
You can connect to a local run time but this also means using local resources gpu/cpu etc.
Really hope this helps :-).
F~
If your code isn't secret, you can use git to sync your local codes to github. Then, git clone to Colab with no need for any authentication.

Access docs on Gdrive via Python

I am looking for a way to access an .csv document that I have registered on drive to perform data analysis. The idea would be to have something similar as pandas' read_csv but to access a remote file, not one registered locally. Note that I don't want to access a Google spreadsheet document : it's a .csv document that I have shared on Google drive. Ideally, I'd like to be able to save it on Drive as well.
Thank you for the help,
Best,
You will want to use Google's File Stream to do this. What it does is basically mount the drive to your computer so that you can access it from anywhere.
So on my windows computer I can open a terminal and then access anything on my drive. (Or if you have a mac you will find it mounted to /Volumes)
>>>ls /mnt/g/
$RECYCLE.BIN My Drive Team Drives
>>>ls /mnt/g/My\ Drive/
test.csv

Categories

Resources