access to shared with me drive folder in Google Colab - python

I'm new to google colaboration
My team is doing a miniproject together, so my partner built a drive folder and shared it with me. The problem is that her code is to link to the file in her 'My Drive'
While she shares with me only the "miniproject" folder, thus when I run the code on the file in it, it will get error because of wrong path.
Her code:
df = pandas.read_csv("/content/drive/MyDrive/ColabNotebooks/miniproject/zoo6.csv")
The code I need to run on my account:
df = pandas.read_csv("/content/drive/MyDrive/miniproject/zoo6.csv")
(since I made a shortcut to my My Drive)
How can I run the code by my drive account on her drive folder?

there currently exists some workarounds by adding the files to your drive though this is less than ideal. You can check out this answer

Related

Can't mount Google drive folder in colab notebook

I'm trying to mount a directory from https://drive.google.com/drive/folders/my_folder_name for use in a google colab notebook.
The instructions for mounting a folder show an example for a directory starting with /content/drive:
from google.colab import drive
drive.mount('/content/drive')
but my directory doesn't start with /content/drive, and the following things I've tried have all resulted in ValueError: Mountpoint must be in a directory that exists:
drive.mount("/content/drive/folders/my_folder_name")
drive.mount("content/drive/folders/my_folder_name")
drive.mount("drive/folders/my_folder_name")
drive.mount("https://drive.google.com/drive/folders/my_folder_name")
How can I mount a google drive location which doesn't start with /content/drive?
The path in drive.mount('/content/drive') is the path (mount point) where to mount the GDrive inside the virtual box where you notebook is running (refer to 'mount point' in Unix/Linux). It does not point to the path you are trying to access of your Google Drive.
Leave "/content/drive" intact and work like this instead:
from google.colab import drive
drive.mount("/content/drive") # Don't change this.
my_path = "/path/in/google_drive/from/root" # Your path
gdrive_path = "/content/drive" + "/My Drive" + my_path # Change according to your locale, if neeeded.
# "/content/drive/My Drive/path/in/google_drive/from/root"
And modify my_path to your desired folder located in GDrive (i don't know if "/My Drive/" changes according to your locale). Now, Colab Notebooks saves notebooks by default in "/Colab Notebooks" so, in MY case, the root of my GDrive is actually gdrive_path = "/content/drive/My Drive" (and I'm guessing yours is too).
This leaves us with:
import pandas as pd
from google.colab import drive
drive.mount("/content/drive") # Don't change this.
my_path = "/folders/my_folder_name" # THIS is your GDrive path
gdrive_path = "/content/drive" + "/My Drive" + my_path
# /content/drive/My Drive/folders/my_folder_name
sample_input_file = gdrive_path + "input.csv" # The specific file you are trying to access
rawdata = pd.read_csv(sample_input_file)
# /content/drive/My Drive/folders/my_folder_name/input.csv
After a successul mount, you will be asked to paste a validation code after you have granted permissions to the drive.mount API.
Update: GColab does not require copy/paste of the code anymore but instead to simply confirm you are who you say you are via a usual Google login page.
You can try this way
drive.mount('/gdrive)
Now access your file from this path
/gdrive/'My Drive'/folders/my_folder_name
In my case, this is what worked. I think this is what Katardin suggested, except that I had to first add these subfolders (that I was given access to via a link) to My Drive:
right click on subfolders in the google drive link I was given, and select "Add to My Drive."
Log into my google drive. Add the subfolders to a new folder in my google drive my_folder_name.
Then I could access the contents of those subfolders from colab with the following standard code:
drive.mount('/content/drive')
data_dir = 'drive/My Drive/my_folder_name'
os.listdir(data_dir) # shows the subfolders I had shared with me
I have found the reason why one cant mount ones own google drive for these things is because of a race condition with google . First it was suggested that changing the mount location from /content/gdrive to /content/something else but this didnt fix it. What I ended up doing was copying manually the files that are copied to google drive, then installing the google drive desktop application I would then in windows 10 go to the folder which is now located on google drive and disable file permissions inheritance and then manually putting full control rights on the folder to the users group and to authenticated users group. This seems to have fixed this for me. Other times I have noticed with these colabs (not this one in particular but some of the components used like the trained models are missing from the repository (as if they had been removed) Only solution for this is to look around for other sources of these files. This includes scurrying through google search engine and also looking at the git checkout level to find branches besides master and also looking for projects that cloned the project on github to see if they still include the files.
Open the google drive and share the link to everybody or your own accounts.
colab part
from google.colab import drive
drive.mount('/content/drive')
You may want to try the following, though it depends if you're doing this in pro or personal. There is a My Drive that Google Drive keeps in place in the file structure after the /content/drive/.
drive.mount('/content/drive/My Drive/folders/my_folder_name')
Copy your Colab document link and open on Chrome incognito window. And run the command again ;) It should work with no error

Using Custom Libraries in Google Colab without Mounting Drive

I am using Google Colab and I would like to use my custom libraries / scripts, that I have stored on my local machine. My current approach is the following:
# (Question 1)
from google.colab import drive
drive.mount("/content/gdrive")
# Annoying chain of granting access to Google Colab
# and entering the OAuth token.
And then I use:
# (Question 2)
!cp /content/gdrive/My\ Drive/awesome-project/*.py .
Question 1:
Is there a way to avoid the mounting of the drive entriely? Whenever the execution context changes (e.g. when I select "Hardware Acceleration = GPU", or when I wait an hour), I have to re-generate and re-enter the OAuth token.
Question 2:
Is there a way to sync files between my local machine and my Google Colab scripts more elegently?
Partial (not very satisfying answer) regarding Question 1: I saw that one could install and use Dropbox. Then you can hardcode the API Key into the application and mounting is done, regardless of whether or not it is a new execution context. I wonder if a similar approach exists based on Google Drive as well.
Question 1.
Great question and yes there is- I have been using this workaround which is particularly useful if you are a researcher and want other to be able to re run your code- or just 'colab'orate when working with larger datasets. The below method has worked well working as a team and there are challenges to each person having their own version of datasets.
I have used this regularly on 30 + Gb of image files downloaded and unzipped to colab run time.
The file id is in the link provided when you share from google drive
you can also select multiple files and select share all and then get a generate for example a .txt or .json file which you can parse and extract the file id's.
from google_drive_downloader import GoogleDriveDownloader as gdd
#some file id/ list of file ids parsed from file urls.
google_fid_id = '1-4PbytN2awBviPS4Brrb4puhzFb555g2'
destination = 'dir/dir/fid'
#if zip file ad kwarg unzip=true
gdd.download_file_from_google_drive(file_id=google_fid_id,
destination, unzip=True)
A url parsing function to get file ids from a list of urls might look like this:
def parse_urls():
with open('/dir/dir/files_urls.txt', 'r') as fb:
txt = fb.readlines()
return [url.split('/')[-2] for url in txt[0].split(',')]
One health warning is that you can only repeat this a small number of times in a 24 hour window for the same files.
Here's the gdd git repo:
https://github.com/ndrplz/google-drive-downloader
here is an working example (my own) of how it works inside bigger script:
https://github.com/fdsig/image_utils
Question 2.
You can connect to a local run time but this also means using local resources gpu/cpu etc.
Really hope this helps :-).
F~
If your code isn't secret, you can use git to sync your local codes to github. Then, git clone to Colab with no need for any authentication.

Access docs on Gdrive via Python

I am looking for a way to access an .csv document that I have registered on drive to perform data analysis. The idea would be to have something similar as pandas' read_csv but to access a remote file, not one registered locally. Note that I don't want to access a Google spreadsheet document : it's a .csv document that I have shared on Google drive. Ideally, I'd like to be able to save it on Drive as well.
Thank you for the help,
Best,
You will want to use Google's File Stream to do this. What it does is basically mount the drive to your computer so that you can access it from anywhere.
So on my windows computer I can open a terminal and then access anything on my drive. (Or if you have a mac you will find it mounted to /Volumes)
>>>ls /mnt/g/
$RECYCLE.BIN My Drive Team Drives
>>>ls /mnt/g/My\ Drive/
test.csv

Download a complete folder hierarchy in google drive api python

How can I download a complete folder hierarchy using python Google drive api. Is it that I have query each files in the folder and then download that. But doing this way, folder hierarchy will be lost. Any way to make it in proper way. Thanks
You can achieve this using Google GAM.
gam all users show filelist >filelist.csv
gam all users show filetree >filetree.csv
I got all the answers from this site. I found it very useful.
https://github.com/jay0lee/GAM/wiki
Default max results of query is 100. Must use pageToken/nextPageToken to repeat it.
see Python Google Drive API - list the entire drive file tree

Delete file from Google Drive using Google Drive API SDK

How do I delete any file from Drive using Python's Google Drive API SDK?
I want to sync my folder with google drive, such that, whenever I delete any file from my local machine, the same file which is uploaded on the drive with same name, should be deleted.
I went through : https://developers.google.com/drive/v2/reference/files/delete
But then, from where do I get fileid?
Any help would be appreciated.
Thanks in advance...
You need to read and understand https://developers.google.com/drive/v2/reference/files#resource and https://developers.google.com/drive/search-parameters and https://developers.google.com/drive/v2/reference/files/list
At the bottom of the last page is a Try It Now feature which you can use to play with the Drive SDK BEFORE you write a single line of code. Do the same with https://developers.google.com/drive/v2/reference/files/delete
Once you understand them, you will know how to trash or delete files from Drive. Personally I prefer trash as it's easier to undo my mistakes during testing. #martineau Don't worry too much about the disk space; Google isn't about to run out of disk :-)
The only catch to using Trash is you need to remember to qualify any queries with 'trashed=false' and users will need to empty Trash if ever they hit quota.

Categories

Resources