How to Download a directory on Google Drive using Python? - python

service = self.auth()
items = self.listFilesInFolder(downLoadFolderKey)
for item in items:
file_id = (item.get('id'))
file_name = (item.get('name'))
request = service.files().get_media(fileId=file_id)
fh = io.BytesIO()
downloader = MediaIoBaseDownload(fh, request)
done = False
while done is False:
status, done = downloader.next_chunk()
print ("Download %d%%." % int(status.progress() * 100) + file_name)
filepath = fileDownPath + file_name
with io.open(filepath, 'wb') as f:
fh.seek(0)
f.write(fh.read())
I am using Google Drive API v3.
I am trying to download a full directory. But the problem is the directory itself contains folders and when I try to run this bit of code. This error happens.
<HttpError 403 when requesting https://www.googleapis.com/drive/v3/files/1ssF0XD8pi6oh6DXB1prIJPWKMz9dggm2?alt=media returned "Only files with binary content can be downloaded. Use Export with Google Docs files.">
The error I figure is due to it trying to download the folders, within the directory. But how do I download the full directory?
P.S The directory changes so I cannot hard code file IDs and then download the files.

I believe your situation and goal as follows.
By items = self.listFilesInFolder(downLoadFolderKey), you have already been able to retrieve all file and folder list including the subfolders under the specific folder.
items include the mimeType for each files and folders.
In your issue, when the folder is used in the loop, the error occurs.
You want to remove this error.
For this, how about this answer?
Modification point:
When the mimeType is included in items of items = self.listFilesInFolder(downLoadFolderKey), the folder can be checked by the mimeType. The mimeType of folder is application/vnd.google-apps.folder.
From your script, I think that when the Google Docs file (Spreadsheet, Document, Slides and so on) is downloaded with the method of "Files: get", the same error occurs.
In order to download the Google Docs files, it is required to use the method of "Files: export".
When above point is reflected to your script, how about the following modification?
Modified script:
From:
request = service.files().get_media(fileId=file_id)
To:
file_mimeType = (item.get('mimeType'))
if file_mimeType == 'application/vnd.google-apps.folder':
continue
request = service.files().export_media(fileId=file_id, mimeType='application/pdf') if 'application/vnd.google-apps' in file_mimeType else service.files().get_media(fileId=file_id)
In this modification, at first, please confirm whether the file mimeType to items of items = self.listFilesInFolder(downLoadFolderKey) is included, again. By this, the folder can be skipped and also, Google Docs files and the files except for Google Docs can be downloaded using the value of mimeType.
In this modification, as a sample modification, Google Docs files are downloaded as the PDF file. If you want to change the output mimeType, please modify mimeType='application/pdf'.
References:
G Suite and Drive MIME Types
Files: get
Files: export

Related

How to copy a file box URL rooted in folder directory?

I need to create an script that helps me pick the URL from a Box file that has a path in my computer.
I've the Box desktop application installed.
Like: C:\Users\Thiago.Oliveira\Box\
I've created this script:
# Providing the folder path
origin = r'C:\\Users\\Thiago.Oliveira\\Box\\XXXXXXX.xlsx'
target = f'C:\\Users\\Thiago.Oliveira\\Box\\XXXXXXX{datetime.date.today()}.xlsx'
# Fetching the list of all the files
shutil.copy(origin, target)
print("Files are copied successfully")
This helps me out to copy and rename the file for this box folder. But I also want to pick up the URL from the newly created file so I can send it over in an e-mail.
I haven't found anything that would help me with this.
Is this possible?
Yes, you can, see the example below.
This example is using the box python sdk, and this is a JWT auth application script.
After uploading the file, the box sdk returns a file object, which has many properties.
I'm not sure what you mean by "pick up the URL", it could be the direct download URL of the file or a shared link.
The example is for the direct download URL.
To get the destination FOLDER_ID, you can look at the browser URL when you open the folder on the box.com app.
e.g.:image of demo folder url in browser
from datetime import date
import os
from boxsdk import JWTAuth, Client
def main():
auth = JWTAuth.from_settings_file(".jwt.config.json")
auth.authenticate_instance()
client = Client(auth)
folder_id = "163422716106"
user = client.user().get()
print(f"User: {user.id}:{user.name}")
folder = client.folder(folder_id).get()
print(f"Folder: {folder.id}:{folder.name}")
with open("./main.py", "rb") as file:
basename = os.path.basename(file.name)
file_name = os.path.splitext(basename)[0]
file_ext = os.path.splitext(basename)[1]
file_time = date.today().strftime("%Y%m%d")
box_file_name = file_name + "_" + file_time + file_ext
print(f"Uploading {box_file_name} to {folder.name}")
box_file = folder.upload_stream(file, box_file_name)
box_file.get()
print(f"File: {box_file.id}:{box_file.name}")
print(f"Download URL: {box_file.get_download_url()}")
if __name__ == "__main__":
main()
print("Done")
Resulting in:
User: 20344589936:UI-Elements-Sample
Folder: 163422716106:Box UI Elements Demo
Uploading main_20230203.py to Box UI Elements Demo
File: 1130939599686:main_20230203.py
Download URL: https://public.boxcloud.com/d/1/b1!5CgJ...fLc./download

Remove CSV's, add new CSV's with python in Google API [duplicate]

I have this script written in python which looks thrue folder 'CSVtoGD', list every CSV there and send those CSV's as independent sheets to my google drive. I am trying to write a line which will delete the old files when I run the program again. What am I missing here? I am trying to achieve that by using:
sh = gc.del_spreadsheet(filename.split(".")[0]+" TTF")
Unfortunately the script is doing the same thing after adding this line. It is uploading new files but not deleting old ones.
Whole script looks like that
import gspread
import os
gc = gspread.oauth(credentials_filename='/users/user/credentials.json')
os.chdir('/users/user/CSVtoGD')
files = os.listdir()
for filename in files:
if filename.split(".")[1] == "csv":
folder_id = '19vrbvaeDqWcxFGwPV82APWYTmB'
sh = gc.del_spreadsheet(filename.split(".")[0]+" TTF")
sh = gc.create(filename.split(".")[0]+" TTF", folder_id)
content = open(filename, 'r').read().encode('utf-8')
gc.import_csv(sh.id, content)
Everything is working fine, CSVs from folder are uploaded to google drive, my problem is with deleting the old CSV (with the same name as new ones)
When I saw the document of gspread, it seems that the argument of the method of del_spreadsheet is the file ID. Ref When I saw your script, you are using the filename as the argument. I thought that this might be the reason for your issue. When this is reflected in your script, it becomes as follows.
From:
sh = gc.del_spreadsheet(filename.split(".")[0]+" TTF")
To:
sh = gc.del_spreadsheet(gc.open(filename.split(".")[0] + " TTF").id)
Note:
When the Spreadsheet of the filename of filename.split(".")[0] + " TTF" is not existing, an error occurs. Please be careful about this.
Reference:
del_spreadsheet(file_id)
Added:
From your reply of When I try do delete other file using this method from My Drive it is working well., it was found that my proposed modification can be used for "My Drive". But, it seems that this cannot be used for the shared drive.
When I saw the script of gspread again, I noticed that the current request cannot search the files in the shared drive using the filename. And also, I confirmed that in the current gspread, the Spreadsheet ID cannot be retrieved using gspread. Because the files cannot be searched from all shared drives. By this, I would like to propose the following modified script.
Modified script:
import gspread
import os
from googleapiclient.discovery import build
gc = gspread.oauth(credentials_filename='/users/user/credentials.json')
service = build("drive", "v3", credentials=gc.auth)
def getSpreadsheetId(filename):
q = f"name='{filename}' and mimeType='application/vnd.google-apps.spreadsheet' and trashed=false" # or q = "name='" + filename + "' and mimeType='application/vnd.google-apps.spreadsheet' and trashed=false"
res = service.files().list(q=q, fields="files(id)", corpora="allDrives", includeItemsFromAllDrives=True, supportsAllDrives=True).execute()
items = res.get("files", [])
if not items:
print("No files found.")
exit()
return items[0]["id"]
os.chdir('/users/user/CSVtoGD')
files = os.listdir()
for filename in files:
fname = filename.split(".")
if fname[1] == "csv":
folder_id = '19vrbvaeDqWcxFGwPV82APWYTmB'
oldSpreadsheetId = getSpreadsheetId(fname[0] + " TTF")
sh = gc.del_spreadsheet(oldSpreadsheetId)
sh = gc.create(fname[0] + " TTF", folder_id)
content = open(filename, "r").read().encode("utf-8")
gc.import_csv(sh.id, content)
In this modification, in order to retrieve the Spreadsheet ID from the filename in the shared drive, googleapis for python is used. Ref
But, in this case, it supposes that you have the permission for writing to the shared drive. Please be careful about this.

How to copy folder and its content into new location in SharePoint Online with python office365 module

I am trying to copy a folder with all its content from one site to another in SharePoint Online. So I created a code that recursively creates the folder structure.
What I can't do is to copy the files. For the start I am trying to test the following very simple code which I got from GitHub
url = 'https://company.sharepoint.com/sites/MyTeam'
ctx = ClientContext(url).with_credentials(ClientCredential(client_id, client_secret))
source_folder = ctx.web.get_folder_by_server_relative_url('Shared Documents/from')
target_folder = source_folder.copy_to('Shared Documents/to').get().execute_query()
This code doesn't do anything at all. I tried to check the paths and print out the files in the source_folder and target_folder. It prints out the files in the source_folder and prints nothing for the target_folder. What am I missing here?
You could try to use sever relatice url like this:
/sites/MyTeam/Shared Documents/from and /sites/MyTeam/Shared Documents/to
Check if it works.

File is not found when I try to upload the files to S3 using boto3

I'm following a simple tutorial on YouTube about how to automatically upload files in S3 using Python, and I'm getting this error shows that:
FileNotFoundError: [WinError 2] The system cannot find the file specified: 'age.csv'
And this does not make sense to me, because files are there. For example, this my code looks like:
client = boto3.client('s3',
aws_access_key_id=access_key,
aws_secret_access_key=secret_access_key)
path = 'C:/Users/User/Desktop/python/projects/AWS-Data-Processing/example_data'
for file in os.listdir(path):
upload_file_bucket = 'my-uploaded-data'
print(file)
if '.txt' in file:
upload_file_key_txt = 'txt/' + str(file)
client.upload_file(file, upload_file_bucket, upload_file_key_txt)
print("txt")
elif '.csv' in file:
upload_file_key_csv = 'csv/' + str(file)
client.upload_file(file, upload_file_bucket, upload_file_key_csv)
print("csv")
And when I comment out the part where it says:
client.upload_file(file, upload_file_bucket, upload_file_key_txt)
it prints out either "txt" or "cvs", and I comment out to just read files such as:
for file in os.listdir(path):
upload_file_bucket = 'my-uploaded-data'
print(file)
Then it successfully prints out the file names. So I don't understand why I get the error of there is no file existing when there is. It sounds contradicting and I need some help to understand this error.
I read a post where I might need to download AWS CLI, so which I did but it didn't help. I'm guessing the problem lies in the function upload_file but I just don't understand how there is no file?
Any advice will be appreciated!
The upload_file function takes a full file path, and not just a name. It cannot guess what is your directory, so you need to prepend it or use a different way of iterating over the files.
Source: https://boto3.amazonaws.com/v1/documentation/api/latest/guide/s3-uploading-files.html

How to upload file to box using python

I am new to python and I want to know if we can upload files from our local system to box.com?
Or else can we take help from any mediator like Jenkins to upload this files?
You can use the below boxsdk library code.
def upload_file_to_box(client, folder_id, filename):
folder = client.folder(folder_id=folder_id)
items = folder.get_items()
for item in items:
if item.name == filename:
updated_file = client.file(item.id).update_contents(item.name)
print('File "{0}" has been updated'.format(updated_file.name))
return
uploaded_file = folder.upload(filename)
print('File "{0}" has been uploaded'.format(uploaded_file.name))
This will check for a specific file name and compare it with all files names in the folder and updates a new version if exists, otherwise uploads a new file.
Also you can search the filename inside a folder using search API by using the below code. But the search API has a time lag of 10 minutes or greater.
items = client.search().query(query='"{}"'.format(filename), limit=100, ancestor_folders=[folder])
I don't know if I understood your question correctly, but there is a package for python to connect to the box platform through an API http://opensource.box.com/box-python-sdk/tutorials/intro.html

Categories

Resources