I'm trying to delete all files with the extension '.pdf' from a google drive folder.
Everything is fine with the API authentication, I can upload the files. The problem is being the delete.
Here I upload
upload_file = 'Test1.pdf'
gfile = drive.CreateFile({'parents': [{'id': '11SsSKYEATgn_VWzSb-8RjRL-VoIxvamC'}]})
gfile.SetContentFile(upload_file)
gfile.Upload()
Here I try to delete
delfile = drive.CreateFile({'parents': [{'id': '11SsSKYEATgn_VWzSb-8RjRL-VoIxvamC'}]})
filedel = "*.pdf"
delfile.SetContentFile(filedel)
delfile.Delete()
Error:
Traceback (most recent call last):
File "C:/Users/linol/Documents/ProjetoRPA-Python/RPA-TESTE.py", line 40, in <module>
delfile.SetContentFile(filedel)
File "C:\Users\linol\Documents\ProjetoRPA-Python\venv\lib\site-packages\pydrive\files.py", line 175, in SetContentFile
self.content = open(filename, 'rb')
OSError: [Errno 22] Invalid argument: '*.pdf'
I believe your goal and your current situation as follows.
You want to delete the files of PDF file in the specific folder.
You want to achieve this using pydrive for python.
You have already been able to get and put values for Google Drive using Drive API.
In this case, I would like to propose the following flow.
Retrieve file list of PDF file from the specific folder.
Delete the files using the file list.
When above flow is reflected to the script, it becomes as follows.
Sample script:
Please modify ### to your folder ID.
# 1. Retrieve file list of PDF file from the specific folder.
fileList = drive.ListFile({'q': "'###' in parents and mimeType='application/pdf'"}).GetList()
# 2. Delete the files using the file list.
for e in fileList:
drive.CreateFile({'id': e['id']}).Trash()
# drive.CreateFile({'id': e['id']}).Delete() # When you use this, the files are completely deleted. Please be careful this.
This sample script retrieves the files using the mimeType. When you want to retrieve the files using the filename, you can also use fileList = drive.ListFile({'q': "'###' in parents and title contains '.pdf'"}).GetList().
IMPORTANT: In this sample script, when Delete() is used, the files are completely deleted from Google Drive. So at first, I would like to recommend to use Trash() instead of Delete() as a test of script. By this, the files are not deleted and moved to the trash box. By this, I thought that you can test the script.
Note:
It seems that PyDrive uses Drive API v2. Please be careful this.
Reference:
PyDrive
Related
I wrote a short function in Google Apps script that can make a copy of a specific file that is stored on Google Drive. The purpose of it is that this file is a template and every time I want to create a new document for work I make a copy of this template and just change the title of the document. The code that I wrote to make a copy of the file and store it in the specific folder that I want is very simple:
function copyFile() {
var file = DriveApp.getFileById("############################################");
var folder = DriveApp.getFolderById("############################");
var filename = "Copy of Template";
file.makeCopy(filename, folder);
}
This function takes a specific file, based on ID and a specific folder based on ID and puts the copy entitles "Copy of Template" into that folder.
I have been searching all over and I cannot seem to find this. Is there a way to do the exact same thing, but using Python instead? Or, at the very least is there a way to have Python somehow call that function to run this function? I need this to be done in Python because I am writing a script that does many functions at once whenever I start a new project for work, such as creating a new document from template in Google Drive as well as other things that are not related to Google Drive at all and they can therefore not be done in Google Apps Script.
There are a few tutorials around the web that give partial answers. Here is a step-by-step guide of what you need to do.
Open Command prompt and type (without the quotes) "pip install PyDrive"
Follow the instructions here by step one - https://developers.google.com/drive/v3/web/quickstart/python to set up an account
When that is done, click on Download JSON and a file will be downloaded. Make sure to rename that to client_secrets.json, not client_secret.json as the Quick Start says to do.
Next, make sure to put that file in the same directory as your python script. If you are running the script from a console, that directory might be your username directory.
I assume that you already know the folder id that you are placing this file in and file id that you are copying. If you don't know it, there are tutorials of how to find it using python or you can open it up in Docs and it will be in the URL of the file. Basically enter the ID of the folder and the ID of the file and when you run this script it will make a copy of the chosen file and place it in the chosen folder.
One thing to note is that while running, your browser window will open up and ask for permission, just click accept and then the script will complete.
In order for this to work you might have to enable the Google Drive API, which is in the API's section.
Python Script:
## Create a new Document in Google Drive
from pydrive.auth import GoogleAuth
from pydrive.drive import GoogleDrive
gauth = GoogleAuth()
gauth.LocalWebserverAuth()
drive = GoogleDrive(gauth)
folder = "########"
title = "Copy of my other file"
file = "############"
drive.auth.service.files().copy(fileId=file,
body={"parents": [{"kind": "drive#fileLink",
"id": folder}], 'title': title}).execute()
From https://developers.google.com/drive/v2/reference/files/copy
from apiclient import errors
# ...
def copy_file(service, origin_file_id, copy_title):
"""Copy an existing file.
Args:
service: Drive API service instance.
origin_file_id: ID of the origin file to copy.
copy_title: Title of the copy.
Returns:
The copied file if successful, None otherwise.
"""
copied_file = {'title': copy_title}
try:
return service.files().copy(
fileId=origin_file_id, body=copied_file).execute()
except errors.HttpError, error:
print 'An error occurred: %s' % error
return None
With API v3:
Copy file to a directory with different name.
service.files().copy(fileId='PutFileIDHere', body={"parents": ['ParentFolderID'], 'name': 'NewFileName'} ).execute()
For me, the answer of #Rashi worked with a small modification.
instead of:
'name': 'NewFileName'
this worked:
'title': 'NewFileName'
The directory stricture on Google Drive is as follows:
Inside mydrive/BTP/BTP-4
I need to get the folder ID for BTP-4 so that I can transfer a specific file from the folder. How do I do it?
fileList = GoogleDrive(self.driveConn).ListFile({'q': "'root' in parents and trashed=false"}).GetList()
for file in fileList:
if (file['title'] == "BTP-4"):
fileID = file['id']
print(remoteFile, fileID)
return fileID
Will be able to give path like /MyDrive/BTP/BTP-4 and filename as "test.csv" and then directly download the file?
Answer:
Unfortunately, this is not possible.
More Information:
Google Drive supports creating multiple files or folders with the same name in the same location:
As a result of this, in some cases, providing a file path isn't enough to identify a file or folder uniquiely - in this case mydrive/Parent folder/Child folder/Child doc points to two different files, and mydrive/Parent folder/Child folder/Child folder points to five different folders.
You have to either directly search for the folder with its ID, or to get a folder/file's ID you have to search for children recursively through the folders like you are already doing.
I have written the following code to extract zip files in a directory and a delete a particular excel file in the extracted directory :
def extractZipFiles(dest_directory):
"This function extracts zip files in the destination directory for further processing"
fileFullPath = dest_directory + '\\'
extractedDirList = list()
for file in os.listdir(dest_directory):
dn = fileFullPath+file
dn = re.sub(r'\.zip$', "", fileFullPath+file) #remove the trailing .zip.
extractedDirList.append(dn)
zf = zipfile.ZipFile(fileFullPath+file, mode='r')
zf.extractall(dn) # extract the contents of that zip to the empty directory
zf.close()
return extractedDirList
def removeSelectedReports(extractedDirList):
"This function removes the selected reports from extracted directory"
for i in range(len(extractedDirList)):
for filename in os.listdir(extractedDirList[i]):
if filename.startswith("ABC_8"):
logger.info("File to be removed::"+filename)
fullPathName= "%s/%s" % (extractedDirList[i],filename)
os.remove(fullPathName)
return
extractedDirList = extractZipFiles(attributionRptDestDir)
logger.info("ZIP FILES EXTRACTED:"+str(extractedDirList))
removeSelectedReports(extractedDirList)
I am getting the following intermittent issue even though I have closed the zip file handler.
[WinError 32] The process cannot access the file because it is being used by another process: '\\\\share\\Workingdirectory\\report.20180517.zip'
Can you please help resolve this issue
You should try to figure out what has the file open. Based on your code, it looks like you are on Microsoft Windows.
I would stop all applications on your workstation, including browsers, run with only a minimum number of apps open, and reproduce the problem. Once reproduced you can use a tool to lists all handles open to a particular file.
A handy utility would be handle.exe, but please use any tool with similar functionality.
Once you find the offending application, you can further investigate why the file is open, and take counter measures.
I would be careful not to close any application which has the file open, until you know it is safe to do so.
All the Dropbox API SO questions and official documentation I have seen only give examples of uploading files that are external to Dropbox and then downloading Dropbox files to external files using put_file and get_file respectively.
Is there a way to both read and write files exclusively in the Dropbox file system without creating external files?
You can send strings directly to put_file. It doesn't have to be a file object:
# ... insert example code in OP's SO link to get client object
# uploading
s = 'This is a line\n'
s += 'This is another line'
response = client.put_file('/magnum-opus.txt', s)
And files received using get_file can be already accessed directly without creating an external file:
# downloading
f, metadata = client.get_file_and_metadata('/magnum-opus.txt')
for line in f:
print line
f.close()
An application creates a zip file and stores it to Google Storage. I am working on the web server side which is responsible for grabbing the zip file extract the contents etc. The web service is written in Python App Engine. So far I have set the credentials and by following the tutorials I am able to list buckets contents. Although I am able to read the metadata and file information I am not able to get the contents to zip file module. Following this
guide I have ended up in that spot which is not working:
uri = boto.storage_uri(BUCKET_NAME, GOOGLE_STORAGE)
objs = uri.get_bucket()
files = []
for obj in objs:
object_contents = StringIO.StringIO()
if obj.get_key():
obj.get_file(object_contents)
my_zip = zipfile.ZipFile(object_contents)
object_contents.close()
files.append(my_zip)
print len(files)
I am getting: BadZipfile: File is not a zip file
How can I read the content properly?
There's now an API that lets you manipulate objects in Google Storage directly. Using that, you can get a file-like object for the zipfile, and simply pass that to the zipfile module.