All the Dropbox API SO questions and official documentation I have seen only give examples of uploading files that are external to Dropbox and then downloading Dropbox files to external files using put_file and get_file respectively.
Is there a way to both read and write files exclusively in the Dropbox file system without creating external files?
You can send strings directly to put_file. It doesn't have to be a file object:
# ... insert example code in OP's SO link to get client object
# uploading
s = 'This is a line\n'
s += 'This is another line'
response = client.put_file('/magnum-opus.txt', s)
And files received using get_file can be already accessed directly without creating an external file:
# downloading
f, metadata = client.get_file_and_metadata('/magnum-opus.txt')
for line in f:
print line
f.close()
Related
I have some pdf files which are uploaded on a remote server. I have URL for each file and we can download these PDF files by visiting those URLs.
My question is,
I want to merge all pdf files into a single file (but, without storing these files into local directory). How can I do that (in python module 'PyPDF2')?
Please move to pypdf. It's essentially the same as PyPDF2, but the development will continue there (I'm the maintainer of both projects).
Your question is answered in the docs:
https://pypdf.readthedocs.io/en/latest/user/streaming-data.html
Instead of writing to a file, you write to io.ByteIO stream:
from io import ByteIO
# e.g. writer = PdfWriter()
# ... do what you want to do with the PDFs
with BytesIO() as bytes_stream:
writer.write(bytes_stream)
bytes_stream.seek(0)
data = bytes_stream.read() # that is now the "bytes" represention
I'm trying to delete all files with the extension '.pdf' from a google drive folder.
Everything is fine with the API authentication, I can upload the files. The problem is being the delete.
Here I upload
upload_file = 'Test1.pdf'
gfile = drive.CreateFile({'parents': [{'id': '11SsSKYEATgn_VWzSb-8RjRL-VoIxvamC'}]})
gfile.SetContentFile(upload_file)
gfile.Upload()
Here I try to delete
delfile = drive.CreateFile({'parents': [{'id': '11SsSKYEATgn_VWzSb-8RjRL-VoIxvamC'}]})
filedel = "*.pdf"
delfile.SetContentFile(filedel)
delfile.Delete()
Error:
Traceback (most recent call last):
File "C:/Users/linol/Documents/ProjetoRPA-Python/RPA-TESTE.py", line 40, in <module>
delfile.SetContentFile(filedel)
File "C:\Users\linol\Documents\ProjetoRPA-Python\venv\lib\site-packages\pydrive\files.py", line 175, in SetContentFile
self.content = open(filename, 'rb')
OSError: [Errno 22] Invalid argument: '*.pdf'
I believe your goal and your current situation as follows.
You want to delete the files of PDF file in the specific folder.
You want to achieve this using pydrive for python.
You have already been able to get and put values for Google Drive using Drive API.
In this case, I would like to propose the following flow.
Retrieve file list of PDF file from the specific folder.
Delete the files using the file list.
When above flow is reflected to the script, it becomes as follows.
Sample script:
Please modify ### to your folder ID.
# 1. Retrieve file list of PDF file from the specific folder.
fileList = drive.ListFile({'q': "'###' in parents and mimeType='application/pdf'"}).GetList()
# 2. Delete the files using the file list.
for e in fileList:
drive.CreateFile({'id': e['id']}).Trash()
# drive.CreateFile({'id': e['id']}).Delete() # When you use this, the files are completely deleted. Please be careful this.
This sample script retrieves the files using the mimeType. When you want to retrieve the files using the filename, you can also use fileList = drive.ListFile({'q': "'###' in parents and title contains '.pdf'"}).GetList().
IMPORTANT: In this sample script, when Delete() is used, the files are completely deleted from Google Drive. So at first, I would like to recommend to use Trash() instead of Delete() as a test of script. By this, the files are not deleted and moved to the trash box. By this, I thought that you can test the script.
Note:
It seems that PyDrive uses Drive API v2. Please be careful this.
Reference:
PyDrive
I am writing a script to pull xml files from an FTP, turn them into an .xlsx file, and re-upload to a different directory on the same FTP. I want to create the .xlsx file within my script instead of copying the xml data into a template and uploading my local file.
I tried creating a filename for the .xlsx doc, but i realize that i need to save it before i can upload to the FTP. My question is, would it be better to create a temporary folder on the server the script is being run and empty the folder out afterwards? or is there a way to upload the doc without saving it anywhere (preferred)? I will be running the script on a windows server
ftps.cwd(ftpExcelDir)
wbFilename = str(orderID + '.xlsx')
savedFile = saving the file somwhere # this is the part im having trouble with
ftps.storline('STOR ' + wbFilename, savedFile)
With the following code, i can get the .xlsx files to save to the FTP, but i recieve an invalid extension/corrupt file error from Excel:
ftps.cwd(ftpExcelDir)
wbFilename = str(orderID + '.xlsx')
inMemoryWB = io.BytesIO()
wb.save(inMemoryWB)
ftps.storbinary('STOR ' + wbFilename, inMemoryWB)
The ftp functions take file objects... but those don't strictly speaking need to be files. Python has BytesIO and StringIO objects which act like files, but are backed by memory. See: https://stackoverflow.com/a/44672691/8833934
I want to manipulate a downloaded PDF using PyPDF and for that, I need a file object.
I use GAE to host my Python app, so I cannot actually write the file to disk.
Is there any way to obtain the file object from URL or from a variable that contains the file contents?
TIA.
Most tools (including urllib) already give you a file-like, but if you need true random access then you'll need to create a StringIO.StringIO and read the data into it.
In GAE you can use the blobstore to read, write file data and to upload and download files. And you can use the File API:
Example :
_file = files.blobstore.create(mime_type=mimetype, _blobinfo_uploaded_filename='test')
with files.open(_file, 'a') as f :
f.write(somedata)
files.finalize(_file)
An application creates a zip file and stores it to Google Storage. I am working on the web server side which is responsible for grabbing the zip file extract the contents etc. The web service is written in Python App Engine. So far I have set the credentials and by following the tutorials I am able to list buckets contents. Although I am able to read the metadata and file information I am not able to get the contents to zip file module. Following this
guide I have ended up in that spot which is not working:
uri = boto.storage_uri(BUCKET_NAME, GOOGLE_STORAGE)
objs = uri.get_bucket()
files = []
for obj in objs:
object_contents = StringIO.StringIO()
if obj.get_key():
obj.get_file(object_contents)
my_zip = zipfile.ZipFile(object_contents)
object_contents.close()
files.append(my_zip)
print len(files)
I am getting: BadZipfile: File is not a zip file
How can I read the content properly?
There's now an API that lets you manipulate objects in Google Storage directly. Using that, you can get a file-like object for the zipfile, and simply pass that to the zipfile module.