Download file google drive python - python

How do I download a file from googledrive?
I am using pydrive using the link.
#https://drive.google.com/open?id=DWADADDSASWADSCDAW
from pydrive.auth import GoogleAuth
from pydrive.drive import GoogleDrive
gauth = GoogleAuth()
drive = GoogleDrive(gauth)
gdrive_file = drive.CreateFile({'id': 'id=DWADADDSASWADSCDAW'})
gdrive_file.GetContentFile('DWADSDCXZCDWA.zip') # Download content file.
Error:
raceback (most recent call last):
File "C:\Users\Hoxton\AppData\Local\Continuum\miniconda3\lib\site-packages\oauth2client\clientsecrets.py", line 121, in _loadfile
with open(filename, 'r') as fp:
FileNotFoundError: [Errno 2] No such file or directory: 'client_secrets.json'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "C:\Users\Hoxton\AppData\Local\Continuum\miniconda3\lib\site-packages\pydrive\auth.py", line 386, in LoadClientConfigFile
client_type, client_info = clientsecrets.loadfile(client_config_file)
File "C:\Users\Hoxton\AppData\Local\Continuum\miniconda3\lib\site-packages\oauth2client\clientsecrets.py", line 165, in loadfile
return _loadfile(filename)
File "C:\Users\Hoxton\AppData\Local\Continuum\miniconda3\lib\site-packages\oauth2client\clientsecrets.py", line 125, in _loadfile
exc.strerror, exc.errno)
oauth2client.clientsecrets.InvalidClientSecretsError: ('Error opening file', 'client_secrets.json', 'No such file or directory', 2)
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "C:/Users/Hoxton/123/pyu_test.py", line 8, in <module>
gdrive_file.GetContentFile('PyUpdater+App-win-1.0.zip') # Download content file.
File "C:\Users\Hoxton\AppData\Local\Continuum\miniconda3\lib\site-packages\pydrive\files.py", line 210, in GetContentFile
self.FetchContent(mimetype, remove_bom)
File "C:\Users\Hoxton\AppData\Local\Continuum\miniconda3\lib\site-packages\pydrive\files.py", line 42, in _decorated
self.FetchMetadata()
File "C:\Users\Hoxton\AppData\Local\Continuum\miniconda3\lib\site-packages\pydrive\auth.py", line 57, in _decorated
self.auth.LocalWebserverAuth()
File "C:\Users\Hoxton\AppData\Local\Continuum\miniconda3\lib\site-packages\pydrive\auth.py", line 113, in _decorated
self.GetFlow()
File "C:\Users\Hoxton\AppData\Local\Continuum\miniconda3\lib\site-packages\pydrive\auth.py", line 443, in GetFlow
self.LoadClientConfig()
File "C:\Users\Hoxton\AppData\Local\Continuum\miniconda3\lib\site-packages\pydrive\auth.py", line 366, in LoadClientConfig
self.LoadClientConfigFile()
File "C:\Users\Hoxton\AppData\Local\Continuum\miniconda3\lib\site-packages\pydrive\auth.py", line 388, in LoadClientConfigFile
raise InvalidConfigError('Invalid client secrets file %s' % error)
pydrive.settings.InvalidConfigError: Invalid client secrets file ('Error opening file', 'client_secrets.json', 'No such file or directory', 2)
Process finished with exit code 1

Try the provided sample code in the documentation.
The Drive API allows you to download files that are stored in Google
Drive. Also, you can download exported versions of Google Documents
(Documents, Spreadsheets, Presentations, etc.) in formats that your
app can handle. Drive also supports providing users direct access to a
file via the URL in the webViewLink property.
Here is the code snippet:
file_id = '0BwwA4oUTeiV1UVNwOHItT0xfa2M'
request = drive_service.files().get_media(fileId=file_id)
fh = io.BytesIO()
downloader = MediaIoBaseDownload(fh, request)
done = False
while done is False:
status, done = downloader.next_chunk()
print "Download %d%%." % int(status.progress() * 100)

This works for me:
from google_drive_downloader import GoogleDriveDownloader as gdd
gdd.download_file_from_google_drive(file_id='1z8e2CnvrX8ZSu2kk0QgiFWurOKMr0', dest_path='E:/model.h5')
Source: https://newbedev.com/python-download-files-from-google-drive-using-url

Hey I know it's a bit late to answer, but it might still be helpful to someone.
I had a similar problem with G-sheets, the problem here is that there might be multiple formats to download the file in and you're not specifying which one you want.To do this you need to add the mimetype parameter to the GetContentFile Method. Like so:
gdrive_file.GetContentFile('DWADSDCXZCDWA.zip', mimetype = 'application/zip')
Note that there are multiple mimetypes for zip files and that the mimetype and extension need to agree. So you need to know which one to use, or just try out different ones if you don't. Here's a handy list:
application/x-compressed
application/x-zip-compressed
application/zip
multipart/x-zip
Furthermore, if you actually access the metadata of the file you can have a peek at all the types of formats you can export it in under 'exportLinks'. There will be a dict with mimetypes and the associated links.

Related

Converting all word document in a folder to txt using python

I am trying to convert word document to txt. Changing the extension doesn't work. I need to open it in word ,then save it to .txt format
I am using code from here http://code.activestate.com/recipes/279003-converting-word-documents-to-text/
import fnmatch, os, pythoncom, sys, win32com.client
wordapp = win32com.client.gencache.EnsureDispatch("Word.Application")
try:
for path, dirs, files in os.walk(sys.argv[1]):
for doc in [os.path.abspath(os.path.join(path, filename)) for filename in files if fnmatch.fnmatch(filename, '*.doc')]:
print "processing %s" % doc
wordapp.Documents.Open(doc)
docastxt = doc.rstrip('doc') + 'txt'
wordapp.ActiveDocument.SaveAs(docastxt, FileFormat=win32com.client.constants.wdFormatTextLineBreaks)
wordapp.ActiveWindow.Close()
finally:
wordapp.Quit()
however when I run it the first file will work perfectly but giving error on the second file
here are the error msg
Traceback (most recent call last):
File "recipe-279003-1.py", line 11, in <module>
wordapp.ActiveDocument.SaveAs(docastxt, FileFormat=win32com.client.constants.wdFormatTextLineBreaks)
File "C:\Users\moxzhang\AppData\Local\Continuum\anaconda3\lib\site-packages\win32com\client\__init__.py", line 474, in __getattr__
return self._ApplyTypes_(*args)
File "C:\Users\moxzhang\AppData\Local\Continuum\anaconda3\lib\site-packages\win32com\client\__init__.py", line 467, in _ApplyTypes_
self._oleobj_.InvokeTypes(dispid, 0, wFlags, retType, argTypes, *args),
pywintypes.com_error: (-2147023179, 'The interface is unknown.', None, None)
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "recipe-279003-1.py", line 14, in <module>
wordapp.Quit()
File "C:\Users\moxzhang\AppData\Local\Temp\gen_py\3.7\00020905-0000-0000-C000-000000000046x0x8x7\_Application.py", line 353, in Quit
, OriginalFormat, RouteDocument)
pywintypes.com_error: (-2147023179, 'The interface is unknown.', None, None)
I have tried only put one word file in the directory , it works fine. but once i have 2 files there. the procedure will fail on the 2ed file (both file will convert succesfully if it is the only file in the folder)
Could you please let me know what this error message means ? and what can I do to fix it.

Reading password protected Word Documents with zipfile

I am trying to read a password protected word document on Python using zipfile.
The following code works with a non-password protected document, but gives an error when used with a password protected file.
try:
from xml.etree.cElementTree import XML
except ImportError:
from xml.etree.ElementTree import XML
import zipfile
psw = "1234"
WORD_NAMESPACE = '{http://schemas.openxmlformats.org/wordprocessingml/2006/main}'
PARA = WORD_NAMESPACE + 'p'
TEXT = WORD_NAMESPACE + 't'
def get_docx_text(path):
document = zipfile.ZipFile(path, "r")
document.setpassword(psw)
document.extractall()
xml_content = document.read('word/document.xml')
document.close()
tree = XML(xml_content)
paragraphs = []
for paragraph in tree.getiterator(PARA):
texts = [node.text
for node in paragraph.getiterator(TEXT)
if node.text]
if texts:
paragraphs.append(''.join(texts))
return '\n\n'.join(paragraphs)
When running get_docx_text() with a password protected file, I received the following error:
Traceback (most recent call last):
File "<ipython-input-15-d2783899bfe5>", line 1, in <module>
runfile('/Users/username/Workspace/Python/docx2txt.py', wdir='/Users/username/Workspace/Python')
File "/Applications/Spyder-Py2.app/Contents/Resources/lib/python2.7/spyderlib/widgets/externalshell/sitecustomize.py", line 680, in runfile
execfile(filename, namespace)
File "/Applications/Spyder-Py2.app/Contents/Resources/lib/python2.7/spyderlib/widgets/externalshell/sitecustomize.py", line 78, in execfile
builtins.execfile(filename, *where)
File "/Users/username/Workspace/Python/docx2txt.py", line 41, in <module>
x = get_docx_text("/Users/username/Desktop/file.docx")
File "/Users/username/Workspace/Python/docx2txt.py", line 23, in get_docx_text
document = zipfile.ZipFile(path, "r")
File "zipfile.pyc", line 770, in __init__
File "zipfile.pyc", line 811, in _RealGetContents
BadZipfile: File is not a zip file
Does anyone have any advice to get this code to work?
I don't think this is an encryption problem, for two reasons:
Decryption is not attempted when the ZipFile object is created. Methods like ZipFile.extractall, extract, and open, and read take an optional pwd parameter containing the password, but the object constructor / initializer does not.
Your stack trace indicates that the BadZipFile is being raised when you create the ZipFile object, before you call setpassword:
document = zipfile.ZipFile(path, "r")
I'd look carefully for other differences between the two files you're testing: ownership, permissions, security context (if you have that on your OS), ... even filename differences can cause a framework to "not see" the file you're working on.
Also --- the obvious one --- try opening the encrypted zip file with your zip-compatible command of choice. See if it really is a zip file.
I tested this by opening an encrypted zip file in Python 3.1, while "forgetting" to provide a password. I could create the ZipFile object (the variable zfile below) without any error, but got a RuntimeError --- not a BadZipFile exception --- when I tried to read a file without providing a password:
Traceback (most recent call last):
File "./zf.py", line 35, in <module>
main()
File "./zf.py", line 29, in main
print_checksums(zipfile_name)
File "./zf.py", line 22, in print_checksums
for checksum in checksum_contents(zipfile_name):
File "./zf.py", line 13, in checksum_contents
inner_file = zfile.open(inner_filename, "r")
File "/usr/lib64/python3.1/zipfile.py", line 903, in open
"password required for extraction" % name)
RuntimeError: File apache.log is encrypted, password required for extraction
I was also able to raise a BadZipfile exception, once by trying to open an empty file and once by trying to open some random logfile text that I'd renamed to a ".zip" extension. The two test files produced identical stack traces, down to the line numbers.
Traceback (most recent call last):
File "./zf.py", line 35, in <module>
main()
File "./zf.py", line 29, in main
print_checksums(zipfile_name)
File "./zf.py", line 22, in print_checksums
for checksum in checksum_contents(zipfile_name):
File "./zf.py", line 10, in checksum_contents
zfile = zipfile.ZipFile(zipfile_name, "r")
File "/usr/lib64/python3.1/zipfile.py", line 706, in __init__
self._GetContents()
File "/usr/lib64/python3.1/zipfile.py", line 726, in _GetContents
self._RealGetContents()
File "/usr/lib64/python3.1/zipfile.py", line 738, in _RealGetContents
raise BadZipfile("File is not a zip file")
zipfile.BadZipfile: File is not a zip file
While this stack trace isn't exactly the same as yours --- mine has a call to _GetContents, and the pre-3.2 "small f" spelling of BadZipfile --- but they're close enough that I think this is the kind of problem you're dealing with.

Getting an EOFError when getting large files with Paramiko

I'm trying to write a quick python script to grab some logs with sftp. My first inclination was to use Pysftp, since it seemed like it made it very simple. It worked great, until it got to a larger file. I got an error while getting any file over about 13 MB. I then decided to try writing what I needed directly in Paramiko, rather than relying on the extra layer of Pysftp. After figuring out how to do that, I ended up getting the exact same error. Here's the Paramiko code, as well as the trace from the error I get. Does anyone have any idea why this would have an issue pulling any largish files? Thanks.
# Create tranport and connect
transport = paramiko.Transport((host, 22))
transport.connect(username=username, password=password)
sftp = paramiko.SFTPClient.from_transport(transport)
# List of the log files in c:
files = sftp.listdir('c:/logs')
# Now pull them, logging as you go
for f in files:
if f[0].lower() == 't' or f[:3].lower() == 'std':
logger.info('Pulling {0}'.format(f))
sftp.get('c:/logs/{0}'.format(f), output_dir +'/{0}'.format(f))
# Close the connection
sftp.close()
transport.close()
And here's the error:
No handlers could be found for logger "paramiko.transport"
Traceback (most recent call last):
File "pull_logs.py", line 420, in <module> main()
File "pull_logs.py", line 410, in main
pull_logs(username, host, password, location)
File "pull_logs.py", line 142, in pull_logs
sftp.get('c:/logs/{0}'.format(f), output_dir +'/{0}'.format(f))
File "/Users/me/my_site/site_packages/paramiko/sftp_client.py", line 676, in get
size = self.getfo(remotepath, fl, callback)
File "/Users/me/my_site/site_packages/paramiko/sftp_client.py", line 645, in getfo
data = fr.read(32768)
File "/Users/me/my_site/site_packages/paramiko/file.py", line 153, in read
new_data = self._read(read_size)
File "/Users/me/my_site/site_packages/paramiko/sftp_file.py", line 152, in _read
data = self._read_prefetch(size)
File "/Users/me/my_site/site_packages/paramiko/sftp_file.py", line 132, in _read_prefetch
self.sftp._read_response()
File "/Users/me/my_site/site_packages/paramiko/sftp_client.py", line 721, in _read_response
raise SSHException('Server connection dropped: %s' % (str(e),))
paramiko.SSHException: Server connection dropped:

wget loop to all the lines (url) in textfile and download Windows

I have a simple task but cannot make my code work. I want to loop over the URLs listed in my textfile and download it using wget command in Python. Each URL are placed in separate line in the textfile.
Basically, this is the structure of the list in my textfile:
http://e4ftl01.cr.usgs.gov//MODIS_Composites/MOLT/MOD11C3.005/2000.03.01/MOD11C3.A2000061.005.2007177231646.hdf
http://e4ftl01.cr.usgs.gov//MODIS_Composites/MOLT/MOD11C3.005/2014.12.01/MOD11C3.A2014335.005.2015005235231.hdf
all the URLs are about 178 lines. Then save it in the current working directory.
Below is the initial code that I am working:
import os, fileinput, urllib2 as url, wget
os.chdir("E:/Test/dwnld")
for line in fileinput.FileInput("E:/Test/dwnld/data.txt"):
print line
openurl = wget.download(line)
The error message is:
Traceback (most recent call last): File "E:\Python_scripts\General_purpose\download_url_from_textfile.py", line 5, in <module>
openurl = wget.download(line) File "C:\Python278\lib\site-packages\wget.py", line 297, in download
(fd, tmpfile) = tempfile.mkstemp(".tmp", prefix=prefix, dir=".") File "C:\Python278\lib\tempfile.py", line 308, in mkstemp
return _mkstemp_inner(dir, prefix, suffix, flags) File "C:\Python278\lib\tempfile.py", line 239, in _mkstemp_inner
fd = _os.open(file, flags, 0600) OSError: [Errno 22] Invalid argument: ".\\MOD11C3.A2000061.005.2007177231646.hdf'\n.frbfrp.tmp"
Try to use urllib.urlretrieve. Check the documentation here: https://docs.python.org/2/library/urllib.html#urllib.urlretrieve

Dropbox Python API: File size detection may have failed

I'm attempting to upload a text file to Dropbox using this code:
def uploadFile(file):
f = open('logs/%s.txt' % file)
response = client.put_file('/%s.txt' % file, f)
print "Uploaded log file %s" % file
Connecting to dropbox works perfectly fine, it's just when I upload files I recieve this error:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "C:\Python27\lib\site-packages\dropbox_python_sdk-1.5.1-py2.7.egg\dropbox
\client.py", line 352, in put_file
return self.rest_client.PUT(url, file_obj, headers)
File "C:\Python27\lib\site-packages\dropbox_python_sdk-1.5.1-py2.7.egg\dropbox
\rest.py", line 265, in PUT
return cls.IMPL.PUT(*n, **kw)
File "C:\Python27\lib\site-packages\dropbox_python_sdk-1.5.1-py2.7.egg\dropbox
\rest.py", line 211, in PUT
return self.request("PUT", url, body=body, headers=headers, raw_response=raw
_response)
File "C:\Python27\lib\site-packages\dropbox_python_sdk-1.5.1-py2.7.egg\dropbox
\rest.py", line 174, in request
raise util.AnalyzeFileObjBug(clen, bytes_read)
dropbox.util.AnalyzeFileObjBug:
Expected file object to have 18 bytes, instead we read 17 bytes.
File size detection may have failed (see dropbox.util.AnalyzeFileObj)
Google has given me no help with this one.
Sounds like you are a victim of newline unification. The file object reports a file size of 18 bytes ("abcdefghijklmnop\r\n") but you read only 17 bytes ("abcdefghijklmnop\n").
Open the file in binary mode to avoid this:
f = open('logs/%s.txt' % file, 'rb')
The default is to use text mode, which may convert '\n' characters to a platform-specific representation on writing and back on reading.

Categories

Resources