Set-up
I have a local folder – access_google_drive – which contains the .py script used to access my Google Drive account via the Google Drive API.
The script looks like this,
def connect_google_drive_api():
import os
# use Gdrive API to access Google Drive
os.chdir('/Users/my/fol/ders/access_google_drive')
from pydrive.auth import GoogleAuth
from pydrive.drive import GoogleDrive
gauth = GoogleAuth()
gauth.LocalWebserverAuth() # client_secrets.json need to be in the same directory as the script
drive = GoogleDrive(gauth)
return drive
In the access_google_drive I also have the client_secrets.json file.
Issue
This set-up worked fine, until yesterday. Since yesterday, I see the following error,
Failed to find "code" in the query parameters of the redirect.
Try command-line authentication
Traceback (most recent call last):
File "<ipython-input-8-48c5ad9148cf>", line 2, in <module>
gauth.LocalWebserverAuth() # client_secrets.json need to be in the same directory as the script
File "/opt/anaconda3/lib/python3.7/site-packages/pydrive/auth.py", line 115, in _decorated
code = decoratee(self, *args, **kwargs)
File "/opt/anaconda3/lib/python3.7/site-packages/pydrive/auth.py", line 241, in LocalWebserverAuth
raise AuthenticationError('No code found in redirect')
AuthenticationError: No code found in redirect
Question
I have no idea why I'm seeing this error. Both script and file are in the same folder. I haven't edited the script nor the .json.
Did I miss an update? Are the stars not aligned?
Who can help me out?
Related
I am trying to extract text data from images using Google Cloud Vision API. My Initial starting point was here After Enabling Vision API and Creating service account,generating json file, I created a script by referring this example.
Here's my code
from google.cloud import vision
from google.cloud.vision_v1 import types
import os
os.environ['GOOGLE_APPLICATION_CREDENTIALS'] = 'generated_after_creating_sec_key.json'
image_path = 'images\\image_1.png'
vision_client = vision.ImageAnnotatorClient()
print(vision_client)
image = types.Image()
image.source.image_path = image_path
response = vision_client.text_detection(image=image)
for text in response.text_annotations:
print(text.description)
The only difference between the example shown in google page and my code is that the image uploaded in the example is on gcloud while mine happens to be on local storage.
Here's the complete stacktrace.
<google.cloud.vision_v1.ImageAnnotatorClient object at 0x000001DF861D7970>
Traceback (most recent call last):
File "text_detection.py", line 10, in <module>
image.source.image_path = image_path
File "C:\Users\user\ProjectFolder\ProjName\venv\lib\site-packages\proto\message.py", line 677, in __setattr__
pb_type = self._meta.fields[key].pb_type
KeyError: 'image_path'
What is the root cause of this error? Please help!
Per the Google Cloud Vision docs, you want image.source.image_uri instead.
According to the Google Cloud Vision docs, you should use
image.source.image_uri instead.
I'm trying to move a Python Jupyter scraper script (and json cred file) from my laptop to Google Colab.
I've made a connection between Google Colab and Google Drive.
I've stored the (.ipynb) script and credential JSON file on Google Drive.
However I can't make the connection between the 2 (gdrive json cred file and colab) to make it work.
Here below the part of the script concerning the credentials handling:
# Sheet key
# 1i1bmMt-xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx_d7Eo
import gspread
import pandas as pd
import requests
from bs4 import BeautifulSoup
from oauth2client.service_account import ServiceAccountCredentials
# Access credentials for google sheet and access the google sheet
scope = ["https://spreadsheets.google.com/feeds",
"https://www.googleapis.com/auth/spreadsheets",
"https://www.googleapis.com/auth/drive.file",
"https://www.googleapis.com/auth/drive"]
# Copy your path to your credential JSON file.
PATH_TO_CREDENTIAL = '/Users/user/json-keys/client_secret.json'
# Initiate your credential
credentials = ServiceAccountCredentials.from_json_keyfile_name(PATH_TO_CREDENTIAL, scope)
# Authorize your connection to your google sheet
gc = gspread.authorize(credentials)
I receive FileNotFoundError: and credential erros
Hope someone can help me with this, thanks
You try to put the file to the same directory to test it first. Make sure that the file is okay and can run successfully.
Here's the source code for reference:
If client_secret.json is in the same directory as the file you're running, then the correct syntax is:
import os
DIRNAME = os.path.dirname(__file__)
credentials = ServiceAccountCredentials.from_json_keyfile_name(os.path.join(DIRNAME, 'client_secret.json'), scope)
If the above test is okay, then try to move the file to your target directory '/Users/user/json-keys/client_secret.json' and try to create a symbolic link in the current directory to link the client_secret.json file. Then, run the program with the above code to test it again. Make sure it has no problem when putting the file to that directory. It's a workaround.
I used this case for reference to this:
Django not recognizing or seeing JSON file
Consider the following code that uses the PyDrive module:
from pydrive.auth import GoogleAuth
from pydrive.drive import GoogleDrive
gauth = GoogleAuth()
gauth.LocalWebserverAuth()
drive = GoogleDrive(gauth)
file = drive.CreateFile({'title': 'test.txt'})
file.Upload()
file.SetContentString('hello')
file.Upload()
file.SetContentString('')
file.Upload() # This throws an exception.
Creating file and changing its contents works fine until I try to erase the contents by setting the content string to an empty one. Doing so throws this exception:
pydrive.files.ApiRequestError
<HttpError 400 when requesting
https://www.googleapis.com/upload/drive/v2/files/{LONG_ID}?alt=json&uploadType=resumable
returned "Bad Request">
When I look at my Drive, I see the test.txt file successfully created with text hello in it. However I expected that it would be empty.
If I change the empty string to any other text, the file is changed twice without errors. Though this doesn't clear the contents so it's not what I want.
When I looked up the error on the Internet, I found this issue on PyDrive github that may be related though it remains unsolved for almost a year.
If you want to reproduce the error, you have to create your own project that uses Google Drive API following this tutorial from the PyDrive docs.
How can one erase the contents of a file through PyDrive?
Issue and workaround:
When resumable=True is used, it seems that the data of 0 byte cannot be used. So in this case, it is required to upload the empty data without using resumable=True. But when I saw the script of PyDrive, it seems that resumable=True is used as the default. Ref So in this case, as a workaround, I would like to propose to use the requests module. The access token is retrieved from gauth of PyDrive.
When your script is modified, it becomes as follows.
Modified script:
import io
import requests
from pydrive.auth import GoogleAuth
from pydrive.drive import GoogleDrive
gauth = GoogleAuth()
gauth.LocalWebserverAuth()
drive = GoogleDrive(gauth)
file = drive.CreateFile({'title': 'test.txt'})
file.Upload()
file.SetContentString('hello')
file.Upload()
# file.SetContentString()
# file.Upload() # This throws an exception.
# I added below script.
res = requests.patch(
"https://www.googleapis.com/upload/drive/v3/files/" + file['id'] + "?uploadType=multipart",
headers={"Authorization": "Bearer " + gauth.credentials.token_response['access_token']},
files={
'data': ('metadata', '{}', 'application/json'),
'file': io.BytesIO()
}
)
print(res.text)
References:
PyDrive
Files: update
I'm trying to access and modify data on Google Spreadsheet using Python. I'm having trouble to open the Google Spreadsheet from Python. I closely followed various tutorials and prepared the following before writing any code.
Enabled Google Sheets API and Google Drive API on GCP Console
Generated and downloaded credentials (JSON file) from GCP Console
Spreadsheet: Shared (edit-access) with the client email found in the JSON file
Installed gspread and oauth2client --> pip install gspread oauth2client
The following is the Python code to interface with Google Sheets. The goal in Lines 12 and 13, is to output to the console all of the data found in the linked Google Spreadsheet.
1 import gspread
2 from oauth2client.service_account import ServiceAccountCredentials
3
4 scope = ["https://spreadsheets.google.com/feeds","https://www.googleapis.com/auth/spreadsheets","https://www.googleapis.com/auth/drive.file","https://www.googleapis.com/auth/drive"]
5 creds = ServiceAccountCredentials.from_json_keyfile_name('Py-Sheets.json', scope)
6 client = gspread.authorize(creds)
7
8 print("Hello World")
9
10 sheet = client.open("Test-Sheets").sheet1
11
12 sample = sheet.get_all_records()
13 print(sample)
Everything seems to run fine up to line 10 (above), where I get an error saying SpreadsheetNotFound. Here's the error in full (below).
Traceback (most recent call last):
File "/home/username/anaconda3/lib/python3.7/site-packages/gspread/client.py", line 119, in open
self.list_spreadsheet_files(title),
File "/home/username/anaconda3/lib/python3.7/site-packages/gspread/utils.py", line 97, in finditem
return next((item for item in seq if func(item)))
StopIteration
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "pysheets.py", line 10, in <module>
sheet = client.open("Test-Sheets").sheet1
File "/home/username/anaconda3/lib/python3.7/site-packages/gspread/client.py", line 127, in open
raise SpreadsheetNotFound
gspread.exceptions.SpreadsheetNotFound
I also received the following error via email.
DNS Error: 15698833 DNS type 'mx' lookup of python-spreadsheets-123456.iam.gserviceaccount.com responded with code NXDOMAIN Domain name not found: python-spreadsheets-123456.iam.gserviceaccount.com
How do I fix the error created after executing Line 10? The code is almost the exact same as what I found in the tutorials. The spreadsheet is named exactly what I put in client.open(). Does the spreadsheet have to be in a specific GDrive directory for it to be located?
An alternative would be opening the spreadsheet by URL on Google Colab:
# Access Google Sheets as a data source.
from google.colab import auth
auth.authenticate_user()
import gspread
from oauth2client.client import GoogleCredentials
gc = gspread.authorize(GoogleCredentials.get_application_default())
# At this point, you will have a link printed in your output which will direct you to a sign-in page. Pick the relevant google account and copy the provided link. Paste it in the provided line at the output section.
# Load your dataframe
import pandas as pd
wb = gc.open_by_url('https://docs.google.com/spreadsheets/.....') # A URL of your workbook.
sheet1 = wb.worksheet('Sheet1') # Enter your sheet name.
original_df = sheet1.get_all_values()
df = pd.DataFrame(original_df)
df.head()
When I upload data using following code, the data vanishes once I get disconnected.
from google.colab import files
uploaded = files.upload()
for fn in uploaded.keys():
print('User uploaded file "{name}" with length {length} bytes'.format(
name=fn, length=len(uploaded[fn])))
Please suggest me ways to upload my data so that the data remains intact even after days of disconnection.
I keep my data stored permanently in a .zip file in google drive, and upload it to the google colabs VM using the following code.
Paste it into a cell, and change the file_id. You can find the file_id from the URL of the file in google drive. (Right click on file -> Get shareable link -> find the part of the URL after open?id=)
##title uploader
file_id = "1BuM11fJJ1qdZH3VbQ-GwPlK5lAvXiNDv" ##param {type:"string"}
!pip install -U -q PyDrive
from pydrive.auth import GoogleAuth
from pydrive.drive import GoogleDrive
from google.colab import auth
from oauth2client.client import GoogleCredentials
# 1. Authenticate and create the PyDrive client.
auth.authenticate_user()
gauth = GoogleAuth()
gauth.credentials = GoogleCredentials.get_application_default()
drive = GoogleDrive(gauth)
# PyDrive reference:
# https://googledrive.github.io/PyDrive/docs/build/html/index.html
from google.colab import auth
auth.authenticate_user()
from googleapiclient.discovery import build
drive_service = build('drive', 'v3')
# Replace the assignment below with your file ID
# to download a different file.
#
# A file ID looks like: 1gLBqEWEBQDYbKCDigHnUXNTkzl-OslSO
import io
from googleapiclient.http import MediaIoBaseDownload
request = drive_service.files().get_media(fileId=file_id)
downloaded = io.BytesIO()
downloader = MediaIoBaseDownload(downloaded, request)
done = False
while done is False:
# _ is a placeholder for a progress object that we ignore.
# (Our file is small, so we skip reporting progress.)
_, done = downloader.next_chunk()
fileId = drive.CreateFile({'id': file_id }) #DRIVE_FILE_ID is file id example: 1iytA1n2z4go3uVCwE_vIKouTKyIDjEq
print(fileId['title'])
fileId.GetContentFile(fileId['title']) # Save Drive file as a local file
!unzip {fileId['title']}
Keeping data in GDrive is good (#skaem).
If your data contains code, I can suggest you to simply git clone your source repository from Github (or any other code versioning service), at the beginning of your colab notebook.
This way, you can develop offline, and perform your experiments in the cloud whenever you need, with up-to-date code.