Python Pickle.load() not loading correctly - python

SCOPES_SHEETS = 'https://www.googleapis.com/auth/spreadsheets'
Gives read/write permissions ^
def main():
service_sheets = get_google_service('sheets', 'v4', 'token_sheets.json', SCOPES_SHEETS)
with open('services.pkl', 'wb') as f:
pickle.dump(service_sheets, f)
with open('services.pkl', 'rb') as f:
service_sheets = pickle.load(f)
with open('serviceCopy.pkl', 'wb') as f:
pickle.dump(service_sheets, f)
def get_google_service(type, version, token, scope):
store = file.Storage(token)
creds = store.get()
if not creds or creds.invalid:
flow = client.flow_from_clientsecrets('credentials.json', scope)
creds = tools.run_flow(flow, store)
return build(type, version, http=creds.authorize(Http()))
I have a program that I want to run as a service in the background. It involves reading/writing things in a google sheet. For this I have to create a google service but I don't want to have to do it every time the code runs so I'm trying to store the service object in a file instead. For some reason the file service.pkl is different from serviceCopy.pkl. I've tried changing the encoding for pickle.load(file, encoding='utf8') but I keep getting files that don't match.
To my understanding they should be exactly the same.
I think the issue is with loading the saved file but I'm not sure what's causing it.
I'm using python 3.6.

Related

How to get Google Drive API response in Python when using Batch requests?

So I am using Google Drive API to extract metadata from a drive folder, and then storing that data in a csv file. I need to use Batch processing for getting data from multiple files but, how i cannot understand how using the batch method.
This is the code I am using right now.
service = build('drive', 'v3', credentials=creds)
# Call the Drive v3 API
results = service.files().list(q = f"parents = '{drive_folder_id}'", pageSize=20,
fields=drive_data_fields).execute()
# gets 4 responses: kind, nextPageToken,files,incompleteSearch
items = results.get('files', [])
if not items:
print('No files found.')
return
print('Files:')
df = pd.DataFrame(items)
df.to_csv(drive_data_file)
the service.files().list(q = f"parents = '{drive_folder_id}'", pageSize=5, fields=drive_data_fields) returns the metadata of files in folder of drive. What I need is somehow send this as a batch request which returns the metadata as well. However batch.add does not give a response, result returns None.
This is how I want to use the code
service = build('drive', 'v3', credentials=creds)
# Call the Drive v3 API
batch = service.new_batch_http_request()
batch.add(service.files().list(q = f"parents = '{drive_folder_id}'", pageSize=5, fields=drive_data_fields))
result = batch.execute()
#above should return a response that I can write to a csv file.
Problem is, that I have to get the data so that I can write it, but here result returns None, and can't understand how batches response works.
Read the docs, but couldn't get an answer, any help will be appreciated.

How to check if google sheet exist? Python

"""
BEFORE RUNNING:
---------------
1. If not already done, enable the Google Sheets API
and check the quota for your project at
https://console.developers.google.com/apis/api/sheets
2. Install the Python client library for Google APIs by running
`pip install --upgrade google-api-python-client`
"""
# TODO: Change placeholder below to generate authentication credentials. See
# https://developers.google.com/sheets/quickstart/python#step_3_set_up_the_sample
#
# Authorize using one of the following scopes:
# 'https://www.googleapis.com/auth/drive'
# 'https://www.googleapis.com/auth/drive.file'
# 'https://www.googleapis.com/auth/spreadsheets'
SCOPES = ['https://www.googleapis.com/auth/spreadsheets']
creds = None
if os.path.exists('google.json'):
creds = Credentials.from_authorized_user_file('google.json', SCOPES)
if not creds or not creds.valid:
if creds and creds.expired and creds.refresh_token:
creds.refresh(Request())
else:
flow = InstalledAppFlow.from_client_secrets_file(
'CLIENT.json',SCOPES)
creds = flow.run_local_server(port=0)
with open('google.json', 'w') as token:
token.write(creds.to_json())
service = discovery.build('sheets', 'v4', credentials=creds)
spreadsheet_body = {
'sheets': [{
'properties': {
'title': str(files[0])
}
}]
}
request = service.spreadsheets().create(body=spreadsheet_body)
if request == str(files[0]):
pass
else:
response = request.execute()
pprint(response)
How can I create condition? if google sheet name exist if TRUE then don't proceed to create. I read the documentation and I didn't see any possible answer or I am just mistaken to understand the documentation please help thank you.
I believe your goal is as follows.
You want to check whether a file (Google Spreadsheet) is existing in Google Drive using a filename.
You want to achieve this using googleapis for python.
In this case, how about the following sample script? In this case, in order to search the file using the filename, Drive API is used.
Sample script:
filename = str(files[0])
service = build("drive", "v3", credentials=creds)
results = service.files().list(pageSize=1, fields="files(id, name)", q="name='" + filename + "' and mimeType='application/vnd.google-apps.spreadsheet' and trashed=false",).execute()
files = results.get("files", [])
if not files:
# When the file of filename is not found, this script is run.
print("No files were found.")
else:
# When the file of filename is found, this script is run.
print("Files were found.")
When this script is run, you can check whether the file is existing in Google Drive in the filename.
In this case, please add a scope of "https://www.googleapis.com/auth/drive.metadata.readonly" as follows. And, please reauthorize the scopes. So, please remove the file of google.json and run the script again.
SCOPES = [
"https://www.googleapis.com/auth/spreadsheets",
"https://www.googleapis.com/auth/drive.metadata.readonly",
]
From your question, I couldn't know whether you are trying to use the script in the shared Drive. So, in this modification, the script cannot be used for the shared Drive. But, if you want to use this script in the shared Drive, please include corpora="allDrives", includeItemsFromAllDrives=True, supportsAllDrives=True to the request.
Reference:
Files: list

Argparse - AttributeError: 'Namespace' object has no attribute

I know there are dozens of posts about these errors already, but I still can't seem to figure mine out. I'm using the Google Analytics Reporting API (v4) and combining scripts I've found from Google and this site.
The code in question just tries to initialize Google Analytics and creates a csv file with the script's arguments in the name, to which data is saved further along in the code. Everything worked fine until I added the lines to create a file (var basic_url...)
The code runs as: filename.py 'website_url.com' 'start_date' 'end_date'
import argparse
import csv
import sys
from apiclient.discovery import build
import httplib2
from oauth2client import client
from oauth2client import file
from oauth2client import tools
from googleapiclient import sample_tools
from googleapiclient import errors
# Declare command-line flags.
argparser = argparse.ArgumentParser(add_help=False)
argparser.add_argument('property_uri', type=str,
help=('Site or app URI to query data for (including '
'trailing slash).'))
argparser.add_argument('start_date', type=str,
help=('Start date of the requested date range in '
'YYYY-MM-DD format.'))
argparser.add_argument('end_date', type=str,
help=('End date of the requested date range in '
'YYYY-MM-DD format.'))
SCOPES = ['https://www.googleapis.com/auth/analytics.readonly']
# Path to client_secrets.json file for initializing google analytics object
CLIENT_SECRETS_PATH = 'analytics_client_secrets.json'
VIEW_ID = 'XXXXXXX'
def initialize_analyticsreporting(argv,):
"""Initializes the analyticsreporting service object.
Returns:
analytics an authorized analyticsreporting service object.
"""
# Parse command-line arguments.
parser = argparse.ArgumentParser(
formatter_class=argparse.RawDescriptionHelpFormatter,
parents=[tools.argparser])
flags = parser.parse_args([])
# Set up a Flow object to be used if we need to authenticate.
flow = client.flow_from_clientsecrets(
CLIENT_SECRETS_PATH, scope=SCOPES,
message=tools.message_if_missing(CLIENT_SECRETS_PATH))
# Prepare credentials, and authorize HTTP object with them.
# If the credentials don't exist or are invalid run through the native client
# flow. The Storage object will ensure that if successful the good
# credentials will get written back to a file.
storage = file.Storage('analyticsreporting.dat')
credentials = storage.get()
if credentials is None or credentials.invalid:
credentials = tools.run_flow(flow, storage, flags)
http = credentials.authorize(http=httplib2.Http())
# Build the service object.
analytics = build('analyticsreporting', 'v4', http=http)
basic_url = flags.property_uri
basic_url = basic_url.replace("://","_")
basic_url =basic_url.replace(".","-")
# Create blank csv file
f= open("./Landingpage_analytics_"+flags.start_date+"_"+flags.end_date+"_"+basic_url+".csv", 'wt')
writer = csv.writer(f)
writer.writerow( ('Landingpage', 'Sessions', 'Bounces', 'avg. Session Duration', 'avg. Pages/Session') )
f.close()
return analytics, flags
I'm especially confused because I use the Google Search Console API in the same script and initialize a var flags with the following lines and have no trouble accessing the arguments, e.g., flags.start_date and flags.property_uri ...
service, flags = sample_tools.init(
argv, 'searchconsole', 'v1', __doc__, __file__, parents=[argparser],
scope='https://www.googleapis.com/auth/webmasters.readonly')
Here is the traceback:
Traceback (most recent call last):
File "/Users/gabrielh/Documents/Python/EVE Python/Search/attempt3.py", line 483, in <module>
main()
File "/Users/gabrielh/Documents/Python/EVE Python/Search/attempt3.py", line 478, in main
analytics, flags = initialize_analyticsreporting(sys.argv)
File "/Users/gabrielh/Documents/Python/EVE Python/Search/attempt3.py", line 108, in initialize_analyticsreporting
basic_url = flags.property_uri
AttributeError: 'Namespace' object has no attribute 'property_uri'

Can not retrieve thumbnailLink from google drive API

I am trying to get the thumbnailLink from files from a shared drive using python and the google Drive API, however, the file information does not include the thumbnailLink (although for most files the hasThumbnail field, which i do get as a field for the file, has a value of true)
I have looked around a lot and none of the solutions i have found seem to work (although this is my first python project as well as my first google drive api project, so i might just be ignorant of what i am doing)
What i have tried:
- setting the scope to 'https://www.googleapis.com/auth/drive' (was ..drive.metadata.readonly before)
- using a wildcard as such: results = drive.files().list(pageSize=10, fields="*",blablabla...). If i for instance try fields="thumbnailLink" it doesn't find any files.
- after getting the list, i tried using the id of each file from that list to do file = service.files().get(fileId=item_id, supportsAllDrives=True, fields="*").execute() but the same happens, i have many fields including the hasThumbnail field which is set to true, yet no thumbnail link.
- i tried using the "Try this API" console on the official website, where i did in fact get the thumbnailLink!! (with the same parameters as above). So i do not understand why this is missing when requested from my application.
Edit (code):
i have one method like so
SCOPES = ['https://www.googleapis.com/auth/drive']
def getDrive():
"""Shows basic usage of the Drive v3 API.
Prints the names and ids of the first 10 files the user has access to.
"""
creds = None
# The file token.pickle stores the user's access and refresh tokens, and is
# created automatically when the authorization flow completes for the first
# time.
if os.path.exists('token.pickle'):
with open('token.pickle', 'rb') as token:
creds = pickle.load(token)
# If there are no (valid) credentials available, let the user log in.
if not creds or not creds.valid:
if creds and creds.expired and creds.refresh_token:
creds.refresh(Request())
else:
flow = InstalledAppFlow.from_client_secrets_file(
'credentials.json', SCOPES)
creds = flow.run_local_server(port=53209)
# Save the credentials for the next run
with open('token.pickle', 'wb') as token:
pickle.dump(creds, token)
service = build('drive', 'v3', credentials=creds)
return service
then i call it from here and also get the files:
def getFiles(request):
drive = getDrive()
# Call the Drive v3 API
results = drive.files().list(
pageSize=10, fields="*", driveId="blabla", includeItemsFromAllDrives=True, corpora="drive", supportsAllDrives=True).execute()
items = results.get('files', [])
getItems = []
for item in items:
item_id = item['id']
getItems.append(drive.files().get(fileId=item_id, supportsAllDrives=True, fields="*").execute())
if not items:
print('No files found.')
else:
print('Files:')
print(getItems)
for item in items:
# print(u'{0} ({1})'.format(item['name'], item['id']))
print(item)
return render(request, "index.html", {'files': getItems})
Also, yes, i do use a service account, i can retrieve all the files i need, just not the thumbnailLink.
I don't think it makes sense to call list() and then also get() but i had read that the problem could be solved through the get() method, which in my case did not work.
The issue is in the structure of the response
If you specify fields="*", the response would be something like
{
"kind": "drive#fileList",
...
"files": [
{
"kind": "drive#file",
...
"hasThumbnail": true,
"thumbnailLink": "XXX",
"thumbnailVersion": "XXX"
...
}
..
]
}
So, thumbnailLink is nested inside of files.
In order to retrieve it specify:
fields='files(id, thumbnailLink)'

Google Client API v3 - update a file on drive using Python

I'm trying to update the content of a file from a python script using the google client api. The problem is that I keep receiving error 403:
An error occurred: <HttpError 403 when requesting https://www.googleapis.com /upload/drive/v3/files/...?alt=json&uploadType=resumable returned "The resource body includes fields which are not directly writable.
I have tried to remove metadata fields, but didn't help.
The function to update the file is the following:
# File: utilities.py
from googleapiclient import errors
from googleapiclient.http import MediaFileUpload
from googleapiclient.discovery import build
from httplib2 import Http
from oauth2client import file, client, tools
def update_file(service, file_id, new_name, new_description, new_mime_type,
new_filename):
"""Update an existing file's metadata and content.
Args:
service: Drive API service instance.
file_id: ID of the file to update.
new_name: New name for the file.
new_description: New description for the file.
new_mime_type: New MIME type for the file.
new_filename: Filename of the new content to upload.
new_revision: Whether or not to create a new revision for this file.
Returns:
Updated file metadata if successful, None otherwise.
"""
try:
# First retrieve the file from the API.
file = service.files().get(fileId=file_id).execute()
# File's new metadata.
file['name'] = new_name
file['description'] = new_description
file['mimeType'] = new_mime_type
file['trashed'] = True
# File's new content.
media_body = MediaFileUpload(
new_filename, mimetype=new_mime_type, resumable=True)
# Send the request to the API.
updated_file = service.files().update(
fileId=file_id,
body=file,
media_body=media_body).execute()
return updated_file
except errors.HttpError as error:
print('An error occurred: %s' % error)
return None
And here there is the whole script to reproduce the problem.
The goal is to substitute a file, retrieving its id by name.
If the file does not exist yet, the script will create it by calling insert_file (this function works as expected).
The problem is update_file, posted above.
from __future__ import print_function
from utilities import *
from googleapiclient import errors
from googleapiclient.http import MediaFileUpload
from googleapiclient.discovery import build
from httplib2 import Http
from oauth2client import file, client, tools
def get_authenticated(SCOPES, credential_file='credentials.json',
token_file='token.json', service_name='drive',
api_version='v3'):
# The file token.json stores the user's access and refresh tokens, and is
# created automatically when the authorization flow completes for the first
# time.
store = file.Storage(token_file)
creds = store.get()
if not creds or creds.invalid:
flow = client.flow_from_clientsecrets(credential_file, SCOPES)
creds = tools.run_flow(flow, store)
service = build(service_name, api_version, http=creds.authorize(Http()))
return service
def retrieve_all_files(service):
"""Retrieve a list of File resources.
Args:
service: Drive API service instance.
Returns:
List of File resources.
"""
result = []
page_token = None
while True:
try:
param = {}
if page_token:
param['pageToken'] = page_token
files = service.files().list(**param).execute()
result.extend(files['files'])
page_token = files.get('nextPageToken')
if not page_token:
break
except errors.HttpError as error:
print('An error occurred: %s' % error)
break
return result
def insert_file(service, name, description, parent_id, mime_type, filename):
"""Insert new file.
Args:
service: Drive API service instance.
name: Name of the file to insert, including the extension.
description: Description of the file to insert.
parent_id: Parent folder's ID.
mime_type: MIME type of the file to insert.
filename: Filename of the file to insert.
Returns:
Inserted file metadata if successful, None otherwise.
"""
media_body = MediaFileUpload(filename, mimetype=mime_type, resumable=True)
body = {
'name': name,
'description': description,
'mimeType': mime_type
}
# Set the parent folder.
if parent_id:
body['parents'] = [{'id': parent_id}]
try:
file = service.files().create(
body=body,
media_body=media_body).execute()
# Uncomment the following line to print the File ID
# print 'File ID: %s' % file['id']
return file
except errors.HttpError as error:
print('An error occurred: %s' % error)
return None
# If modifying these scopes, delete the file token.json.
SCOPES = 'https://www.googleapis.com/auth/drive'
def main():
service = get_authenticated(SCOPES)
# Call the Drive v3 API
results = retrieve_all_files(service)
target_file_descr = 'Description of deploy.py'
target_file = 'deploy.py'
target_file_name = target_file
target_file_id = [file['id'] for file in results if file['name'] == target_file_name]
if len(target_file_id) == 0:
print('No file called %s found in root. Create it:' % target_file_name)
file_uploaded = insert_file(service, target_file_name, target_file_descr, None,
'text/x-script.phyton', target_file_name)
else:
print('File called %s found. Update it:' % target_file_name)
file_uploaded = update_file(service, target_file_id[0], target_file_name, target_file_descr,
'text/x-script.phyton', target_file_name)
print(str(file_uploaded))
if __name__ == '__main__':
main()
In order to try the example, is necessary to create a Google Drive API from https://console.developers.google.com/apis/dashboard,
then save the file credentials.js and pass its path to get_authenticated(). The file token.json will be created after the first
authentication and API authorization.
The problem is that the metadata 'id' can not be changed when updating a file, so it should not be in the body. Just delete it from the dict:
# File's new metadata.
del file['id'] # 'id' has to be deleted
file['name'] = new_name
file['description'] = new_description
file['mimeType'] = new_mime_type
file['trashed'] = True
I tried your code with this modification and it works
I also struggled a little bit with the function and found if you don't have to update the metadata then just remove them in the update function like :updated_file = service.files().update(fileId=file_id, media_body=media_body).execute()
At Least that worked for me
The problem is The resource body includes fields which are not directly writable. So try removing all of the metadata properties and then add them back one by one. The one I would be suspicious about is trashed. Even though the API docs say this is writable, it shouldn't be. Trashing a file has side effects beyond setting a boolean. Updating a file and setting it to trashed at the same time is somewhat unusual. Are you sure that's what you intend?

Categories

Resources