New to Google API: I have the following function where I am trying to create a new spreadsheet with a dataframe in Google Sheets. When I go to share it with myself, it does not error out, but at the same time the new spreadsheet does not appear to be shared with me (it's not shown in docs.google.com/spreadsheets) - any idea why?
def create_insert_sheets(nm_sheet, email_share, type_role, df):
gc = create_credentials()
sh = gc.create(nm_sheet)
sh.share(email_share, perm_type='user', role=type_role)
##
update_spreadsheet(0, sh.id, df)
return sh
create_insert_sheets("api_test_create", ["my-service-account#kebasic=basu-3321113.iam.gserviceaccount.com","myemail#domain.com"], "owner", df_grade)
Related
OS: Windows 11
Python 3.11, using VS Code
So, I want to use a python script to autofill a bunch of cells in a google spreadsheet, and I was using the guide at https://www.makeuseof.com/tag/read-write-google-sheets-python/. My code to access the spreadsheet and make sure I can write is as follows:
# for writing to google sheets, from https://www.makeuseof.com/tag/read-write-google-sheets-python/
import gspread
from oauth2client.service_account import ServiceAccountCredentials
import json
scopes = [
'https://www.googleapis.com/auth/spreadsheets',
'https://www.googleapis.com/auth/drive'
]
credentials = ServiceAccountCredentials.from_json_keyfile_name("[filename].json", scopes)
file = gspread.authorize(credentials)
sheet = file.open("[spreadsheetName]")
sheet = sheet.testSheet
So, I run the code, and get the following error:
Traceback (most recent call last):
File "[pythonScriptPath]", line 15, in
sheet = file.open("[spreadsheetName]")
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "[anotherPath]", line 160, in open
raise SpreadsheetNotFound
gspread.exceptions.SpreadsheetNotFound
Now, I've googled this error and one possible cause was that I didn't share the spreadsheet with the email in the json file. However, I shared it with the email beforehand (ends in iam.gserviceaccount.com) and set it as editor, and even after double-checking and running the script again I'm still getting the error. Could anyone tell me what I'm missing here, please? And the json file and the script are in the same folder.
EDIT: Solved, by instead changing up how I did things and instead using something I found from the gspread documentation (WORKSHEET_NAME is a constant I defined elsewhere in the code, the caps-in-brackets are just the relevant string that I've removed for anonymization):
credentials = gspread.service_account(filename=r'[JSON_FILE_PATH]')
spreadsheet = credentials.open_by_url('[GOOGLE_SHEET_URL]')
worksheet = spreadsheet.worksheet(WORKSHEET_NAME)
The issue is that your service account dosent have access to the file.
The fastest way to fix it is to open the service account json file and find the service account email address its the only one with a # in it
now go to google drive and share the file with the service account like you would any other user
option number two requires that you have a goggle workspace domain and can configure domain wide deligation
Similar to this: How to export GCP's Security Center Assets to a Cloud Storage via cloud Function?
I need to export the Findings as seen in the Security Command Center to BigQuery so we can easily filter the data we need and generate custom reports.
Using this documentation as an example (https://cloud.google.com/security-command-center/docs/how-to-api-list-findings#python), I wrote the following:
from google.cloud import securitycenter
from google.cloud import bigquery
JSONPath = "Path to JSON File For Service Account"
client = securitycenter.SecurityCenterClient().from_service_account_json(JSONPath)
BQclient = bigquery.Client().from_service_account_json(JSONPath)
table_id = "project.security_center.assets"
org_name = "organizations/1234567891011"
all_sources = "{org_name}/sources/-".format(org_name=org_name)
finding_result_iterator = client.list_findings(request={"parent": all_sources})
for i, finding_result in enumerate(finding_result_iterator):
errors = BQclient.insert_rows_json(table_id, finding_result)
if errors == []:
print("New rows have been added.")
else:
print("Encountered errors while inserting rows: {}".format(errors))
However, that then gave me the error:
"json_rows argument should be a sequence of dicts".
Any help with this would be greatly appreciated :)
Not sure if this existed back then in Q2 of 2021, but now there is documentation telling how to do this:
https://cloud.google.com/security-command-center/docs/how-to-analyze-findings-in-big-query
You can create exports of SCC findings to bigquery using this command:
gcloud scc bqexports create BIG_QUERY_EXPORT \
--dataset=DATASET_NAME \
--folder=FOLDER_ID | --organization=ORGANIZATION_ID | --project=PROJECT_ID \
[--description=DESCRIPTION] \
[--filter=FILTER]
Filter will allow to filter out unwanted findings (they will be in SCC, but won't be copied to the BigQuery).
It's useful if you want to export findings from one project or selected categories only. (Use -category:CATEGORY to exclude categories, works the same on different parameters as well).
I managed to sort this by writing:
for i, finding_result in enumerate(finding_result_iterator):
rows_to_insert = [
{u"category": finding_result.finding.category, u"name": finding_result.finding.name, u"project": finding_result.resource.project_display_name, u"external_uri": finding_result.finding.external_uri},
]
Ok I'm new to python but...I really like it. I have been trying to figure this out for awhile and thought someone could help that knows a lot more than I.
So what I would like to do is use pygsheets and combine batch the updates with one api call vs several. I have been searching for examples or ideas and found if you unlink and link it will do this? I tried and it speed it up only a little bit, then I looked and you could use update.values vs update.value. I have got it to work with the something like this wk1.update_values('A2:C4',[[1,2,3],[4,5,6],[7,8,9]]) but what if you want the updates to be in specific cell locations vs a range like a2:c4? I appreciate any advice in advance.
https://pygsheets.readthedocs.io/en/latest/worksheet.html#pygsheets.Worksheet.update_values
https://pygsheets.readthedocs.io/en/latest/sheet_api.html?highlight=batch_updates#pygsheets.sheet.SheetAPIWrapper.values_batch_update
import pygsheets
gc = pygsheets.authorize() # This will create a link to authorize
# Open spreadsheet
GS_ID = ''
File_Tab_Name = 'File1'
Main_Topic = 'Main Topic'
Actual_Company_Name = 'Company Name'
Street = 'Street Address'
City_State_Zip = 'City State Zip'
Phone_Number = 'Phone Number'
# 2. Open spreadsheet by key
sh = gc.open_by_key(GS_ID)
sh.title = File_Tab_Name
wk1 = sh[0]
wk1.title = File_Tab_Name
#wk1.update_values('A2:C4',[[1,2,3],[4,5,6],[7,8,9]])
wk1.update_values([['a1'],['h1'],['i3']],[[Main_Topic],[Actual_Company_Name],[Street]]) ### is this possible
#wk1.unlink()
#wk1.title = File_Tab_Name
#wk1.update_value("a1",Main_Topic) ###Topic
#wk1.update_value("h1",Actual_Company_Name) ###Company Name
#wk1.update_value("i3",Street) ###Street Address
#wk1.update_value("i4",City_State_Zip) ###City State Zip
#wk1.update_value("i5",Phone_Number) ### Phone Number
#wk1.link() # will do all the updates
From what I could undersand you want to batch update values. you can use the update_values_batch function.
wks.update_values_batch(['A1:A2', 'B1:B2'], [[[1],[2]], [[3],[4]]])
# or
wks.update_values_batch([((1,1), (2,1)), 'B1:B2'], [[[1,2]], [[3,4]]], 'COLUMNS')
# or
wks.update_values_batch(['A1:A2', 'B1:B2'], [[[1,2]], [[3,4]]], 'COLUMNS')
see doc here.
NB: update pygsheets to latest version or install from gitub
pip install --upgrade https://github.com/nithinmurali/pygsheets/archive/staging.zip
Unfortunately, pygsheets has no method for updating multiple ranges in batch. Instead, you can use gspread.
gspread has batch_update method where you can update multiple cell or range at once.
Example:
Code:
import gspread
gc = gspread.service_account()
sh = gc.open_by_key("insert spreadsheet key here").sheet1
sh.batch_update([{
'range': 'A1:B1',
'values': [['42', '43']],
}, {
'range': 'A2:B2',
'values': [['44', '45']],
}])
Output:
References:
gspread:batch_update()
gspread Authentication
I'm trying to authorize a view programmatically in BigQuery and I have the following issue: I just tried the code proposed in the Google docs (https://cloud.google.com/bigquery/docs/dataset-access-controls) but when it comes the part of getting the current access entries for the dataset the result is always empty. I don't want to overwrite the current configuration. Any idea about this behavior?
def authorize_view(dataset_id, view_name):
dataset_ref = client.dataset(dataset_id)
view_ref = dataset_ref.table(view_name)
source_dataset = bigquery.Dataset(client.dataset('mydataset'))
access_entries = source_dataset.access_entries # This returns []
access_entries.append(
bigquery.AccessEntry(None, 'view', view_ref.to_api_repr())
)
source_dataset.access_entries = access_entries
source_dataset = client.update_dataset(
source_dataset, ['access_entries']) # API request
I have a mother workbook, view here (https://docs.google.com/spreadsheets/d/13UnBY3umkojHMd9K9PYv8zITu253vO_ipHNxLvE5u9Q/edit#gid=521061986), of which I have only kept sheets 'Saturday' and 'Sunday'. I have copied this spreadsheet to my drive and I am highlighting the events I am interested in (changing the fill color to yellow). I want to create a code that creates a new 'schedule' based on the events that I highlighted... Keeping the yellow events but setting the other events to null and deleting empty rows.
So far, I've successfully connected to my db, but I don't know how to identify the cells by their fill color.
import os
import gspread
from oauth2client.service_account import ServiceAccountCredentials
DATA_DIR = '/path/here/'
scope = ['https://spreadsheets.google.com/feeds', 'https://www.googleapis.com/auth/drive',
'https://www.googleapis.com/auth/spreadsheets']
path = os.path.join(DATA_DIR, 'client_secret.json')
creds = ServiceAccountCredentials.from_json_keyfile_name(path, scope)
client = gspread.authorize(creds)
spreadsheet = client.open('dcon').sheet1
sheets = ['Saturday', 'Sunday']
# deleted every other sheet except Saturday & Sunday since that's when I'll be there
def get_sheet_colors(spreadsheet, ranges: list):
params = {'ranges': ranges,
'fields': 'sheets(data(rowData(values(effectiveFormat/backgroundColor,formattedValue)),'
'startColumn,startRow),properties(sheetId,title))'}
return spreadsheet.get(**params).execute()
desiredA1NotationRanges = ['Sunday''!A1:K2', 'Saturday''!B2:D4']
# these are example ranges
get_sheet_colors(spreadsheet, desiredA1NotationRanges)
I tried this function that I pulled from another StackOverflow question, but I can't run 'get' on a 'worksheet' object. Does anyone have any ideas for this, or maybe there's an easier way to identify events I'm interested in so the code is easier to write?