Problem with authorize view in BigQuery: Dataset access entries is empty - python

I'm trying to authorize a view programmatically in BigQuery and I have the following issue: I just tried the code proposed in the Google docs (https://cloud.google.com/bigquery/docs/dataset-access-controls) but when it comes the part of getting the current access entries for the dataset the result is always empty. I don't want to overwrite the current configuration. Any idea about this behavior?
def authorize_view(dataset_id, view_name):
dataset_ref = client.dataset(dataset_id)
view_ref = dataset_ref.table(view_name)
source_dataset = bigquery.Dataset(client.dataset('mydataset'))
access_entries = source_dataset.access_entries # This returns []
access_entries.append(
bigquery.AccessEntry(None, 'view', view_ref.to_api_repr())
)
source_dataset.access_entries = access_entries
source_dataset = client.update_dataset(
source_dataset, ['access_entries']) # API request

Related

Google Cloud Datastore Query's

Trying to locate data based on its Name/ID field which is auto generated within Google cloud. I want to be able to update the given entity, however I am finding it hard to work with the data formatting. I have a list of data with a button which says 'Update" when clicking the update it gives the Unique Name/ID of that entity, however i cannot seem to find a method of also pulling the information associated with that Name/ID within google cloud.
Table with data in
Data inside google cloud
Unique ID located but struggling to pull the other data based on that ID
def updateSong():
songID = request.form['Update']
# songQuery = datastore_client.query(kind="Song")
# songs = list(songQuery.fetch())
query = datastore_client.query(kind='Song', ancestor=songID)
songData = query.fetch()
print(songData)
id_token = request.cookies.get("token")
error_message = None
if id_token:
try:
user_data = google.oauth2.id_token.verify_firebase_token(
id_token, firebase_request_adapter)
except ValueError as exc:
error_message = str(exc)
return render_template('UpdateSong.html', user_data=user_data, error_message=error_message, songID=songID)
Is there not a method of querying the song ID to then be able to use it as such:
song['Title'] = song title
Try this
query = datastore_client.query()
query.key_filter(datastore_client.key('Song', songID))
song = list(query.fetch())
Source: https://googleapis.dev/python/datastore/latest/_modules/google/cloud/datastore/query.html#Query.key_filter

Listing IBM Cloud Resources using ResourceControllerV2 and pagination issues

I'm using the Python ibm-cloud-sdk in an attempt to iterate all resources in a particular IBM Cloud account. My trouble has been that pagination doesn't appear to "work for me". When I pass in the "next_url" I still get the same list coming back from the call.
Here is my test code. I successfully print many of my COS instances, but I only seem to be able to print the first page....maybe I've been looking at this too long and just missed something obvious...anyone have any clue why I can't retrieve the next page?
try:
####### authenticate and set the service url
auth = IAMAuthenticator(RESOURCE_CONTROLLER_APIKEY)
service = ResourceControllerV2(authenticator=auth)
service.set_service_url(RESOURCE_CONTROLLER_URL)
####### Retrieve the resource instance listing
r = service.list_resource_instances().get_result()
####### get the row count and resources list
rows_count = r['rows_count']
resources = r['resources']
while rows_count > 0:
print('Number of rows_count {}'.format(rows_count))
next_url = r['next_url']
for i, resource in enumerate(resources):
type = resource['id'].split(':')[4]
if type == 'cloud-object-storage':
instance_name = resource['name']
instance_id = resource['guid']
crn = resource['crn']
print('Found instance id : name - {} : {}'.format(instance_id, instance_name))
############### this is SUPPOSED to get the next page
r = service.list_resource_instances(start=next_url).get_result()
rows_count = r['rows_count']
resources = r['resources']
except Exception as e:
Error = 'Error : {}'.format(e)
print(Error)
exit(1)
From looking at the API documentation for listing resource instances, the value of next_url includes the URL path and the start parameter including its token for start.
To retrieve the next page, you would only need to pass in the parameter start with the token as value. IMHO this is not ideal.
I typically do not use the SDK, but a simply Python request. Then, I can use the endpoint (base) URI + next_url as full URI.
If you stick with the SDK, use urllib.parse to extract the query parameter. Not tested, but something like:
from urllib.parse import urlparse,parse_qs
o=urlparse(next_url)
q=parse_qs(o.query)
r = service.list_resource_instances(start=q['start'][0]).get_result()
Could you use the Search API for listing the resources in your account rather than the resource controller? The search index is set up for exactly that operation, whereas paginating results from the resource controller seems much more brute force.
https://cloud.ibm.com/apidocs/search#search

Can CampaignPerformanceReportRequest return for all campaigns?

Trying to use the Bing Ads API to duplicate what I see on the Hourly report.
Unfortunately, even though I'm properly authenticated, the data I'm getting back is only for One Campaign (one which has like 1 impression per day). I can see the data in the UI just fine, but authenticated as the same user via the API, I can only seem to get back the smaller data set. I'm using https://github.com/BingAds/BingAds-Python-SDK and basing my code on the example:
def get_hourly_report(
account_id,
report_file_format,
return_only_complete_data,
time):
report_request = reporting_service.factory.create('CampaignPerformanceReportRequest')
report_request.Aggregation = 'Hourly'
report_request.Format = report_file_format
report_request.ReturnOnlyCompleteData = return_only_complete_data
report_request.Time = time
report_request.ReportName = "Hourly Bing Report"
scope = reporting_service.factory.create('AccountThroughCampaignReportScope')
scope.AccountIds = {'long': [account_id]}
# scope.Campaigns = reporting_service.factory.create('ArrayOfCampaignReportScope');
# scope.Campaigns.CampaignReportScope.append();
report_request.Scope = scope
report_columns = reporting_service.factory.create('ArrayOfCampaignPerformanceReportColumn')
report_columns.CampaignPerformanceReportColumn.append([
'TimePeriod',
'CampaignId',
'CampaignName',
'DeviceType',
'Network',
'Impressions',
'Clicks',
'Spend'
])
report_request.Columns = report_columns
return report_request
I'm not super familiar with these ad data APIs, so any insight will be helpful, even if you don't have a solution.
I spent weeks back and forth with Microsoft Support. Here's the result:
You can get logs out of the examples by adding this code:
import logging
logging.basicConfig(level=logging.INFO)
logging.getLogger('suds.client').setLevel(logging.DEBUG)
logging.getLogger('suds.transport').setLevel(logging.DEBUG)
The issue was related to the way the example is built. In the auth_helper.py file there is a method named authenticate that looks like this:
def authenticate(authorization_data):
# import logging
# logging.basicConfig(level=logging.INFO)
# logging.getLogger('suds.client').setLevel(logging.DEBUG)
# logging.getLogger('suds.transport.http').setLevel(logging.DEBUG)
customer_service = ServiceClient(
service='CustomerManagementService',
version=13,
authorization_data=authorization_data,
environment=ENVIRONMENT,
)
# You should authenticate for Bing Ads services with a Microsoft Account.
authenticate_with_oauth(authorization_data)
# Set to an empty user identifier to get the current authenticated Bing Ads user,
# and then search for all accounts the user can access.
user = get_user_response = customer_service.GetUser(
UserId=None
).User
accounts = search_accounts_by_user_id(customer_service, user.Id)
# For this example we'll use the first account.
authorization_data.account_id = accounts['AdvertiserAccount'][0].Id
authorization_data.customer_id = accounts['AdvertiserAccount'][0].ParentCustomerId
As you can see, at the very bottom, it says "For this example, we'll use the first account." It turns out that my company had 2 accounts. This was not configurable anywhere and I had no idea this code was here, but you can add a breakpoint here to see your full list of accounts. We only had 2, so I flipped the 0 to a 1 and everything started working.

batch predictions google automl via python

I'm pretty new using stackoverflow as well as using the google cloud platform, so apologies if am not asking this question in the right format. I am currently facing an issue with getting the predictions from my model.
I've trained a multilabel automl model on the google cloud platform and and now i want to use that model to score out new data entries.
Since the platform only allows one entry at the same time i want to make use of python to do batch predictions.
I've stored my data entries in seperate .txt files on the google cloud bucket and created a .txt file where i'm listing the gs:// references to those files (like they recommend in the documentation).
I've exported a .json file with my credentials from the service account and specified the id's and paths in my code:
# import API credentials and specify model / path references
path = 'xxx.json'
os.environ['GOOGLE_APPLICATION_CREDENTIALS'] = path
model_name = 'xxx'
model_id = 'TCN1234567890'
project_id = '1234567890'
model_full_id = f"https://eu-automl.googleapis.com/v1/projects/{project_id}/locations/eu/models/{model_id}"
input_uri = f"gs://bucket_name/{model_name}/file_list.txt"
output_uri = f"gs://bucket_name/{model_name}/outputs/"
prediction_client = automl.PredictionServiceClient()
And then i'm running the following code to get the predictions:
# score batch of file_list
gcs_source = automl.GcsSource(input_uris=[input_uri])
input_config = automl.BatchPredictInputConfig(gcs_source=gcs_source)
gcs_destination = automl.GcsDestination(output_uri_prefix=output_uri)
output_config = automl.BatchPredictOutputConfig(
gcs_destination=gcs_destination
)
response = prediction_client.batch_predict(
name=model_full_id,
input_config=input_config,
output_config=output_config
)
print("Waiting for operation to complete...")
print(
f"Batch Prediction results saved to Cloud Storage bucket. {response.result()}"
)
However, i'm getting the following error: InvalidArgument: 400 Request contains an invalid argument.
Would anyone have a hince what is causing this issue?
Any input would be appreciated! Thanks!
Found the issue!
I needed to set the client to the 'eu' environment first:
options = ClientOptions(api_endpoint='eu-automl.googleapis.com')
prediction_client = automl.PredictionServiceClient(client_options=options)

How to copy a BigQuery view via Python SDK?

I have two BigQuery projects and I want to copy a view from Project 1 to Project 2:
from google.cloud import bigquery
proj_1 = bigquery.Client.from_service_account_json(<path>, project='Project 1')
dataset_1 = proj_1.dataset(<dataset_name>)
view_1 = dataset_1.table(<view_name>) # View to copy, already existing
proj_2 = bigquery.Client.from_service_account_json(<path>, project='Project 2')
dataset_2 = proj_2.dataset(<dataset_name>)
view_2 = dataset_2.table(<view_name>) # Destination for copied view
# Start copy job like Google says
# https://cloud.google.com/bigquery/docs/tables#copyingtable
I get the following error:
RuntimeError: [{'message': 'Using table <project>:<dataset>.<view_name> is not allowed for this operation because of its type. Try using a different table that is of type TABLE.', 'reason': 'invalid'}]
I already know that if I set the attribute view_query, view_2 will be recognized as a view. If I set it manually, it works. But the second (automated) solution does not, because the attribute view_1.view_query is always None.
view_2.view_query = 'SELECT * FROM ...' # works
view_2.view_query = view_1.view_query # Won't work, because view_1.view_query is always None
How can I access the query of view_1?
Call of view_1.reload() loads the attribute view_query.
See https://googlecloudplatform.github.io/google-cloud-python/latest/bigquery-usage.html
So
view_1.reload()
view_2.view_query = view_1.view_query
view_2.create() # No need for a copy job, because there is no data copied
does the trick now.

Categories

Resources