O want to write a python script that upload data from a file to a bigquery table.
here is the code:
from google.cloud import bigquery
client = bigquery.Client(project=project_id, location='US').from_service_account_json('my-key.json')
dataset_ref = client.dataset(dataset_id)
table_ref = dataset_ref.table(table_name)
client.load_table_from_file(filename, table_ref)
I am using a gcp vm created in the same project as where my bigquery table is. I am also using a service account that has Bigquery admin role.
I have an error that says that the user doesn't have the bigquery.jobs.create permission.
I don't know if it is a useful information but i am able to read my table.
I don't know what to do.
Thanks for your help.
Related
I am trying to use BigQuery inside python to query a table that is generated via a sheet:
from google.cloud import bigquery
# Prepare connexion and query
bigquery_client = bigquery.Client(project="my_project")
query = """
select * from `table-from-sheets`
"""
df = bigquery_client.query(query).to_dataframe()
I can usually do queries to BigQuery tables, but now I am getting the following error:
Forbidden: 403 Access Denied: BigQuery BigQuery: Permission denied while getting Drive credentials.
What do I need to do to access drive from python?
Is there another way around?
You are missing the scopes for the credentials. I'm pasting the code snippet from the official documentation.
In addition, do not forget to give at least VIEWER access to the Service Account in the Google sheet.
from google.cloud import bigquery
import google.auth
# Create credentials with Drive & BigQuery API scopes.
# Both APIs must be enabled for your project before running this code.
credentials, project = google.auth.default(
scopes=[
"https://www.googleapis.com/auth/drive",
"https://www.googleapis.com/auth/bigquery",
]
)
# Construct a BigQuery client object.
client = bigquery.Client(credentials=credentials, project=project)
I have a bigQuery dataset defined in Google Cloud with my userA account, and I want my colleague userB, who is a member of the same group, to be able to see the dataset that I have defined. Using the bq command-line interface, userB can see the project, but not the dataset. How can I share the dataset created by userA with userB using python script?
Another thing you may run into is that you must give access at the data set level in BigQuery. Depending on how you have setup user roles in cloud platform and BigQuery, you may need to give the service account direct access to the Bigquery data set.
To do this go into BigQuery, hover on your dataset and click the down arrow, select 'share data set'. A modal will open where you can then specify which email address's and service accounts to share the data set with and control their access rights.
Let me know if my instructions are too confusing and I'll upload some images showing exactly how to do this.
Good Luck!!
An example using the Python Client Library. Adapted from here but adding a get_dataset call to get the current ACL policy for already existing datasets:
from google.cloud import bigquery
project_id = "PROJECT_ID"
dataset_id = "DATASET_NAME"
group_name= "google-group-name#google.com"
role = "READER"
client = bigquery.Client(project=project_id)
dataset_info = client.get_dataset(client.dataset(dataset_id))
access_entries = dataset_info.access_entries
access_entries.append(
bigquery.AccessEntry(role, "groupByEmail", group_name)
)
dataset_info.access_entries = access_entries
dataset_info = client.update_dataset(
dataset_info, ['access_entries'])
Another way to do it is using the Google Python API Client and the get and patch methods. First, we retrieve the existing dataset ACL, add the group as READER to the response and patch the dataset metadata:
from oauth2client.client import GoogleCredentials
from googleapiclient import discovery
project_id="PROJECT_ID"
dataset_id="DATASET_NAME"
group_name="google-group-name#google.com"
role="READER"
credentials = GoogleCredentials.get_application_default()
bq = discovery.build("bigquery", "v2", credentials=credentials)
response = bq.datasets().get(projectId=project_id, datasetId=dataset_id).execute()
response['access'].append({u'role': u'{}'.format(role), u'groupByEmail': u'{}'.format(group_name)})
bq.datasets().patch(projectId=project_id, datasetId=dataset_id, body=response).execute()
Replace the project_id, dataset_id, group_name and role variables accordingly.
Versions used:
$ pip freeze | grep -E 'bigquery|api-python'
google-api-python-client==1.7.7
google-cloud-bigquery==1.8.1
I am trying to run a simple query on Google BigQuery via a python script, but am getting the below error that my service account is missing bigquery.jobs.create permission.
My service Account has the following roles applied:
Owner
BigQuery Admin
BigQuery Job User
I've also tried creating a custom role with bigquery.jobs.create and applying that to the service account, but still consistently get this error. What am I doing wrong?
from google.cloud import bigquery
from google.oauth2 import service_account
project_id = "my-test-project"
credentials = service_account.Credentials.from_service_account_file("credentials.json")
client = bigquery.Client(
credentials=credentials,
project=project_id
)
print(client.project) # returns "my-test-project"
query = client.query("select 1 as test;")
Access Denied: Project my-test-project: The user my-service-account #
my-test-project. iam.gserviceaccount.com does not have
bigquery.jobs.create permission in project my-test-project.
Authenticating the client using client = bigquery.Client.from_service_account_json("credentials.json") is the preferred method to avoid "Access Denied" errors. For one reason or another (I'm not sure why since bigquery does use oauth 2.0 access tokens to authorize requests), setting credentials through google.oauth2.service_account can lead to permission issues.
(There are a lot of similar threads here but unfortunately I couldn't find the answer to my error anywhere here or on Goolge)
I'm trying to query a federated table in BigQuery which is pointing to a spreadsheet in Drive.
I've run the following command to create default application credentials for gcloud:
$ gcloud auth application-default login
But this doesn't include Drive into the scope so I'm getting the following error message (which makes sense): Forbidden: 403 Access Denied: BigQuery BigQuery: No OAuth token with Google Drive scope was found.
Then I've tried to auth with explicit Drive scope:
$ gcloud auth application-default login --scopes=https://www.googleapis.com/auth/drive,https://www.googleapis.com/auth/cloud-platform,https://www.googleapis.com/auth/bigquery
After that I'm getting the following error when I try to use bigquery python api:
"Forbidden: 403 Access Denied: BigQuery BigQuery: Access Not Configured. Drive API has not been used in project 764086051850 before or it is disabled. Enable it by visiting https://console.developers.google.com/apis/api/drive.googleapis.com/overview?project=764086051850 then retry. If you enabled this API recently, wait a few minutes for the action to propagate to our systems and retry."
The project number above does not exist in our organisation and the provided link leads to a page which says:
The API "drive.googleapis.com" doesn't exist or you don't have permission to access it
Drive API is definitely enabled for the default project, so the error message doesn't make much sense. I can also query the table from the terminal using bq query_string command.
I'm currently out of ideas on how to debug this further, anyone suggestions?
Configuration:
Google Cloud SDK 187.0.0
Python 2.7
google-cloud 0.27.0
google-cloud-bigquery 0.29.0
There might be issues when using the default credentials. However, you can use a service account, save the credentials in a JSON file and add the necessary scopes. I did a quick test and this code worked for me:
from google.cloud import bigquery
from google.oauth2.service_account import Credentials
scopes = (
'https://www.googleapis.com/auth/bigquery',
'https://www.googleapis.com/auth/cloud-platform',
'https://www.googleapis.com/auth/drive'
)
credentials = Credentials.from_service_account_file('/path/to/credentials.json')
credentials = credentials.with_scopes(scopes)
client = bigquery.Client(credentials=credentials)
query = "SELECT * FROM dataset.federated_table LIMIT 5"
query_job = client.query(query)
rows = query_job.result()
for row in rows: print(row)
If you get a 404 not found error is because you need to share the spreadsheet with the service account (view permission)
I would like to develop an app engine application that directly stream data into a BigQuery table.
According to Google's documentation there is a simple way to stream data into bigquery:
http://googlecloudplatform.blogspot.co.il/2013/09/google-bigquery-goes-real-time-with-streaming-inserts-time-based-queries-and-more.html
https://developers.google.com/bigquery/streaming-data-into-bigquery#streaminginsertexamples
(note: in the above link you should select the python tab and not Java)
Here is the sample code snippet on how streaming insert should be coded:
body = {"rows":[
{"json": {"column_name":7.7,}}
]}
response = bigquery.tabledata().insertAll(
projectId=PROJECT_ID,
datasetId=DATASET_ID,
tableId=TABLE_ID,
body=body).execute()
Although I've downloaded the client api I didn't find any reference to a "bigquery" module/object referenced in the above Google's example.
Where is the the bigquery object (from snippet) should be located?
Can anyone show a more complete way to use this snippet (with the right imports)?
I've Been searching for that a lot and found documentation confusing and partial.
Minimal working (as long as you fill in the right ids for your project) example:
import httplib2
from apiclient import discovery
from oauth2client import appengine
_SCOPE = 'https://www.googleapis.com/auth/bigquery'
# Change the following 3 values:
PROJECT_ID = 'your_project'
DATASET_ID = 'your_dataset'
TABLE_ID = 'TestTable'
body = {"rows":[
{"json": {"Col1":7,}}
]}
credentials = appengine.AppAssertionCredentials(scope=_SCOPE)
http = credentials.authorize(httplib2.Http())
bigquery = discovery.build('bigquery', 'v2', http=http)
response = bigquery.tabledata().insertAll(
projectId=PROJECT_ID,
datasetId=DATASET_ID,
tableId=TABLE_ID,
body=body).execute()
print response
As Jordan says: "Note that this uses the appengine robot to authenticate with BigQuery, so you'll to add the robot account to the ACL of the dataset. Note that if you also want to use the robot to run queries, not just stream, you need the robot to be a member of the project 'team' so that it is authorized to run jobs."
Here is a working code example from an appengine app that streams records to a BigQuery table. It is open source at code.google.com:
http://code.google.com/p/bigquery-e2e/source/browse/sensors/cloud/src/main.py#124
To find out where the bigquery object comes from, see
http://code.google.com/p/bigquery-e2e/source/browse/sensors/cloud/src/config.py
Note that this uses the appengine robot to authenticate with BigQuery, so you'll to add the robot account to the ACL of the dataset.
Note that if you also want to use the robot to run queries, not just stream, you need to robot to be a member of the project 'team' so that it is authorized to run jobs.