Python boto connecting to S3 throwing error - python

Hello I am trying to connect Python to S3 in Frankfurt region using boto 2.43 where I want to print contents of bucket name...
Following is my code :
from boto.s3.connection import S3Connection
hostname='s3.eu-central-1.amazonaws.com'
conn = S3Connection(aws_access_key_id,aws_secret_access_key, host=hostname)
bucket_name = conn.get_bucket('jd-eu01-isg-analytics-data-from-us01', validate=False)
for key in bucket_name.list(prefix='EU_Scripts_For_Frankfurt/'):
print key
continue
When I am executing it,it throws following error :
File "/usr/lib/python2.7/site-packages/boto/s3/bucketlistresultset.py", line 34, in bucket_lister
encoding_type=encoding_type)
File "/usr/lib/python2.7/site-packages/boto/s3/bucket.py", line 473, in get_all_keys
'', headers, **params)
File "/usr/lib/python2.7/site-packages/boto/s3/bucket.py", line 399, in _get_all
query_args=query_args)
File "/usr/lib/python2.7/site-packages/boto/s3/connection.py", line 668, in make_request
retry_handler=retry_handler
File "/usr/lib/python2.7/site-packages/boto/connection.py", line 1071, in make_request
retry_handler=retry_handler)
File "/usr/lib/python2.7/site-packages/boto/connection.py", line 927, in _mexe
request.authorize(connection=self)
File "/usr/lib/python2.7/site-packages/boto/connection.py", line 377, in authorize
connection._auth_handler.add_auth(self, **kwargs)
File "/usr/lib/python2.7/site-packages/boto/auth.py", line 727, in add_auth
**kwargs)
File "/usr/lib/python2.7/site-packages/boto/auth.py", line 546, in add_auth
string_to_sign = self.string_to_sign(req, canonical_request)
File "/usr/lib/python2.7/site-packages/boto/auth.py", line 486, in string_to_sign
sts.append(self.credential_scope(http_request))
File "/usr/lib/python2.7/site-packages/boto/auth.py", line 468, in credential_scope
region_name = self.determine_region_name(http_request.host)
File "/usr/lib/python2.7/site-packages/boto/auth.py", line 662, in determine_region_name
return region_name
UnboundLocalError: local variable 'region_name' referenced before assignment
How to resolve this issue ?? is this because of boto version ?? Any solutions please

Use boto3, it is easier. boto2 is not supported by AWS and boto already deprecated and there is no intention for features
The error pop up because it can't find the region. Boto/boto3
API will check the region name inside the boto/boto3 service initialization stage. If you didn't specify it, it will look for default region name define inside credential file or environment variable (e.g. ~/.aws/config).
This is true even you explicitly specify S3 endpoint URL. If you don't want to hard code the credential, region name, then you must setup AWS credential as specified here: credential configuration.
Making boto/boto3 using credential file/environment variable will make your code cleaner and more flexible, e.g. you can even using STS without changing the code. e.g.:
import boto3
# You can choose between service resource or service client.
s3 = boto3.client("s3")
response = s3.list_objects_v2(
Bucket="jd-eu01-isg-analytics-data-from-us01",
Prefix="EU_Scripts_For_Frankfurt"
)
for content in response["Contents"]:
print content["Key"]
Nevertheless, you can still hardcode access key id, secre_key, region name, etc by passing the parameter when you initialize boto/boto3 resources. (boto API is similar)
import boto3
s3 = boto3.client(
"s3",
region_name = "eu-central-1",
aws_access_key_id = 'xxxxxxxx",
aws_secret_access_key= = "yyyyyyy")

Related

How to authenticate nexmo voice client?

I am trying to make an outbound call using Nexmo (Vonage) API. To access the API, I am required to authenticate my client using an application id and a private key (given to me as a .key file).
The application id was specified as a string and the private key was specified as a path.
client = nexmo.Client(application_id = 'xxxxxx-xxxxxxx', private_key = "C:\\path\\to\\folder\\private.key")
I get the following error.
File "test_bot.py", line 30, in <module>
'ncco': ncco
File "C:\Users\vishn\AppData\Local\Programs\Python\Python37-32\lib\site-packages\wrapt\wrappers.py", line 606, in __call__
args, kwargs)
File "C:\Users\vishn\AppData\Local\Programs\Python\Python37-32\lib\site-packages\deprecated\classic.py", line 285, in wrapper_function
return wrapped_(*args_, **kwargs_)
File "C:\Users\vishn\AppData\Local\Programs\Python\Python37-32\lib\site-packages\nexmo\__init__.py", line 427, in create_call
return self._jwt_signed_post("/v1/calls", params or kwargs)
File "C:\Users\vishn\AppData\Local\Programs\Python\Python37-32\lib\site-packages\nexmo\__init__.py", line 719, in _jwt_signed_post
self.api_host(), self.session.post(uri, json=params, headers=self._headers())
File "C:\Users\vishn\AppData\Local\Programs\Python\Python37-32\lib\site-packages\nexmo\__init__.py", line 742, in _headers
return dict(self.headers, Authorization=b"Bearer " + token)
TypeError: can't concat str to bytes
Diana from Vonage here.
The fix is already on the way. However i noticed that this occurs only with pyjwt 2.0.

client.get_bucket() fails, but only from cloud dataflow (compute engine)

Application has been working normally, now on a re-deploy, google storage is giving strange errors.
MissingSchema: Invalid URL 'None/storage/v1/b/my-bucket-name?projection=noAcl': No schema supplied. Perhaps you meant http://None/storage/v1/b/my-bucket-name?projection=noAcl?
File "/usr/local/lib/python2.7/dist-packages/lib/file_store.py", line 11, in __init__
self.bucket = self.client.get_bucket(parts[0])
File "/usr/local/lib/python2.7/dist-packages/google/cloud/storage/client.py", line 301, in get_bucket
bucket.reload(client=self)
File "/usr/local/lib/python2.7/dist-packages/google/cloud/storage/_helpers.py", line 130, in reload
_target_object=self,
File "/usr/local/lib/python2.7/dist-packages/google/cloud/_http.py", line 392, in api_request
target_object=_target_object,
File "/usr/local/lib/python2.7/dist-packages/google/cloud/_http.py", line 269, in _make_request
return self._do_request(method, url, headers, data, target_object)
File "/usr/local/lib/python2.7/dist-packages/google/cloud/_http.py", line 298, in _do_request
return self.http.request(url=url, method=method, headers=headers, data=data)
File "/usr/local/lib/python2.7/dist-packages/google/auth/transport/requests.py", line 208, in request
method, url, data=data, headers=request_headers, **kwargs)
File "/usr/local/lib/python2.7/dist-packages/requests/sessions.py", line 519, in request
prep = self.prepare_request(req)
File "/usr/local/lib/python2.7/dist-packages/requests/sessions.py", line 462, in prepare_request
hooks=merge_hooks(request.hooks, self.hooks),
File "/usr/local/lib/python2.7/dist-packages/requests/models.py", line 313, in prepare
self.prepare_url(url, params)
File "/usr/local/lib/python2.7/dist-packages/requests/models.py", line 387, in prepare_url
raise MissingSchema(error)
MissingSchema: Invalid URL 'None/storage/v1/b/my-bucket-name?projection=noAcl': No schema supplied. Perhaps you meant http://None/storage/v1/b/my-bucket-name?projection=noAcl? [while running 'generatedPtransform-51']
The code causing the error, I can run this locally using the same service account and it works, no error. I am using $env:GOOGLE_APPLICATION_CREDENTIALS to export my service account credentials at deploy time. All other services are working normally.
# My test is:
# fs = FileStore("gs://my-bucket-name/models/", "development", "general")
class FileStore():
# modelPath - must be a gs:// style google storage resource path containing everything but the file extension
def __init__(self, modelPath, env, modelName):
from google.cloud import storage
parts = modelPath[5:].split('/', 1)
self.client = storage.Client()
self.bucket = self.client.get_bucket(parts[0]) # <- error here
Why would google core client fail to build a URL? Based on 'None/storage/v1/b/my-bucket-name?projection=noAcl', the missing part of the URL should be something like "https://www.googleapis.com".
This error is apparently caused by a mismatch between google_cloud_storage and google_cloud_core. I had specified google_cloud_core >= 1.0.3 in my setup.py but when I looked on the docker image on the compute VM I found it had an earlier version.
After rebuilding my venv from setup.py I had to also run:
C:\Python27\python.exe -m pipenv install google-cloud-core>=1.0.3 --skip-lock
Then I was able to deploy and the application started working again.

creating bucket error on AWS using python

I want to create a bucket to upload some wav files in it. I am able to create a bucket manually with location required, but when I am trying to program in python to create a bucket with us-west-2 location
session = boto3.Session(aws_access_key_id=aws_access_key_id, aws_secret_access_key=aws_secret_access_key)
s3 = session.resource('s3')
s3.create_bucket(Bucket='test-asterisk1', CreateBucketConfiguration={'LocationConstraint': 'eu-central-1'})
I got the following error
Traceback (most recent call last):
File "create_bucket.py", line 10, in <module>
s3.create_bucket(Bucket='asterisk1', CreateBucketConfiguration={'LocationConstraint': 'ap-south-1'})
File "/home/dileep/.local/lib/python2.7/site-packages/boto3/resources/factory.py", line 520, in do_action
response = action(self, *args, **kwargs)
File "/home/dileep/.local/lib/python2.7/site-packages/boto3/resources/action.py", line 83, in __call__ response = getattr(parent.meta.client, operation_name)(**params)
File "/home/dileep/.local/lib/python2.7/site-packages/botocore/client.py", line 314, in _api_call
return self._make_api_call(operation_name, kwargs)
File "/home/dileep/.local/lib/python2.7/site-packages/botocore/client.py", line 612, in _make_api_call
raise error_class(parsed_response, operation_name)
botocore.exceptions.ClientError: An error occurred (IllegalLocationConstraintException) when calling the CreateBucket operation: The ap-south-1 location constraint is incompatible for the region specific endpoint this request was sent to.
I am on indian IP trying to create a 'us-west-2' end point is that creating a problem?
so I tried changing location constrain one by one time each
"LocationConstraint": "EU"|"eu-west-1"|"us-west-1"|"us-west-2"|"ap-south-1"|"ap-southeast-1"|"ap-southeast-2"|"ap-northeast-1"|"sa-east-1"|"cn-north-1"|"eu-central-1"
but whatever location I try it gives me the same error.
so I tried creating with boto instead of boto3
import boto
from boto.s3.connection import Location
s3 = boto.connect_s3(aws_access_key_id, aws_secret_access_key)
s3.create_bucket('test-asterisk1', location=Location.USWest2)
it throws error
File "s2t_amazon.py", line 27, in <module>
s3.create_bucket('test-asterisk2', location=Location.USWest2)
File "/home/dileep/.local/lib/python2.7/site-packages/boto/s3/connection.py", line 623, in create_bucket
response.status, response.reason, body)
boto.exception.S3CreateError: S3CreateError: 409 Conflict
<?xml version="1.0" encoding="UTF-8"?>
<Error><Code>BucketAlreadyOwnedByYou</Code><Message>Your previous request to create the named bucket succeeded and you already own it.
</Message><BucketName>test-asterisk2</BucketName>
<RequestId>EAF26BA152FD20A5</RequestId<HostId>ep0WFZEb1mIjEgbYIY4BGGuOTi5HSutYd3XTKgFjWmRMnGG0ajj5TLF4/t1amJQsOZdZQrqGnoE=</HostId></Error>
I have checked if the bucket is created, and it is not created by anyone of the methods. Can anyone suggest what could be the problem?
This works:
import boto3
s3_client = boto3.client('s3', region_name = 'eu-central-1')
s3_client.create_bucket(Bucket='my-bucket', CreateBucketConfiguration={'LocationConstraint': 'eu-central-1'})
The import thing to realise is that the command must be sent to the region where the bucket is being created. Thus, you'll need to specify the region when creating the client and also when creating the bucket.

credentials issue in Python while using AWS S3

This error i am getting:
ERROR:boto:Unable to read instance data, giving up
Traceback (most recent call last):
File "<ipython-input-62-476f799f9e0f>", line 2, in <module>
conn = boto.connect_s3()
File "/usr/local/lib/python2.7/dist-packages/boto/__init__.py", line 141, in connect_s3
return S3Connection(aws_access_key_id, aws_secret_access_key, **kwargs)
File "/usr/local/lib/python2.7/dist-packages/boto/s3/connection.py", line 191, in __init__
validate_certs=validate_certs, profile_name=profile_name)
File "/usr/local/lib/python2.7/dist-packages/boto/connection.py", line 569, in __init__
host, config, self.provider, self._required_auth_capability())
File "/usr/local/lib/python2.7/dist-packages/boto/auth.py", line 993, in get_auth_handler
'Check your credentials' % (len(names), str(names)))
NoAuthHandlerFound: No handler was ready to authenticate. 1 handlers were checked. ['HmacAuthV1Handler'] Check your credentials
This error Message is coming while establishing connection with aws S3Connection.
I want to establish connection with AWS S3 and read CSV files.
please Help me out?
i am using Python 2.7.12
And Now i am using this below code:
import boto
import time
from boto.s3.connection import S3Connection
conn = S3Connection('<aws access key>','<aws secret key>')
print conn
from boto.s3.connection import Location
print '\n'.join(i for i in dir(Location) if i[0].isupper())
conn.create_bucket('egp-shared-prod/egp-prod-c2c1/',
location=Location.DEFAULT)
And, Its show Error:
File "<ipython-input-69-4b49d719d4ca>", line 15, in <module>
conn.create_bucket('egp-shared-prod/egp-prod-c2c1/', location=Location.DEFAULT)
File "/usr/local/lib/python2.7/dist-packages/boto/s3/connection.py", line 616, in create_bucket
data=data)
File "/usr/local/lib/python2.7/dist-packages/boto/s3/connection.py", line 668, in make_request
retry_handler=retry_handler
File "/usr/local/lib/python2.7/dist-packages/boto/connection.py", line 1071, in make_request
retry_handler=retry_handler)
File "/usr/local/lib/python2.7/dist-packages/boto/connection.py", line 1030, in _mexe
raise ex
gaierror: [Errno -2] Name or service not known
I tried your code and my testing has found that the error is related to your bucket name of egp-shared-prod/egp-prod-c2c1/.
The Bucket Restrictions and Limitations documentation says:
Bucket names can contain lowercase letters, numbers, and hyphens.
Slashes are not permitted. Also, they seem to be upsetting the boto code.
Boto (the official AWS Python bindings) that you are using, expect you to save your AWS_ACCESS_KEY_id and AWS_SECRET_ACCESS_KEY in environment variables like so:
export AWS_ACCESS_KEY_ID='AK123'
export AWS_SECRET_ACCESS_KEY='abc123'
you can pass aws cridential access like:
#Connection with s3 :
s3= boto3.resource(
service_name='s3',
region_name='us-east-1',
aws_secret_access_key='',
aws_access_key_id=''
)

using Google BigQuery through Python script

I want to make some very easy tasks on BigQuery via a python script. I found this package which does not work well. Indeed, when I try this code:
from bigquery import get_client
project_id = 'txxxxxxxxxxxxxxxxxx9'
# Service account email address as listed in the Google Developers Console.
service_account = '7xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx.apps.googleusercontent.com'
# PKCS12 or PEM key provided by Google.
key = '/home/fxxxxxxxxxxxx/Dropbox/access_keys/google_storage/xxxxxxxxxxxxxxxxxxxxx.pem'
client = get_client(project_id, service_account=service_account, private_key_file=key, readonly=True)
# Submit an async query.
results = client.get_table_schema('newdataset', 'newtable2')
print('results')
I get this error:
/home/xxxxxx/anaconda3/envs/snakes/bin/python2.7 /home/xxxxxx/Dropbox/Prog/bigQuery_daily_import/src/main.py
Traceback (most recent call last):
File "/home/xxxxxx/Dropbox/Prog/bigQuery_daily_import/src/main.py", line 9, in <module>
client = get_client(project_id, service_account=service_account, private_key_file=key, readonly=True)
File "/home/xxxxxx/anaconda3/envs/snakes/lib/python2.7/site-packages/bigquery/client.py", line 83, in get_client
readonly=readonly)
File "/home/xxxxxx/anaconda3/envs/snakes/lib/python2.7/site-packages/bigquery/client.py", line 101, in _get_bq_service
service = build('bigquery', 'v2', http=http)
File "/home/xxxxxx/anaconda3/envs/snakes/lib/python2.7/site-packages/oauth2client/util.py", line 142, in positional_wrapper
return wrapped(*args, **kwargs)
File "/home/xxxxxx/anaconda3/envs/snakes/lib/python2.7/site-packages/googleapiclient/discovery.py", line 196, in build
cache)
File "/home/xxxxxx/anaconda3/envs/snakes/lib/python2.7/site-packages/googleapiclient/discovery.py", line 242, in _retrieve_discovery_doc
resp, content = http.request(actual_url)
File "/home/xxxxxx/anaconda3/envs/snakes/lib/python2.7/site-packages/oauth2client/client.py", line 565, in new_request
self._refresh(request_orig)
File "/home/xxxxxx/anaconda3/envs/snakes/lib/python2.7/site-packages/oauth2client/client.py", line 835, in _refresh
self._do_refresh_request(http_request)
File "/home/xxxxxx/anaconda3/envs/snakes/lib/python2.7/site-packages/oauth2client/client.py", line 862, in _do_refresh_request
body = self._generate_refresh_request_body()
File "/home/xxxxxx/anaconda3/envs/snakes/lib/python2.7/site-packages/oauth2client/client.py", line 1541, in _generate_refresh_request_body
assertion = self._generate_assertion()
File "/home/xxxxxx/anaconda3/envs/snakes/lib/python2.7/site-packages/oauth2client/client.py", line 1670, in _generate_assertion
private_key, self.private_key_password), payload)
File "/home/xxxxxx/anaconda3/envs/snakes/lib/python2.7/site-packages/oauth2client/_pycrypto_crypt.py", line 121, in from_string
pkey = RSA.importKey(parsed_pem_key)
File "/home/xxxxxx/anaconda3/envs/snakes/lib/python2.7/site-packages/Crypto/PublicKey/RSA.py", line 665, in importKey
return self._importKeyDER(der)
File "/home/xxxxxx/anaconda3/envs/snakes/lib/python2.7/site-packages/Crypto/PublicKey/RSA.py", line 588, in _importKeyDER
raise ValueError("RSA key format is not supported")
ValueError: RSA key format is not supported
Process finished with exit code 1
My question: is there a tutorial in python which shows how to communicate easily with BigQuery: importing a dataset from google storage or S3, querying something, exporting the result to google storage.
A lot depends on your environment, and once you've figure that out everything should be super simple. I see the only problem on the error log you pasted is figuring out authentication.
Python pandas has had support for BigQuery for a while:
http://pandas.pydata.org/pandas-docs/stable/generated/pandas.io.gbq.read_gbq.html
And I did a video with the creators of the module:
https://www.youtube.com/watch?v=gLeTDUMb7HY
Now, the simplest and fastest way these days to launch an Jupyter notebook with all of the Google Cloud goodies you mention is our new Google Datalab project:
https://cloud.google.com/datalab/
The only Datalab caveat is that it works on cloud servers, but if you want a fully managed Jupyter/IPython environment, totally secure, persistent, and ready to handle BigQuery, storage, etc... try it out.
Meanwhile, if you are writing a web application look at how other web applications solve this task.
For example, re:dash code to connect to BigQuery:
https://github.com/EverythingMe/redash/blob/master/redash/query_runner/big_query.py

Categories

Resources