How to get object from MinIO response?

How to get object from MinIO response? - python

I am using python API to save and download model from MinIO. This is a MinIO installed on my server. The data is in binary format.
a = 'Hello world!'
a = pickle.dumps(a)
client.put_object(
bucket_name='my_bucket',
object_name='my_object',
data=io.BytesIO(a),
length=len(a)
)
I can see object saved through command line :
mc cat origin/my_bucket/my_object
Hello world!
However, when i try to get it through Python API :
response = client.get_object(
bucket_name = 'my_bucket',
object_name= 'my_object'
)
response is a urllib3.response.HTTPResponse object here.
I am trying to read it as :
response.read()
b''
I get a blank binary string. How can I read this object? It won't be possible for me to know its length at the time of reading it.
and here is response.__dict__ :
{'headers': HTTPHeaderDict({'Accept-Ranges': 'bytes', 'Content-Length': '27', 'Content-Security-Policy': 'block-all-mixed-content', 'Content-Type': 'application/octet-stream', 'ETag': '"75687-1"', 'Last-Modified': 'Fri, 16 Jul 2021 14:47:35 GMT', 'Server': 'MinIO/DEENT.T', 'Vary': 'Origin', 'X-Amz-Request-Id': '16924CCA35CD', 'X-Xss-Protection': '1; mode=block', 'Date': 'Fri, 16 Jul 2021 14:47:36 GMT'}), 'status': 200, 'version': 11, 'reason': 'OK', 'strict': 0, 'decode_content': True, 'retries': Retry(total=5, connect=None, read=None, redirect=None, status=None), 'enforce_content_length': False, 'auto_close': True, '_decoder': None, '_body': None, '_fp': <http.client.HTTPResponse object at 01e50>, '_original_response': <http.client.HTTPResponse object at 0x7e50>, '_fp_bytes_read': 0, 'msg': None, '_request_url': None, '_pool': <urllib3.connectionpool.HTTPConnectionPool object at 0x790>, '_connection': None, 'chunked': False, 'chunk_left': None, 'length_remaining': 27}

Try with response.data.decode()

The response is a urllib3.response.HTTPResponse object.
See urllib3 Documentation:
Backwards-compatible with http.client.HTTPResponse but the response body is loaded and decoded on-demand when the data property is accessed.
Specifically, you should read the answer like this:
response.data # len(response.data)
Or, if you want to stream the object, you have examples on the minio-py repository: examples/get_objects.

Related

SQS sending empty MessageBody

I'm trying to send a message via a Lambda to SQS, however the MessageBody is never sent - any ideas?
Sender:
sqs = boto3.client('sqs')
queue_url = '<queue url>'
# send message to sqs
response = sqs.send_message(
QueueUrl = queue_url,
DelaySeconds=10,
MessageAttributes={
'Author': {
'DataType':'String',
'StringValue':'Mia'
}
},
MessageBody=(
"127.0.0.1"
)
)
Receiver:
# recieve message from sqs
sqs = boto3.client('sqs')
queue_url = '<queue url>'
response = sqs.receive_message(
QueueUrl=queue_url,
AttributeNames=[
'SentTimestamp'
],
MaxNumberOfMessages=1,
MessageAttributeNames=[
'All'
],
VisibilityTimeout=0,
WaitTimeSeconds=0
)
if 'Messages' in response:
message = response['Messages'][0]
receipt_handle = message['ReceiptHandle']
# delete message from queue
sqs.delete_message(
QueueUrl=queue_url,
ReceiptHandle=receipt_handle
)
return message
When I print the response variable from the sender I get:
{'MD5OfMessageBody': '4cb7e68d651d2dcac370949dd9d47c6e', 'MD5OfMessageAttributes': 'd260c4584b674fcbf00879268dd2a419', 'MessageId': '4a7d636b-4955-424b-8c62-b70c03bd7983', 'ResponseMetadata': {'RequestId': 'a902c587-9a6a-529d-a016-9fdfa8adc719', 'HTTPStatusCode': 200, 'HTTPHeaders': {'x-amzn-requestid': 'a902c587-9a6a-529d-a016-9fdfa8adc719', 'date': 'Wed, 12 Jan 2022 01:36:08 GMT', 'content-type': 'text/xml', 'content-length': '459'}, 'RetryAttempts': 0}}
and the reciever:
{'ResponseMetadata': {'RequestId': '718d7c1e-8cc4-5e83-9533-2646bbfa63d5', 'HTTPStatusCode': 200, 'HTTPHeaders': {'x-amzn-requestid': '718d7c1e-8cc4-5e83-9533-2646bbfa63d5', 'date': 'Wed, 12 Jan 2022 01:55:10 GMT', 'content-type': 'text/xml', 'content-length': '240'}, 'RetryAttempts': 0}}
Why is there no Messages attribute in the received response?

I had a similar problem. I would post one message to sqs, and I expected my Lambda to trigger, which would consume that one message by calling receive_message. But the message that the Lambda consumed off sqs was always empty (no Messages or Body).
I realized that the sqs message was automagically taken off the queue and passed into my Lambda via the event parameter.
def lambda_handler(event, context):
So there was no need to call receive_message in my Lambda. Again, since AWS already took the message off the queue, calling receive_message would come up empty because the queue was empty.
I bet if you print the contents of event, you'll see the message you expect.

aws python lambda: reading csv file (iterator should return strings)

I'm getting this message when I'm trying to test my python 3.8 lambda function:
Logs are:
soc-connect
contacts.csv
{'ResponseMetadata': {'RequestId': '9D7D7F0C5CB79984', 'HostId': 'wOd6HvIm+BpLOMKF2beRvqLiW0NQt5mK/kzjCjYxQ2kHQZY0MRCtGs3l/rqo4o0r4xAPuV1QpGM=', 'HTTPStatusCode': 200, 'HTTPHeaders': {'x-amz-id-2': 'wOd6HvIm+BpLOMKF2beRvqLiW0NQt5mK/kzjCjYxQ2kHQZY0MRCtGs3l/rqo4o0r4xAPuV1QpGM=', 'x-amz-request-id': '9D7D7F0C5CB79984', 'date': 'Thu, 26 Mar 2020 11:21:35 GMT', 'last-modified': 'Tue, 24 Mar 2020 16:07:30 GMT', 'etag': '"8a3785e750475af3ca25fa7eab159dab"', 'accept-ranges': 'bytes', 'content-type': 'text/csv', 'content-length': '52522', 'server': 'AmazonS3'}, 'RetryAttempts': 0}, 'AcceptRanges': 'bytes', 'LastModified': datetime.datetime(2020, 3, 24, 16, 7, 30, tzinfo=tzutc()), 'ContentLength': 52522, 'ETag': '"8a3785e750475af3ca25fa7eab159dab"', 'ContentType': 'text/csv', 'Metadata': {}, 'Body': <botocore.response.StreamingBody object at 0x7f858dc1e6d0>}
1153
<_csv.reader object at 0x7f858ea76970>
[ERROR] Error: iterator should return strings, not bytes (did you open the file in text mode?)
The code snippet is:
import boto3
import csv
def digest_csv(bucket_name, key_name):
# Let's use Amazon S3
s3 = boto3.client('s3');
print(bucket_name)
print(key_name)
s3_object = s3.get_object(Bucket=bucket_name, Key=key_name)
print(s3_object)
# read the contents of the file and split it into a list of lines
lines = s3_object['Body'].read().splitlines(True)
print(len(lines))
contacts = csv.reader(lines, delimiter=';')
print(contacts)
# now iterate over those contacts
for contact in contacts:
# here you get a sequence of dicts
# do whatever you want with each line here
print('-*-'.join(contact))
I think the problem is on csv.reader.
I'm setting first parameter an array of lines... Should it be modified?
Any ideas?

Instead of using the csv.reader the following worked for me (adjusted for your delimiter and variables):
for line in lines:
contact = ''.join(line.decode().split(';'))
print(contact)

Shopify API: delete discounts and price rules

I'm using the Shopify Python library to work with my store's orders, discounts, etc. I'm writing some tests that involve creating and deleting price rules and discounts.
def test_get_discount(self):
random_number_string = str(randint(0,10000))
price_rule = shopify.PriceRule.create({
'title': 'TEST-PRICERULE-' + random_number_string,
'target_type': 'line_item',
'target_selection': 'all',
'allocation_method': 'across',
'value_type': 'fixed_amount',
'value': -1000,
'once_per_customer': True,
'customer_selection': 'all',
'starts_at': datetime.datetime.now().strftime("%Y-%m-%d %H:%M:%S")
})
discount = shopify.DiscountCode.create({
'price_rule_id': price_rule.id,
'code': 'TEST-DISCOUNT-' + random_number_string,
'usage_count': 1,
'value_type': 'fixed_amount',
'value': -1000
})
fetched_price_rule = shopify_utils.get_discount_codes('TEST-PRICERULE-' + random_number_string)
self.assertIsNotNone(fetched_price_rule)
#deleted_discount = discount.destroy()
#self.assertIsNone(deleted_discount)
deleted_price_rule = price_rule.delete('discounts')
self.assertIsNone(deleted_price_rule)
Everything up to deleting works properly. I'm also able to delete a discount with no problem, but when i try to delete a price rule, it errors out with the following:
deleted_price_rule = price_rule.delete()
TypeError: _instance_delete() missing 1 required positional argument: 'method_name'
I believe that it's asking me for the nested resource name, so I've tried passing things like discounts, discount_code, etc, but no luck. I see that pyactiveresouce is using that method_name to construct a url to delete the related resource, just don't know what their API looks like as it's not very well documented for cases like this.
When i do specify a (wrong) method name, I get this error:
pyactiveresource.connection.ClientError: Response(code=406, body="b''", headers={'Server': 'nginx', 'Date': 'Sun, 10 Sep 2017 00:57:28 GMT', 'Content-Type': 'application/json', 'Transfer-Encoding': 'chunked', 'Connection': 'close', 'X-Sorting-Hat-PodId': '23', 'X-Sorting-Hat-PodId-Cached': '0', 'X-Sorting-Hat-ShopId': '13486411', 'X-Sorting-Hat-Section': 'pod', 'X-Sorting-Hat-ShopId-Cached': '0', 'Referrer-Policy': 'origin-when-cross-origin', 'X-Frame-Options': 'DENY', 'X-ShopId': '13486411', 'X-ShardId': '23', 'X-Shopify-Shop-Api-Call-Limit': '3/40', 'HTTP_X_SHOPIFY_SHOP_API_CALL_LIMIT': '3/40', 'X-Stats-UserId': '0', 'X-Stats-ApiClientId': '1807529', 'X-Stats-ApiPermissionId': '62125531', 'Strict-Transport-Security': 'max-age=7776000', 'Content-Security-Policy': "default-src 'self' data: blob: 'unsafe-inline' 'unsafe-eval' https://* shopify-pos://*; block-all-mixed-content; child-src 'self' https://* shopify-pos://*; connect-src 'self' wss://* https://*; script-src https://cdn.shopify.com https://checkout.shopifycs.com https://js-agent.newrelic.com https://bam.nr-data.net https://dme0ih8comzn4.cloudfront.net https://api.stripe.com https://mpsnare.iesnare.com https://appcenter.intuit.com https://www.paypal.com https://maps.googleapis.com https://stats.g.doubleclick.net https://www.google-analytics.com https://visitors.shopify.com https://v.shopify.com https://widget.intercom.io https://js.intercomcdn.com 'self' 'unsafe-inline' 'unsafe-eval'; upgrade-insecure-requests; report-uri /csp-report?source%5Baction%5D=error_404&source%5Bapp%5D=Shopify&source%5Bcontroller%5D=admin%2Ferrors&source%5Bsection%5D=admin_api&source%5Buuid%5D=72550a71-d55b-4fb2-961f-f0447c1d04c6", 'X-Content-Type-Options': 'nosniff', 'X-Download-Options': 'noopen', 'X-Permitted-Cross-Domain-Policies': 'none', 'X-XSS-Protection': '1; mode=block; report=/xss-report?source%5Baction%5D=error_404&source%5Bapp%5D=Shopify&source%5Bcontroller%5D=admin%2Ferrors&source%5Bsection%5D=admin_api&source%5Buuid%5D=72550a71-d55b-4fb2-961f-f0447c1d04c6', 'X-Dc': 'ash,chi2', 'X-Request-ID': '72550a71-d55b-4fb2-961f-f0447c1d04c6'}, msg="Not Acceptable")
Any ideas hugely appreciated!

As #Tim mentioned in his response, the destroy method worked for me and successfully deleted all children instances as well as parent instance.

boto3 request_spot_instances method does not work with "UserData"

I'm trying to make AWS EC2 spot instance request via AWS Lambda, and using boto3 to make call to EC2 API.
Now I can create spot instance, but "UserData" param is not working.
#!/usr/bin/env python
# -*- coding: utf-8 -*-
import boto3
import json
import logging
import base64
import os
from pathlib import Path
def request_spot_instance(access_key, secret_key):
ec2_client = boto3.client('ec2',
aws_access_key_id = access_key,
aws_secret_access_key = secret_key,
region_name = 'ap-northeast-1'
)
response = ec2_client.request_spot_instances(
SpotPrice = '0.2',
Type = 'one-time',
LaunchSpecification = {
'ImageId': 'ami-be4a24d9',
'KeyName': 'my_key',
'InstanceType':'c4.large',
'UserData': base64.encodestring(u'#!/bin/sh \n touch foo.txt'.encode('utf-8')).decode('ascii'),
'Placement':{},
'SecurityGroupIds':[
'sg-6bd2780c'
]
}
)
return response
def lambda_handler(event, context):
response = request_spot_instance(os.environ.get('AWS_ACCESS_KEY_ID'), os.environ.get('AWS_SECRET_ACCESS_KEY'))
print(response)
return event, context
if __name__ == "__main__":
lambda_handler("","")
I'm using this code with Python 2.7.11.
The response of request method is:
{u'SpotInstanceRequests': [{u'Status': {u'UpdateTime': datetime.datetime(2016, 12, 23, 6, 11, 8, tzinfo=tzutc()), u'Code': 'pending-evaluation', u'Message': 'Your Spot request has been submitted for review, and is pending evaluation.'}, u'ProductDescription': 'Linux/UNIX', u'SpotInstanceRequestId': 'sir-cerib8cg', u'State': 'open', u'LaunchSpecification': {u'Placement': {u'AvailabilityZone': 'ap-northeast-1c'}, u'ImageId': 'ami-be4a24d9', u'KeyName': 'miz_private_key', u'SecurityGroups': [{u'GroupName': 'ssh only', u'GroupId': 'sg-6bd2780c'}], u'SubnetId': 'subnet-32c6626a', u'Monitoring': {u'Enabled': False}, u'InstanceType': 'c4.large'}, u'Type': 'one-time', u'CreateTime': datetime.datetime(2016, 12, 23, 6, 11, 8, tzinfo=tzutc()), u'SpotPrice': '0.200000'}], 'ResponseMetadata': {'RetryAttempts': 0, 'HTTPStatusCode': 200, 'RequestId': '285867eb-9303-4a6d-83fa-5ccfdadd482f', 'HTTPHeaders': {'transfer-encoding': 'chunked', 'vary': 'Accept-Encoding', 'server': 'AmazonEC2', 'content-type': 'text/xml;charset=UTF-8', 'date': 'Fri, 23 Dec 2016 06:11:08 GMT'}}}
This response does not not include "UserData", which seems differ from wrote in boto3 manual.
I'm suspecting UserData param is not accepted.
Any suggestions are welcome.

Self resolved.
The user data shell script was executed at root "/" directory, and simply I could not find the generated file.
$ ls -al /foo.txt
-rw-r--r-- 1 root root 0 Dec 23 07:03 /foo.txt

AWS S3 image saving loses metadata

I am working with an AWS Lambda function written in python 2.7x which downloads, saves to /tmp , then uploads the image file back to bucket.
My image meta data starts out in original bucket with http headers like Content-Type= image/jpeg, and others.
After saving my image with PIL, all headers are gone and I am left with Content-Type = binary/octet-stream
From what I can tell, image.save is loosing the headers due to the way PIL works. How do I either preserve metadata or at least apply it to the new saved image?
I have seen post suggesting that this metadata is in exif but I tried to get exif info from original file and apply to saved file with no luck. I am not clear of it's in exif data anyway.
Partial code to give idea of what I am doing:
def resize_image(image_path):
with Image.open(image_path) as image:
image.save(upload_path, optimize=True)
def handler(event, context):
global upload_path
for record in event['Records']:
bucket = record['s3']['bucket']['name']
key = urllib.unquote_plus(event['Records'][0]['s3']['object']['key'].encode("utf8"))
download_path = '/tmp/{}{}'.format(uuid.uuid4(), file_name)
upload_path = '/tmp/resized-{}'.format(file_name)
s3_client.download_file(bucket, key, download_path)
resize_image(download_path)
s3_client.upload_file(upload_path, '{}resized'.format(bucket), key)
Thanks to Sergey, I changed to using get_object but response is missing Metadata:
response = s3_client.get_object(Bucket=bucket,Key=key)
response= {u'Body': , u'AcceptRanges': 'bytes', u'ContentType': 'image/jpeg', 'ResponseMetadata': {'HTTPStatusCode': 200, 'RetryAttempts': 0, 'HostId': 'au30hBMN37/ti0WCfDqlb3t9ehainumc9onVYWgu+CsrHtvG0u/zmgcOIvCCBKZgQrGoooZoW9o=', 'RequestId': '1A94D7F01914A787', 'HTTPHeaders': {'content-length': '84053', 'x-amz-id-2': 'au30hBMN37/ti0WCfDqlb3t9ehainumc9onVYWgu+CsrHtvG0u/zmgcOIvCCBKZgQrGoooZoW9o=', 'accept-ranges': 'bytes', 'expires': 'Sun, 01 Jan 2034 00:00:00 GMT', 'server': 'AmazonS3', 'last-modified': 'Fri, 23 Dec 2016 15:21:56 GMT', 'x-amz-request-id': '1A94D7F01914A787', 'etag': '"9ba59e5457da0dc40357f2b53715619d"', 'cache-control': 'max-age=2592000,public', 'date': 'Fri, 23 Dec 2016 15:21:58 GMT', 'content-type': 'image/jpeg'}}, u'LastModified': datetime.datetime(2016, 12, 23, 15, 21, 56, tzinfo=tzutc()), u'ContentLength': 84053, u'Expires': datetime.datetime(2034, 1, 1, 0, 0, tzinfo=tzutc()), u'ETag': '"9ba59e5457da0dc40357f2b53715619d"', u'CacheControl': 'max-age=2592000,public', u'Metadata': {}}
If I use:
metadata = response['ResponseMetadata']['HTTPHeaders']
metadata = {'content-length': '84053', 'x-amz-id-2': 'f5UAhWzx7lulo3cMVF8hdVRbHnhdnjHWRDl+LDFkYm9pubjL0A01L5yWjgDjWRE4TjRnjqDeA0U=', 'accept-ranges': 'bytes', 'expires': 'Sun, 01 Jan 2034 00:00:00 GMT', 'server': 'AmazonS3', 'last-modified': 'Fri, 23 Dec 2016 15:47:09 GMT', 'x-amz-request-id': '4C69DF8A58EF3380', 'etag': '"9ba59e5457da0dc40357f2b53715619d"', 'cache-control': 'max-age=2592000,public', 'date': 'Fri, 23 Dec 2016 15:47:10 GMT', 'content-type': 'image/jpeg'}
Saving with put_object
s3_client.put_object(Bucket=bucket+'resized',Key=key, Metadata=metadata, Body=downloadfile)
creates a whole lot of extra metadata in s3 including the fact that it does not save content-type as image/jpeg but rather as binary/octet-stream and it does create metadata x-amz-meta-content-type = image/jpeg

You are confusing S3 metadata, stored by AWS S3 along with an object, and EXIF metadata, stored inside the file itself.
download_file() doesn't get object attributes from S3. You should use get_object() instead: https://boto3.readthedocs.io/en/latest/reference/services/s3.html#S3.Client.get_object
Then you can use put_objects() with the same attributes to upload new file: https://boto3.readthedocs.io/en/latest/reference/services/s3.html#S3.Client.put_object

Content type information is not on the file you upload, it has to be guessed or extracted somehow. This is something you must do manually or using tools. With a fairly small dictionary you can guess most file types.
When you upload a file or object, you have the chance to specify its content type. Otherwise S3 defaults to application/octet-stream.
Using the boto3 python package for instance:
s3client.upload_file(
Filename=local_path,
Bucket=bucket,
Key=remote_path,
ExtraArgs={
"ContentType": "image/jpeg"
}
)

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

How to get object from MinIO response? - python

Try with response.data.decode()

Related

SQS sending empty MessageBody

aws python lambda: reading csv file (iterator should return strings)

Shopify API: delete discounts and price rules

boto3 request_spot_instances method does not work with "UserData"

AWS S3 image saving loses metadata

Categories

Resources