I want to upload an image to s3 with lambda and Api gateway when i submit form how can i do it in python.
currently i am getting this error while i am trying to upload image through PostMan
Could not parse request body into json: Could not parse payload into json: Unexpected character (\'-\' (code 45))
my code currently is
import json
import boto3
import base64
s3 = boto3.client('s3')
def lambda_handler(event, context):
print(event)
try:
if event['httpMethod'] == 'POST' :
print(event['body'])
data = json.loads(event['body'])
name = data['name']
image = data['file']
image = image[image.find(",")+1:]
dec = base64.b64decode(image + "===")
s3.put_object(Bucket='', Key="", Body=dec)
return {'statusCode': 200, 'body': json.dumps({'message': 'successful lambda function call'}), 'headers': {'Access-Control-Allow-Origin': '*'}}
except Exception as e:
return{
'body':json.dumps(e)
}
Doing an upload through API Gateway and Lambda has its limitations:
You can not handle large files and there is an execution timeout of 30 seconds as I recall.
I would go with creating a presigned url that is requested by the client through API gateway, then use it as the endpoint to put the file.
Something like this will go in your Lambda ,
(This is a NodeJs example)
const uploadUrl = S3.getSignedUrl( 'putObject', {
Bucket: get(aPicture, 'Bucket'),
Key: get( aPicture, 'Key'),
Expires: 600,
})
callback( null, { url })
(NodeJs)
https://docs.aws.amazon.com/AWSJavaScriptSDK/latest/AWS/S3.html#getSignedUrl-property
(Python)
https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/s3.html#S3.Client.generate_presigned_url
You don't need Lambda for this. You can proxy S3 API with API Gateway
https://docs.aws.amazon.com/apigateway/latest/developerguide/integrating-api-with-aws-services-s3.html
Ooops it's over engineering.
anyways, it's seems like your are getting error from API-gateway, first check the lambda through "test lambda" using aws console, if it's working fine and getting the response back from lambda then please check with API-gateway side.
i doubt you are using some mapping templates in gateway, AWS uses AWS Velocity templates, which looks like JSON, but it's different. those mapping template at integration request causing this issue.
Related
I created lambda function. In the function, I scanned all the s3 buckets if the bucket name contains "log" word. Took those buckets and apply the lifecycle. If the retention older than 30 days delete all kind of files under the buckets that contains "log" word. But when I triggered the code I get this format error.I couldn't fix it. Is my code or format wrong ?
This is my python code:
import logging
import boto3
import datetime, os, json, boto3
from botocore.exceptions import ClientError
def lambda_handler(event, context):
s3 = boto3.client('s3')
response = s3.list_buckets()
# Output the bucket names
print('Existing buckets:')
for bucket in response['Buckets']:
#print(f' {bucket["Name"]}')
if 'log' in bucket['Name']:
BUCKET = bucket["Name"]
print (BUCKET)
try:
policy_status = s3.put_bucket_lifecycle_configuration(
Bucket = BUCKET ,
LifecycleConfiguration={'Rules': [{'Expiration':{'Days': 30,'ExpiredObjectDeleteMarker': True},'Status': 'Enabled',} ]})
except ClientError as e:
print("Unable to apply bucket policy. \nReason:{0}".format(e))
The error
Existing buckets:
sample-log-case
sample-bucket-log-v1
template-log-v3-mv
Unable to apply bucket policy.
Reason:An error occurred (MalformedXML) when calling the PutBucketLifecycleConfiguration operation: The XML you provided was not well-formed or did not validate against our published schema
I took below example from boto3 site, I just changed parameters didnt touch the format but I also get the format issue for this too :
import boto3
client = boto3.client('s3')
response = client.put_bucket_lifecycle_configuration(
Bucket='bucket-sample',
LifecycleConfiguration={
'Rules': [
{
'Expiration': {
'Days': 3650,
},
'Status': 'Enabled'
},
],
},
)
print(response)
According to the documentation, the 'Filter' rule is required if the LifecycleRule does not contain a 'Prefix' element (the 'Prefix' element is no longer used). 'Filter', in turn, requires exactly one of 'Prefix', 'Tag' or 'And'.
Adding the following to your 'Rules' in LifecycleConfiguration will solve the problem:
'Filter': {'Prefix': ''}
I'm calling this Lambda function via API Gateway. My issue is that the image file is malformed, meaning that it does not open.
import boto3
import json
def lambda_handler(event, context):
print(event)
# removing all the data around the packet
# this also results in a malformed png
start = '<?xpacket end="r"?>'
end = '\r\n------'
content = str(event['body'])
content = content[content.index(start) + len(start):content.index(end)].encode('utf-8')
bucket_name = "bucket-name"
file_name = "hello1.png"
lambda_path = "/tmp/" + file_name
s3_path = file_name
s3 = boto3.resource("s3")
s3.Bucket(bucket_name).put_object(Key=s3_path, Body=content)
return {
'statusCode': 200,
'headers': {
'Access-Control-Allow-Origin': '*',
},
'body': json.dumps(event)
}
Lambda has payload limit of 6mb for synchronous call and 256KB for async call.
https://docs.aws.amazon.com/lambda/latest/dg/gettingstarted-limits.html
Also api gateway has limit of 10MB for RESTfull APIs and 128KB for socket message
https://docs.aws.amazon.com/apigateway/latest/developerguide/limits.html
It may be the primary reason why some part of file is uploaded, some not.
Even if you do not hit those limits with smaller file sizes, you pay for lambda execution while uploading. It is just a waste of lambda's time.
Also there may be config on api gateway to modify payload while pushing it to lambda. Make sure there is no active template which would convert the request before hitting lambda and check if use lambda as proxy is checked at gateway-api dashboard for this resource.
To upload to S3 better use Pre-Signed URL for an Amazon S3 PUT Operation:
https://docs.aws.amazon.com/sdk-for-go/v1/developer-guide/s3-example-presigned-urls.html
so I have this sort of http cloud function (python 3.7)
from google.cloud import storage
def upload_blob(bucket_name, blob_text, destination_blob_name):
"""Uploads a file to the bucket."""
storage_client = storage.Client()
bucket = storage_client.get_bucket(bucket_name)
blob = bucket.blob(destination_blob_name)
blob.upload_from_string(blob_text)
print('File {} uploaded to {}.'.format(
source_file_name,
destination_blob_name))
Now I want to test this function from Testing tab of the Functions details page where it asks for triggering event in JSON format.
i tried with various formats like
{
"name":[param1,param2,param3]
}
but always ended up getting error
upload_blob() missing 2 required positional arguments: 'blob_text' and 'destination_blob_name'
Also tried to find out any documentation but was unable to do so, can you help in pointing to the right direction
Your upload_blob function doesn't look like an HTTP Cloud Function. HTTP Cloud Functions take a single parameter, request. For example:
def hello_http(request):
...
See https://cloud.google.com/functions/docs/writing/http#writing_http_helloworld-python for more details.
As suggested by Dustin you should add an entry point which parses the json request.
A code snipped for the following request will be as follows:
{"bucket_name": "xyz", "blob_text": "abcdfg", "destination_blob_name": "sample.txt"}
The entry function will be as follows:
entry_point_function(request):
{
content_type = request.headers['content-type']
if 'application/json' in content_type:
request_json = request.get_json(silent=True)
if request_json and 'bucket_name' in request_json:
text = request_json['bucket_name']
else:
raise ValueError("JSON is invalid, or missing a 'bucket_name' property")
}
Rename the entry point in the function configuration. For the cloud UI
You need to change
Context: I have some log files in an S3 bucket that I need to retrieve. The permissions on the bucket prevent me from downloading them directly from the S3 bucket console. I need a "backdoor" approach to retrieve the files. I have an API Gateway setup to hit a Lambda, which will figure out what files to retrieve and get them from the S3 bucket. However, the files are over 10 MB and the AWS API Gateway has a maximum payload size of 10 MB. Now, I need a way to compress the files and serve them to the client as a downloadable zip.
import json
import boto3
import zipfile
import zlib
import os
S3 = boto3.resource('s3')
BUCKET = S3.Bucket(name="my-bucket")
TEN_MEGA_BYTES = 1000000000
def lambda_handler(event, context):
# utilize Lambda's temporary storage (512 MB)
retrieved = zipfile.ZipFile("/tmp/retrieved.zip", mode="w", compression=zipfile.ZIP_DEFLATED, compresslevel=9)
for bucket_obj in BUCKET.objects.all():
# logic to decide which file I want is done here
log_file_obj = BUCKET.Object(bucket_obj.key).get()
# get the object's binary encoded content (a bytes object)
content = log_file_obj["Body"].read()
# write the content to a file within the zip
# writestr() requires a bytes or str object
retrieved.writestr(bucket_obj.key, content)
# close the zip
retrieved.close()
# visually checking zip size for debugging
zip_size = os.path.getsize("/tmp/retrieved.zip")
print("{} bytes".format(zip_size), "{} percent of 10 MB".format(zip_size / TEN_MEGA_BYTES * 100))
return {
"header": {
"contentType": "application/zip, application/octet-stream",
"contentDisposition": "attachment, filename=retrieved.zip",
"contentEncoding": "deflate"
},
"body": retrieved
}
# return retrieved
I have tried returning the zipfile object directly and within a JSON structure with headers that are supposed to be mapped from the integration response to the method response (i.e. the headers I'm setting programmatically in the Lambda should be mapped to the response headers that the client actually receives). In either case, I get a marshal error.
Response:
{
"errorMessage": "Unable to marshal response: <zipfile.ZipFile [closed]> is not JSON serializable",
"errorType": "Runtime.MarshalError"
}
I have done a lot of tinkering in the API Gateway in the AWS Console trying to set different combinations of headers and/or content types, but I am stuck. I'm not entirely sure what would be the correct header/content-type combination.
From the above error message, it appears like Lambda can only return JSON structures but I am unable to confirm either way.
I got to the point of being able to return a compressed JSON payload, not a downloadable .zip file, but that is good enough for the time being. My team and I are looking into requesting a time-limited pre-signed URL for the S3 bucket that will allow us to bypass the permissions for a limited time to access the files. Even with the compression, we expect to someday reach the point where even the compressed payload is too large for the API Gateway to handle. We also discovered that Lambda has a smaller payload limit than even the API Gateway, Lambda Limits, 6 MB (synchronous).
Solution:
This article was a huge help, Lambda Compression. The main thing that was missing was using "Lambda Proxy Integration." When configuring the API Gateway Integration Request, choose "Use Lambda Proxy Integration." That will limit your ability to use any mapping templates on your method request and will mean that you have to return a specific structure from your Lambda function. Also, ensure content encoding is enabled in the settings of the API Gateway and with 'application/json' specified as an acceptable Binary Media Type.
Then, when sending the request, use these headers:
Accept-Encoding: application/gzip,
Accept: application/json
import json
import boto3
import gzip
import base64
from io import BytesIO
S3 = boto3.resource('s3')
BUCKET = S3.Bucket(name="my-bucket")
def gzip_b64encode(data):
compressed = BytesIO()
with gzip.GzipFile(fileobj=compressed, mode='w') as f:
json_response = json.dumps(data)
f.write(json_response.encode('utf-8'))
return base64.b64encode(compressed.getvalue()).decode('ascii')
def lambda_handler(event, context):
# event body is JSON string, this is request body
req_body = json.loads(event["body"])
retrieved_data = {}
for bucket_obj in BUCKET.objects.all():
# logic to decide what file to grab based on request body done here
log_file_obj = BUCKET.Object(bucket_obj.key).get()
content = log_file_obj["Body"].read().decode("UTF-8")
retrieved_data[bucket_obj.key] = json.loads(content)
# integration response format
return {
"isBase64Encoded": True,
"statusCode": 200,
"headers": {
"Content-Type": "application/json",
"Content-Encoding": "gzip",
"Access-Control-Allow-Origin": "*"
},
"body": gzip_b64encode(retrieved_data)
}
I'm trying to figure out how to receive a file sent by a browser through an API call in Python.
The web client is allowed to send any types of files (let's say .txt, .docx, .xlsx, ...). I don't know if I should use binary or not.
The idea was to save the file after on S3. Now I know it's possible to use js libraries like Aws Amplify and generate a temporary url but i'm not too interested in that solution.
Any help appreciated, I've searched extensively a solution in Python but i can't find anything actually working !
My API is private and i'm using serverless to deploy.
files_post:
handler: post/post.post
events:
- http:
path: files
method: post
cors: true
authorizer:
name: authorizer
arn: ${cf:lCognito.CognitoUserPoolMyUserPool}
EDIT
I have a half solution that works for text files but doesn't for PDF, XLSX, or images, if someone had i'd be super happy
from cgi import parse_header, parse_multipart
from io import BytesIO
import json
def post(event, context):
print event['queryStringParameters']['filename']
c_type, c_data = parse_header(event['headers']['content-type'])
c_data['boundary'] = bytes(c_data['boundary']).encode("utf-8")
body_file = BytesIO(bytes(event['body']).encode("utf-8"))
form_data = parse_multipart(body_file, c_data)
s3 = boto3.resource('s3')
object = s3.Object('storage', event['queryStringParameters']['filename'])
object.put(Body=form_data['upload'][0])
You are using API Gateway so your lambda event will map to something like this (from Amazon Docs):
{
"resource": "Resource path",
"path": "Path parameter",
"httpMethod": "Incoming request's method name"
"headers": {String containing incoming request headers}
"multiValueHeaders": {List of strings containing incoming request headers}
"queryStringParameters": {query string parameters }
"multiValueQueryStringParameters": {List of query string parameters}
"pathParameters": {path parameters}
"stageVariables": {Applicable stage variables}
"requestContext": {Request context, including authorizer-returned key-value pairs}
"body": "A JSON string of the request payload."
"isBase64Encoded": "A boolean flag to indicate if the applicable request payload is Base64-encode"
}
You can pass the file as a base64 value in the body and decode it in your lambda function. Take the following Python snippet
def lambda_handler(event, context):
data = json.loads(event['body'])
# Let's say we user a regular <input type='file' name='uploaded_file'/>
encoded_file = data['uploaded_file']
decoded_file = base64.decodestring(encoded_file)
# now save it to S3