how to write airflow error logs into s3 bucket using python - python

when my airflow dag fails i get the errors in the path
"/home/ec2-user/software/airflow/logs/dagtest_dag/trigger_load/2019-10-10T06:01:33.342433+00:00/1.log
"
how to take these log to s3 bucket?

Configure this as a cron job.
import boto3
s3 = boto3.client('s3')
# Make sure your client is authenticated
with open('path/to/your/logs.log', 'rb') as data:
s3.upload_fileobj(data, 'bucketname', 'path/to/your/logs/in/s3.log')

Related

S3 AWS Session Expired

I am new to AWS S3. I did a lot of googling around how to connect S3 using python and found everyone is using Boto so its what I am using as the client. I used powershell to login and create the .aws/credentials . In that file, I was able to get the aws_access_key_id, aws_secret_access_key AND aws_session_token needed to establish the session. I understand that session is only about 8 hours so the next day when my python script runs to connect to S3 obviously the session is expired. How can i overcome this and how can I establish a new session daily? Below is my code.
s3_client = boto3.client(
"s3",
aws_access_key_id=id_,
aws_secret_access_key=secret,
aws_session_token=token,
region_name='r'
)
# Test it on a service (yours may be different)
# s3 = session.resource('s3')
# Print out bucket names
for bucket in s3.buckets.all():
# print(bucket.name)
bucket = 'automated-reports' # already created on S3
csv_buffer = StringIO()
all_active_scraper_counts_df.to_csv(csv_buffer, index=False)
# s3_resource = boto3.resource('s3')
put_response = s3_client.put_object(Bucket=bucket, Key="all_active_scrapers.csv", Body=csv_buffer.getvalue())
status = put_response.get("ResponseMetadata", {}).get("HTTPStatusCode")
if status == 200:
print(f"Successful S3 put_object response. Status - {status}")
else:
print(f"Unsuccessful S3 put_object response. Status - {status}")
You have three options:
Run a script that updates the credentials before you run your Python script. You can use AWS CLI sts assume-role to get a new set of credentials.
Add a try/catch statement inside your code to handle credentials expired error. Then, generate new credentials and re-initialize the S3 client.
Instead of using a Role, use a User. (IAM Identities). User credentials can be valid forever. You won't need to update the credentials in this case.

No Credentials Error - Using boto3 and aws s3 bucket

I am using python and jupyter notebook to read files from an aws s3 bucket, and I am getting the error 'No Credentials Error:Unable to locate credentials' when running the following code:
import boto3
s3 = boto3.resource('s3')
bucket = s3.Bucket('my-bucket')
for obj in bucket.objects.all():
key = obj.key
body = obj.get()['Body'].read()
I believe I need to put my access key somewhere, but I am not sure where. Thank you!
If you have AWS CLI installed, just do a simple aws configure. Then you will be good to go.

using boto3 to get file list and download files

I was given this s3 url: s3://file.share.external.bdex.com/Offrs
In this url is a battery of files I need to download.
I have this code:
import boto3
s3_client = boto3.client('s3',
aws_access_key_id='<<ACCESS KEY>>',
aws_secret_access_key='<<SECRET_ACCESS_KEY>>'
)
object_listing = s3_client.list_objects_v2(Bucket='file.share.external.bdex.com/Offrs',
Prefix='')
print(object_listing)
I have tried:
Bucket='file.share.external.bdex.com', Prefix='Offrs'
Bucket='s3://file.share.external.bdex.com/Offrs/'
Bucket='file.share.external.bdx.com/Offrs', Prefix='Offrs'
and several other configurations, all saying I'm not following the regex. due to the slash, or not found.
What am I missing?
Thank you.
Bucket = 'file.share.external.bdx.com'
Prefix = 'Offrs/'
You can test your access permissions via the AWS CLI:
aws s3 ls s3://file.share.external.bdex.com/Offrs/

Problems Enabling S3 Bucket Transfer Acceleration Using boto3

I am attempting to pull information about an S3 bucket using boto3. Here is the setup (bucketname is set to a valid S3 bucket name):
import boto3
s3 = boto3.client('s3')
result = s3.get_bucket_acl(Bucket=bucketname)
When I try, I get this error:
ClientError: An error occurred (InvalidRequest) when calling the
GetBucketAcl operation: S3 Transfer Acceleration is not configured on
this bucket
So, I attempt to enable transfer acceleration:
s3.put_bucket_accelerate_configuration(Bucket=bucketname, AccelerateConfiguration={'Status': 'Enabled'})
But, I get this error, which seems silly, since the line above is attempting to configure the bucket. I do have IAM rights (Allow: *) to modify the bucket too:
ClientError: An error occurred (InvalidRequest) when calling the
PutBucketAccelerateConfiguration operation: S3 Transfer Acceleration
is not configured on this bucket
Does anyone have any ideas on what I'm missing here?
Although I borrowed the code in the original question from the boto3 documentation, this construct is not complete and did not provide the connectivity that I expected:
s3 = boto3.client('s3')
What is really needed are fully-initialized session and client handlers, like this (assuming that the profile variable is set correctly in the ~/.aws/config file and bucketname is a valid S3 bucket):
from boto3 import Session
session = Session(profile_name=profile)
client = session.client('s3')
result = client.get_bucket_acl(Bucket=bucketname)
After doing this (duh), I was able to connect with or without transfer acceleration.
Thanks to the commenters, since those comments led me to the solution.

How to execute bash commands in aws lambda

I am trying to schedule a job in AWS lambda where i get data fromm a Json API. I want to transfer JSON file to amazon S3 everytime.I have set up S3 bucket and aws lambda function with proper IAM roles. I am writing AWS lambda function in Python. Code works fine on an EC2 instance but It's not transferring file to S3 if I put it in AWS Lambda.
import os
def lambda_handler(event, context):
#changing the directory to /tmp
os.chdir("/tmp")
print "loading function"
#downloading file to
os.system("wget https://jsonplaceholder.typicode.com/posts/1 -P /tmp")
#using aws-cli to transfer file to amazon S3
os.system("aws s3 sync . s3://targetbucket")
I am new to aws lambda. I am not getting any error but it's not giving me expected output
AWS Lambda does not have the aws cli by default.
You can either Create a deployment package with awscli in it or Use python boto3 library.
import boto3
s3client = boto3.client('s3')
for filename in os.listdir('/tmp'): # assuming there will not be any sub-directories
fpath = os.path.join('/tmp',filename)
if os.path.isfile(fpath):
s3client.upload_file(fpath, 'targetbucket', filename)

Categories

Resources