how to write airflow error logs into s3 bucket using python

how to write airflow error logs into s3 bucket using python - python

when my airflow dag fails i get the errors in the path
"/home/ec2-user/software/airflow/logs/dagtest_dag/trigger_load/2019-10-10T06:01:33.342433+00:00/1.log
"
how to take these log to s3 bucket?

Configure this as a cron job.
import boto3
s3 = boto3.client('s3')
# Make sure your client is authenticated
with open('path/to/your/logs.log', 'rb') as data:
s3.upload_fileobj(data, 'bucketname', 'path/to/your/logs/in/s3.log')

Related

S3 AWS Session Expired

I am new to AWS S3. I did a lot of googling around how to connect S3 using python and found everyone is using Boto so its what I am using as the client. I used powershell to login and create the .aws/credentials . In that file, I was able to get the aws_access_key_id, aws_secret_access_key AND aws_session_token needed to establish the session. I understand that session is only about 8 hours so the next day when my python script runs to connect to S3 obviously the session is expired. How can i overcome this and how can I establish a new session daily? Below is my code.
s3_client = boto3.client(
"s3",
aws_access_key_id=id_,
aws_secret_access_key=secret,
aws_session_token=token,
region_name='r'
)
# Test it on a service (yours may be different)
# s3 = session.resource('s3')
# Print out bucket names
for bucket in s3.buckets.all():
# print(bucket.name)
bucket = 'automated-reports' # already created on S3
csv_buffer = StringIO()
all_active_scraper_counts_df.to_csv(csv_buffer, index=False)
# s3_resource = boto3.resource('s3')
put_response = s3_client.put_object(Bucket=bucket, Key="all_active_scrapers.csv", Body=csv_buffer.getvalue())
status = put_response.get("ResponseMetadata", {}).get("HTTPStatusCode")
if status == 200:
print(f"Successful S3 put_object response. Status - {status}")
else:
print(f"Unsuccessful S3 put_object response. Status - {status}")

You have three options:
Run a script that updates the credentials before you run your Python script. You can use AWS CLI sts assume-role to get a new set of credentials.
Add a try/catch statement inside your code to handle credentials expired error. Then, generate new credentials and re-initialize the S3 client.
Instead of using a Role, use a User. (IAM Identities). User credentials can be valid forever. You won't need to update the credentials in this case.

No Credentials Error - Using boto3 and aws s3 bucket

I am using python and jupyter notebook to read files from an aws s3 bucket, and I am getting the error 'No Credentials Error:Unable to locate credentials' when running the following code:
import boto3
s3 = boto3.resource('s3')
bucket = s3.Bucket('my-bucket')
for obj in bucket.objects.all():
key = obj.key
body = obj.get()['Body'].read()
I believe I need to put my access key somewhere, but I am not sure where. Thank you!

If you have AWS CLI installed, just do a simple aws configure. Then you will be good to go.

using boto3 to get file list and download files

I was given this s3 url: s3://file.share.external.bdex.com/Offrs
In this url is a battery of files I need to download.
I have this code:
import boto3
s3_client = boto3.client('s3',
aws_access_key_id='<<ACCESS KEY>>',
aws_secret_access_key='<<SECRET_ACCESS_KEY>>'
)
object_listing = s3_client.list_objects_v2(Bucket='file.share.external.bdex.com/Offrs',
Prefix='')
print(object_listing)
I have tried:
Bucket='file.share.external.bdex.com', Prefix='Offrs'
Bucket='s3://file.share.external.bdex.com/Offrs/'
Bucket='file.share.external.bdx.com/Offrs', Prefix='Offrs'
and several other configurations, all saying I'm not following the regex. due to the slash, or not found.
What am I missing?
Thank you.

Bucket = 'file.share.external.bdx.com'
Prefix = 'Offrs/'
You can test your access permissions via the AWS CLI:
aws s3 ls s3://file.share.external.bdex.com/Offrs/

Problems Enabling S3 Bucket Transfer Acceleration Using boto3

I am attempting to pull information about an S3 bucket using boto3. Here is the setup (bucketname is set to a valid S3 bucket name):
import boto3
s3 = boto3.client('s3')
result = s3.get_bucket_acl(Bucket=bucketname)
When I try, I get this error:
ClientError: An error occurred (InvalidRequest) when calling the
GetBucketAcl operation: S3 Transfer Acceleration is not configured on
this bucket
So, I attempt to enable transfer acceleration:
s3.put_bucket_accelerate_configuration(Bucket=bucketname, AccelerateConfiguration={'Status': 'Enabled'})
But, I get this error, which seems silly, since the line above is attempting to configure the bucket. I do have IAM rights (Allow: *) to modify the bucket too:
ClientError: An error occurred (InvalidRequest) when calling the
PutBucketAccelerateConfiguration operation: S3 Transfer Acceleration
is not configured on this bucket
Does anyone have any ideas on what I'm missing here?

Although I borrowed the code in the original question from the boto3 documentation, this construct is not complete and did not provide the connectivity that I expected:
s3 = boto3.client('s3')
What is really needed are fully-initialized session and client handlers, like this (assuming that the profile variable is set correctly in the ~/.aws/config file and bucketname is a valid S3 bucket):
from boto3 import Session
session = Session(profile_name=profile)
client = session.client('s3')
result = client.get_bucket_acl(Bucket=bucketname)
After doing this (duh), I was able to connect with or without transfer acceleration.
Thanks to the commenters, since those comments led me to the solution.

How to execute bash commands in aws lambda

I am trying to schedule a job in AWS lambda where i get data fromm a Json API. I want to transfer JSON file to amazon S3 everytime.I have set up S3 bucket and aws lambda function with proper IAM roles. I am writing AWS lambda function in Python. Code works fine on an EC2 instance but It's not transferring file to S3 if I put it in AWS Lambda.
import os
def lambda_handler(event, context):
#changing the directory to /tmp
os.chdir("/tmp")
print "loading function"
#downloading file to
os.system("wget https://jsonplaceholder.typicode.com/posts/1 -P /tmp")
#using aws-cli to transfer file to amazon S3
os.system("aws s3 sync . s3://targetbucket")
I am new to aws lambda. I am not getting any error but it's not giving me expected output

AWS Lambda does not have the aws cli by default.
You can either Create a deployment package with awscli in it or Use python boto3 library.
import boto3
s3client = boto3.client('s3')
for filename in os.listdir('/tmp'): # assuming there will not be any sub-directories
fpath = os.path.join('/tmp',filename)
if os.path.isfile(fpath):
s3client.upload_file(fpath, 'targetbucket', filename)

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

how to write airflow error logs into s3 bucket using python - python

when my airflow dag fails i get the errors in the path "/home/ec2-user/software/airflow/logs/dagtest_dag/trigger_load/2019-10-10T06:01:33.342433+00:00/1.log " how to take these log to s3 bucket?

Configure this as a cron job. import boto3 s3 = boto3.client('s3') # Make sure your client is authenticated with open('path/to/your/logs.log', 'rb') as data: s3.upload_fileobj(data, 'bucketname', 'path/to/your/logs/in/s3.log')

Related

S3 AWS Session Expired

No Credentials Error - Using boto3 and aws s3 bucket

using boto3 to get file list and download files

Problems Enabling S3 Bucket Transfer Acceleration Using boto3

How to execute bash commands in aws lambda

Categories

Resources