Sending logs from one serverless function to multiple cloudwatch groups - python

I am developing an API for an app with python, using FastAPI, serverless and amazon web services.
We are using CloudWatch for saving our logs.
The thing is, I am required to send different logs to different groups in the same aplication, depending on wethever is it an error, an info, etc.
Let´s say, I have two log groups in CloudWatch: /aws/lambda/firstGroup and /aws/lambda/secondGroup.
An I have this function:
def foo(some_data):
logger.info(f'calling the function with the data: {data}') # this goes to logGroup /aws/lambda/firstGroup
try:
doSomething()
except:
logger.error('ERROR! Something happened') # this goes to logGroup /aws/lambda/secondGroup
How can I configure the serverless.yml file so the logger.info goes to the first group and the logger.error goes to the second group?
Thanks in advance!

I am using this solution with ec2 instance.
create a log group
create log stream
then dump your logs
import boto3
import time
# init clients
clw_client = boto3.client('logs', region_name=REGION)
# print("check else create new log group..")
try:
clw_client.create_log_group(logGroupName=LOG_GROUP)
except clw_client.exceptions.ResourceAlreadyExistsException:
pass
# print("check else create new log stream.....")
LOG_STREAM = '{}-{}'.format(time.strftime("%m-%d-%Y-%H-%M-%S"),'logstream')
try:
clw_client.create_log_stream(logGroupName=LOG_GROUP, logStreamName=LOG_STREAM)
except clw_client.exceptions.ResourceAlreadyExistsException:
pass
def log_update(text):
print(text)
response = clw_client.describe_log_streams(
logGroupName = LOG_GROUP,
logStreamNamePrefix = LOG_STREAM
)
try:
event_log = {
'logGroupName': LOG_GROUP,
'logStreamName': LOG_STREAM,
'logEvents': [{
'timestamp': int(round(time.time() * 1000)),
'message': f"{time.strftime('%Y-%m-%d %H:%M:%S')}\t {text}"
}
],
}
if 'uploadSequenceToken' in response['logStreams'][0]:
event_log.update({'sequenceToken': response['logStreams'][0] ['uploadSequenceToken']})
response = clw_client.put_log_events(**event_log)
except Exception as e:
log_update(e)
Then use call that function inside your app whenever you like. Just don't check for groups and stream again and again for one job, these should run once.
You can update it, add more logic like change log-group name to implement what you wanted in this question with just some if else statements. Good luck!

Related

How to perform async commit when using kafka-python

I'm using kafka-python library for my fastapi consumer app and I'm consuming messages in batch with maximum of 100 records. Since the topic has huge traffic and have only one partition, consuming, processing and committing should be as quick as possible hence I want to use commit_async(), instead of synchronous commit().
But I'm not able to find a good example of commit_async(). I'm looking for an example for commit_async() with callback so that I can log in case of commit failure. But I'm not sure what does that callback function takes as argument and what field those arguments contain.
However the docs related to commit_async mentions the arguments, I'm not completely sure how to use them.
I need help in completing my callback function on_commit(), someone please help here
Code
import logging as log
from kafka import KafkaConsumer
from message_handler_impl import MessageHandlerImpl
def on_commit():
pass
class KafkaMessageConsumer:
def __init__(self, bootstrap_servers: str, topic: str, group_id: str):
self.bootstrap_servers = bootstrap_servers
self.topic = topic
self.group_id = group_id
self.consumer = KafkaConsumer(topic, bootstrap_servers=bootstrap_servers, group_id=group_id, enable_auto_commit=False, auto_offset_reset='latest')
def consume_messages(self, max_poll_records: int,
message_handler: MessageHandlerImpl = MessageHandlerImpl()):
try:
while True:
try:
msg_pack = self.consumer.poll(max_records=max_poll_records)
for topic_partition, messages in msg_pack.items():
message_handler.process_messages(messages)
self.consumer.commit_async(callback=on_commit)
except Exception as e:
log.error("Error while consuming message due to: %s", e, exc_info=True)
finally:
log.error("Something went wrong, closing consumer...........")
self.consumer.close()
if __name__ == "__main__":
kafka_consumer = KafkaMessageConsumer("localhost:9092", "test-topic", "test-group")
kafka_consumer.consume_messages(100)
The docs are fairly clear.
Called as callback(offsets, response) with response as either an Exception or an OffsetCommitResponse struct.
def on_commit(offsets, response):
# or maybe try checking type(response)
if hasattr(response, '<some attribute unique to OffsetCommitResponse>'):
print('committed ' + str(offsets))
else:
print(response) # exception
I'm sure you could look at the source code an maybe find a unit test that covers a full example

Python InfluxDB2 - write_api.write(...) How to check for success?

I need to write historic data into InfluxDB (I'm using Python, which is not a must in this case, so I maybe willing to accept non-Python solutions). I set up the write API like this
write_api = client.write_api(write_options=ASYNCHRONOUS)
The Data comes from a DataFrame with a timestamp as key, so I write it to the database like this
result = write_api.write(bucket=bucket, data_frame_measurement_name=field_key, record=a_data_frame)
This call does not throw an exception, even if the InfluxDB server is down. result has a protected attribute _success that is a boolean in debugging, but I cannot access it from the code.
How do I check if the write was a success?
If you use background batching, you can add custom success, error and retry callbacks.
from influxdb_client import InfluxDBClient
def success_cb(details, data):
url, token, org = details
print(url, token, org)
data = data.decode('utf-8').split('\n')
print('Total Rows Inserted:', len(data))
def error_cb(details, data, exception):
print(exc)
def retry_cb(details, data, exception):
print('Retrying because of an exception:', exc)
with InfluxDBClient(url, token, org) as client:
with client.write_api(success_callback=success_cb,
error_callback=error_cb,
retry_callback=retry_cb) as write_api:
write_api.write(...)
If you are eager to test all the callbacks and don't want to wait until all retries are finished, you can override the interval and number of retries.
from influxdb_client import InfluxDBClient, WriteOptions
with InfluxDBClient(url, token, org) as client:
with client.write_api(success_callback=success_cb,
error_callback=error_cb,
retry_callback=retry_cb,
write_options=WriteOptions(retry_interval=60,
max_retries=2),
) as write_api:
...
if you want to immediately write data into database, then use SYNCHRONOUS version of write_api - https://github.com/influxdata/influxdb-client-python/blob/58343322678dd20c642fdf9d0a9b68bc2c09add9/examples/example.py#L12
The asynchronous write should be "triggered" by call .get() - https://github.com/influxdata/influxdb-client-python#asynchronous-client
Regards
write_api.write() returns a multiprocessing.pool.AsyncResult or multiprocessing.pool.AsyncResult (both are the same).
With this return object you can check on the asynchronous request in a couple of ways. See here: https://docs.python.org/2/library/multiprocessing.html#multiprocessing.pool.AsyncResult
If you can use a blocking request, then write_api = client.write_api(write_options=SYNCRONOUS) can be used.
from datetime import datetime
from influxdb_client import WritePrecision, InfluxDBClient, Point
from influxdb_client.client.write_api import SYNCHRONOUS
with InfluxDBClient(url="http://localhost:8086", token="my-token", org="my-org", debug=False) as client:
p = Point("my_measurement") \
.tag("location", "Prague") \
.field("temperature", 25.3) \
.time(datetime.utcnow(), WritePrecision.MS)
try:
client.write_api(write_options=SYNCHRONOUS).write(bucket="my-bucket", record=p)
reboot = False
except Exception as e:
reboot = True
print(f"Reboot? {reboot}")

Google Cloud Functions randomly retrying on success

I have a Google Cloud Function triggered by a PubSub. The doc states messages are acknowledged when the function end with success.
link
But randomly, the function retries (same execution ID) exactly 10 minutes after execution. It is the PubSub ack max timeout.
I also tried to get message ID and acknowledge it programmatically in Function code but the PubSub API respond there is no message to ack with that id.
In StackDriver monitoring, I see some messages not being acknowledged.
Here is my code : main.py
import base64
import logging
import traceback
from google.api_core import exceptions
from google.cloud import bigquery, error_reporting, firestore, pubsub
from sql_runner.runner import orchestrator
logging.getLogger().setLevel(logging.INFO)
def main(event, context):
bigquery_client = bigquery.Client()
firestore_client = firestore.Client()
publisher_client = pubsub.PublisherClient()
subscriber_client = pubsub.SubscriberClient()
logging.info(
'event=%s',
event
)
logging.info(
'context=%s',
context
)
try:
query_id = base64.b64decode(event.get('data',b'')).decode('utf-8')
logging.info(
'query_id=%s',
query_id
)
# inject dependencies
orchestrator(
query_id,
bigquery_client,
firestore_client,
publisher_client
)
sub_path = (context.resource['name']
.replace('topics', 'subscriptions')
.replace('function-sql-runner', 'gcf-sql-runner-europe-west1-function-sql-runner')
)
# explicitly ack message to avoid duplicates invocations
try:
subscriber_client.acknowledge(
sub_path,
[context.event_id] # message_id to ack
)
logging.warning(
'message_id %s acknowledged (FORCED)',
context.event_id
)
except exceptions.InvalidArgument as err:
# google.api_core.exceptions.InvalidArgument: 400 You have passed an invalid ack ID to the service (ack_id=982967258971474).
logging.info(
'message_id %s already acknowledged',
context.event_id
)
logging.debug(err)
except Exception as err:
# catch all exceptions and log to prevent cold boot
# report with error_reporting
error_reporting.Client().report_exception()
logging.critical(
'Internal error : %s -> %s',
str(err),
traceback.format_exc()
)
if __name__ == '__main__': # for testing
from collections import namedtuple # use namedtuple to avoid Class creation
Context = namedtuple('Context', 'event_id resource')
context = Context('666', {'name': 'projects/my-dev/topics/function-sql-runner'})
script_to_start = b' ' # launch the 1st script
script_to_start = b'060-cartes.sql'
main(
event={"data": base64.b64encode(script_to_start)},
context=context
)
Here is my code : runner.py
import logging
import os
from retry import retry
PROJECT_ID = os.getenv('GCLOUD_PROJECT') or 'my-dev'
def orchestrator(query_id, bigquery_client, firestore_client, publisher_client):
"""
if query_id empty, start the first sql script
else, call the given query_id.
Anyway, call the next script.
If the sql script is the last, no call
retrieve SQL queries from FireStore
run queries on BigQuery
"""
docs_refs = [
doc_ref.get() for doc_ref in
firestore_client.collection(u'sql_scripts').list_documents()
]
sorted_queries = sorted(docs_refs, key=lambda x: x.id)
if not bool(query_id.strip()) : # first execution
current_index = 0
else:
# find the query to run
query_ids = [ query_doc.id for query_doc in sorted_queries]
current_index = query_ids.index(query_id)
query_doc = sorted_queries[current_index]
bigquery_client.query(
query_doc.to_dict()['request'], # sql query
).result()
logging.info(
'Query %s executed',
query_doc.id
)
# exit if the current query is the last
if len(sorted_queries) == current_index + 1:
logging.info('All scripts were executed.')
return
next_query_id = sorted_queries[current_index+1].id.encode('utf-8')
publish(publisher_client, next_query_id)
#retry(tries=5)
def publish(publisher_client, next_query_id):
"""
send a message in pubsub to call the next query
this mechanism allow to run one sql script per Function instance
so as to not exceed the 9min deadline limit
"""
logging.info('Calling next query %s', next_query_id)
future = publisher_client.publish(
topic='projects/{}/topics/function-sql-runner'.format(PROJECT_ID),
data=next_query_id
)
# ensure publish is successfull
message_id = future.result()
logging.info('Published message_id = %s', message_id)
It looks like the pubsub message is not ack on success.
I do not think I have background activity in my code.
My question : why my Function is randomly retrying even when success ?
Cloud Functions does not guarantee that your functions will run exactly once. According to the documentation, background functions, including pubsub functions, are given an at-least-once guarantee:
Background functions are invoked at least once. This is because of the
asynchronous nature of handling events, in which there is no caller
that waits for the response. The system might, in rare circumstances,
invoke a background function more than once in order to ensure
delivery of the event. If a background function invocation fails with
an error, it will not be invoked again unless retries on failure are
enabled for that function.
Your code will need to expect that it could possibly receive an event more than once. As such, your code should be idempotent:
To make sure that your function behaves correctly on retried execution
attempts, you should make it idempotent by implementing it so that an
event results in the desired results (and side effects) even if it is
delivered multiple times. In the case of HTTP functions, this also
means returning the desired value even if the caller retries calls to
the HTTP function endpoint. See Retrying Background Functions for more
information on how to make your function idempotent.

AWS Lambda/SNS Publish ignore invalid endpoints

i'm sending apple push notifications via AWS SNS via Lambda with Boto3 and Python.
from __future__ import print_function
import boto3
def lambda_handler(event, context):
client = boto3.client('sns')
for record in event['Records']:
if record['eventName'] == 'INSERT':
rec = record['dynamodb']['NewImage']
competitors = rec['competitors']['L']
for competitor in competitors:
if competitor['M']['confirmed']['BOOL'] == False:
endpoints = competitor['M']['endpoints']['L']
for endpoint in endpoints:
print(endpoint['S'])
response = client.publish(
#TopicArn='string',
TargetArn = endpoint['S'],
Message = 'test message'
#Subject='string',
#MessageStructure='string',
)
Everything works fine! But when an endpoint is invalid for some reason (at the moment this happens everytime i run a development build on my device, since i get a different endpoint then. This will be either not found or deactivated.) the Lambda function fails and gets called all over again. In this particular case if for example the second endpoint fails it will send the push over and over again to endpoint 1 to infinity.
Is it possible to ignore invalid endpoints and just keep going with the function?
Thank you
Edit:
Thanks to your help i was able to solve it with:
try:
response = client.publish(
#TopicArn='string',
TargetArn = endpoint['S'],
Message = 'test message'
#Subject='string',
#MessageStructure='string',
)
except Exception as e:
print(e)
continue
Aws lamdba on failure retries the function till the event expires from the stream.
In your case since the exception on the 2nd endpoint is not handled, the retry mechanism ensures the reexecution of post to the first endpoint.
If you handle the exception and ensure the function successfully ends even when there is a failure, then the retries will not happen.

Using Tweepy to listen to stream and search for tweets. How to stop previous search and only listen for new stream?

I'm using Flask and Tweepy to search for live tweets. On the front-end I have a user text input, and button called "Search". Ideally, when a user gives a search-term into the input and clicks the "Search" button, the Tweepy should listen for the new search-term and stop the previous search-term stream. When the "Search" button is clicked it executes this function:
#app.route('/search', methods=['POST'])
# gets search-keyword and starts stream
def streamTweets():
search_term = request.form['tweet']
search_term_hashtag = '#' + search_term
# instantiate listener
listener = StdOutListener()
# stream object uses listener we instantiated above to listen for data
stream = tweepy.Stream(auth, listener)
if stream is not None:
print "Stream disconnected..."
stream.disconnect()
stream.filter(track=[search_term or search_term_hashtag], async=True)
redirect('/stream') # execute '/stream' sse
return render_template('index.html')
The /stream route that is executed in the second to last line in above code is as follows:
#app.route('/stream')
def stream():
# we will use Pub/Sub process to send real-time tweets to client
def event_stream():
# instantiate pubsub
pubsub = red.pubsub()
# subscribe to tweet_stream channel
pubsub.subscribe('tweet_stream')
# initiate server-sent events on messages pushed to channel
for message in pubsub.listen():
yield 'data: %s\n\n' % message['data']
return Response(stream_with_context(event_stream()), mimetype="text/event-stream")
My code works fine, in the sense that it starts a new stream and searches for a given term whenever the "Search" button is clicked, but it does not stop the previous search. For example, if my first search term was "NYC" and then I wanted to search for a different term, say "Los Angeles", it will give me results for both "NYC" and "Los Angeles", which is not what I want. I want just "Los Angeles" to be searched. How do I fix this? In other words, how do I stop the previous stream? I looked through other previous threads, and I know I have to use stream.disconnect(), but I'm not sure how to implement this in my code. Any help or input would be greatly appreciated. Thanks so much!!
Below is some code that will cancel old streams when a new stream is created. It works by adding new streams to a global list, and then calling stream.disconnect() on all streams in the list whenever a new stream is created.
diff --git a/app.py b/app.py
index 1e3ed10..f416ddc 100755
--- a/app.py
+++ b/app.py
## -23,6 +23,8 ## auth.set_access_token(access_token, access_token_secret)
app = Flask(__name__)
red = redis.StrictRedis()
+# Add a place to keep track of current streams
+streams = []
#app.route('/')
def index():
## -32,12 +34,18 ## def index():
#app.route('/search', methods=['POST'])
# gets search-keyword and starts stream
def streamTweets():
+ # cancel old streams
+ for stream in streams:
+ stream.disconnect()
+
search_term = request.form['tweet']
search_term_hashtag = '#' + search_term
# instantiate listener
listener = StdOutListener()
# stream object uses listener we instantiated above to listen for data
stream = tweepy.Stream(auth, listener)
+ # add this stream to the global list
+ streams.append(stream)
stream.filter(track=[search_term or search_term_hashtag],
async=True) # make sure stream is non-blocking
redirect('/stream') # execute '/stream' sse
What this does not solve is the problem of session management. With your current setup a search by one user will affect the searches of all users. This can be avoided by giving your users some identifier and storing their streams along with their identifier. The easiest way to do this is likely to use Flask's session support. You could also do this with a requestId as Pierre suggested. In either case you will also need code to notice when a user has closed the page and close their stream.
Disclaimer: I know nothing about Tweepy, but this appears to be a design issue.
Are you trying to add state to a RESTful API? You may have a design problem.
As JRichardSnape answered, your API shouldn't be the one taking care of canceling a request; it should be done in the front-end. What I mean here is in the javascript / AJAX / etc calling this function, add another call, to the new function
#app.route('/cancelSearch', methods=['POST'])
With the "POST" that has the search terms. So long as you don't have state, you can't really do this safely in an async call: Imagine someone else makes the same search at the same time then canceling one will cancel both (remember, you don't have state so you don't know who you're canceling). Perhaps you do need state with your design.
If you must keep using this and don't mind breaking the "stateless" rule, then add a "state" to your request. In this case it's not so bad because you could launch a thread and name it with the userId, then kill the thread every new search
def streamTweets():
search_term = request.form['tweet']
userId = request.form['userId'] # If your limit is one request per user at a time. If multiple windows can be opened and you want to follow this limit, store userId in a cookie.
#Look for any request currently running with this ID, and cancel them
Alternatively, you could return a requestId, which you would then keep in the front-end can call cancelSearch?requestId=$requestId. In cancelSearch, you would have to find the pending request (sounds like that's in tweepy since you're not using your own threads) and disconnect it.
Out of curiosity I just watched what happens when you search on Google, and it uses a GET request. Have a look (debug tools -> Network; then enter some text and see the autofill). Google uses a token sent with every request (every time you type something)). It doesn't mean it's used for this, but that's basically what I described. If you don't want a session, then use a unique identifier.
Well I solved it by using timer method But still I'm looking for pythonic way.
from streamer import StreamListener
def stream():
hashtag = input
#assign each user an ID ( for pubsub )
StreamListener.userid = random_user_id
def handler(signum, frame):
print("Forever is over")
raise Exception("end of time")
def main_stream():
stream = tweepy.Stream(auth, StreamListener())
stream.filter(track=track,async=True)
redirect(url_for('map_stream'))
def close_stream():
# this is for closing client list in redis but don't know it's working
obj = redis.client_list(tweet_stream)
redis_client_list = obj[0]['addr']
redis.client_kill(redis_client_list)
stream = tweepy.Stream(auth, StreamListener())
stream.disconnect()
import signal
signal.signal(signal.SIGALRM, handler)
signal.alarm(300)
try:
main_stream()
except Exception:
close_stream()
print("function terminate")

Categories

Resources