How to group Cloud Function log entries by execution? - python

In Google App Engine, first generation, logs are grouped automatically by request in Logs Viewer, and in the second generation it's easy enough to set up.
In background Cloud Functions I can't find any way of doing it (save manually filtering by executionId in Logs Viewer). From various articles around the web I read that the key is to set the trace argument to the Trace ID when calling the Stackdriver Logging API, and that in HTTP contexts this ID can be found in the X-Cloud-Trace-Context header.
There are no headers in background contexts (for example, called from Pub/Sub or Storage triggers). I've tried setting this to an arbitrary value, such as the event_id from the function context, but no grouping happens.
Here's a minified representation of how I've tried it:
from google.cloud.logging.resource import Resource
import google.cloud.logging
log_name = 'cloudfunctions.googleapis.com%2Fcloud-functions'
cloud_client = google.cloud.logging.Client()
cloud_logger = cloud_client.logger(log_name)
request_id = None
def log(message):
labels = {
'project_id': 'settle-leif',
'function_name': 'group-logs',
'region': 'europe-west1',
}
resource = Resource(type='cloud_function', labels=labels)
trace_id = f'projects/settle-leif/traces/{request_id}'
cloud_logger.log_text(message, trace=trace_id, resource=resource)
def main(_data, context):
global request_id
request_id = context.event_id
log('First message')
log('Second message')

This is currently possible.
It's on our roadmap to provide this support: https://github.com/GoogleCloudPlatform/functions-framework-python/issues/79

Related

Sending logs from one serverless function to multiple cloudwatch groups

I am developing an API for an app with python, using FastAPI, serverless and amazon web services.
We are using CloudWatch for saving our logs.
The thing is, I am required to send different logs to different groups in the same aplication, depending on wethever is it an error, an info, etc.
Let´s say, I have two log groups in CloudWatch: /aws/lambda/firstGroup and /aws/lambda/secondGroup.
An I have this function:
def foo(some_data):
logger.info(f'calling the function with the data: {data}') # this goes to logGroup /aws/lambda/firstGroup
try:
doSomething()
except:
logger.error('ERROR! Something happened') # this goes to logGroup /aws/lambda/secondGroup
How can I configure the serverless.yml file so the logger.info goes to the first group and the logger.error goes to the second group?
Thanks in advance!
I am using this solution with ec2 instance.
create a log group
create log stream
then dump your logs
import boto3
import time
# init clients
clw_client = boto3.client('logs', region_name=REGION)
# print("check else create new log group..")
try:
clw_client.create_log_group(logGroupName=LOG_GROUP)
except clw_client.exceptions.ResourceAlreadyExistsException:
pass
# print("check else create new log stream.....")
LOG_STREAM = '{}-{}'.format(time.strftime("%m-%d-%Y-%H-%M-%S"),'logstream')
try:
clw_client.create_log_stream(logGroupName=LOG_GROUP, logStreamName=LOG_STREAM)
except clw_client.exceptions.ResourceAlreadyExistsException:
pass
def log_update(text):
print(text)
response = clw_client.describe_log_streams(
logGroupName = LOG_GROUP,
logStreamNamePrefix = LOG_STREAM
)
try:
event_log = {
'logGroupName': LOG_GROUP,
'logStreamName': LOG_STREAM,
'logEvents': [{
'timestamp': int(round(time.time() * 1000)),
'message': f"{time.strftime('%Y-%m-%d %H:%M:%S')}\t {text}"
}
],
}
if 'uploadSequenceToken' in response['logStreams'][0]:
event_log.update({'sequenceToken': response['logStreams'][0] ['uploadSequenceToken']})
response = clw_client.put_log_events(**event_log)
except Exception as e:
log_update(e)
Then use call that function inside your app whenever you like. Just don't check for groups and stream again and again for one job, these should run once.
You can update it, add more logic like change log-group name to implement what you wanted in this question with just some if else statements. Good luck!

Facebook Marketing API - how to handle rate limit for retrieving *all* ad sets through campaign ids?

I've recently started working with the Facebook Marketing API, using the facebook_business SDK for Python (running v3.9 on Ubuntu 20.04). I think I've mostly wrapped my head around how it works, however, I'm still kind of at a loss as to how I can handle the arbitrary way in which the API is rate-limited.
Specifically, what I'm attempting to do is to retrieve all Ad Sets from all the campaigns that have ever run on my ad account, regardless of whether their effective_status is ACTIVE, PAUSED, DELETED or ARCHIVED.
Hence, I pulled all the campaigns for my ad account. These are stored in a dict, whereby the key indicates the effective_status, like so, called output:
{'ACTIVE': ['******************',
'******************',
'******************'],
'PAUSED': ['******************',
'******************',
'******************'}
Then, I'm trying to pull the Ad Set ids, like so:
import pandas as pd
import json
import re
import time
from random import *
from facebook_business.api import FacebookAdsApi
from facebook_business.adobjects.adaccount import AdAccount # account-level info
from facebook_business.adobjects.campaign import Campaign # campaign-level info
from facebook_business.adobjects.adset import AdSet # ad-set level info
from facebook_business.adobjects.ad import Ad # ad-level info
# auth init
app_id = open(APP_ID_PATH, 'r').read().splitlines()[0]
app_secret = open(APP_SECRET_PATH, 'r').read().splitlines()[0]
token = open(APP_ACCESS_TOKEN, 'r').read().splitlines()[0]
# init the connection
FacebookAdsApi.init(app_id, app_secret, token)
campaign_types = list(output.keys())
ad_sets = {}
for status in campaign_types:
ad_sets_for_status = []
for campaign_id in output[status]:
# sleep and wait for a random time
sleepy_time = uniform(1, 3)
time.sleep(sleepy_time)
# pull the ad_sets for this particular campaign
campaign_ad_sets = Campaign(campaign_id).get_ad_sets()
for entry in campaign_ad_sets:
ad_sets_for_status.append(entry['id'])
ad_sets[status] = ad_sets_for_status
Now, this crashes at different times whenever I run it, with the following error:
FacebookRequestError:
Message: Call was not successful
Method: GET
Path: https://graph.facebook.com/v11.0/23846914220310083/adsets
Params: {'summary': 'true'}
Status: 400
Response:
{
"error": {
"message": "(#17) User request limit reached",
"type": "OAuthException",
"is_transient": true,
"code": 17,
"error_subcode": 2446079,
"fbtrace_id": "***************"
}
}
I can't reproduce the time at which it crashes, however, it certainly doesn't take ~600 calls (see here: https://stackoverflow.com/a/29690316/5080858), and as you can see, I'm sleeping ahead of every API call. You might suggest that I should just call the get_ad_sets method on the AdAccount endpoint, however, this pulls fewer ad sets than the above code does, even before it crashes. For my use-case, it's important to pull ads that are long over as well as ads that are ongoing, hence it's important that I get as much data as possible.
I'm kind of annoyed with this -- seeing as we are paying for these ads to run, you'd think FB would make it as easy as possible to retrieve info on them via API, and not introduce API rate limits similar to those for valuable data one doesn't necessarily own.
Anyway, I'd appreciate any kind of advice or insights - perhaps there's also a much better way of doing this that I haven't considered.
Many thanks in advance!
The error with 'code': 17 means that you reach the limit of call and in order to get more nodes you have to wait.
Firstly I would handle the error in this way:
from facebook_business.exceptions import FacebookRequestError
...
for status in campaign_types:
ad_sets_for_status = []
for campaign_id in output[status]:
# keep trying until the request is ok
while True:
try:
campaign_ad_sets = Campaign(campaign_id).get_ad_sets()
break
except FacebookRequestError as error:
if error.api_error_code() in [17, 80000]:
time.sleep(sleepy_time) # sleep for a period of time
for entry in campaign_ad_sets:
ad_sets_for_status.append(entry['id'])
ad_sets[status] = ad_sets_for_status
I'd like to suggest you moreover to fetch the list of nodes from the account (by using the 'level': node param in params) and by using the batch calls: I can assure you that this will help you a lot and it will decrease the program run time.
I hope I was helpful.

Simplest way to perform logging from Google Cloud Run

I followed this guide https://firebase.google.com/docs/hosting/cloud-run to setup cloud run docker.
Then I tried to follow this guide https://cloud.google.com/run/docs/logging to perform a simple log. Trying to write a structured log to stdout
This is my code:
trace_header = request.headers.get('X-Cloud-Trace-Context')
if trace_header:
trace = trace_header.split('/')
global_log_fields['logging.googleapis.com/trace'] = "projects/sp-64d90/traces/" + trace[0]
# Complete a structured log entry.
entry = dict(severity='NOTICE',
message='This is the default display field.',
# Log viewer accesses 'component' as jsonPayload.component'.
component='arbitrary-property',
**global_log_fields)
print(json.dumps(entry))
I cannot see this log in the Cloud Logs Viewer. I do see the http Get logs each time I call the docker.
Am I missing anything? I am new to this and wondered what is the simples way to be able to log information and view it assuming the docker I created was exactly with the steps from the guide (https://firebase.google.com/docs/hosting/cloud-run)
Thanks
I am running into the exact same issue. I did find that flushing stdout causes the logging to appear when it otherwise would not. Looks like a bug in Cloud Run to me.
print(json.dumps(entry))
import sys
sys.stdout.flush()
Output with flushing
#For Python/Java
Using "google-cloud-logging" module is the easiest way to push container logs to Stackdriver logs. COnfigure google-cloud-logging to work with python's default logging module
import logging as log
import google.cloud.logging as logging
def doSomething(param):
logging_client = logging.Client()
logging_client.setup_logging()
log.info(f"Some log here: {param}")
now you should see this log in Stackdriver logging under Cloud Run Revision.
An easy way to integrate Google Cloud Platform logging into your Python code is to create a subclass from logging.StreamHandler. This way logging levels will also match those of Google Cloud Logging, enabling you to filter based on severity. This solution also works within Cloud Run containers.
Also you can just add this handler to any existing logger configuration, without needing to change current logging code.
import json
import logging
import os
import sys
from logging import StreamHandler
from flask import request
class GoogleCloudHandler(StreamHandler):
def __init__(self):
StreamHandler.__init__(self)
def emit(self, record):
msg = self.format(record)
# Get project_id from Cloud Run environment
project = os.environ.get('GOOGLE_CLOUD_PROJECT')
# Build structured log messages as an object.
global_log_fields = {}
trace_header = request.headers.get('X-Cloud-Trace-Context')
if trace_header and project:
trace = trace_header.split('/')
global_log_fields['logging.googleapis.com/trace'] = (
f"projects/{project}/traces/{trace[0]}")
# Complete a structured log entry.
entry = dict(severity=record.levelname, message=msg)
print(json.dumps(entry))
sys.stdout.flush()
A way to configure and use the handler could be:
def get_logger():
logger = logging.getLogger(__name__)
if not logger.handlers:
gcp_handler = GoogleCloudHandler()
gcp_handler.setLevel(logging.DEBUG)
gcp_formatter = logging.Formatter(
'%(levelname)s %(asctime)s [%(filename)s:%(funcName)s:%(lineno)d] %(message)s')
gcp_handler.setFormatter(gcp_formatter)
logger.addHandler(gcp_handler)
return logger
1.Follow the guide you mentioned Serve dynamic content and host microservices with Cloud Run
2.Add the following code to index.js
const {Logging} = require('#google-cloud/logging');
const express = require('express');
const app = express();
app.get('/', (req, res) => {
console.log('Hello world received a request.');
const target = process.env.TARGET || 'World';
const projectId = 'your-project';
const logging = new Logging({projectId});
// Selects the log to write to
const log = logging.log("Cloud_Run_Logs");
// The data to write to the log
const text = 'Hello, world!';
// The metadata associated with the entry
const metadata = {
resource: {type: 'global'},
// See: https://cloud.google.com/logging/docs/reference/v2/rest/v2/LogEntry#logseverity
severity: 'INFO',
};
// Prepares a log entry
const entry = log.entry(metadata, text);
async function writeLog() {
// Writes the log entry
await log.write(entry);
console.log(`Logged the log that you just created: ${text}`);
}
writeLog();
res.send(`Hello ${target}!`);
});
const port = process.env.PORT || 8080;
app.listen(port, () => {
console.log('Hello world listening on port', port);
});
3.Check the logs under Logging/Global
Edit
For python:
import os
import google.cloud.logging
import logging
from flask import Flask
app = Flask(__name__)
#app.route('/')
def hello_world():
target = os.environ.get('TARGET', 'World')
# Instantiates a client
client = google.cloud.logging.Client()
# Connects the logger to the root logging handler; by default this captures
# all logs at INFO level and higher
client.setup_logging()
# The data to log
text = 'Hello, these are logs from cloud run!'
# Emits the data using the standard logging module
logging.warning(text)
return 'Hello {}!\n'.format(text)
There is support for Bunyan and Winston node.js libraries in Google Cloud Loggging:
https://cloud.google.com/logging/docs/setup/nodejs#using_bunyan
https://cloud.google.com/logging/docs/setup/nodejs#using_winston
Typically, if you are not looking to do structured logging, all you need to do is print things to stdout/stderr and Cloud Run will pick it up.
This is documented at https://cloud.google.com/run/docs/logging and it has Node.js example for structured and non-structured logging as well.

get trace ID of sent request

I'm using the Open Tracing Python library for GRPC and am trying to build off of the example script here: https://github.com/opentracing-contrib/python-grpc/blob/master/examples/trivial/trivial_client.py.
Once I have sent a request through the intercepted channel, how do I find the trace-id value for the request? I want to use this to look at the traced data in the Jaeger UI.
I had missed a key piece of documentation. In order to get a trace ID, you have to create a span on the client side. This span will have the trace ID that can be used to examine data in the Jaeger UI. The span has to be added into the GRPC messages via an ActiveSpanSource instance.
# opentracing-related imports
from grpc_opentracing import open_tracing_client_interceptor, ActiveSpanSource
from grpc_opentracing.grpcext import intercept_channel
from jaeger_client import Config
# dummy class to hold span data for passing into GRPC channel
class FixedActiveSpanSource(ActiveSpanSource):
def __init__(self):
self.active_span = None
def get_active_span(self):
return self.active_span
config = Config(
config={
'sampler': {
'type': 'const',
'param': 1,
},
'logging': True,
},
service_name='foo')
tracer = config.initialize_tracer()
# ...
# In the method where GRPC requests are sent
# ...
active_span_source = FixedActiveSpanSource()
tracer_interceptor = open_tracing_client_interceptor(
tracer,
log_payloads=True,
active_span_source=active_span_source)
with tracer.start_span('span-foo') as span:
print(f"Created span: trace_id:{span.trace_id:x}, span_id:{span.span_id:x}, parent_id:{span.parent_id}, flags:{span.flags:x}")
# provide the span to the GRPC interceptor here
active_span_source.active_span = span
with grpc.insecure_channel(...) as channel:
channel = intercept_channel(channel, tracer_interceptor)
Of course, you could switch the ordering of the with statements so that the span is created after the GRPC channel. That part doesn't make any difference.
Correct me, if I'm wrong. If you mean how to find the trace-id on the server side, you can try to access the OpenTracing span by get_active_span. The trace-id, I suppose, should be one of the tags in it.

python firebase realtime listener

Hi there I'm new in python.
I would like to implement the listener on my Firebase DB.
When I change one or more parameters on the DB my Python code have to do something.
How can I do it?
Thank a lot
my db is like simple list of data from 001 to 200:
"remote-controller"
001 -> 000
002 -> 020
003 -> 230
my code is:
from firebase import firebase
firebase = firebase.FirebaseApplication('https://remote-controller.firebaseio.com/', None)
result = firebase.get('003', None)
print result
It looks like this is supported now (october 2018): although it's not documented in the 'Retrieving Data' guide, you can find the needed functionality in the API reference. I tested it and it works like this:
def listener(event):
print(event.event_type) # can be 'put' or 'patch'
print(event.path) # relative to the reference, it seems
print(event.data) # new data at /reference/event.path. None if deleted
firebase_admin.db.reference('my/data/path').listen(listener)
As Peter Haddad suggested, you should use Pyrebase for achieving something like that given that the python SDK still does not support realtime event listeners.
import pyrebase
config = {
"apiKey": "apiKey",
"authDomain": "projectId.firebaseapp.com",
"databaseURL": "https://databaseName.firebaseio.com",
"storageBucket": "projectId.appspot.com"
}
firebase = pyrebase.initialize_app(config)
db = firebase.database()
def stream_handler(message):
print(message["event"]) # put
print(message["path"]) # /-K7yGTTEp7O549EzTYtI
print(message["data"]) # {'title': 'Pyrebase', "body": "etc..."}
my_stream = db.child("posts").stream(stream_handler)
If Anybody wants to create multiple listener using same listener function and want to get more info about triggered node, One can do like this.
Normal Listener function will get a Event object it has only Data, Node Name, Event type. If you add multiple listener and You want to differentiate between the data change. You can write your own class and add some info to it while creating object.
class ListenerClass:
def __init__(self, appname):
self.appname = appname
def listener(self, event):
print(event.event_type) # can be 'put' or 'patch'
print(event.path) # relative to the reference, it seems
print(event.data) # new data at /reference/event.path. None if deleted
print(self.appname) # Extra data related to change add your own member variable
Creating Objects:
listenerObject = ListenerClass(my_app_name + '1')
db.reference('PatientMonitoring', app= obj).listen(listenerObject.listener)
listenerObject = ListenerClass(my_app_name + '2')
db.reference('SomeOtherPath', app= obj).listen(listenerObject.listener)
Full Code:
import firebase_admin
from firebase_admin import credentials
from firebase_admin import db
# Initialising Database with credentials
json_path = r'E:\Projectz\FYP\FreshOnes\Python\PastLocations\fyp-healthapp-project-firebase-adminsdk-40qfo-f8fc938674.json'
my_app_name = 'fyp-healthapp-project'
xyz = {'databaseURL': 'https://{}.firebaseio.com'.format(my_app_name),'storageBucket': '{}.appspot.com'.format(my_app_name)}
cred = credentials.Certificate(json_path)
obj = firebase_admin.initialize_app(cred,xyz , name=my_app_name)
# Create Objects Here, You can use loops and create many listener, But listener will create thread per every listener, Don't create irrelevant listeners. It won't work if you are running on machine with thread constraint
listenerObject = ListenerClass(my_app_name + '1') # Decide your own parameters, How you want to differentiate. Depends on you
db.reference('PatientMonitoring', app= obj).listen(listenerObject.listener)
listenerObject = ListenerClass(my_app_name + '2')
db.reference('SomeOtherPath', app= obj).listen(listenerObject.listener)
As you can see on the per-language feature chart on the Firebase Admin SDK home page, Python and Go currently don't have realtime event listeners. If you need that on your backend, you'll have to use the node.js or Java SDKs.
You can use Pyrebase, which is a python wrapper for the Firebase API.
more info here:
https://github.com/thisbejim/Pyrebase
To retrieve data you need to use val(), example:
users = db.child("users").get()
print(users.val())
Python Firebase Realtime Listener Full Code :
import firebase_admin
from firebase_admin import credentials
from firebase_admin import db
def listener(event):
print(event.event_type) # can be 'put' or 'patch'
print(event.path) # relative to the reference, it seems
print(event.data) # new data at /reference/event.path. None if deleted
json_path = r'E:\Projectz\FYP\FreshOnes\Python\PastLocations\fyp-healthapp-project-firebase-adminsdk-40qfo-f8fc938674.json'
my_app_name = 'fyp-healthapp-project'
xyz = {'databaseURL': 'https://{}.firebaseio.com'.format(my_app_name),'storageBucket': '{}.appspot.com'.format(my_app_name)}
cred = credentials.Certificate(json_path)
obj = firebase_admin.initialize_app(cred,xyz , name=my_app_name)
db.reference('PatientMonitoring', app= obj).listen(listener)
Output:
put
/
{'n0': '40', 'n1': '71'} # for first time its gonna fetch the data from path whether data is changed or not
put # On data changed
/n1
725
put # On data changed
/n0
401

Categories

Resources