Fastest way to log to an external server - python

I'm working on a python/flask application and I have my logging handled on a different server. The way I currently set it up is to have a function which sends a request to the external server whenever somebody visits a webpage.
This, of course extends my TTB because execution only continues after the request to the external server is completed. I've heard about threading but read that that also takes a little extra time.
Summary of current code:
log_auth_token = os.environ["log_auth"]
def send_log(data):
post_data = {
"data": data,
"auth": log_auth_token
}
r = requests.post("https://example.com/log", data=data)
#app.route('/log')
def log():
send_log("/log was just accessed")
return("OK")
In short:
Intended behavior: User requests webpage -> User recieves response -> Request is logged.
Current behavior: User requests webpage -> Request is logged -> User recieves response.
What would be the fastest way to achieve my intended behavior?

What would be the fastest way to achieve my intended behavior?
Log locally and periodically send the log files to a separate server. More specifically, you need to create rotating log files and archive them so you don't end up with 1 huge file. In order to do this you need to configure your reverse proxy (like NGINX).
Or log locally and create an application that allows you to read the log files remotely.
Sending a log per server call to a separate server simply isn't efficient unless you have another process do that. Users shouldn't have to wait for your log action to complete

Related

Send data from Django to another server

I have an already existing Django app. I would like to add a system that sends data from my Django application to another Python application hosted on another server, so that the Python application receives data from the Django App in json format, possibly.
So for example, i would need to create a view that every tot seconds sends the data from a DB table to this application, or when a form is hit, the data is sent to this external application.
How can i do this? Is there an example for this particular matter? I don't know what tools i'd need to use to create this system, i only know that i would need to use Celery to perform asynchronous tasks, but nothing else; should i use Webhooks maybe? Or Django channels?
Edit: adding some more context:
I have my Django client. Then i have one or two Python applications running on another server. On my Django client i have some forms. Once the form is submitted, the data is saved on the db, but i also want this data to be sent instantly to my Python applications. The Python applications should receive the data from Django in Json format and perform some tasks according to the values submitted by users. Then the application should send a response to Django.
Come on! I'll call your Django app here "DjangoApp" and your Python apps, in Flask or another framework by "OtherApp".
First as you predicted you will need a framework that is capable of performing tasks, the new **Django 3.0 allows this, but I haven't used it yet ... I will pass on to you something that you are using and fully functional with Django 2.8 and Python 3.8**.
On your DjangoApp server you will need to structure the communication well with your Celery, let's leave the tasks to him. You can read Celery Docs and this post, its very ok to make this architecture.
Regardless of how your form or Django App looks, when you want it to activate a task in celery, it is basically the function to transmit data but in the background.
from .tasks import send_data
...
form.save()
# Create a function within the form to get the data the way you want it
# or do it the way you want.
values = form.new_function_serializedata()
send_data.delay(values) # [CALL CELERY TASKS][1]
...
Read too CALL CELERY TASKS
In all your other applications you will need to have a POST route to receive and serialize this data, do this with lightweight frameworks like Pyramid
This way, every time a form is submitted, you will have this data sent to the server within the send_data function.
In my experience, but not knowing much about your problem I would use a similar architecture but using Celery Beat.
CELERY_BEAT_SCHEDULE = {
'send_data': {
'task': 'your_app.tasks.send_data',
'schedule': crontab(), # CONFIGURE YOUR CRON
},
}
Not only is the above code added, but it is something like that.
Within your models I would create one field for sent. And every 2 seconds, 10 seconds .. as long as I wish I would filter all objects with sent = false, and pass all objects for the send_data task.
I don't know if you got confused, that's a lot to explain. But I hope I can help and answer your questions.
import requests
from django import http
def view(request):
url = 'python.app.com' # replace with other python app url or ip
request_data = {'key': 'value'} # replace with data to be sent to other app
response = requests.post(url, json=request_data)
response_data = response.json() # data returned by other app
return http.JsonResponse(response_data)
This is an example of a function based view that uses the requests library to hit an external service. The request lib takes care of encoding/decoding your data to/from json.
Yeah, webhook would be one of the options, but there are other options available too.
-> You can use Rest Apis to send data from one app to another. but In their case, you need to think about synchronization. That depends on your requirement, If you don't want data in synchronize manner then you may use RabbiMq or other async tools. Just push your rest API request in Rabbitmq and Rabbitmq will handle.

Django celery and channels example

I have a Django app and need to generate files that can take up to a minute, so I pass that off to a background worker.
Currently, the process works as follows. I post to the server that replies with a URL that I can poll. I then poll the server every 2 seconds and either sends back "busy" or a url of where the file is located in my S3 bucket.
I want to replace this polling with Django channels, but unsure what is the best way to achieve this as I can't find any examples online. is channels even intended for something like this?
My current thoughts are the following:
Start the file generation as soon as the client opens a connection to on a specific route (previously this would have been a post)
The background task gets started as soon as the client connects and get the channel name as a paramenter
Once it is done it sends back the file path to the consumer which in turn sends it to the browser where I'll use JS to create a download button.
Below is an example:
#shared_task
def my_bg_task(channel_name):
#some long running calc here
channel_layer = get_channel_layer()
async_to_sync(channel_layer.send)(channel_name, {'type': 'generation_done', 'f_path': 'path/to/s3/bucket'})
class ReloadConsumer(WebsocketConsumer):
def connect(self):
my_bg_task.delay(self.channel_name)
self.accept()
def generation_done(self, event):
self.send(text_data=json.dumps({event}))
Is this the best way to achieve this?
Obviously from a security point of the it should not be accessible to anybody other than the user that opened the connection.

Can I persist an http connection (or other data) across Flask requests?

I'm working on a Flask app which retrieves the user's XML from the myanimelist.net API (sample), processes it, and returns some data. The data returned can be different depending on the Flask page being viewed by the user, but the initial process (retrieve the XML, create a User object, etc.) done before each request is always the same.
Currently, retrieving the XML from myanimelist.net is the bottleneck for my app's performance and adds on a good 500-1000ms to each request. Since all of the app's requests are to the myanimelist server, I'd like to know if there's a way to persist the http connection so that once the first request is made, subsequent requests will not take as long to load. I don't want to cache the entire XML because the data is subject to frequent change.
Here's the general overview of my app:
from flask import Flask
from functools import wraps
import requests
app = Flask(__name__)
def get_xml(f):
#wraps(f)
def wrap():
# Get the XML before each app function
r = requests.get('page_from_MAL') # Current bottleneck
user = User(data_from_r) # User object
response = f(user)
return response
return wrap
#app.route('/one')
#get_xml
def page_one(user_object):
return 'some data from user_object'
#app.route('/two')
#get_xml
def page_two(user_object):
return 'some other data from user_object'
if __name__ == '__main__':
app.run()
So is there a way to persist the connection like I mentioned? Please let me know if I'm approaching this from the right direction.
I think you aren't approaching this from the right direction because you place your app too much as a proxy of myanimelist.net.
What happens when you have 2000 users? Your app end up doing tons of requests to myanimelist.net, and a mean user could definitely DoS your app (or use it to DoS myanimelist.net).
This is a much cleaner way IMHO :
Server side :
Create a websocket server (ex: https://github.com/aaugustin/websockets/blob/master/example/server.py)
When a user connects to the websocket server, add the client to a list, remove it from the list on disconnect.
For every connected users, do frequently check myanimelist.net to get the associated xml (maybe lower the frequence the more online users you get)
for every xml document, make a diff with your server local version, and send that diff to the client using the websocket channel (assuming there is a diff).
Client side :
on receiving diff : update the local xml with the differences.
disconnect from websocket after n seconds of inactivity + when disconnected add a button on the interface to reconnect
I doubt you can do anything much better assuming myanimelist.net doesn't provide a "push" API.

Flask - Correct signal to subscribe to log after a request was finished?

I want to log requests (ie. user page views) to a database, but I want only to log the request metadata to a DB after the request was finished and data was successfully sent to the client.
Does flask request_tearing_down is the correct signal to subscribe? How about request_finished?
It looks like you don't want request_finished. From the docs:
This signal is sent right before the response is sent to the client.
From what I gather, request_tearing_down is also triggered before a response is sent.
I don't think there is a specific signal that exists that you can subscribe to to do something after a response has been sent. You might be able to modify Flask's code to add one yourself.
You best option might be to make the logging happen asynchronously so that it doesn't delay the response. You could do this yourself with threads or subprocesses, or you could use a library like Celery to do some of the work for you.
Also see this question

Funkload API testing

So I want to use funkload to stress test an API. I have a set of urls in the test
The thing is the authentication is sent via querystring on every request (no cookies involved)
so /abc?auth=token1 would be one user and /abc?auth=token2 is another
I have code similar to this:
class Simple(FunkLoadTestCase):
def setUp(self):
# fetch urls from a file ... ending up with something like
urlList = ['http://localhost/abc?auth=1', 'http://localhost/def?auth=1']
self.urlList = urlList
def test_simple(self):
for url in self.urlList:
self.get(url, description='Get url')
The problem is that the server relies heavily on memcached so running the same user concurrently x times only puts the server on proper load on the 1st request.
I am looking for a way for to identify what concurrent user I am running as so I can modify the authentication token per concurrent user.
Any ideas?
For anyone running into a similar problem. I figured out how to do it using the credential server. Instead of actually having username:password in passwords.txt I used name:authkey.
Another method but not scalable to running multiple load generation servers that you do have access to self.thread_id so you can know which thread/user you are running as.

Categories

Resources