Socket.io connections distribution between several servers

Socket.io connections distribution between several servers - python

I'm working on DB design tool (python, gevent-socket.io). In these tool multiple users can discuss one DB model, receiving changes in runtime. To support this feature, I'm using socket.io. I'd like to extend number of servers that handle socket.io connection easily. The simplest way to do it is to set up nginx to choose server basing of model ID.
I'd like module approach, where model ID is divided by number of servers. So if I have 3 nodes, model 1 will be handled on first, 2 - on second, 3 - on third, 4 - on first again etc.
My request for model loading looks like /models/, so no problem here - argument can be parsed to find server to handle it. But after model page is loaded, JS tries to establish connection:
var socket = io.connect('/models', {
'reconnection limit': 4000
});
It accesses default endpoint, so server receives following requests:
http://example.com/socket.io/1/xhr-pooling/111111?=1111111
To handle it, I create application this way:
SocketIOServer((app.config['HOST'], app.config['PORT']), app, resource='socket.io', transports=transports).serve_forever()
and then
#bp.route('/<path:remaining>')
def socketio(remaining):
app = current_app._get_current_object()
try:
# Hack: set app instead of request to make it available in the namespace.
socketio_manage(request.environ, {'/models': ModelsNamespace}, app)
except:
app.logger.error("Exception while handling socket.io connection", exc_info=True)
return Response()
I'd like to change it to
http://example.com/socket.io/<model_id>/1/xhr-pooling/111111?=1111111
to be able to choose right server in ngnix. How to do it?
UPDATE
I also like to check user permissions when it tries to establish connection. I'd like to do it in socketio(remaining) method, but, again, I need to know what model he is trying to access.
UPDATE 2
I implemented permission validator, taking model_id from HTTP_REFERER. Seems, it's only part of request that contains identifier of the model (example of values: http://example.com/models/1/).

The first idea - is to tell client side available servers for current time.
Furthermore you can generate server list for client side by priority, just put them in javascript generated array by order.
This answer means that your servers can answer on any models, you can control server loading by changing servers ordering in generated list for new clients.
I think this is more flexible way. But if you want - you can parse query string in nginx and route request on any underlying server - just have a table for "model id-server port" relations
Upd: Just thinking about your task. And find one another solution. When you generate client web page you can inline servers count in js somewhere. Then, when you requesting model updates, just use another parameter founded as
serverId = modelId%ServersCount;
that will be server identificator for routing in nginx.
Then in nginx config you can use simple parsing query string, and routing request to server you can find by serverId parameter.
in "metalanguage" it will be
get parameter serverId to var $servPortSuffix
route request to localhost:80$servPortSuffix
or another routing idea.
You can add additional GET parameters to socket.io via
io.connect(url, {query: "foo=bar"})

Related

Send data from Django to another server

I have an already existing Django app. I would like to add a system that sends data from my Django application to another Python application hosted on another server, so that the Python application receives data from the Django App in json format, possibly.
So for example, i would need to create a view that every tot seconds sends the data from a DB table to this application, or when a form is hit, the data is sent to this external application.
How can i do this? Is there an example for this particular matter? I don't know what tools i'd need to use to create this system, i only know that i would need to use Celery to perform asynchronous tasks, but nothing else; should i use Webhooks maybe? Or Django channels?
Edit: adding some more context:
I have my Django client. Then i have one or two Python applications running on another server. On my Django client i have some forms. Once the form is submitted, the data is saved on the db, but i also want this data to be sent instantly to my Python applications. The Python applications should receive the data from Django in Json format and perform some tasks according to the values submitted by users. Then the application should send a response to Django.

Come on! I'll call your Django app here "DjangoApp" and your Python apps, in Flask or another framework by "OtherApp".
First as you predicted you will need a framework that is capable of performing tasks, the new **Django 3.0 allows this, but I haven't used it yet ... I will pass on to you something that you are using and fully functional with Django 2.8 and Python 3.8**.
On your DjangoApp server you will need to structure the communication well with your Celery, let's leave the tasks to him. You can read Celery Docs and this post, its very ok to make this architecture.
Regardless of how your form or Django App looks, when you want it to activate a task in celery, it is basically the function to transmit data but in the background.
from .tasks import send_data
...
form.save()
# Create a function within the form to get the data the way you want it
# or do it the way you want.
values = form.new_function_serializedata()
send_data.delay(values) # [CALL CELERY TASKS][1]
...
Read too CALL CELERY TASKS
In all your other applications you will need to have a POST route to receive and serialize this data, do this with lightweight frameworks like Pyramid
This way, every time a form is submitted, you will have this data sent to the server within the send_data function.
In my experience, but not knowing much about your problem I would use a similar architecture but using Celery Beat.
CELERY_BEAT_SCHEDULE = {
'send_data': {
'task': 'your_app.tasks.send_data',
'schedule': crontab(), # CONFIGURE YOUR CRON
},
}
Not only is the above code added, but it is something like that.
Within your models I would create one field for sent. And every 2 seconds, 10 seconds .. as long as I wish I would filter all objects with sent = false, and pass all objects for the send_data task.
I don't know if you got confused, that's a lot to explain. But I hope I can help and answer your questions.

import requests
from django import http
def view(request):
url = 'python.app.com' # replace with other python app url or ip
request_data = {'key': 'value'} # replace with data to be sent to other app
response = requests.post(url, json=request_data)
response_data = response.json() # data returned by other app
return http.JsonResponse(response_data)
This is an example of a function based view that uses the requests library to hit an external service. The request lib takes care of encoding/decoding your data to/from json.

Yeah, webhook would be one of the options, but there are other options available too.
-> You can use Rest Apis to send data from one app to another. but In their case, you need to think about synchronization. That depends on your requirement, If you don't want data in synchronize manner then you may use RabbiMq or other async tools. Just push your rest API request in Rabbitmq and Rabbitmq will handle.

How can I configure a Flask API to allow for token-protected user-specific responses?

If I have a server end point that say does a simple task like initializes an API with a token that I generate client side, and then print the users account info.
Can I initialize() the API globally so the user can do other tasks after printing account info?
and
How does that affect other users initializing() and printing info if they do it at the same time?
I don't understand how this server works and any help would be great. Thank you!

If I'm understanding you correctly, it sounds like the easiest way to accomplish what you're trying to do is to use the flask-login module.
You will want to create an endpoint / Flask route (for example, '/login') that the user will send a POST request to with a username (and usually also a password). When the login is successful, the user's browser will have a cookie set with a token that will allow them to access Flask routes that have a #login_required decorator attached.
In your Python code for those routes, you will be able to access the "current_user", which will allow you to tailor your response to the particular user sending the request.
This way of doing things will allow you to deal with your first question, which seems to be about saving a user's identity across requests, and it will also allow you to deal with your second question, which seems to be about the problem of having different users accessing the same endpoint while expecting different responses.

Nathan's answer points you in the right direction, but I'll make a comment about state and scope because you asked about multiple users.
Firstly, the normal way to do this would be for the server to generate the random token and send it back to the user.
client request 1 > init > create token > store along with init data > return token to user
client request 2 > something(token) > find data related to token > return data to user
When you run Flask (and most webapps) you try to make it so that each request is independent. In your case, the user calls an endpoint and identifies themselves by passing a token. If you make a record of that token somewhere (normally in a database) then you can identify that user's data by looking up the same token they pass on subsequent requests.
Where you choose to store that information is important. As I say a database is the normal recommended approach as it's built to be accessed safely by multiple people at the same time.
You might be tempted to not do the database thing and actually store the token / data information in a global variable inside python. Here's the reason why that's (probably) not going to work:
Flask is a wsgi server, and how it behaves when up and running depends on how it's configured. I generally use uwsgi with several different processes. Each process will have its own state that can't see one another. In this (standard / common) configuration, you don't know which process is going to receive any given request.
request 1 > init(token) > process 1 > store token in global vars
request 2 > other(token) > process 2 > can't find token in global vars
That's why we use a database to store all shared information:
request 1 > init(token) > process 1 > store token in db
request 2 > other(token) > process 2 > can find token db

Can I persist an http connection (or other data) across Flask requests?

I'm working on a Flask app which retrieves the user's XML from the myanimelist.net API (sample), processes it, and returns some data. The data returned can be different depending on the Flask page being viewed by the user, but the initial process (retrieve the XML, create a User object, etc.) done before each request is always the same.
Currently, retrieving the XML from myanimelist.net is the bottleneck for my app's performance and adds on a good 500-1000ms to each request. Since all of the app's requests are to the myanimelist server, I'd like to know if there's a way to persist the http connection so that once the first request is made, subsequent requests will not take as long to load. I don't want to cache the entire XML because the data is subject to frequent change.
Here's the general overview of my app:
from flask import Flask
from functools import wraps
import requests
app = Flask(__name__)
def get_xml(f):
#wraps(f)
def wrap():
# Get the XML before each app function
r = requests.get('page_from_MAL') # Current bottleneck
user = User(data_from_r) # User object
response = f(user)
return response
return wrap
#app.route('/one')
#get_xml
def page_one(user_object):
return 'some data from user_object'
#app.route('/two')
#get_xml
def page_two(user_object):
return 'some other data from user_object'
if __name__ == '__main__':
app.run()
So is there a way to persist the connection like I mentioned? Please let me know if I'm approaching this from the right direction.

I think you aren't approaching this from the right direction because you place your app too much as a proxy of myanimelist.net.
What happens when you have 2000 users? Your app end up doing tons of requests to myanimelist.net, and a mean user could definitely DoS your app (or use it to DoS myanimelist.net).
This is a much cleaner way IMHO :
Server side :
Create a websocket server (ex: https://github.com/aaugustin/websockets/blob/master/example/server.py)
When a user connects to the websocket server, add the client to a list, remove it from the list on disconnect.
For every connected users, do frequently check myanimelist.net to get the associated xml (maybe lower the frequence the more online users you get)
for every xml document, make a diff with your server local version, and send that diff to the client using the websocket channel (assuming there is a diff).
Client side :
on receiving diff : update the local xml with the differences.
disconnect from websocket after n seconds of inactivity + when disconnected add a button on the interface to reconnect
I doubt you can do anything much better assuming myanimelist.net doesn't provide a "push" API.

Django, global variables and tokens

I'm using django to develop a website. On the server side, I need to transfer some data that must be processed on the second server (on a different machine). I then need a way to retrieve the processed data. I figured that the simplest would be to send back to the Django server a POST request, that would then be handled on a view dedicated for that job.
But I would like to add some minimum security to this process: When I transfer the data to the other machine, I want to join a randomly generated token to it. When I get the processed data back, I expect to also get back the same token, otherwise the request is ignored.
My problem is the following: How do I store the generated token on the Django server?
I could use a global variable, but I had the impression browsing here and there on the web, that global variables should not be used for safety reason (not that I understand why really).
I could store the token on disk/database, but it seems to be an unjustified waste of performance (even if in practice it would probably not change much).
Is there third solution, or a canonical way to do such a thing using Django?

You can store your token in django cache, it will be faster from database or disk storage in most of the cases.
Another approach is to use redis.
You can also calculate your token:
save some random token in settings of both servers
calculate token based on current timestamp rounded to 10 seconds, for example using:
token = hashlib.sha1(secret_token)
token.update(str(rounded_timestamp))
token = token.hexdigest()
if token generated on remote server when POSTing request match token generated on local server, when getting response, request is valid and can be processed.

The simple obvious solution would be to store the token in your database. Other possible solutions are Redis or something similar. Finally, you can have a look at distributed async tasks queues like Celery...

Querying objects from mysql with python

Since I can't explain clearly what I don't understand I'll use an example.
Lets say I have a client application and a server application. The server awaits and when the client sends some keyword to the server so the server knows what should be queried. And lets say that the client requests a product object so the server queries the database and gets back the row that the client needs as a set object. So every time I need some object I need send it to the client in form of a string and then instantiate it ?
Am i missing something ? Isn't it expensive to instantiate objects on every query ?
TIA!

Your question is very vague and doesn't really ask something but I'll try to give you a generic answer of how to interact between server and client.
When a user request a item in the client, you should provide the client with an API to the server, something like http://example.com/search?param=test. The client will use this API in either an AJAX call or a direct call.
The server should parse the param, connect to database, retrieve the requested item and return to client. The most common data types for this exchange are JSON and Plain Text.
The client will then parse either of the data types, generate if required an object from these and finnally show the user the requested data.
If this is not what you need please update your question to ask specifically the issue you have and maybe provide some code where you have the issue and I'll update my answer accordingly.

MySQL Server uses custom protocol over TCP. If you don't want to use any library you will have to parse TCP messages. MySQL Connector / Python does exactly that - you can look at its source code if wish.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Socket.io connections distribution between several servers - python

Related

Send data from Django to another server

How can I configure a Flask API to allow for token-protected user-specific responses?

Can I persist an http connection (or other data) across Flask requests?

Django, global variables and tokens

Querying objects from mysql with python

Categories

Resources