Keeping Events Synced Between Websocket and REST - python

Is there a best practice for keeping/auditing between a websocket and REST API? Current setup was working relying on web sockets but starting to see feed issues and disconnects during busy times on the far end and need to look at keeping in sync between websockets and REST without risking any locking as the volume is pretty heavy.
Current Setup:
Postgres database cluster that stores to track readings. The main feed comes from a pub/sub connection where a primary and secondary websocket connector gets a timestamped event and a few fields and I write write to the database. I see about 50-100 events a second.
Problem is that the sockets have started to fail on the far end (both active/standy connections drop) and delay potentailly 30 minutes before I get data again which leaves me with a gaping hole in my series.
Is there a better way to architect this to be able to consume both REST and websocket data or am I worrying too much about the database and any locking issues.

Related

Update single database value on a website with many users

For this question, I'm particularly struggling with how to structure this:
User accesses website
User clicks button
Value x in database increments
My issue is that multiple people could potentially be on the website at the same time and click the button - I want to make sure each user is able to click the button, and update the value and read the incremented value too, but I don't know how to circumvent any synchronisation/concurrency issues.
I'm using flask to run my website backend, and I'm thinking of using MongoDB or Redis to store my single value that needs to be updated.
Please comment if there is any lack of clarity in my question, but this is a problem I've really been struggling with how to solve.
Thanks :)
redis, I think you can use redis hincrby command, or create a distributed lock to make sure there is only one writer at the same time and only the lock holding writer can make the update in your flask framework. Make sure you release the lock after certain period of time or after the writer done using the lock.
mysql, you can start a transaction, and make the update and commit the change to make sure the data is right
To solve this problem I would suggest you follow a micro service architecture.
A service called worker would handle the flask route that's called when the user clicks on the link/button on the website. It would generate a message to be sent to another service called queue manager that maintains a queue of increment/decrement messages from the worker service.
There can be multiple worker service instances running concurrently but the queue manager is a singleton service that takes the messages from each service and adds them to the queue. If the queue manager is busy the worker service will either timeout and retry or return a failure message to the user. If the queue is full a response is sent back to the worker to retry n number of times, and you can count down that n.
A third service called storage manager is run every time the queue is not empty, this service sends the messages to the storage solution (whatever mongo, redis, good ol' sql) and it will ensure the increment/decrement messages are handled in the order they were received in the queue. You could also include a time stamp from the worker service in the message if you wanted to use that to sort the queue.
Generally whatever hosting environment for flask will use gunicorn as the production web server and support multiple concurrent worker instances to handle the http requests, and this would naturally be your worker service.
How you build and coordinate the queue manager and storage manager is down to implementation preference, for instance you could use something like Google Cloud pub/sub system to send messages between different deployed services but that's just off the top of my head. There's a load of different ways to do it, and you're in the best position to decide that.
Without knowing more details about what you're trying to achieve and what's the requirements for concurrent traffic I can't go into greater detail, but that's roughly how I've approached this type of problem in the past. If you need to handle more concurrent users at the website, you can pick a hosting solution with more concurrent workers. If you need the queue to be longer, you can pick a host with more memory, or else write the queue to an intermediate storage. This will slow it down but will make recovering from a crash easier.
You also need to consider handling when messages fail between different services, how to recover from a service crashing or the queue filling up.
EDIT: Been thinking about this over the weekend and a much simpler solution is to just create a new record in a table directly from the flask route that handles user clicks. Then to get your total you just get a count from this table. Your bottlenecks are going to be how many concurrent workers your flask hosting environment supports and how many concurrent connections your storage supports. Both of these can be solved by throwing more resources at them.

Choosing a right paradigm to implement a specific programing task

I have a following architecture:
Main Control Unit(MCU) here must: run TCP/IP server for communication with robot; run Redis Database; be able to run several data processing programs (where data sent by robot or obtained from Redis is used). External Control Unit is connected to Redis DB.
I have to write program for MCU. Ideally it must be able to perform asynchronously following tasks:
Get request from robot and pass it to Redis DB, so External Control Unit can react to the signal appearance and start acquiring data from sensor (Then publishing this data to Redis DB).
Get request from robot to start receiving data from robot (Then publish this data to Redis DB).
React to the appearance of data from External Control Unit in Redis DB. This must force Main Control Unit to start data processing program using obtained sensor data.
Get request from robot to send resultant data to robot.
This is a simplified version of the system, since there will be more External Control Units with different sensors. But most of MCU tasks are described.
For now I can ensure data transmission between MCU and robot. And I'm pretty familiar with publishing/subscribing techniques of Redis DB.
But I'm struggling with choosing proper technology/paradigm to program MCU, asynchronous/multithreading/multiprocessing programming? Where to dig?
Addition to question:
Which paradigm (asynchronous/multithreading/multiprocessing programming) is better to use to implement such a behavior of MCU:
MCU receive request from robot to start up computer vision routine (it takes around 30-40 seconds to be finished). The routine can be started only if necessary data are found in redis DB. So, may be MCU must wait until ExtXU end publishing this data into DB, may be not. After the routine has been started, during 30-40 seconds of CV routine run, another request from robot can come, and it must be processed while CV routine is running.
I studied today an asyncio python module, and in my mind it's not suitable to implement what I want. It is ideal for processing multiple client requests on server, or trying to get some data from different servers from one client. A coroutines lock/wait is essential, so program can do something else while coroutine waits something. But my CV routine do not wait anything, it just runs.

How much data can a websocket consumer handle?

I've built a simple application using Django Channels where a Consumer receives data from an external service, the consumer than sends this data to some subscribers on another consumer.
I'm new to websockets and i had a little concern: the consumer is receiving a lot of data, in the order of 100 (or more) JSON records per second. At what point should i be worried about this service crashing or running into performance issues? Is there some sort of limit for what i'm doing?
there is not explicit limit, however it is worth nothing that for each instances (open connection) of the consumer you can only process one WS message at once.
So if you have a single websocket connection and are sending lots and lots of WS messages down that connection if the consumer does work on these (eg write them to the db) the queue of messages might fill up and you will get an error.
their are a few solutions to this,
Open multiple ws connections and share out the load
in your consumer before doing any work that will take time put it onto a work queue and have some background tasks (that you do not await) consume it.
For this second option it is probably a good idea to create this background queue in your on_connect method and handle shutting it down/waiting for it to flush everything in the on disconnect method.
--
if you are expecting a massive amount of data and don't want to fork out for a costly (high memory) VM you might be better of using a server that is no written in python.
My suggestion for a python developer would be https://docs.vapor.codes/4.0/websockets/ this is a Swift server framework, Swift linguistically if very close to TypeAnotated python so is easier than other high performance options to pick up for a python dev.

Data buffering/storage - Python

I am writing an embedded application that reads data from a set of sensors and uploads to a central server. This application is written in Python and runs on a Rasberry Pi unit.
The data needs to be collected every 1 minute, however, the Internet connection is unstable and I need to buffer the data to a non volatile storage (SD-card) etc. whenever there is no connection. The buffered data should be uploaded as and when the connection comes back.
Presently, I'm thinking about storing the buffered data in a SQLite database and writing a cron job that can read the data from this database continuously and upload.
Is there a python module that can be used for such feature?
Is there a python module that can be used for such feature?
I'm not aware of any readily available module, however it should be quite straight forward to build one. Given your requirement:
the Internet connection is unstable and I need to buffer the data to a non volatile storage (SD-card) etc. whenever there is no connection. The buffered data should be uploaded as and when the connection comes back.
The algorithm looks something like this (pseudo code):
# buffering module
data = read(sensors)
db.insert(data)
# upload module
# e.g. scheduled every 5 minutes via cron
data = db.read(created > last_successful_upload)
success = upload(data)
if success:
last_successful_upload = max(data.created)
The key is to seperate the buffering and uploading concerns. I.e. when reading data from the sensor don't attempt to immediately upload, always upload from the scheduled module. This keeps the two modules simple and stable.
There are a few edge cases however that you need to concern yourself with to make this work reliably:
insert data while uploading is in progress
SQLlite doesn't support being accessed from multiple processes well
To solve this, you might want to consider another database, or create multiple SQLite databases or even flat files for each batch of uploads.
If you mean a module to work with SQLite database, check out SQLAlchemy.
If you mean a module which can do what cron does, check out sched, a python event scheduler.
However, this looks like a perfect place to implemet a task queue --using a dedicated task broker (rabbitmq, redis, zeromq,..), or python's threads and queues. In general, you want to submit an upload task, and worker thread will pick it up and execute, while the task broker handles retries and failures. All this happens asynchronously, without blocking your main app.
UPD: Just to clarify, you don't need the database if you use a task broker, because a task broker stores the tasks for you.
This is only database work. You can create a master and slave databases in different locations and if one is not on the network, will run with the last synched info.
And when the connection came back hr merge all the data.
Take a look in this answer and search for master and slave database

ZeroMQ is too fast for database transaction

Inside an web application ( Pyramid ) I create certain objects on POST which need some work done on them ( mainly fetching something from the web ). These objects are persisted to a PostgreSQL database with the help of SQLAlchemy. Since these tasks can take a while it is not done inside the request handler but rather offloaded to a daemon process on a different host. When the object is created I take it's ID ( which is a client side generated UUID ) and send it via ZeroMQ to the daemon process. The daemon receives the ID, and fetches the object from the database, does it's work and writes the result to the database.
Problem: The daemon can receive the ID before it's creating transaction is committed. Since we are using pyramid_tm, all database transactions are committed when the request handler returns without an error and I would rather like to leave it this way. On my dev system everything runs on the same box, so ZeroMQ is lightning fast. On the production system this is most likely not an issue since web application and daemon run on different hosts but I don't want to count on this.
This problem only recently manifested itself since we previously used MongoDB with a write_convern of 2. Having only two database servers the write on the entity always blocked the web-request until the entity was persisted ( which is obviously is not the greatest idea ).
Has anyone run into a similar problem?
How did you solve it?
I see multiple possible solutions, but most of them don't satisfy me:
Flushing the transaction manually before triggering the ZMQ message. However, I currently use SQLAlchemy after_created event to trigger it and this is really nice since it decouples this process completely and thus eliminating the risk of "forgetting" to tell the daemon to work. Also think that I still would need a READ UNCOMMITTED isolation level on the daemon side, is this correct?
Adding a timestamp to the ZMQ message, causing the worker thread that received the message, to wait before processing the object. This obviously limits the throughput.
Dish ZMQ completely and simply poll the database. Noooo!
I would just use PostgreSQL's LISTEN and NOTIFY functionality. The worker can connect to the SQL server (which it already has to do), and issue the appropriate LISTEN. PostgreSQL would then let it know when relevant transactions finished. You trigger for generating the notifications in the SQL server could probably even send the entire row in the payload, so the worker doesn't even have to request anything:
CREATE OR REPLACE FUNCTION magic_notifier() RETURNS trigger AS $$
BEGIN
PERFORM pg_notify('stuffdone', row_to_json(new)::text);
RETURN new;
END;
$$ LANGUAGE plpgsql;
With that, right as soon as it knows there is work to do, it has the necessary information, so it can begin work without another round-trip.
This comes close to your second solution:
Create a buffer, drop the ids from your zeromq messages in there and let you worker poll regularly this id-pool. If it fails retrieving an object for the id from the database, let the id sit in the pool until the next poll, else remove the id from the pool.
You have to deal somehow with the asynchronous behaviour of your system. When the ids arrive constantly before persisting the object in the database, it doesnt matter whether pooling the ids (and re-polling the the same id) reduces throughput, because the bottleneck is earlier.
An upside is, you could run multiple frontends in front of this.

Categories

Resources