I have a problem as follows:
Server process 1
Constantly sends updates that occur to a datastore
Server process 2
Clients contact the server, which queries the datastore, and returns a result
The thing is, the results that process 1 and process 2 are sending back the client are totally different and unrelated.
How does one decompose this?
Do you just have one process constantly sending data, and define the protocol to have a bit which corresponds to whether the return type is 1 or 2?
Do you have two processes? How do they share the datastore then (it is just a structure not a database)?
Thanks!
It sounds like you want to stream your series of ints "somewhere" and also collect them in a datastore. In my system I am streaming sensor readings into a database and also allowing them to go directly to web clients, giving them live power readings. I've written a blog entry on why a database is not suitable for live data - though it is perfect for saving the data for later analysis.
I'd have the first server process be a twisted server that uses txamp to stream the ints to RabbitMQ. Any clients that want live data can subscribe to the stream in RabbitMQ, also using Txamp. Web browser clients can use Orbited here is a worked example.
In your design server 1 saves to the database. You could instead have server3 collect data from RabbitMQ and stream it to the database. I plan to have a server that collects chunks of data and render graphs to store to a central fileshare.
Don't create your own messaging system, RabbitMQ is well tested, scalable, and can persist your "messages" (raw data) if something goes wrong.
If you can restrict yourself to Twisted, I recommend to use Perspective Broker. It's essentially an RPC system, and doesn't care much about the notion of "client" and "server" - either the initiator of a TCP connection or the responder can start RPC calls in PB.
So server 1 would accept registration calls with a callback object, and call the callback whenever it has new data available. Server 2 provides various RPC operations as clients require them. If they operate on the very same data, I would put both servers into a single process.
Why not use a database instead of "just a structure"? Both relational and non-relational DBs offer many practical advantages (separate processes using them, take care of replication [[and/or snapshots, backups, ...]], rich functionality if you need it for the "queries", and so on, and so forth).
Worst case, the "just a structure" can be handled by a third process that's entirely dedicated to it (basically mimicking what any DB engine would offer -- though the engine would probably do it better and faster;-), allowing you to at least keep a good decomposition (with the two server processes both interacting with the "datastore process").
Related
I am working on a project whereby I have a couple of remote IoT devices that send messages via UDP. I am looking to make a server that can receive this constant flow of UDP messages and can store them in a database. Additionally I would like to make a (REST) API which allows the information from this database to be accessed (every 15/30 minutes or so) from other applications.
Does anyone have any suggestions for how to do this (preferably in python)?
So far I am able to do the following (in python):
I know how to make a UDP client and server, and send messages between them using "socket". This link provided a useful explanation.
I know how to create a Flask server, store random data in a database using SQLAlchemy, and make the database content available via an API that can be accessed via Postman. This link showed me how.
What I am not able to do:
Tying everything together is where the problem arises. Specifically I don't know how to combine these above methods so that everything works at the same time (in the same loop so to say). Both Flask and the UDP server are running their own loops and listening (for events?) so I don't see how those processes would work simultaneously.
One thing that I was considering is to run the UDP server + database insertion in one terminal, and the Flask/API server from another terminal. That would mean that the database is being opened and accessed by multiple programs at the same time. Is that possible? It would be like opening a single Excel sheet multiple times (which is not permitted I would think).
I also came across this library which allows you to combine Flask with Flask-Sockets, but that doesn't seem to support UDP as far as I understand..
Many thanks!
I want to be able to schedule delivery of a lightweight message from a server to a client. This is new territory to me so I'd appreciate some advice on the possible approaches available.
The client is running on a Raspberry Pi using node.js (because I'm using node libraries to control a piece of attached hardware). Eventually there will be multiple clients like it.
The server could be anything, though I'm most familiar with Python, django and node.
I want to be able to access the server from a browser and cause it to schedule a future message to the client, effectively a push notification with a tiny bit of data.
I'm looking at pub-sub and messaging systems to do this; I started writing a system that uses node on both ends and sockets, but the approach I want is more fire-and-forget occasional messages, not constant realtime data exchange. I'm also not a huge fan of the node-cron style scheduling, I'd like to be able to retrieve and alter scheduled events and it felt somewhat heavy-handed to layer this on top of a cron system.
My current solution uses python on the server (so I can write a django web interface) with celery and rabbitmq, using a named queue per client. The client subscribes to that specific queue using node-amqp, and off we go. This also allows me to create queues that multiple clients can be interested in, which is a neat bonus.
This answer makes me think I'm doing the right thing -- but as I'm new to this stuff, it feels like I might be missing something. Are there alternatives I should consider in the world of server-client messaging?
Since you are already using python you could take a look at python remote objects, (pyro).
I am using mysqldb for my database currently, and I need to integrate a messaging feature that is in real-time. The chat demo that Tornado provides does not implement a database, (whereas the blog does.)
This messaging service also will also double as an email in the future (like how Facebook's message service works. The chat platform is also email.) Regardless, I would like to make sure that my current, first chat version will be able to be expanded to function as email, and overall, I need to store messages in a database.
Is something like this as simple as: for every chat message sent, query the database and display the message on the users' screen. Or, is this method prone to suffer from high server load and poor optimization? How exactly should I structure the "infrastructure" to make this work?
(I apologize for some of the inherent subjectivity in this question; however, I prefer to "measure twice, code once.")
Input, examples, and resources appreciated.
Regards.
Tornado is a single threaded non blocking server.
What this means is that if you make any blocking calls on the main thread you will eventually kill performance. You might not notice this at first because each database call might only block for 20ms. But once you are making more than 200 database calls per seconds your application will effectively be locked up.
However that's quite a few DB calls. In your case that would be 200 people hitting send on their chat message in the same second.
What you probably want to do is use a queue with a non blocking API. So Tornado receives a chat message. You put it on the queue to be saved to the database by another process, then you send the chat message back out to the other chat members.
When someone connects to a chat session you also need to send off a request to the queue for all the previous messages, when the queue responds you send those off to the newly connected user.
That's how I would approach the problem anyway.
Also see this question and answer: Any suggestion for using non-blocking MySQL api on Tornado in Python3?
Just remember, Tornado is single threaded. It's amazing. And can handle thousands of simultaneous connections. But if code in one of those connections blocks for 1 second then NOTHING else will be done for any other connection during that second.
This question is related to this project.
I have to send device information in json format to a server. However, I'm concerned about what will happen if i can't connect to the remote server? I don't want to loose data, so I thought that each datum could be in a queue. A connection thread could parse the queue and send data to server. In my opinion this is a better solution than having a connection thread which sends data directly. Am I correct?
Something like a queue is always suitable to decouple things. Especially when the processing of data can be deferred. A queue implementation like RabbitMQ is transactional so you it will integrate nicely into a system where transactional integrity is a must.
Right, you might want a persistent queue-like storage between the application components. Depending on your requirements it might be anything ranging from the simple file to fully fledged transactional storage.
I want to implement a lightweight Message Queue proxy. It's job is to receive messages from a web application (PHP) and send them to the Message Queue server asynchronously. The reason for this proxy is that the MQ isn't always avaliable and is sometimes lagging, or even down, but I want to make sure the messages are delivered, and the web application returns immediately.
So, PHP would send the message to the MQ proxy running on the same host. That proxy would save the messages to SQLite for persistence, in case of crashes. At the same time it would send the messages from SQLite to the MQ in batches when the connection is available, and delete them from SQLite.
Now, the way I understand, there are these components in this service:
message listener (listens to the messages from PHP and writes them to a Incoming Queue)
DB flusher (reads messages from the Incoming Queue and saves them to a database; due to SQLite single-threadedness)
MQ connection handler (keeps the connection to the MQ server online by reconnecting)
message sender (collects messages from SQlite db and sends them to the MQ server, then removes them from db)
I was thinking of using Twisted for #1 (TCPServer), but I'm having problem with integrating it with other points, which aren't event-driven. Intuition tells me that each of these points should be running in a separate thread, because all are IO-bound and independent of each other, but I could easily put them in a single thread. Even though, I couldn't find any good and clear (to me) examples on how to implement this worker thread aside of Twisted's main loop.
The example I've started with is the chatserver.py, which uses service.Application and internet.TCPServer objects. If I start my own thread prior to creating TCPServer service, it runs a few times, but the it stops and never runs again. I'm not sure, why this is happening, but it's probably because I don't use threads with Twisted correctly.
Any suggestions on how to implement a separate worker thread and keep Twisted? Do you have any alternative architectures in mind?
You're basically considering writing an ad-hoc extension to your messaging server, the job of which it is to provide whatever reliability guarantees you've asked of it.
Instead, perhaps you should take the hardware where you were planning to run this new proxy and run another MQ node on it. The new node should take care of persisting and relaying messages that you deliver to it while the other nodes are overloaded or offline.
Maybe it's not the best bang for your buck to use a separate thread in Twisted to get around a blocking call, but sometimes the least evil solution is the best. Here's a link that shows you how to integrate threading into Twisted:
http://twistedmatrix.com/documents/10.1.0/core/howto/threading.html
Sometimes in a pinch easy-to-implement is faster than hours/days of research which may all turn out to be for nought.
A neat solution to this problem would be to use the Key Value store Redis. Its a high speed persistent data store, with plenty of clients - it has a php and a python client (if you want to use a timed/batch process to process messages - it saves you creating a database, and also deals with your persistence stories. It runs fine on Cywin/Windows + posix environments.
PHP Redis client is here.
Python client is here.
Both have a very clean and simple API. Redis also offers a publish/subscribe mechanism, should you need it, although it sounds like it would be of limited value if you're publishing to an inconsistent queue.