I have a requirement where I am going to receive asynchronous data (JSON) from client over http, which I've to process whenever it is received and then send it to a device over TCP connection, and get the responses back (1 or 2 responses based on request). This response I have to send it to client back.
My question is how can I do that? i.e. should I run to while True: loop and wait for data is put on the Queue, by keep checking it if it is not empty, once received , I will collect data from Queue and send it over TCP, But how second loop (recv data from TCP connection) should be run? Same while True: loop for waiting for TCP response? and once response is received how do I send it to HTTP client?
If yes, then how will it work? Can someone provide an example? I thought running two processes, one for write_to_queue and read_from_queue, but still can't comprehend how to implement and how will it work.
Related
Is there an alternative technique of timeout to interrupt communication with a server in case it has received all the messages?
Let me explain, I want to communicate through an SSL socket with a server (XMPP) unknown to me. The communication is done through XML messages.
For each request I can receive one or more response messages of different types. Or a single message that is sent in multiple chunks. Therefore I create a loop to receive all messages related to a request from the server. So I do not know the number and the size of messages for each request a priori. Once the messages are finished, the client waits for no more responses from the Server to stop listening (i.e., it waits for the timeout). However, this results in a long wait (e.g., for each request I have to wait 2s which at the end of the communication could be as long as 1 minute for all requests).
Is there a faster way to stop listening when we have received all the messages for each request?
I attach here my code:
s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
sock = ssl.wrap_socket(s, ssl_version=ssl. ssl.PROTOCOL_TLSv1_2)
sock.connect((hostname, port))
sock.timeout(2)
#example for a generic unique message request
sock.sendall(message.encode())
data = “”
while True:
try:
response = sock.recv(1024)
if not response: break
data += response.decode()
except socket.timeout:
break
You know how much data to expect because of the protocol.
The XMPP protocol says (section 4.2) what the start of a stream looks like; (section 4.3) tells you about stream negotiation process which the <stream:features> tag is used for. Don't understand them yet? Section 4.1 tells you about what a stream is and how it works.
In one comment, you complained about receiving a response in multiple chunks. This is absolutely normal in TCP because TCP does not care about chunks. TCP can split up a 1kB message into 1000 1-byte chunks if it feels like it (usually it doesn't feel like it). So you must keep receiving chunks until you see the end of the message.
Actually, XMPP is not even a request-response protocol. The server can send you messages at any time, which are not responses to something you sent, for example it could tell you that someone said "hello" to you. So you might send an IQ request, but the next message you receive might not be an IQ response, it might be a message saying that your friend is now online. You know which message is the response because: the message type is IQ; the 'type' is 'result' or 'error'; and the 'id' is the same as in the request. Section 8.2.3.
Do python-socketio or underlying python-engineio have any kind of confirmation that specific message was completely delivered to other side, similar to what TCP does to ensure all data was successfully transferred to other side?
I have kind of pubsub service built on python-socketio server, which sends back ok/error status when request has been processed. But in my python-socketio client sometimes I just need fire and forget some message to pubsub but I have to wait it was completely delivired before I terminate my application.
So, my naive code:
await sio.emit("publish", {my message})
it seems the await above is just scheduling send over wire to asyncio, but does not wait for send to complete. I suppose it's by design. Just need to know is it possible to know when send is complete or not.
Socket.IO has ACK packets that can be used for the receiving side to acknowledge receipt of an event.
When using the Python client and server, you can replace the emit() with call() to wait for the ack to be received. The return value of call() is whatever data the other side returned in the acknowledgement.
But not that for this to work the other side also needs to be expanded to send this ACK packets. If your other side is also Python, an event handler can issue an ACK simply by returning something from the handler function. The data that you return is included in the ACK packet. If the other side is JavaScript, you get a callback function passed as a last argument into your handler. The handler needs to call this function passing any data that it wants to send to the other side as response.
I am new to the multithreading web server programming
Now I am writing a server program that:
Receive messages (in self-defined data format) from tcp socket
Process these messages (which takes time)
Send corresponding responses to the socket
Provide ACK mechanism for receiving messages and sending responses, that is every message contains a unique seq number and I should include the ack (same as seq) in the corresponding response. The other side also implements this mechanism. If I did not receive ACK from the other side for 5 min, I should re-send the message that I expected to receive corresponding ACK from.
My thought was to use a while loop to receive messages from the socket, then process the messages and send responses.
The problem is, processing messages takes time and I may receive multiple messages in a short period. So if I call the process_message() function in this while loop and wait for its finish, it will be blocking and I will definitely waste time. So I need non-blocking way.
I have done some research. I supposed I may use two common techs: thread pool and message queue.
For thread pool, my idea goes like the following pseudo code:
def process_message():
process_message // takes time
send_response(socket)
while True:
message = recv(socket)
thread = thread_pool.get_one()
thread.start(target=process_message)
For message queue, I am not sure, but my idea would be having producer thread and consumer thread:
def consumer:
// only one consumer thread?
message = queue.poll()
consumer_thread.process_message(message)
send_response(socket)
while True:
// only one producer thread?
message = recv(socket)
producer_thread.put_message_to_queue()
Hope my idea is clear. Can anyone provide some typical solution?
Then, the tricker part, any thoughts on how to implement the ACK mechanism?
Thank you!
This is rather broad because there is still too much to implement.
The general idea is indeed to implement:
a TCP server, that will receive incoming messages and write them (including the socket from which they were received) in a queue
a pool of worker threads that will get a message from the queue, process the message, and pass the response to an object in charge of sending the message and wait for the acknowledgement
an object that will send the responses, store the sequence number, the socket and the message until the response has been acknowledged. A thread would be handy to process the list of message waiting for acknowledgement and sent them again when the timeout is exhausted.
But each part requires a consequent amount of work, and can be implemented in different ways (select, TCPServer or threads processing accepted sockets for the first, which data structure to store the messages waiting for acknowledgement for the third, and which pool implementation for the second). I have done some tests and realized that a complete answer would be far beyond what is expected on this site. IMHO, you'd better break the question in smaller answerable pieces, keeping this one as the general context.
You should also say whether the incoming messages should be immediately acknowledged when received or will be implicitely acknowledged by the response.
I'm making a proxy which sits between the browser and the web. There's a snippet of code I can't seem to get to work.
#send request to web server
web_client.send(request)
#signal client is done with sending
web_client.shutdown(1)
If I use shutdown(1), the proxy has a great improvement in performance and speed.
However, some web servers do not send responses if I use shutdown. Console output:
request sent to host wix.com
got response packet of len 0
got response packet of len 0
breaking loop
and the browser displays
The connection was reset
The connection to the server was reset while the page was loading.
However, if I remove shutdown(1), there are no problems of sort. Console output:
got response packet of len 1388
got response packet of len 1388
got response packet of len 1388
got response packet of len 989
got response packet of len 0
got response packet of len 0
breaking loop
and the browser normally displays the website.
Why is this happening? This is only happening on certain hosts.
From https://docs.python.org/2/library/socket.html#socket.socket.shutdown
Depending on the platform, shutting down one half of the connection
can also close the opposite half (e.g. on Mac OS X, shutdown(SHUT_WR)
does not allow further reads on the other end of the connection)
This may not be the problem because you say that only some web servers are affected, but is your proxy running on Mac OS X?
TCP/IP stack will do graceful connection close only if there is no pending data to be sent on the socket. send completion indicates only the data is pushed into the kernel buffer & ready for sending. Here, shutdown is invoked immediately after send while there is some send data pending in the TCP stack. So TCP stack sends reset to the other end as it decides the application doesn't wish to complete sending the process. To do a graceful connection close, invoke select on the socket & wait for socket to be writable which means all the data is pushed out the stack. Then invoke shutdown & close the socket.
As far as I understand the basics of the client-server model, generally only client may initiate requests; server responds to them. Now I've run into a system where the server sends asynchronous messages back to the client via the same persistent TCP connection whenever it wants. So, a couple of questions:
Is it a right thing to do at all? It seems to really overcomplicate implementation of a client.
Are there any nice patterns/methodologies I could use to implement a client for such a system in Python? Changing the server is not an option.
Obviously, the client has to watch both the local request queue (i.e. requests to be sent to the server), and the incoming messages from the server. Launching two threads (Rx and Tx) per connection does not feel right to me. Using select() is a major PITA here. Do I miss something?
When dealing with asynchronous io in python I typically use a library such as gevent or eventlet. The objective of these libraries is allow for applications written in a synchronous to be multiplexed by a back-end reactor.
This basic example demonstrates the launching of two green threads/co-routines/fibers to handle either side of the TCP duplex. The send side of the duplex is listening on an asynchronous queue.
This is all performed within a single hardware thread. Both gevent && eventlet have more substantive examples in their documentation that what I have provided below.
If you run nc -l -p 8000 you will see "012" printed out. As soon netcat is exited, this code will be terminated.
from eventlet import connect, sleep, GreenPool
from eventlet.queue import Queue
def handle_i(sock, queue):
while True:
data = sock.recv(8)
if data:
print(data)
else:
queue.put(None) #<- signal send side of duplex to exit
break
def handle_o(sock, queue):
while True:
data = queue.get()
if data:
sock.send(data)
else:
break
queue = Queue()
sock = connect(('127.0.0.1', 8000))
gpool = GreenPool()
gpool.spawn(handle_i, sock, queue)
gpool.spawn(handle_o, sock, queue)
for i in range(0, 3):
queue.put(str(i))
sleep(1)
gpool.waitall() #<- waits until nc exits
I believe what you are trying to achieve is a bit similar to jsonp. While sending to the client, send through a callback method which you know of, that is existing in client.
like if you are sending "some data xyz", send it like server.send("callback('some data xyz')");. This suggestion is for javascript because it executes the returned code as if it were called through that method., and I believe you can port this theory to python with some difficulty. But I am not sure, though.
Yes this is very normal and Server can also send the messages to client after connection is made like in case of telnet server when you initiate a connection it sends you a message for the capability exchange and after that it asks you about your username & password.
You could very well use select() or if I were in your shoes I would have spawned a separate thread to receive the asynchronous messages from the server & would have left the main thread free to do further processing.