How can I create a non-http proxy with Twisted. Instead I would like to do it for the Terraria protocol which is made entirely of binary data. I see that they have a built-in proxy for HTTP connections, but this application needs to act more like an entry point which is forwarded to a set server (almost like a BNC on IRC).
I can't figure out how to read the data off of one connection and send it to the other connection.
I have already tried using a socket for this task, but the blocking recv and send methods do not work well as two connections need to be live at the same time.
There are several different ways to create proxies in Twisted. The basic technique is built on peering, by taking two different protocols, on two different ports, and somehow gluing them together so that they can exchange data with each other.
The simplest proxy is a port-forwarder. Twisted ships with a port-forwarder implementation, see http://twistedmatrix.com/documents/current/api/twisted.protocols.portforward.html for the (underdocumented) classes ProxyClient and ProxyServer, although the actual source at http://twistedmatrix.com/trac/browser/tags/releases/twisted-11.0.0/twisted/protocols/portforward.py might be more useful to read through. From there, we can see the basic technique of proxying in Twisted:
def dataReceived(self, data):
self.peer.transport.write(data)
When a proxying protocol receives data, it puts it out to the peer on the other side. That's it! Quite simple. Of course, you'll usually need some extra setup... Let's look at a couple of proxies I've written before.
This is a proxy for Darklight, a little peer-to-peer system I wrote. It is talking to a backend server, and it wants to only proxy data if the data doesn't match a predefined header. You can see that it uses ProxyClientFactory and endpoints (fancy ClientCreator, basically) to start proxying, and when it receives data, it has an opportunity to examine it before continuing, either to keep proxying or to switch protocols.
class DarkServerProtocol(Protocol):
"""
Shim protocol for servers.
"""
peer = None
buf = ""
def __init__(self, endpoint):
self.endpoint = endpoint
print "Protocol created..."
def challenge(self, challenge):
log.msg("Challenged: %s" % challenge)
# ...omitted for brevity...
return is_valid(challenge)
def connectionMade(self):
pcf = ProxyClientFactory()
pcf.setServer(self)
d = self.endpoint.connect(pcf)
d.addErrback(lambda failure: self.transport.loseConnection())
self.transport.pauseProducing()
def setPeer(self, peer):
# Our proxy passthrough has succeeded, so we will be seeing data
# coming through shortly.
log.msg("Established passthrough")
self.peer = peer
def dataReceived(self, data):
self.buf += data
# Examine whether we have received a challenge.
if self.challenge(self.buf):
# Excellent; change protocol.
p = DarkAMP()
p.factory = self.factory
self.transport.protocol = p
p.makeConnection(self.transport)
elif self.peer:
# Well, go ahead and send it through.
self.peer.transport.write(data)
This is a rather complex chunk of code which takes two StatefulProtocols and glues them together rather forcefully. This is from a VNC proxy (https://code.osuosl.org/projects/twisted-vncauthproxy to be precise), which needs its protocols to do a lot of pre-authentication stuff before they are ready to be joined. This kind of proxy is the worst case; for speed, you don't want to interact with the data going over the proxy, but you need to do some setup beforehand.
def start_proxying(result):
"""
Callback to start proxies.
"""
log.msg("Starting proxy")
client_result, server_result = result
success = True
client_success, client = client_result
server_success, server = server_result
if not client_success:
success = False
log.err("Had issues on client side...")
log.err(client)
if not server_success:
success = False
log.err("Had issues on server side...")
log.err(server)
if not success:
log.err("Had issues connecting, disconnecting both sides")
if not isinstance(client, Failure):
client.transport.loseConnection()
if not isinstance(server, Failure):
server.transport.loseConnection()
return
server.dataReceived = client.transport.write
client.dataReceived = server.transport.write
# Replay last bits of stuff in the pipe, if there's anything left.
data = server._sful_data[1].read()
if data:
client.transport.write(data)
data = client._sful_data[1].read()
if data:
server.transport.write(data)
server.transport.resumeProducing()
client.transport.resumeProducing()
log.msg("Proxying started!")
So, now that I've explained that...
I also wrote Bravo. As in, http://www.bravoserver.org/. So I know a bit about Minecraft, and thus about Terraria. You will probably want to parse the packets coming through your proxy on both sides, so your actual proxying might start out looking like this, but it will quickly evolve as you begin to understand the data you're proxying. Hopefully this is enough to get you started!
Related
Short description:
Client sends server data via TCP socket. Data varies in length and is strings broken up by the delimiter "~~~*~~~"
For the most part it works fine. For a while. After a few minutes data winds up all over the place. So I start tracking the problem and data is ending up in the wrong place because the full thing has not been passed.
Everything comes into the server script and is parsed by a different delimiter -NewData-* then placed into a Queue. This is the code:
Yes I know the buffer size is huge. No I don't send data that kind of size in one go but I was toying around with it.
class service(SocketServer.BaseRequestHandler):
def handle(self):
data = 'dummy'
#print "Client connected with ", self.client_address
while len(data):
data = self.request.recv(163840000)
#print data
BigSocketParse = []
BigSocketParse = data.split('*-New*Data-*')
print "Putting data in queue"
for eachmatch in BigSocketParse:
#print eachmatch
q.put(str(eachmatch))
#print data
#self.request.send(data)
#print "Client exited"
self.request.close()
class ThreadedTCPServer(SocketServer.ThreadingMixIn, SocketServer.TCPServer):
pass
t = ThreadedTCPServer(('',500), service)
t.serve_forever()
I then have a thread running on while not q.empty(): which parses the data by the other delimiter "~~~*~~~"
So this works for a while. An example of the kind of data I'm sending:
2016-02-23 18:01:24.140000~~~*~~~Snowboarding~~~*~~~Blue Hills~~~*~~~Powder 42
~~~*~~~Board Rental~~~*~~~15.0~~~*~~~1~~~*~~~http://bigshoes.com
~~~*~~~No Wax~~~*~~~50.00~~~*~~~No Ramps~~~*~~~2016-02-23 19:45:00.000000~~~*~~~-15
But things started to break. So I took some control data and sent it in a loop. Would work for a while then results started winding up in the wrong place. And this turned up in my queue:
2016-02-23 18:01:24.140000~~~*~~~Snowboarding~~~*~~~Blue Hills~~~*~~~Powder 42
~~~*~~~Board Rental~~~*~~~15.0~~~*~~~1~~~*~~~http://bigshoes.com
~~~*~~~No Wax~~~*~~~50.00~~~*~~~No Ramps~~~*~~~2016-02-23 19:45:00.000000~~~*~
Cutting out the last "~~-15".
So the exact same data works then later doesn't. That suggests some kind of overflow to me.
The client connects like this:
class Connect(object):
def connect(self):
host = socket.gethostname() # Get local machine name
#host = "127.0.0.1"
port = 500 # Reserve a port for your service.
sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
#print('connecting to host')
sock.connect((host, port))
return sock
def send(self, command):
sock = self.connect()
#recv_data = ""
#data = True
#print('sending: ' + command)
sock.sendall(command)
sock.close()
return
It doesn't wait for a response because I don't want it hanging around waiting for one. But it closes the socket and (as far as I understand) I don't need to flush the socket buffer or anything it should just be clearing itself when the connection closes.
Would really appreciate any help on this one. It's driving me a little spare at this point.
Updates:
I'm running this on both my local machine and a pretty beefy server and I'd be pushed to believe it's a hardware issue. The server/client both run locally and sockets are used as a way for them to communicate so I don't believe latency would be the cause.
I've been reading into the issues with TCP communication. An area where I feel I'll quickly be out of my depth but I'm starting to wonder if it's not an overflow but just some king of congestion.
If sendall on the client does not ensure everything is sent maybe some kind of timer/check on the server side to make sure nothing more is coming.
The basic issue is that your:
data = self.request.recv(163840000)
line is not guaranteed to receive all the data at once (regardless of how big you make the buffer).
In order to function properly, you have to handle the case where you don't get all the data at once (you need to track where you are, and append to it). See the relevant example in the Python docs on using a socket:
Now we come to the major stumbling block of sockets - send and recv operate on the network buffers. They do not necessarily handle all the bytes you hand them (or expect from them), because their major focus is handling the network buffers. In general, they return when the associated network buffers have been filled (send) or emptied (recv). They then tell you how many bytes they handled. It is your responsibility to call them again until your message has been completely dealt with.
As mentioned, you are not receiving the full message even though you have a large buffer size. You need to keep receiving until you get zero bytes. You can write your own generator that takes the request object and yields the parts. The nice side is that you can start processing messages while some are still coming in
def recvblocks(request):
buf = ''
while 1:
newdata = request.recv(10000)
if not newdata:
if buf:
yield buf
return
buf += newdata
parts = buf.split('*-New*Data-*')
buf = parts.pop()
for part in parts:
yield part
But you need a fix on your client also. You need to shutdown the socket before close to really close the TCP connection
sock.sendall(command)
sock.shutdown(request.SHUT_RDWR)
sock.close()
Twisted has two data reception modes: a Line Mode and a Raw Mode, and we can switch between them using setRawMode() and setLineMode() functions.
the line mode detects and end of line and then calls the lineReceived() function.
From Twisted doc:
def rawDataReceived(self, data):
Override this for when raw data is received.
How can Twisted detect the end of a raw data and then call rawDataReceived() ?
EDIT:
I'll add this to to complete my question.
I'm using this Qt function to send data to the Twisted server
qint64 QIODevice::write(const QByteArray & byteArray)
I thought that using write() two times means that the Twisted server will trigger the rawDataReceived() functions two times too.
write( "raw1" );
write( "raw2" );
but data are received in one time.
You asked:
How can Twisted detect the end of a raw data and then call rawDataReceived() ?
In short, when you turn on raw your asking Twisted not to detect.
... but let me explain
When you talk about 'detecting the end of data' inside of a connection (I.E. if your not closing the connection at the end of data), your talking about an idea that is normally referred to as framing.
Framing is one of the primary issues you have to keep in mind when your doing application level networking programming, because most (networking) protocols don't guarantee data framing to the application.
Confusingly many networking protocols (of which TCP is one of the most notorious) will often but not always present data to the receiver in the same way as it is transmitted (I.E. As though it had framing, each write will cause one read to happen - but only in cases of slow-use and low-load). Because of this maybe-it-will-work-maybe-it-won't behavior the best practice is to always explicitly add/build-in some sort of framing.
The most common method to add application-level framing in TCP/Serial/Keyboard style interfaces is to use line-breaks as end-of-frame makers, which is what LineMode is about.
Turning on raw mode in Twisted is like saying 'I want to write my own framing', but I doubt thats really what your after.
Instead you probably want to look at some of the other helper protocols (netstring, prefixed-message-length) that Twisted offers that will do binary framing for you (also see SO: Fragmented data in Twisted dataRecivied by Twisted's author Glyph)
Twisted does not detect the end of the raw data. It just calls rawDataReceived as it receive data.
Following is relevant part from Twisted code. (protocols/basic.py)
def dataReceived(self, data):
"""
Protocol.dataReceived.
Translates bytes into lines, and calls lineReceived (or
rawDataReceived, depending on mode.)
"""
if self._busyReceiving:
self._buffer += data
return
try:
self._busyReceiving = True
self._buffer += data
while self._buffer and not self.paused:
if self.line_mode:
....
else:
data = self._buffer
self._buffer = b''
why = self.rawDataReceived(data) # <--------
if why:
return why
finally:
self._busyReceiving = False
I have a Client that currently does the following:
connects
collects some data locally
sends that data to a server
repeats
if disconnected, reconnects and continues the above (not shown)
Like this:
def do_send(self):
def get_data():
# do something
return data
def send_data(data)
self.sendMessage(data)
return deferToThread(get_data).addCallback(send_data)
def connectionMade(self):
WebSocketClientProtocol.connectionMade(self)
self.sender = task.LoopingCall(self.do_send)
self.sender.start(60)
However, when disconnected, I would like the data collection to continue, probably queuing and writing to file at a certain limit. I have reviewed the DeferredQueue object which seems like what I need, but I can't seem to crack it.
In pseudo-code, it would go something like this:
queue = DeferredQueue
# in a separate class from the client protocol
def start_data_collection():
self.collecter = task.LoopingCall(self.get_data)
self.sender.start(60)
def get_data()
# do something
queue.put(data)
Then have the client protocol check the queue, which is where I get lost. Is DeferredQueue what I need, or is there a better way?
A list would work just as well. You'll presumably get lost in the same place - how do you have the client protocol check the list?
Either way, here's one answer:
queued = []
...
connecting = endpoint.connect(factory)
def connected(protocol):
if queued:
sending = protocol.sendMessage(queued.pop(0))
sending.addCallback(sendNextMessage, protocol)
sending.addErrback(reconnect)
connecting.addCallback(connected)
The idea here is that at some point an event happens: your connection is established. This example represents that event as the connecting Deferred. When the event happens, connected is called. This example pops the first item from the queue (a list) and sends it. It waits for the send to be acknowledged and then sends the next message. It also implies some logic about handling errors by reconnecting.
Your code could look different. You could use the Protocol.connectionMade callback to represent the connection event instead. The core idea is the same - define callbacks to handle certain events when they happen. Whether you use an endpoint's connect Deferred or a protocol's connectionMade doesn't really matter.
I am creating a sort of a client-server implementation, and I'd like to make sure that every sent message gets a response. So I want to create a timeout mechanism, which doesn't check if the message itself is delivered, but rather checks if the delivered message gets a response.
IE, for two computers 1 and 2:
1: send successfully: "hello"
2: <<nothing>>
...
1: Didn't get a response for my "hello" --> timeout
I thought of doing it by creating a big boolean array with id for each message, which will hold a "in progress" flag, and will be set when the message's response is received.
I was wondering perhaps there was a better way of doing that.
Thanks,
Ido.
There is a better way, which funnily enough I myself just implemented here. It uses the TimeoutMixin to achieve the timeout behaviour you need, and a DeferredLock to match up the correct replies with what was sent.
from twisted.internet import defer
from twisted.protocols.policies import TimeoutMixin
from twisted.protocols.basic import LineOnlyReceiver
class PingPongProtocol(LineOnlyReceiver, TimeoutMixin):
def __init__(self):
self.lock = defer.DeferredLock()
self.deferred = None
def sendMessage(self, msg):
result = self.lock.run(self._doSend, msg)
return result
def _doSend(self, msg):
assert self.deferred is None, "Already waiting for reply!"
self.deferred = defer.Deferred()
self.deferred.addBoth(self._cleanup)
self.setTimeout(self.DEFAULT_TIMEOUT)
self.sendLine(msg)
return self.deferred
def _cleanup(self, res):
self.deferred = None
return res
def lineReceived(self, line):
if self.deferred:
self.setTimeout(None)
self.deferred.callback(line)
# If not, we've timed out or this is a spurious line
def timeoutConnection(self):
self.deferred.errback(
Timeout("Some informative message"))
I haven't tested this, it's more of a starting point. There are a few things you might want to change here to suit your purposes:
I use a LineOnlyReceiver — that's not relevant to the problem itself, and you'll need to replace sendLine/lineReceived with the appropriate API calls for your protocol.
This is for a serial connection, so I don't deal with connectionLost etc. You might need to.
I like to keep state directly in the instance. If you need extra state information, set it up in _doSend and clean it up in _cleanup. Some people don't like that — the alternative is to create nested functions inside _doSend that close over the state information that you need. You'll still need that self.deferred there though, otherwise lineReceived (or dataReceived) has no idea what to do.
How to use it
Like I said, I created this for serial communications, where I don't have to worry about factories, connectTCP, etc. If you're using TCP communications, you'll need to figure out the extra glue you need.
# Create the protocol somehow. Maybe this actually happens in a factory,
# in which case, the factory could have wrapper methods for this.
protocol = PingPongProtocol()
def = protocol.sendMessage("Hi there!")
def.addCallbacks(gotHiResponse, noHiResponse)
Is there any way to force self.transport.write(response) to write immediately to its connection so that the next call to self.transport.write(response) does not get buffered into the same call.
We have a client with legacy software we cannot amend, that reads for the 1st request and then starts reading again, and the problem I have is twisted joins the two writes together which breaks the client any ideas i have tried looking into deferreds but i don't think it will help in this case
Example:
self.transport.write("|123|") # amount of messages to follow
a loop to generate next message
self.transport.write("|message 1 text here|")
Expected:
|123|
|message 1 text here|
Result:
|123||message 1 text here|
I was having a somewhat related problem using down level Python 2.6. The host I was talking to was expecting a single ACK character, and THEN a separate data buffer, and they all came at once. On top
of this, it was a TLS connection. However, if you reference the socket DIRECTLY, you can invoke a
sendall() as:
self.transport.write(Global.ACK)
to:
self.transport.getHandle().sendall(Global.ACK)
... and that should work. This does not seem to be a problem on Python 2.7 with Twisted on X86, just
Python 2.6 on a SHEEVAPlug ARM processor.
Can you tell which transport you are using. For most implementations,
This is the typical approach :
def write(self, data):
if data:
if self.writeInProgress:
self.outQueue.append(data)
else:
....
Based on the details the behavior of write function can be changed to do as desired.
Maybe You can register your protocol as a pull producer to the transport
self.transport.registerProducer(self, False)
and then create a write method in your protocol that has it's job buffering
the data until the transport call your protocol resumeProducing method to fetch
the data one by one.
def write(self, data):
self._buffers.append(data)
def resumeProducing(self):
data = self._buffers.pop()
self.transport.write(data)