Streaming uploads using Python Requests

Streaming uploads using Python Requests - python

I want to stream an "infinite" (i.e. continuous) amount of data using HTTP Post. Basically, I want to send the POST request header and then stream the content (where the content length is unknown). I looked through http://docs.python-requests.org/en/latest/user/advanced/ and it seems to have the facility. The one question I have is it says in the document " To stream and upload, simply provide a file-like object for your body". What does "file-like" object mean? The data I wish to stream comes from a sensor. How do I implement a "file-like" object which will read data from the sensor and pass it to the caller?
Sorry about my ignorance here but I am feeling my way through python (i.e. learning as I go along. hmm.. looks like a snake. It feels slithery. Trying to avoid the business end of the critter... :-) ).
Thank you in advance for your help.
Ranga.

Just wanted to give you an answer so you could close your question:
It sounds like what you're really looking for is python websockets. Internally, you make a HTTP request to upgrade the connection to a websocket, and after the handshake you are free to stream data both ways. Python makes this easy, for example:
from flask import Flask
from flask_sockets import Sockets
app = Flask(__name__)
sockets = Sockets(app)
#sockets.route('/echo')
def echo_socket(ws):
while True:
message = ws.receive()
ws.send(message)
#app.route('/')
def hello():
return 'Hello World!'
Websockets do support full duplex communication, but you seem only interested in the server-to-client part. In that case you can just stream data using ws.send(). I'm not sure if this is what you're looking for, but it should provide a solution.

A file-like object is an object with a "read" method that accept a size and returns a binary data buffer for the next chunk of data.
One example that looks like that is indeed, the file object, if you want to read from the filesystem.
Another common case is the StringIO class, which reads and writes to a buffer.
In your case, you would need to implement a "file-like object" by yourself, which would simply read from the sensor.
class Sensor(object):
def __init__(self, sensor_thing)
self.sensor_thing = sensor_thing
def read(self, size):
return self.convert_to_binary(self.sensor_thing.read_from_sensor())
def convert_to_binary(self, sensor_data)
....

Related

Save data sent by the client to the server in a REST API

My knowledge of RESTful services is almost 0 and I've been struggling with this issue for a couple of days now.
What i'm trying to achieve is having a client communicating with other clients through REST, modifying a specific variable saved in the api file so that, later on, another client can request for that variable which has been changed.
Something akin to this:
app = Flask(__name__)
aString = ""
#app.route("/")
def home():
return "<h1>THE WORST REST API</h1>"
#app.route("/write")
def write():
aString = "chop"
return home()
#app.route("/read")
def read():
return aString
The client calls /read, receives ""
The client calls /write, aString changes into "chop"
Another client calls /read, receives "chop"
Now this isn't possible with this code (though I'm not sure why, I guess the reason is that every REST request reloads the API so changes done to aString are lost immediately) however I need some way to achieve this.
As I said, I am absolutely ignorant when it comes to this stuff, but I absolutely have to make the 2 clients communicate using this string one way or another, while keeping things as simple as possible.
Also, I'm a tad bit restrained when it comes to tools I can download/install due to the fact I'm coding in a work environment and I can't download too much stuff just to make this small thing work. Any solution is appreciated as long as all the libraries/modules I need can simply be retrieved using "pip install".
Thanks a lot to any of you who's willing to answer!

As a starter, HTTP is stateless. Neither the server nor the client save state between requests per default.
That's why your aString is gone after the first request.
You have several options:
either store the string in a text file or a database
make use of sessions (e.g. save a cookie or with similar techniques)
What way you choose depends on your requirements.

how to efficiently send back data from a server to a client with python grpc

I would like to know if there is a recommended way to return data back from a server to a client in GRPC python.
Currently, I have a dedicated server's RPC call that blocks on every client call - it loops on a data queue(which blocks when empty) to get the data and sends it back to the client. The server implementation of this call:
def GetData(self, request, context):
while self._is_initialized:
data = self._processed_data_queue.get()
yield data
print 'client out'
It seems super awkward, non-scalable, and obviously slows down the communication.
In NodeJS, C#, c++ implementations, it is much much easier to accomplish this.
But with Python implementation it doesn't seem to be possible to accomplish this efficiently. I really hope I'm missing something.
In addition, the server currently accepts data from a client, add it to a queue and then return it back to the client(without any processing). Again, with the code above, my performance drops dramatically even without any processing!
Thanks,
Mike

How to check if a python module is sending my data when i use it?

The title pretty much says it.
I need to make sure that while I am working with python modules there isn't any sort of malicious code in the module, specifacily the type that scrapes data from the machine runnign the code and sends it elsewhere?
do i have a method of doing that with python?
can i be certain this is done even when i am using modules like requests for sending and receiving HTTP GET\POST requests?
I mean is there a way to check this without reading every line of code in module?

You question is not really connected to python it is more a security risk. Python is a dynamic language so checking if any module behaves correctly is near impossible. However, what you can do it setup a virtual machine sandbox run your program with some fake data and check if guest machine tries to make some strange connections. You can than inspect where data is being send in what format and then trace it back to malicious code fragment in one of the modules.
EDIT
The only other option is if you are sure what method/function the malicious code will use. If it is for example the request library you could patch for example the post() method to check the destination or the package that is being send. However the malicious code could use its own implementation so you cannot be 100% sure.
A link on how to patch post() method
How to unit test a POST method in python?

It's better to have a global approach using tools like Wireshark for example that lets you sniff the packets sent/received by your machine.
With that said, in python, you could overwrite some methods that you're suspicious about. Here's the idea
import requests
def write_to_logs(message):
print(message) # Or you could store in a log file
original_get = requests.get
def mocked_get(*args, **kwargs):
write_to_logs('get method triggered with args = {}, kwargs= {}'.format(args,kwargs))
original_get(*args, **kwargs)
requests.get = mocked_get
response = requests.get('http://google.com')
Output :
get method triggered with args = ('http://google.com',), kwargs= {}

Twisted, FTP, and "streaming" large files

I'm attempting to implement what can best be described as "an FTP interface to an HTTP API". Essentially, there is an existing REST API that can be used to manage a user's files for a site, and I'm building a mediator server that re-exposes this API as an FTP server. So you can login with, say, Filezilla and list your files, upload new ones, delete old ones, etc.
I'm attempting this with twisted.protocols.ftp for the (FTP) server, and twisted.web.client for the (HTTP) client.
The thing I'm running up against is, when a user tries to download a file, "streaming" that file from an HTTP response to my FTP response. Similar for uploading.
The most straightforward approach would be to download the entire file from the HTTP server, then turn around and send the contents to the user. The problem with this is that any given file could be many gigabytes large (think drive images, ISO files, etc). With this approach, though, the contents of the file would be held in memory between the time I download it from the API and the time I send it to the user - not good.
So my solution is to try to "stream" it - as I get chunks of data from the API's HTTP response, I just want to turn around and send those chunks along to the FTP user. Seems straightforward.
For my "custom FTP functionality", I'm using a subclass of ftp.FTPShell. The reading method of this, openForReading, returns a Deferred that fires with an implementation of IReadFile.
Below is my (initial, simple) implementation for "streaming HTTP". I use the fetch function to setup an HTTP request, and the callback I pass in gets called with each chunk I get from the response.
I thought I could use some sort of two-ended buffer object to transport the chunks between the HTTP and FTP, by using the buffer object as the file-like object required by ftp._FileReader, but that's quickly proving not to work, as the consumer from the send call almost immediately closes the buffer (because it's returning an empty string, because there's no data to read yet, etc). Thus, I'm "sending" empty files before I even start receiving the HTTP response chunks.
Am I close, but missing something? Am I on the wrong path altogether? Is what I want to do really impossible (I highly doubt that)?
from twisted.web import client
import urlparse
class HTTPStreamer(client.HTTPPageGetter):
def __init__(self):
self.callbacks = []
def addHandleResponsePartCallback(self, callback):
self.callbacks.append(callback)
def handleResponsePart(self, data):
for cb in self.callbacks:
cb(data)
client.HTTPPageGetter.handleResponsePart(self, data)
class HTTPStreamerFactory(client.HTTPClientFactory):
protocol = HTTPStreamer
def __init__(self, *args, **kwargs):
client.HTTPClientFactory.__init__(self, *args, **kwargs)
self.callbacks = []
def addChunkCallback(self, callback):
self.callbacks.append(callback)
def buildProtocol(self, addr):
p = client.HTTPClientFactory.buildProtocol(self, addr)
for cb in self.callbacks:
p.addHandleResponsePartCallback(cb)
return p
def fetch(url, callback):
parsed = urlparse.urlsplit(url)
f = HTTPStreamerFactory(parsed.path)
f.addChunkCallback(callback)
from twisted.internet import reactor
reactor.connectTCP(parsed.hostname, parsed.port or 80, f)
As a side note, this is only my second day with Twisted - I spent most of yesterday reading through Dave Peticolas' Twisted Introduction, which has been a great starting point, even if based on an older version of twisted.
That said, I may be doing things wrong.

I thought I could use some sort of two-ended buffer object to transport the chunks between the HTTP and FTP, by using the buffer object as the file-like object required by ftp._FileReader, but that's quickly proving not to work, as the consumer from the send call almost immediately closes the buffer (because it's returning an empty string, because there's no data to read yet, etc). Thus, I'm "sending" empty files before I even start receiving the HTTP response chunks.
Instead of using ftp._FileReader, you want something that will do a write whenever a chunk arrives from your HTTPStreamer to a callback it supplies. You never need/want to do a read from a buffer on the HTTP, because there's no reason to even have such a buffer. As soon as HTTP bytes arrive, write them to the consumer. Something like...
class FTPStreamer(object):
implements(IReadFile)
def __init__(self, url):
self.url = url
def send(self, consumer):
fetch(url, consumer.write)
# You also need a Deferred to return here, so the
# FTP implementation knows when you're done.
return someDeferred
You may also want to use Twisted's producer/consumer interface to allow the transfer to be throttled, as may be necessary if your connection to the HTTP server is faster than your user's FTP connection to you.

Reading socket buffer using asyncore

I'm new to Python (I have been programming in Java for multiple years now though), and I am working on a simple socket-based networking application (just for fun). The idea is that my code connects to a remote TCP end-point and then listens for any data being pushed from the server to the client, and perform some parsing on this.
The data being pushed from server -> client is UTF-8 encoded text, and each line is delimited by CRLF (\x0D\x0A). You probably guessed: the idea is that the client connects to the server (until cancelled by the user), and then reads and parses the lines as they come in.
I've managed to get this to work, however, I'm not sure that I'm doing this quite the right way. So hence my actual questions (code to follow):
Is this the right way to do it in Python (ie. is it really this simple)?
Any tips/tricks/useful resources (apart from the reference documentation) regarding buffers/asyncore?
Currently, the data is being read and buffered as follows:
def handle_read(self):
self.ibuffer = b""
while True:
self.ibuffer += self.recv(self.buffer_size)
if ByteUtils.ends_with_crlf(self.ibuffer):
self.logger.debug("Got full line including CRLF")
break
else:
self.logger.debug("Buffer not full yet (%s)", self.ibuffer)
self.logger.debug("Filled up the buffer with line")
print(str(self.ibuffer, encoding="UTF-8"))
The ByteUtils.ends_with_crlf function simply checks the last two bytes of the buffer for \x0D\x0A. The first question is the main one (answer is based on this), but any other ideas/tips are appreciated. Thanks.

TCP is a stream, and you are not guaranteed that your buffer will not contain the end of one message and the beginning of the next.
So, checking for \n\r at the end of the buffer will not work as expected in all situations. You have to check each byte in the stream.
And, I would strongly recommend that you use Twisted instead of asyncore.
Something like this (from memory, might not work out of the box):
from twisted.internet import reactor, protocol
from twisted.protocols.basic import LineReceiver
class MyHandler(LineReceiver):
def lineReceived(self, line):
print "Got line:", line
f = protocol.ClientFactory()
f.protocol = MyHandler
reactor.connectTCP("127.0.0.1", 4711, f)
reactor.run()

It's even simpler -- look at asynchat and its set_terminator method (and other helpful tidbits in that module). Twisted is orders of magnitude richer and more powerful, but, for sufficiently simple tasks, asyncore and asynchat (which are designed to interoperate smoothly) are indeed very simple to use, as you've started observing.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Streaming uploads using Python Requests - python

Related

Save data sent by the client to the server in a REST API

how to efficiently send back data from a server to a client with python grpc

How to check if a python module is sending my data when i use it?

Twisted, FTP, and "streaming" large files

Reading socket buffer using asyncore

Categories

Resources