Pushing data once a URL is requested - python

Given, when a user requests /foo on my server, I send the following HTTP response (not closing the connection):
Content-Type: multipart/x-mixed-replace; boundary=-----------------------
-----------------------
Content-Type: text/html
foo
When the user goes to /bar (which will send 204 No Content so the view doesn't change), I want to send the following data in the initial response.
-----------------------
Content-Type: text/html
bar
How would I get the second request to trigger this from the initial response? I'm planning on possibly creating a fancy [engines that support multipart/x-mixed-replace (currently only Gecko)]-only email webapp that does server-push and Ajax effects without any JavaScript, just for fun.

No complete answer, but:
In your question, you're describing a Comet-style architecture. Regarding support of Comet-style techniques in Python/WSGI, there is a StackOverflow question, which talks about various Python servers with support for long-running requests a la Comet.
Also interesting is this mail thread in the Python Web-SIG: "Could WSGI handle Asynchronous response?". In May 2008, there was a broad discussion in the Web-SIG about the topic of asynchronous requests in WSGI.
A recent development is evserver, a lightweight WSGI server, which implements the Asynchronous WSGI extension proposed by Christopher Stawarz in the Web-SIG in May 2008.
Finally, the Tornado web server supports non-blocking asynchronous requests. It has a chat example application using long polling, which has similarities with your requirements.

If the problem is to pass some command from /bar application to /foo application and you are using some servlet-like approach (the Python code is loaded once and not for each request as in CGI), you can just change some class property of the /foo application and be ready to react to the change in the /foo instance (by checking the property state).
Obviously the /foo application should not return right after the first request and yield content line by line.
Thought this is just theory, I have not tried that myself.

I have created some small example (just for fun, you know :))
import threading
num = 0
cond = threading.Condition()
def app(environ, start_response):
global num
cond.acquire()
num += 1
cond.notifyAll()
cond.release()
start_response("200 OK", [("Content-Type", "multipart/x-mixed-replace; boundary=xxx")])
while True:
n = num
s = "--xxx\r\nContent-Type: text/html\r\n\r\n%s\n" % n
yield s
# wait for num change:
cond.acquire()
while num == n:
cond.wait()
cond.release()
from cherrypy.wsgiserver import CherryPyWSGIServer
server = CherryPyWSGIServer(("0.0.0.0", 3000), app)
try:
server.start()
except KeyboardInterrupt:
server.stop()
# Now whenever you visit http://127.0.0.1:3000/, the number increases.
# It also automatically increases in all previously opened windows/tabs.
The idea of a shared variable and thread synchronization (using condition variable object) is based on the fact that WSGI server provided by CherryPyWSGIServer is threaded.

Not sure if this is quite what you're looking for, but there is a fairly old way of doing server push using a mime content of multipart/x-mixed-replace
Basically you compose the response as a mime object with content type multipart/x-mixed-replace, and send the first "version" of a document down. The browser will keep the socket open.
Then as the server decides to push more data, a new "version" of the document gets sent from the server, and the browser will intelligently replace (within whatever frame/iframe contains the content) the content.
This was an early way of doing webcams, where the server would send down (push) image after image, and the browser would just keep replacing the image in the document over and over. This is also a way of doing a "Loading..." message over a single HTTP request.

Related

Contract testing with Kafka in Python environment?

I am working with multiple applications that communicate asynchronously using Kafka. These applications are managed by several departments and contract testing is appropriate to ensure that the messages used during communication follow the expected schema and will evolve according to the contract specification.
It sounded like the pact library for python is a good fit because it helps creating contract tests for HTTP and message integrations.
What I wanted to do is to send an HTTP request and to listen from the appropriate and dedicated Kafka topic immediately after. But it seems that the test is forcing me specify an HTTP code even if what I am expecting is a message from a queue without an HTTP status code. Furthermore, it seems that the HTTP request is being sent before the consumer is listening. Here is some sample code.
from pact.consumer import Consumer as p_Consumer
from pact.provider import Provider as p_Provider
from confluent_kafka import Consumer as k_Consumer
pact = p_Consumer('Consumer').has_pact_with(p_Provider('Provider'))
pact.start_service()
atexit.register(pact.stop_service)
config = {'bootstrap.servers':'server', 'group.id':0, 'auto.offset.reset':'latest'}
consumer = k_consumer(config)
consumer.subscribe(['usertopic'])
def user():
while True:
msg = consumer.poll(timeout=1)
if msg is None:
continue
else:
return msg.value().decode()
class ConstractTesting(unittest.TestCase):
expected = {
'username': 'UserA',
'id':123,
'groups':['Editors']
}
pact.given('UserA exists and is not an administrator')
.upon_receiving('a request for UserA')
.with_request(method='GET',path='/user/')
.will_respond_with(200, body=expected)
with pact:
result = user()
self.assertEqual(result,expected)
How would I carry out contract testing in Python using Kafka? It feels like I am going through a lot of hoops to carry out this test.
With Pact message it's a different API you write tests against. You don't use the standard HTTP one, in fact the transport itself is ignored altogether and it's just the payload - the message - we're interested in capturing and verifying. This allows us to test any queue without having to build specific interfaces for each
See this example: https://github.com/pact-foundation/pact-python/blob/02643d4fb89ff7baad63e6436f6a929256c6bf12/examples/message/tests/consumer/test_message_consumer.py#L65
You can read more about message pact testing here: https://docs.pact.io/getting_started/how_pact_works#non-http-testing-message-pact
And finally here are some Kafka examples for other languages that may be helpful: https://docs.pactflow.io/docs/examples/kafka/js/consumer

How to loop GETs until a certain response is received

I'm looking for some advice, or a relevant tutorial regarding the following:
My task is to set up a flask route that POSTs to API endpoint X, receives a new endpoint Y in X's response, then GETs from endpoint Y repeatedly until it receives a certain status message in the body of Y's response, and then returns Y's response.
The code below (irrelevant data redacted) accomplishes that goal in, I think, a very stupid way. It returns the appropriate data occasionally, but not reliably. (It times out 60% of the time.) When I console log very thoroughly, it seems as though I have bogged down my server with multiple while loops running constantly, interfering with each other.
I'll also receive this error occasionally:
SIGPIPE: writing to a closed pipe/socket/fd (probably the client disconnected) on request /book
import sys, requests, time, json
from flask import Flask, request
# create the Flask app
app = Flask(__name__)
# main booking route
#app.route('/book', methods=['POST']) #GET requests will be blocked
def book():
# defining the api-endpoints
PRICING_ENDPOINT = ...
# data to be sent to api
data = {...}
# sending post request and saving response as response object
try:
r_pricing = requests.post(url = PRICING_ENDPOINT, data = data)
except requests.exceptions.RequestException as e:
return e
sys.exit(1)
# extracting response text
POLL_ENDPOINT = r_pricing.headers['location']
# setting data for poll
data_for_poll = {...}
r_poll = requests.get(POLL_ENDPOINT, data = data_for_poll)
# poll loop, looking for 'UpdatesComplete'
j = 1
poll_json = r_poll.json()
update_status = poll_json['Status']
while update_status == 'UpdatesPending':
time.sleep(2)
j = float(j) + float(1)
r_poll = requests.get(POLL_ENDPOINT, data = data_for_poll)
poll_json = r_poll.json()
update_status = poll_json['Status']
return r_poll.text
This is more of an architectural issue more than a Flask issue. Long-running tasks in Flask views are always a poor design choice. In this case, the route's response is dependent on two endpoints of another server. In effect, apart from carrying the responsibility of your app, you are also carrying the responsibility of another server.
Since the application's design seems to be a proxy for another service, I would recommend creating the proxy in the right way. Just like book() offers the proxy for PRICING_ENDPOINT POST request, create another route for POLL_ENDPOINT GET request and move the polling logic to the client code (JS).
Update:
If you cannot for some reason trust the client (browser -> JS) with the POLL_ENDPOINT information in a hidden proxy like situation, then maybe move the polling to a task runner like Celery or Python RQ. Although, it will introduce additional components to your application, it would be the right way to go.
Probably you get that error because of the HTTP connection time out with your API server that is looping. There are some standards for HTTP time connection and loop took more time that is allowed for the connection. The first (straight) solution is to "play" with Apache configs and increase the HTTP connection time for your wsgi. You can also make a socket connection and in it check the update status and close it while the goal was achieved. Or you can move your logic to the client side.

pika for rabbitMQ crashing while using flask server

So we have a single thread flask server running where we receive requests from a python app client. In this flask server we use rabbitMQ with pika library to distribute messages to other clients.
What is happening is that in the get function the program is crashing with the error:
pika.exceptions.ConnectionClosed: (505, 'UNEXPECTED_FRAME - expected
content header for class 60, got non content header frame instead')
I've searched a lot of topics about this in stack overflow and others but they all address problems with multi threading which is not the case. Flask should only serve with one thread unless it is called in app.run(threaded=yes).
The program normally crashes when multiple messages are sent in a short interval (e.g. 5 per second) and it's also important to note that messages are being received every second with a request to this function:
#app.route('/api/users/getMessages', methods=['POST'])
def get_Messages():
data = json.loads(request.data)
token = data['token']
payload = jwt.decode(token, 'SECRET', algorithms=['HS256'])
istid = payload['istid']
print('istid: '+istid)
messages = []
queue = channel.queue_declare(queue=istid)
for i in range(queue.method.message_count):
method_frame, header_frame, body = channel.basic_get(queue=istid, no_ack=True)
if method_frame:
#print(method_frame, header_frame, body)
messages.append(body)
else:
print('No message returned')
res = {'messages':messages, 'error':0}
return jsonify(res)
In this code it crashes normally in the line:
queue = channel.queue_declare(queue=istid)
But we also tried to change the code to use a while instead of a for where it ends when the body is None and it crashes in the line:
method_frame, header_frame, body = channel.basic_get(queue=istid, no_ack=True)
in that case.
Also important, the crashes are random and it can work a few times and then randomly crashes after a get request while messages are being sent. If anyone knows anything related to this we would appreciate any help.
Another note, we thought about using basic_consume with callback instead of basic_get but we didn't find a way in which this would work since we have to send the messages back and have several user making requests to this same function.
EDIT #1:
In the rabbitMQ docs rabbitmq if you search for the function "def basic_get" you will notice there are some TODO comments and also a reference to this
Due to implementation details, this cannot be called a second time
until the callback is executed.
So I suspected that this could be what was happening but even if it is I don't know how could it be solved.
For anyone interested in the solution, as it is in the other comments, the program was not thread safe since flask as of version 1.0 uses threaded = True as default.
The solution is either:
1) running flask with app.run(threaded = False)
2) Making the program thread safe by implementing locks whenever accessing the channel /connection with pika.

Sending a flask request within a flask request

I am implementing an endpoint in my Flask application that receives a collection of HTTP requests, and returns a collection of the corresponding HTTP responses. In order to accomplish this, I need my endpoint to call other endpoints in order to construct the result. However, because Flask is blocking while processing the original request, it cannot process the nested requests and the application gets deadlocked.
Is there any way to issue a request within a request in flask in a way that doesn't result in a deadlock?
I included a segment of my code which I believe should be enough to illustrate the problem without overwhelming you. If you would like to see more of it please let me know and I'll share.
from requests import Session, Request
def split(request):
multipart = request.stream.read()
boundary = request.content_type.split(';')[1]
prefix = ' boundary"'
suffix = '"'
delimiter = '--%s' % boundary[len(prefix)+1:-len(suffix)]
subrequests = [s.lstrip() for s in multipart.split(delimiter)]
for sub in subrequests:
status_line, _, more_lines = sub.partition('\n')
method, path, version = status_line.split()
headers, _, body = more_lines.partition('\n\n')
url = 'http://localhost:3000' + path
return Request(method, url, headers=headers, data=body)
#app.route('/batch', methods=["GET", "POST"])
def batch():
subrequests = split(request)
session = Session()
responses = []
for sub in subrequests:
response.append(s.send(sub.prepare())) # Deadlock!
There are two candidate solutions that I considered which I found to be unsatisfactory:
Don't issue a full request. Instead, just call the function that is mapped to the endpoint of interest (url_for). I am unsatisfied by this approach because the nested requests have their own headers and cookies which are neglected by this approach. Furthermore, code in the 'before_request' and 'after_request' handlers won't get called automatically
Run multiple instances of the application. This will solve the problem, but expose my service to a pretty simple DoS attack. If I have X instances running, All an attacker would need to do is to hit my service with X different requests to cause a deadlock.
Thank you.
Knowing that the internal flask server is not production-ready, when using only for development, pass the threaded=true parameter to app.run.
app.run(debug=True, threaded=True)
This happens cause you're using the flask devserver. It's not for production use.
In production environment you would use an application server (uWSGI, GUnicorn, Tornado, ...) with or without a webserver layer (NGINX, Apache,...) to proxy/balance connections to the workers protecting (not completely but in a lot of environments it's acceptable) from DoS attacks.

How do I receive Github Webhooks in Python

Github offers to send Post-receive hooks to an URL of your choice when there's activity on your repo.
I want to write a small Python command-line/background (i.e. no GUI or webapp) application running on my computer (later on a NAS), which continually listens for those incoming POST requests, and once a POST is received from Github, it processes the JSON information contained within. Processing the json as soon as I have it is no problem.
The POST can come from a small number of IPs given by github; I plan/hope to specify a port on my computer where it should get sent.
The problem is, I don't know enough about web technologies to deal with the vast number of options you find when searching.. do I use Django, Requests, sockets,Flask, microframeworks...? I don't know what most of the terms involved mean, and most sound like they offer too much/are too big to solve my problem - I'm simply overwhelmed and don't know where to start.
Most tutorials about POST/GET I could find seem to be concerned with either sending or directly requesting data from a website, but not with continually listening for it.
I feel the problem is not really a difficult one, and will boil down to a couple of lines, once I know where to go/how to do it. Can anybody offer pointers/tutorials/examples/sample code?
First thing is, web is request-response based. So something will request your link, and you will respond accordingly. Your server application will be continuously listening on a port; that you don't have to worry about.
Here is the similar version in Flask (my micro framework of choice):
from flask import Flask, request
import json
app = Flask(__name__)
#app.route('/',methods=['POST'])
def foo():
data = json.loads(request.data)
print "New commit by: {}".format(data['commits'][0]['author']['name'])
return "OK"
if __name__ == '__main__':
app.run()
Here is a sample run, using the example from github:
Running the server (the above code is saved in sample.py):
burhan#lenux:~$ python sample.py
* Running on http://127.0.0.1:5000/
Here is a request to the server, basically what github will do:
burhan#lenux:~$ http POST http://127.0.0.1:5000 < sample.json
HTTP/1.0 200 OK
Content-Length: 2
Content-Type: text/html; charset=utf-8
Date: Sun, 27 Jan 2013 19:07:56 GMT
Server: Werkzeug/0.8.3 Python/2.7.3
OK # <-- this is the response the client gets
Here is the output at the server:
New commit by: Chris Wanstrath
127.0.0.1 - - [27/Jan/2013 22:07:56] "POST / HTTP/1.1" 200 -
Here's a basic web.py example for receiving data via POST and doing something with it (in this case, just printing it to stdout):
import web
urls = ('/.*', 'hooks')
app = web.application(urls, globals())
class hooks:
def POST(self):
data = web.data()
print
print 'DATA RECEIVED:'
print data
print
return 'OK'
if __name__ == '__main__':
app.run()
I POSTed some data to it using hurl.it (after forwarding 8080 on my router), and saw the following output:
$ python hooks.py
http://0.0.0.0:8080/
DATA RECEIVED:
test=thisisatest&test2=25
50.19.170.198:33407 - - [27/Jan/2013 10:18:37] "HTTP/1.1 POST /hooks" - 200 OK
You should be able to swap out the print statements for your JSON processing.
To specify the port number, call the script with an extra argument:
$ python hooks.py 1234
I would use:
https://github.com/carlos-jenkins/python-github-webhooks
You can configure a web server to use it, or if you just need a process running there without a web server you can launch the integrated server:
python webhooks.py
This will allow you to do everything you said you need. It, nevertheless, requires a bit of setup in your repository and in your hooks.
Late to the party and shameless autopromotion, sorry.
If you are using Flask, here's a very minimal code to listen for webhooks:
from flask import Flask, request, Response
app = Flask(__name__)
#app.route('/webhook', methods=['POST'])
def respond():
print(request.json) # Handle webhook request here
return Response(status=200)
And the same example using Django:
from django.http import HttpResponse
from django.views.decorators.http import require_POST
#require_POST
def example(request):
print(request.json) # Handle webhook request here
return HttpResponse('Hello, world. This is the webhook response.')
If you need more information, here's a great tutorial on how to listen for webhooks with Python.
If you're looking to watch for changes in any repo...
1. If you own the repo that you want to watch
In your repo page, Go to settings
click webhooks, new webhook (top right)
give it your ip/endpoint and setup everything to your liking
use any server to get notified
2. Not your Repo
take the url you want i.e https://github.com/fire17/gd-xo/
add /commits/master.atom to the end such as:
https://github.com/fire17/gd-xo/commits/master.atom
Use any library you want to get that page's content, like:
filter out the keys you want, for example the element
response = requests.get("https://github.com/fire17/gd-xo/commits/master.atom").text
response.split("<updated>")[1].split("</updated>")[0]
'2021-08-06T19:01:53Z'
make a loop that checks this every so often and if this string has changed, then you can initiate a clone/pull request or do whatever you like

Categories

Resources