Apache does not Compress/Deflate/GZIP responses - python

I have set up apache .htaccess file so that it should compress JSONs that my application spews. The application is written in python, and is linked to apache with CGI scripts.
<ifmodule mod_deflate.c>
AddOutputFilterByType DEFLATE text/text text/html text/plain text/xml text/css application/x-javascript application/javascript text/javascript text/json application/json
</ifmodule>
My JSONs returned are still not GZipped, although static files returned are. Are there ideas or thoughts as to how I can fix this?
I am using Apache2

.htaccess doesn't affect script output, so you will have to handle gzip compression from Python app. E.g. in case of Django app you can use GZipMiddleware.

Related

How to compress a JSON payload from Django Rest API [duplicate]

I was wondering: would it be possible to compress the response payload in Django REST?
At the moment, the response payloads are plain JSON data. However, there's quite a lot of data to bounce back and forth so I was wondering if compressing the data would help with the bandwidth issues.
HTTP response compression will most likely not be handled by Django but by your HTTP server using the gzip or deflate algorithms.
You just need to make sure your HTTP server is configured to compress HTTP Responses with Content-Type header set to application/json.
How to enable gzip compression for nginx: https://rtcamp.com/tutorials/nginx/enable-gzip/
The following worked for me.
I actually turned gzip on at the nginx level, not within Django or Django Rest Framework.
/etc/nginx/nginx.conf file:
http {
#... other settings ...#
##
# Gzip Settings
##
gzip on;
gzip_disable "msie6";
gzip_vary on;
gzip_proxied any;
gzip_comp_level 6;
gzip_buffers 16 8k;
gzip_http_version 1.1;
gzip_types text/plain text/css application/json application/x-javascript text/xml application/xml application/xml+rss text/javascript;
}
This leaves the compressing up to the nginx server and as most modern browsers automatically know how to extract (uncompress) gzip compression, I didn't need to do anything on my client-side - even when receiving json data inside an Angular spa app.
My 1.3 MB JSON payload turned into about a 180 KB payload.
A pretty quick and fast way to save MB's of data.
If you are using the Django / DRF built-in web server rather than Apache or nginx, that uses its own WSGI server, so those methods won't work for you.
However, Django does have a built-in gzip middleware which you should be able to use, as described in these answers:
https://stackoverflow.com/a/1864377/2540707
https://stackoverflow.com/a/14821684/2540707
That being said, for production use you should be using a real web server rather than Django's built-in one.

uwsgi/nginx configuration for chunked response

I have two endpoints like below:
GET on /api/v1/foo
POST on /api/v1/foo
I need the POST implementation to send back chunked responses using HTTP/1.1 chuked-tranfer encoding however the GET endpoint should send plain JSON
My setup is nginx -> uwsgi -> flask.
I see some of my chunks currently getting truncated at a hex size of 1000 which is 4K in bytes and not the same as my flask layer sent it. Probably because I'm missing some nginx or uwsgi configuration.
uwsgi configuration(uwsgi.ini):
[uwsgi]
route = ^/api/v1/foo$ goto:dochunked
route-run = last:
route-label = dochunked
route-if = equal:$\{REQUEST_METHOD\};POST goto:dopostchunked
route-run = last:
route-label = dopostchunked
route-run = chunked:
nginx configuration:
location / {
uwsgi_pass unix:var/uwsgi.sock;
uwsgi_read_timeout 600;
include uwsgi_params;
}
location /api/v1/foo {
uwsgi_pass unix:var/uwsgi.sock;
uwsgi_read_timeout 600;
include uwsgi_params;
if ($request_method = "POST" ) {
set $chunked_transfer_encoding on;
add_header X-Accel-Buffering no;
}
}
curl response headers
HTTP/1.1 200 OK
Server: nginx/1.10.1
Date: Wed, 03 Jan 2018 00:06:50 GMT
Content-Type: text/plain
Transfer-Encoding: chunked
Connection: keep-alive
X-Frame-Options: deny
X-Content-Type-Options: nosniff
X-XSS-Protection: 1; mode=block
X-Accel-Buffering: no
Chunking is about Transfer-Encoding, plain JSON is Content-Type, the two things are not related.
The transfer encoding stuff is just about the communication methods used by the two HTTP 1.1 endpoints (the server and the client). Like would be a gzip compression, also. Using chunked transmission avoids using the Content-Length headers and allows the response to be sent in multiple chunks, of course. But on the other side, once the response is received chunks are added, and you should not see any difference between a response sent via Content-Length+big-body-in-one-chunk or a body-sent-in-multiple-chunks.
I say should because you may experience problems with bad HTTP/1.1. libraries which do not wait until the end of the message (last chunk marker) before launching something like an response-receveid event for application languages.
Usually using chunks or not is the responsability of the HTTP server, and you have few contgrol other that because chunks support is a requested feature of HTTP/1.1. Playing with the size of the response body and the size of buffers used by the http server you may see differences on the way chunks are made. If you have multiple actors in the chain (like here flask and Nginx), each actor can decide to reorganize the chunks, merge some of them (buffering), or not.
But as I said, you should not care about it. Unless your client side of the application as bugs with chunked encoding, that would mean your side of the HTTP communication doesn't understand HTTP/1.1.
Finally, if you really need to avoid chunks, but you shouldn't, I see 3 options:
You could enforce an HTTP/1.0 response. No chunks with HTTP/1.0. But that's a very very old version of the protocol. To do that you'll have to ask for HTTP/1.0 in the request side, you'll get an HTTP/1.1 response from Nginx but without the advanced features of HTTP/1.1 (like chunks).
You could use the nginx chunked_transfer_encoding setting. we can see it's on by default, so usually you use that to set it to off on a specific location. Your current way of using it does nothing. This option was made specifically for bad HTTp clients, as stated:
It may come in handy when using a software failing to support chunked
encoding despite the standard’s requirement.
You could maybe also try playing with proxy_buffering off, that may work, I'm unsure.

Python Flask CORS - API always allows any origin

I've looked through many SO answers, and can't seem to find this issue. I have a feeling that I'm just missing something obvious.
I have a basic Flask api, and I've implemented both the flask_cors extension and the custom Flask decorator [#crossdomain from Armin Ronacher].1 (http://flask.pocoo.org/snippets/56/) Both show the same issue.
This is my example app:
application = Flask(__name__,
static_url_path='',
static_folder='static')
CORS(application)
application.config['CORS_HEADERS'] = 'Content-Type'
#application.route('/api/v1.0/example')
#cross_origin(origins=['http://example.com'])
# #crossdomain(origin='http://example.com')
def api_example():
print(request.headers)
response = jsonify({'key': 'value'})
print(response.headers)
return response
(EDIT 3 inserted):
When I make a GET request to that endpoint from JS in a browser (from 127.0.0.1), it always returns 200, when I would expect to see:
Response to preflight request doesn't pass access control check: No 'Access-Control-Allow-Origin' header is present on the requested resource. Origin 'http://127.0.0.1:5000' is therefore not allowed access. The response had HTTP status code 403.
CURL:
ACCT:ENVIRON user$ curl -i http://127.0.0.1:5000/api/v1.0/example
HTTP/1.0 200 OK
Content-Type: application/json
Content-Length: 20
Access-Control-Allow-Origin: http://example.com
Server: Werkzeug/0.11.4 Python/2.7.11
Date: [datetime]
{
"key": "value"
}
LOG:
Content-Length:
User-Agent: curl/7.54.0
Host: 127.0.0.1:5000
Accept: */*
Content-Type:
Content-Type: application/json
Content-Length: 20
127.0.0.1 - - [datetime] "GET /api/v1.0/example HTTP/1.1" 200 -
I'm not even seeing all of the proper headers in the response, and it doesn't seem to care what the origin is in the request.
Any ideas what I'm missing? Thanks!
EDIT:
As a side note, looking at the documentation example here (https://flask-cors.readthedocs.io/en/v1.7.4/#a-more-complicated-example), it shows:
#app.route("/")
def helloWorld():
'''
Since the path '/' does not match the regular expression r'/api/*',
this route does not have CORS headers set.
'''
return '''This view is not exposed over CORS.'''
...which is rather interesting since I already have the root path (and others) exposed without any CORS decoration, and they are working fine from any origin. So it seems that there is something fundamentally wrong with this setup.
Along those lines, the tutorial here (https://blog.miguelgrinberg.com/post/designing-a-restful-api-with-python-and-flask) seems to indicate that Flask apis should naturally be exposed without protection (I would assume that's just since the CORS extension hasn't been applied), but my application is basically just operating like the CORS extension doesn't even exist (other than a few notes in the log that you can see).
EDIT 2:
My comments were unclear, so I created three example endpoints on AWS API Gateway with different CORS settings. They are GET method endpoints that simply return "success":
1) CORS not enabled (default):
Endpoint: https://t9is0yupn4.execute-api.us-east-1.amazonaws.com/prod/cors-default
Response:
XMLHttpRequest cannot load
https://t9is0yupn4.execute-api.us-east-1.amazonaws.com/prod/cors-default.
Response to preflight request doesn't pass access control check: No
'Access-Control-Allow-Origin' header is present on the requested
resource. Origin 'http://127.0.0.1:5000' is therefore not allowed
access. The response had HTTP status code 403.
2) CORS enabled - Origin Restricted:
Access-Control-Allow-Headers: 'Content-Type'
Access-Control-Allow-Origin: 'http://example.com'
Endpoint: https://t9is0yupn4.execute-api.us-east-1.amazonaws.com/prod/cors-enabled-example
Response:
XMLHttpRequest cannot load
https://t9is0yupn4.execute-api.us-east-1.amazonaws.com/prod/cors-enabled-example.
Response to preflight request doesn't pass access control check: The
'Access-Control-Allow-Origin' header has a value 'http://example.com'
that is not equal to the supplied origin. Origin
'http://127.0.0.1:5000' is therefore not allowed access.
3) CORS enabled - Origin Wildcard:
Access-Control-Allow-Headers: 'Content-Type'
Access-Control-Allow-Origin: '*'
Endpoint: https://t9is0yupn4.execute-api.us-east-1.amazonaws.com/prod/cors-enabled-wildcard
Response:
"success"
I'm not that experienced with infrastructure, but my expectation was that enabling the Flask CORS extension would cause my api endpoints to mimic this behavior depending on what I set at the origins= setting. What am I missing in this Flask setup?
SOLUTION EDIT:
Alright, so given that something on my end was obviously not normal, I stripped down my app and re-implemented some very basic APIs for each variation of CORS origin restriction. I've been using AWS's elastic beanstalk to host the test environment, so I re-uploaded those examples and ran a JS ajax request to each. It's now working.
I'm getting the Access-Control-Allow-Origin error on naked endpoints. It appears that when I configured the app for deployment I was uncommenting CORS(application, resources=r'/api/*'), which was obviously allowing all origins for the naked endpoints!
I'm not sure why my route with a specific restriction (origins=[]) was also allowing everything, but that must have been some type of typo or something small, because it's working now.
A special thanks to sideshowbarker for all the help!
From your question as-is, it’s not completely clear what behavior you’re expecting. But as far as how the CORS protocol works, it seems like your server is already behaving as expected.
Specifically, the curl response cited in the question shows this response header:
Access-Control-Allow-Origin: http://example.com
That indicates a server already configured to tell browsers, Only allow cross-origin requests from frontend JavaScript code running in browsers if code’s running at the origin http://example.com.
If the behavior you’re expecting is that the server will now refuse requests from non-browser clients such as curl, then CORS configuration on its own isn’t going to cause a server to do that.
The only thing a server does differently when you configure it with CORS support is just to send the Access-Control-Allow-Origin response header and other CORS response headers. That’s it.
Actual enforcement of CORS restrictions is done only by browsers, not by servers.
So no matter what server-side CORS configuration you make, the server still goes on accepting requests from all clients and origins it would otherwise; in other words, all clients from all origins still keep on getting responses from the server just as they would otherwise.
But browsers will only expose responses from cross-origin requests to frontend JavsScript code running at a particular origin if the server the request was sent to opts-in to permitting the request by responding with an Access-Control-Allow-Origin header that allows that origin.
That’s the only thing you can do using CORS configuration. You can’t make a server only accept and respond to requests from particular origins just by doing any server-side CORS configuration. To do that, you need to use something other than just CORS configuration.

Flask-CORS not working for POST, but working for GET

I'm running a Flask-Restful API locally and sending a POST request containing JSON from a different port. I'm getting the error
No 'Access-Control-Allow-Origin' header is present on the requested resource.
However, when I run
curl --include -X OPTIONS http://localhost:5000/api/comments/3
--header Access-Control-Request-Method:POST
--header Access-Control-Request-Headers:Content-Type
--header Origin:http://localhost:8080
I get
HTTP/1.0 200 OK
Content-Type: text/html; charset=utf-8
Allow: HEAD, GET, POST, OPTIONS
Access-Control-Allow-Origin: http://localhost:8080
Access-Control-Allow-Methods: DELETE, GET, HEAD, OPTIONS, PATCH, POST, PUT
Vary: Origin
Access-Control-Allow-Headers: Content-Type
Content-Length: 0
which shows "Access-Control-Allow-Origin" as "*". GET works fine, it's just POST that gives this error. What could be going wrong? If relevant, for the frontend I'm using react and requesting through axios.
You have to add CORS(app, resources={r"/*": {"origins": "*"}}) into your flask app.
Hope that solves the issue.
the Flask-Cors docs explain why this might happen
"When using JSON cross origin, browsers will issue a pre-flight OPTIONS request for POST requests. In order for browsers to allow POST requests with a JSON content type, you must allow the Content-Type header. The simplest way to do this is to simply set the CORS_HEADERS configuration value on your application: e.g."
https://flask-cors.readthedocs.io/en/1.9.0/
app.config['CORS_HEADERS'] = 'Content-Type'
In my case, a CORS error raised because of an internal error. An error completely unrelated to CORS, which should return 500, was causing this.

Django Response always Chunked with text/html cannot set Content-Length

In my Django Application's views.py , I return an HttpResponse object after attempting to set the following HTTP Header fields:
# Create a Response Object with the content to return
response = HttpResponse("%s"%(output_display),mimetype='text/html')
response['Cache-Control'] = 'must-revalidate, max-age=20'
response['Vary'] = 'Accept-Encoding'
response['Transfer-Encoding'] = 'gzip'
#response['Content-Encoding'] = 'gzip'
response['Connection'] = 'close'
#response['Content-Type'] = 'text/html'
response['Content-Length'] = '%s'%(len(output_display))
return response
I then capture the output using the Live HTTP Headers plugin with FireFox, and it looks like:
HTTP/1.1 200 OK
Date: Sun, 10 Mar 2013 14:55:09 GMT
Server: Apache/2.2.22 (Ubuntu)
Transfer-Encoding: gzip, chunked <---------- Why 'chunked'?
Vary: Accept-Encoding
Connection: close
Cache-Control: must-revalidate, max-age=20
Content-Encoding: gzip
Content-Type: text/html <---------------------- No Content-Length even though I set it?
X-Pad: avoid browser bug
I am trying to cache using Apache2's mem_cache, so I need the Content-Length to be set and cannot have 'chunked' for Transfer-Encoding.
My Apache2 mem_cache.conf looks like ( large numbers just for testing ):
<IfModule mod_mem_cache.c>
CacheEnable mem /
MCacheSize 10000
MCacheMaxObjectCount 10000000
MCacheMinObjectSize 1
MCacheMaxObjectSize 10000000
MCacheMaxStreamingBuffer 10000000
</IfModule>
But even though I explicitly set the Content-Length and Transfer-Encoding in my response code, 'chunked' is inserted automatically and therefore my Content-Length is not honored. Why is this? How can I fix this to get the desired response? Thanks -
I came across a similar issue recently with a mod_wsgi application; I was trying to update an apache configuration that was using its built-in disk cache, to use socache/memcache instead.
The disk cache was working, but switching to memcache or shmcb didn't work. If I issued a request for a resource I wanted cached, it wouldn't store it in the cache (CacheDetailHeader is helpful for this). Checking the logs at debug, I found the message:
[Wed Dec 05 18:52:16.571002 2018] [cache_socache:debug] \
[pid 884:tid 140422596777728] mod_cache_socache.c(389): \
[client 127.0.0.1:56576] AH02346: URL 'http://127.0.1.1:80/cacheme/c?' \
had no explicit size, ignoring, referer: http://127.0.0.1/
It seems that socache doesn't like objects that don't have explicit sizes. I tried setting the newer, socache equivalents of those mod_memcache settings to sufficiently large values: CacheSocacheMaxSize and CacheSocacheReadSize.
I know that the Content-Length header was being set and made it through to somewhere; it showed up in the mod_wsgi logs when I deliberately miscalculated it.
A few things I found:
Don't set Transfer-Encoding header yourself, as this is forbidden by the WSGI specification:
Who set the Transfer-Encoding: chunked header?
Even though you're setting the Content-Length header yourself, it's also being gzipped by apache. This changes the length; when Apache doesn't know what the length will be, it switches to chunked and removes the Content-Length header.
I found that with:
Content-Type: text/html
Content-Length set to my utf-8 encoding size
set in the python/mod_wsgi application, and:
SetEnv no-gzip 1
set in the apache configuration, that the object made it into a shmcb cache.
It looks like when apache gzips an object, it changes the headers to that it isn't accepted by socache.
I looked around for ways to make them compatible, but couldn't find too much on this issue. There is some mention of reordering the cache/deflate filters in the mod_cache documentation:
https://httpd.apache.org/docs/2.4/mod/mod_cache.html#finecontrol
This worked if I put in a directive to reorder the cache/deflate filters:
# within a directory
SetOutputFilter CACHE;DEFLATE
Curiously, on a cache miss, the server returned gzipped content, but on a cache hit, the server returned unencoded text/html. This looks odd, but I haven't understood the FilterChain directives well enough to try those out.
I also found some mention of this in a related issue with php/content-length:
https://serverfault.com/questions/183843/content-length-not-sent-when-gzip-compression-enabled-in-apache
The answer there found that if they set the DeflateBufferSize to a large-enough value, then content-length would be set.
I couldn't get this to work.
So it looks like one is stuck between choosing cached or gzipped.

Categories

Resources