Twisted Web behind Apache - How to correct links?

Twisted Web behind Apache - How to correct links? - python

I am attempting to write a web application using the Twisted framework for python.
I want the application to work if run as a standalone server (ala twistd), or if Apache reverse proxies to it. E.g.
Apache https://example.com/twisted/ --> https://internal.example.com/
After doing some research, it seemed like I needed to use the vhost.VHostMonsterResource to make this work. So I set up apache with the following directive:
ProxyPass /twisted https://localhost:8090/twisted/https/127.0.0.1:443
Here is my basic SSL server:
from twisted.web import server, resource, static
from twisted.internet import reactor
from twisted.application import service, internet
from twisted.internet.ssl import SSL
from twisted.web import vhost
import sys
import os.path
from textwrap import dedent
PORT = 8090
KEY_PATH = "/home/waldbiec/projects/python/twisted"
PATH = "/home/waldbiec/projects/python/twisted/static_files"
class Index(resource.Resource):
def render_GET(self, request):
html = dedent("""\
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html>
<head>
<title>Index</title>
</head>
<body>
<h1>Index</h1>
<ul>
<li>Files</li>
</ul>
</body>
</html>
""")
return html
class ServerContextFactory:
def getContext(self):
"""
Create an SSL context.
Similar to twisted's echoserv_ssl example, except the private key
and certificate are in separate files.
"""
ctx = SSL.Context(SSL.SSLv23_METHOD)
ctx.use_privatekey_file(os.path.join(KEY_PATH, 'serverkey.pem'))
ctx.use_certificate_file(os.path.join(KEY_PATH, 'servercert.pem'))
return ctx
class SSLService(internet.SSLServer):
def __init__(self):
root = resource.Resource()
root.putChild("", Index())
root.putChild("twisted", vhost.VHostMonsterResource())
root.putChild("files", static.File(PATH))
site = server.Site(root)
internet.SSLServer.__init__(self, PORT, site, ServerContextFactory())
application = service.Application("SSLServer")
ssl_service = SSLService()
ssl_service.setServiceParent(application)
It almost works-- but the "files" link on the index page does not behave how I want it to when using apache as a reverse proxy, because it is an absolute link.
My main question is, other than using a relative link, is there some way to compute what the full URL path of the link ought to be in such a way that the link still works in standalone server mode?
A second question would be, am I using VHostMonsterResource correctly? I did not find much documentation, and I pieced together my code from examples I found on the web.

This seems like too much work. Why use VHostMonsterResource at all? You may have very specific reasons for wanting some of this but....Most times:
Have apache handle the ssl. apache then passes off to your twisted app serving non SSL goodies back to apache. Documentation all over the net on the apache config stuff.
you can sill add another server on an ssl port if you really want to
Haven't tested but structure more like:
root = resource.Resource()
root.putChild("", Index())
root.putChild("files", static.File(PATH))
http = internet.TCPServer(8090, server.Site(root))
# change this port # to 443 if no apache
https= internet.SSLServer(8443, server.Site(root), ServerContextFactory())
application = service.Application("http_https_Server")
http.setServiceParent(application)
https.setServiceParent(application)
Dev tip:
During development, for the cost of a couple of extra lines you can add an ssl server so that you can ssh into the running web_server and inspect variables and other state. Way cool.
ssl = internet.TCPServer(8022, getManholeFactory(globals(), waldbiec ='some non-system waldbiec passwork'))
ssl.setServiceParent(application)

Configure the Twisted application so that it knows its own root location. It can use that information to generate URLs correctly.

So after digging into the vhost.VHostMonsterResource source, I determined I could create another resource that could let the reverse proxied URL prefix be specified by an additional marker in the Apache ProxyPass URL.
Firstly, I finally figured out that vhost.VHostMonsterResource is supposed to be a special URL in your back end web site that figures out the reverse proxy host and port from data encoded in the URL path. The URL path (sans scheme and net location) looks like:
/$PATH_TO_VHMONST_RES/$REV_PROXY_SCHEME/$REV_PROXY_NETLOC/real/url/components/
$PATH_TO_VHMONST : Path in the (internal) twisted site that corresponds to the VHostMonsterResource resource.
$REV_PROXY_SCHEME : http or https that is being used by the reverse proxy (Apache).
$REV_PROXY_NETLOC : The net location (host and port) or the reverse proxy (Apache).
So you can control the configuration from the reverse proxy by encoding this information in the URL. The result is that the twisted site will understand the HTTP request came from the reverse proxy.
However, if you are proxying a subtree of the external site as per my original example, this information is lost. So my solution was to create an additional resource that can decode the extra path information. The new proxy URL path becomes:
/$PATH_TO_MANGLE_RES/$REV_PROXY_PATH_PREFIX/$VHOSTMONST_MARKER/$REV_PROXY_SCHEME/$REV_PROXY_NETLOC/real/url/components/
$PATH_TO_MANGLE_RES : The path to the resource that decodes the reverse proxy path info.
$REV_PROXY_PATH_PREFIX : The subtree prefix of the reverse proxy.
$VHOSTMONST_MARKER : A path component (e.g. "vhost") that signals a VHostMonster Resource should be used to further decode the path.

Related

django alternative to prevent header poison

Firstly I don't speak English very well, but anyway...
i know I need to use allowed_hosts, but I need to use all "*" and a header attack can cause something like:
<script src = "mysite.com/js/script.js"> <script>
to
<script src = "attacker.com/js/script.js"> <script>
or
mysite.com/new_password=blabla&token=blabla8b10918gd91d1b0i1
to
attacker.com/new_password=blabla&token=blabla8b10918gd91d1b0i1
But all static files are load on a nodejs server "cdn.mysite.com" and all domains are in the database, so I always take the domain from the database to compare with the request header, and use the domain from the database of data to send anything to the client:
views.py:
def Index(request):
url = request.META['HTTP_HOST']
cf = Config.objects.first()
if cf.domain == url:
form = regForm()
return render(request, 'page/site/home.html', {'form': form})
elif cf.user_domain == url:
ur = request.user.is_authenticated
if ur:
config = {'data' : request.user}
lojas = 'json.loads(request.user.user_themes)'
return render(request, 'app/home.html', {"config":config, "lojas":lojas})
else:
forml = loginForm()
return render(request, 'page/users/login/login.html', {'form':forml})
else:
redirect("//" + cf.domain)
Would that still be unsafe to use this way?

You do not need to create yet one bike shed. The allowed_hosts settings is totally enough to prevent spoofing the host name in the request (you can see in Practical HTTP Host header attacks how the spoofing of host name works).
allowed_hosts means domains kind of [YourSite.com, www.YourSite.com, *.YourSite.com] - this is domain names on which you site should operate (not from which you site can load external scripts).
And use HTTP/2 instead of HTTP/1.1 on server, because:
according to the HTTP / 1.1 protocol specification, when specifying an
absolute path of a resource, the Host header value is ignored, and the
host from the resource path is taken as it. This leads to the fact
that even a securely configured web server in this case accepts a
request with a spoofed Host value, and the web application that uses
HOST instead of SERVER_NAME is vulnerable to this attack.
So if you do use SERVER_NAME - this kind of attacks is not affects you.
If you wish to control the possible spoofing of scripts on public CDNs - use the:
Content Security Policy HTTP header (you could select you language to read).
SubResource Integrity (currently is not supported by Safari supported in Safari 12)

Python script developed to act on input from html (web page) is not executing the shell command

Please help me people!
I have setup a xcat server so that I can manage my many nodes therefrom. I want to stop running python scripts directly from within my xcat server. I figured it would be better for simplicity to create a webpage as my interface and use python as the server-side script from the xcat server
I am finding that my underlying python script is not really doing everything I want it to do. For example, my script is unable to power up or down my nodes defined on the xcat server. To illustrate better, my node (hs22n12 ) is defined on my xcat server (xcatmn5). I am able to use “nodels | grep hs22n12” to locate that node on xcatmn5 and operate it which ever way I see fit such as power up (“rpower hs22n12 on”) or power down (“rpower hs22n12 off”). However, when I build this commands into my python scripts such that they are operated when I provide input from html,
The operation is not successful.
Some specs are indicated here:
I am using apache and I have confirmed that this is running
My python scripts are in my var/www/cgi-bin and I am able to run them
My htnl files are located in /var/www/html
Please find below my code snippets
First html code (which is currently okay for me and is working well)
****<!DOCTYPE html>
<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
</head>
<body>
<div id="title">
<title> Node Provisioning Application </title>
</div>
<form action='cgi-bin/powerOff11.py' method="post">
Enter Node: <input type="text" name ="Node"/>
<input type="submit" value="submit">
</form>
</body>
</html>**
I will add my python code shortly
Here's my python code
#!/usr/bin/python
import cgi
import cgitb
import subprocess
import os
import sys
cgitb.enable()
print "Content-type: text/html\n"
form = cgi.FieldStorage()
Node = form.getvalue('Node')
print"<p>%s</p>"% Node
if Node == None:
print"<p>No node provided</p>"
else:
find_node = subprocess.call('nodels | grep ' + Node, shell=True)
if find_node == 0:
print("<p>Node not defined yet!</p>")
else:
if find_node > 0:
print"<p>%s</p>"% Node
p_off = subprocess.call('rpower ' + Node + ' on',shell=True)
print"<p>%s powering on...</p>"% Node
else:
sys.exit()
update:
After some diggging around, I was able to enable the HTTPS protocol for REST API and also enabled the certificate of HTTPs by following the instruction here: https://xcat-docs.readthedocs.io/en/stable/advanced/restapi/restapi_setup/restapi_setup.html
After i did this, I was actually able to access different resources, such as the repository of my xcat server via https. However, original problem has become clearer. The new response I am getting from http whenever I try to run a command that needs root priviledge is:
"Error: Permission denied for request warning: the client certificates under /usr/share/httpd/.xcat/ are not setup correctly, please run '/opt/xcat/share/xcat/scripts/setup-local-client.sh ' as 'root' to generate the client certificates; otherwise, the SSL connection between xcat client and xcatd will be setup without certificate verification and open to Man-In-The-Middle attacks."
This leads me to believe that my problem is how to configure /etc/httpd/conf/httpd.conf to be able to request root access and also make requests from there. Mind you I am able to get response to all binary commdans such as (ls, cd etc) that are in the /usr/bin directory (these commands do not require root priviledge to be made. Can someone point me to how to configure httpd.conf such that my request can be legitimately made from /root to xcat? Thank you all for your helps.

This is xCAT2, I assume?
If so, did you give correct permission to the apache process to execute xCAT commands?
xCAT policy table serves as ACL (access control list) for xCAT.
You can do something like:
mkdef -t policy -o 7.0 name=apache rule=allow
7.0 is just an example - use any other number as long as it doesn't conflict with any preexisting rule - tabdump policy can be convenient as there shouldn't be too many lines in the policy table.
for name use the apache process owner.
By default unless limited with commands= flag, all commands are allowed by this new policy rule.

Thank you again for taking the time to respond to my issue. I have tried it out. It appears that this is not the problem, however you have nudged me in the right direction of thinking beyond the actual code to the entire web service configuration of httpd and how it makes requests from xcatd. I was able to do some adjustment and noticed that I am able to get response from xcatd to all binary commands in /usr/bin/python (e.g ls) but I am unable to get response to such commands that needs root access like lsdef. This leads me to believe that the problem I am having is that I am not given permission to request root priviledge (run root commands) from http client. I need someone who has done web service configuration to look ino this. I believe the problem is most likely in httpd.conf (Still reading up on it) But tanks again.

Flaskdance doesn't generate a HTTPS uri

I'm trying to set up Google sign-in using Flask dance for a flask based website:
from flask_dance.contrib.google import make_google_blueprint, google
blueprint = make_google_blueprint(
client_id= "CLIENT_ID",
client_secret="CLIENT_SECRET",
scope=[
"https://www.googleapis.com/auth/plus.me",
"https://www.googleapis.com/auth/userinfo.email",
]
)
app.register_blueprint(blueprint, url_prefix="/google_login")
And as the documentation suggests, I have the view set up like this:
#app.route('/google_login')
def google_login():
if not google.authorized:
return redirect(url_for("google.login"))
resp = google.get("/oauth2/v2/userinfo")
assert resp.ok, resp.text
return "You are {email} on Google".format(email=resp.json()["email"])
When I was testing I set the environment variable, OAUTHLIB_INSECURE_TRANSPORT to 1 by using
export OAUTHLIB_INSECURE_TRANSPORT=1
And now even after I've removed the environment variable, for some reason the Flaskdance seems to always resolve the URI to a http instead of HTTPS.
This is evident from the redirect uri mismatch error I'm getting (here website refers to the domain name):
The redirect URI in the request,
http://"website"/google_login/google/authorized, does not match
the ones authorized for the OAuth client.
And here are the authorized redirect URIs I've set up in my Google cloud console:
https://"website"/google_login/google/authorized
https://www."website"/google_login/google/authorized
I tried unsetting the environment variable using this command:
unset OAUTHLIB_INSECURE_TRANSPORT
What am I missing here? Any help would be appreciated.

If Flask-Dance is generating http URLs instead of https, that indicates that Flask (not Flask-Dance, but Flask itself) is confused about whether the incoming request is an https request or not. Flask-Dance has some documentation about how to resolve this problem, and the most likely cause is a proxy server that handles the HTTPS separately from your application server.
The fix is to use a middleware like werkzeug's ProxyFix to teach Flask that it's behind a proxy server. Here's how you can use it:
from werkzeug.middleware.proxy_fix import ProxyFix
app.wsgi_app = ProxyFix(app.wsgi_app, x_for=1, x_proto=1)

I had the same problem and in my case adding this to my Apache VirtualHost config solved it:
RequestHeader set X-Forwarded-Proto "https"
My Flask is running behind an Apache proxy but Nginx would also have similar issues, potentially.

How Gmail makes it?

I am developing service similar to banatag. During new feature developmnet I found unexplainable behaviour of Gmail(as I think).
I'll try to explain my question in picture:
Create tag(image that I will request). Now nobody requests it
Add it by URL to email. Url of image http://eggplant-tag.appspot.com/request?FT1R3WECWNTM2ZGUDXRMA8VOXJ4F6TI4
There are two new AJAX requests from page, but there aren't to my domain
Looking for my service. There is request from my IP, with Google User-Agent
What does request this image(tag)?
I see two possibilities:
page make AJAX requests to my service, that's why I see my IP. But in this case, why I couldn't see this request in Network tab of Developer Console?
Google Image Proxy service requests to my service, but why in this case there is my IP in request?
My IP:
[UPD]
Add part of class that handles requests to image(tag):
...
request.remoteAddress = str(self.request.remote_addr)# save remote address
request.put()
...
self.response.write(simpleImageData) #write to body binary data of 1x1 transparent image
self.response.headers[ 'Content-Type' ] = 'image/png'
self.response.headers[ 'Cache-Control' ] = 'no-cache, no-store, must-revalidate'
self.response.headers[ 'Pragma' ] = 'no-cache'
self.response.headers[ 'Expires' ] = '0'
[UPD 2]
I used wireshark to found requests to my service, but there are not any. That's why main question is how Google User Content simulate my IP address?

The workings of Google Image Proxy have been thoroughly analyzed on the web, e.g at https://litmus.com/blog/gmail-adds-image-caching-what-you-need-to-know and https://blog.filippo.io/how-the-new-gmail-image-proxy-works-and-what-this-means-for-you/ -- and the googleusercontent site is the cache/cdn used (among other things) by GIP.
The only relevance of Google App Engine might be how you've configured your app.yaml which you don't show us, i.e, is that image served as a static file, or via logic in your application code -- and, if the latter, does your code have any logging calls when it serves the image. From the limited data you show, I'd guess the former (so the file lives on, and is served by, Google's static file servers, not next to your app's code on your own instances), which would remove any mystery...

I deployed my app on my notebook, then I tried to repeat my actions.
Result confirmed my guess that Google Proxy and App Engine works together, and when Google proxy server requests my app, I see my IP.
In my experiment I saw IP of Google proxy.

Getting the Server URL in Google App Engine using python

How do I get App Engine to generate the URL of the server it is currently running on?
If the application is running on development server it should return
http://localhost:8080/
and if the application is running on Google's servers it should return
http://application-name.appspot.com

You can get the URL that was used to make the current request from within your webapp handler via self.request.url or you could piece it together using the self.request.environ dict (which you can read about on the WebOb docs - request inherits from webob)
You can't "get the url for the server" itself, as many urls could be used to point to the same instance.
If your aim is really to just discover wether you are in development or production then use:
'Development' in os.environ['SERVER_SOFTWARE']

Here is an alternative answer.
from google.appengine.api import app_identity
server_url = app_identity.get_default_version_hostname()
On the dev appserver this would show:
localhost:8080
and on appengine
your_app_id.appspot.com

If you're using webapp2 as framework chances are that you already using URI routing in you web application.
http://webapp2.readthedocs.io/en/latest/guide/routing.html
app = webapp2.WSGIApplication([
webapp2.Route('/', handler=HomeHandler, name='home'),
])
When building URIs with webapp2.uri_for() just pass _full=True attribute to generate absolute URI including current domain, port and protocol according to current runtime environment.
uri = uri_for('home')
# /
uri = uri_for('home', _full=True)
# http://localhost:8080/
# http://application-name.appspot.com/
# https://application-name.appspot.com/
# http://your-custom-domain.com/
This function can be used in your Python code or directly from templating engine (if you register it) - very handy.
Check webapp2.Router.build() in the API reference for a complete explanation of the parameters used to build URIs.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Twisted Web behind Apache - How to correct links? - python

Configure the Twisted application so that it knows its own root location. It can use that information to generate URLs correctly.

Related

django alternative to prevent header poison

Python script developed to act on input from html (web page) is not executing the shell command

Flaskdance doesn't generate a HTTPS uri

How Gmail makes it?

Getting the Server URL in Google App Engine using python

Categories

Resources