Heroku router logging format

Heroku router logging format - python

I've recently deployed a Flask app on Heroku. It provides an API on top of an existing API and requires a confidential API key for the original service from the user. The app is really just a few forms, the values of which are passed with ajax to a specific URL on the server. Nothing fancy. I take steps to not store confidential information in the app and want no traces of it anywhere within the app.
Looking at the logs from heroku logs --source heroku, the heroku router process stores all HTTP requests for the app, including those requests that include the confidential information.
Is there a way to specify the log format for the heroku process so as to not store the URL served?

As other commenters mentioned, it is a bad practice to put confidential info in a URL. These could get cached or logged by a number of systems (e.g. routers, proxy servers, caches) on the roundtrip to the server. There are a couple ways to solve this:
Put them in a the Authorization header. This is probably the most common way authentication is handled for REST-based APIs.
Put them in the POST body. This works to get it out of the URL, but is a little weird semantically to say that your are POSTing the credentials to some resource (if this a REST API), unless it is a login call.

Related

Eliminating nuisance Instance starts

My GCP app has been abused by some users. To stop their usage I have attempted to eliminate features that can be abused, and have employed firewall rules to block certain users. But bad users continue to try to access my app via certain legacy URLs such as myapp.appspot.com/badroute. Of course, I still want users to use the default URL myapp.appspot.com .
I have altered my code in the following manner, but I am still getting Instances to start from them, and I do not want Instances in such cases. What can I do differently to avoid the bad Instances starting OR is there anything I can do to force such Instances to stop quickly instead of after about 15 minutes?
class Dummy(webapp2.RequestHandler):
def get(self):
logging.info("Dummy: " )
self.redirect("/")
app = webapp2.WSGIApplication(
[('/', MainPage),
('/badroute', Dummy)],debug=True)
(I may be referring to Instances when I should be referring to Requests.)

So whats the objective? you want users that visit /badroute to be redirected to some /goodroute ? or you want /badroute to not hit GAE and incur cost?
Putting a google cloud load balancer in front could help.
For the first case you could setup a redirect rule (although you can do this directly within App Engine too, like you did in your code example).
If you just want it to not hit app engine you could setup google cloud load balancer to have the /badroute route to some file in a GCS bucket instead of your GAE service
https://cloud.google.com/load-balancing/docs/https/ext-load-balancer-backend-buckets
However you wouldnt be able to use your *.appsot.com base url. You'd get a static IP which you should then map a custom domain to it

DISCLAIMER: I'm not 100% sure if this would work.
Create a new service dummy.
Create and deploy a dispatch.yaml (GAE Standard // GAE Flex)
Add the links you want to block to the dispatch.yaml and point them to the dummy service.
Set up the Identity Aware Proxy (IAP) and enable it for the dummy service.
???
Profit
The idea is that the IAP will block the requests before they hit the dummy service. Since the requests never actually get forwarded to the service dummy you will not have an instance start. The bots will get a nice 403 page from Google's own infrastructure instead.
EDIT: Be sure to create the dummy service with 0 instances as the idea is to not have it cost money.
EDIT2:
So let me expand a bit on this answer.
You can have multiple GAE services running within one GCP project. Each service is it's own app. You can have one service running a python Flask app and another running a Java Springboot app. You can have each be either GAE Standard or GAE Flex. See this doc.
Normally all traffic gets routed to the default service. Using dispatch.yaml you can make request to certain endpoints go to a specific service.
If you create the dummy service as a GAE Standard app, and you don't actually need it to do anything, you can then route all the endpoints that get abused to this dummy service using the dispatch.yaml. Using GAE Standard you can have the service use 0 instances (and 0 costs).
Using the IAP you can then make sure only your own Google account can access this app (which you won't do). In effect this means that the abusers cannot really access the service, as the IAP blocks it before hitting the service, as you've set it up so only your Google account can access it.
Note, the dispatch.yaml is separate from any services, it's one of the per-project configuration files for GAE. It's not tied to a specific service.
As stated, the dummy app doesn't actually need to do anything, but you need to deploy it once though, as this basically creates the service.

Consider using cloudflare to mitigate bot abuse, customize firewall rules regarding route access, rate limit ips, etc. This can be combined with Google cloud load balancer, if you’d like—as mentioned in https://stackoverflow.com/a/69165767/806876.
References
Cloudflare GCP integration: https://www.cloudflare.com/integrations/google-cloud/

There is a little information I did not provide in my question about my app.yaml:
handlers:
- url: /.*
script: mainapp.app
By simply removing .* from the url specification, no Instance start is created. The user gets Error: Not Found, though. So that satisfies my needs.
Edo Akse's Answer pushed me to this answer by reading here, so I am accepting his answer. I am still not clear how to implement Edo's Answer, though.

How do I secure a Python API with in-the-open authorization tokens?

I have setup a simple REST API server (using Django REST framework) that responds to POST requests by doing some processing on an image uploaded to the server in the request. Previously I used it to power my own frontend (http://lips.kyleingraham.com) as a demonstration but I would now like to open the API to other users.
I would like for an end-user to be able to sign up and, from a dashboard, generate a token based on their credentials that could then be hard-coded into their web app. The sign-up part I believe I can handle but I am unclear on how to restrict a generated token to a user's web app domain. I know that the code for a web app is easily inspected so any API token I provide would need to be policed on my backend.
How can I restrict an authorization token to a users' web app domain so that even if the token was leaked, another user would not be able to utilize it?

If you want to hard-code url into user web app, in that way you can't guarantee that if someone get the token, he won't be able to use it.
The only idea is to set some time limit for each token

How to secure my Azure WebApp with the built-in authentication mechanism

I created a Flask-Webservice with Python that runs independently inside a docker container. I then uploaded the docker image to an Azure Container Registry. From there I can create a WebService (for Containers) with some few clicks in the Azure Portal, that runs this container. So far so good. It behaves just as I want it to.
But of course I don't want anyone to access the service. So I need some kind if authentication. Luckily (or so I thought) there is a built-in authentication-mechanism (I think it is based on OAuth ... I am not that well versed in security issues). Its documentation is a bit sparse on what actually happens and also concentrates on solutions in C#.
I first created a project with Google as described here and then configured the WebApp-Authentication with the Client-Id and Secret. I of course gave Google a java script source and callback-url, too.
When I now log off my Google account and try a GET-Request to my Webservice in the Browser (the GET should just return a "hello world"-String), I am greeted with a Login Screen ... just as I expected.
When I now login to Google again, I am redirected to the callback-url in the browser with some kind of information in the parameters.
a token perhaps? It looks something like this:
https://myapp.azurewebsites.net/.auth/login/google/callback?state=redirxxx&code=xxx&authuser=xxx&session_state=xxx&prompt=xxx).
Here something goes wrong, because an error appears.
An error occurred.
Sorry, the page you are looking for is currently unavailable.
Please try again later.
If you are the system administrator of this resource then you should check the error log for details.
Faithfully yours, nginx.
As far as I now, nginx is a server software that hosts my code. I can imagine that it also should handle the authentication process. It obviously lets all requests through to my code when authentication is turned off, but blocks un-authenticated accesses otherwise and redirects to the google login. Google then checks if your account is authorized for the application and redirects you to the callback with the access token along with it. This then returns a cookie which should grant my browser access to the app. (I am just reproducing the documentation here).
So my question is: What goes wrong. Does my Browser not accept the cookie. Did I something wrong when configuring Google+ or the Authentication in the WebApp. Do I have to use a certain development stack to use the authentication. Is it not supported for any of the technologies I use (Python, Flask...).
EDIT
#miknik:
In Microsofts documentation of the authentication/authorization it says
The authentication and authorization module runs in the same sandbox
as your application code. When it's enabled, every incoming HTTP
request passes through it before being handled by your application
code.
...
The module runs separately from your application code and is
configured using app settings. No SDKs, specific languages, or changes
to your application code are required.
So while you are probably right that the information in the callback-redirect is the authorization grant/code and that after that this code should now be used to get an access token from Google, I don't quite understand how this would work in my situation.
As far as I can see it Microsofts WebApp for Container-Resource on Azure should take care of getting the token automatically and return it as part of the response to the callback-request. The documentation states 4 steps:
Sign user in: Redirects client to /.auth/login/.
Post-authentication: Provider redirects client to /.auth/login//callback.
Establish authenticated session: App Service adds authenticated cookie to response.
Serve authenticated content: Client includes authentication cookie in subsequent requests (automatically handled by browser).
It seems to me that step 2 fails and that that would be exactly what you wrote: that the authorization grant is to be used by the server to get the access token but isn't.
But I also don't have any control over that. Perhaps someone could clear things up by correcting me on some other things:
First I can't quite figure out which parts of my problem represent which role in the OAuth-scheme.
I think I am the Owner, and by adding users to the list in the Google+-Project I authorize them to use my service.
Google is obviously the authorization server
my WebService (or better yet my WebApp for Containers) is the resource server
and finally an application or postman that does the requests is the Client
In the descriptions of OAuth I read the problematic step boils down to: the resource server gets the access token from the authorization server and passes it along to the client. And Azures WebApps Resource is prompted (and enabled) to do so by being called with the callback-url. Am I right somewhere in this?
Alas, I agree that I don't quite understand the whole protocol. But I find most descriptions on the net less than helpful because they are not specific to Azure. If anyone knows a good explanation, general or Azure-specific, please make a comment.

I found a way to make it work and I try to explain what went wrong as good as I can. Please correct me if I go wrong or use the wrong words.
As I suspected the problem wasn't so much that I didn't understand OAuth (or at least how Azure manages it) but the inner workings of the Azure WebApp Service (plus some bad programming on my part). Azure runs an own Server and is not using the built-in server of flask. The actual problem was that my flask-program didn't implement a WSGI-Interface. As I could gather this is another standard for python scripts to interact with any server. So while rudimentary calls from the server (I think Azure uses nginx) were possible, more elaborate calls, like the redirect to the callback url went to dev/null.
I build a new app following this tutorial and then secured it by following the authentication/authorization-tutorial and everything worked fine. The code in the tutorial implements WSGI and is probably more conform to what Azure expects. My docker solution was too simple.
My conclusion: read up on this WSGI-standard that flask always warned me about and I didn't listen and implement it in any code that goes beyond fiddeling around in development.

How do I protect urls I use internally for push queues in Google App Engine?

I'm running Flask on GAE, and I'm working on implementing a push queue to run tasks for me in the background. Because GAE's push queues work by scheduling and sending http requests to my flask server, I'm concerned about my users guessing the urls I designated for internal use with my push queue. I thought about having the push queue send a secret key along with the requests, and have my server only execute the job if the key included in the request is correct, something like this:
taskqueue.add(url='/worker', params={'super_secret_key': 12345})
But I'm wondering if there's a more secure / better way to do this?
Thanks!

You can protect your task urls by configuring them in app.yaml to use admin login
- url: /worker
......
login: admin

here is another way to do it that i think is more efficient.
you can take advantage that appengine removes some request headers from external (ie users) requests. but it doesnt if the request is internal:
http://googlecloudplatform.blogspot.com/2015/07/Unit-Testing-cron-handlers-in-Google-App-Engine.html
look where it says: "Instead, the Cron Service sets a special request header -- X-AppEngine-Cron: true. This is a header that application code can fully trust, since App Engine removes such headers if they’re set in an external request."
you should be able to use the same principle when making yours calls. see these request headers that google sets on taskqueue calls:
https://cloud.google.com/appengine/docs/python/taskqueue/overview-push#task_request_headers
you wont need the admin login anymore, and note that its even more secure because it will only be possible to call from a task queue (thus would require a code change to tamper)

Python - Facebook connect without browser based-redirect

title might be a bit of a misnomer.
I have two subdomains setup in a web application, one for the application frontend at www. (which is using a loose PHP router coupled with backbone, require, and jquery for UI) and a data tier setup at data. (built entire in python utilizing Werkzeug)
We decided to go to this kind of architecture since at some point mobile app will be integrated into the equation, and they can just as easily send HTTP connections to the data subdomain. The data subdomain renders all its responses in JSON, which the JS frontend expects universally as a response be it a valid query response or an error.
This feels like the most organized arrangement I've ever had on a project, until Facebook connect appeared on the manifesto.
Ideally, I'd like to keep as much of the 'action' of the site behind the scenes on the data subdomain, especially authentication stuff, so that the logic is available to the phones and mobile devices as well down the line if need be. Looking into the facebook docs, it appears however that the browser and its session object are critical components in their OAuth flow.
User authentication and app authorization are handled at the same time
by redirecting the user to our OAuth Dialog. When invoking this
dialog, you must pass in your app id that is generated when you create
your application in our Developer App (the client_id parameter) and
the URL that the user's browser will be redirected back to once app
authorization is completed (the redirect_uri parameter). The
redirect_uri must be in the path of the Site URL you specify in
Website section of the Summary tab in the Developer App. Note, your
redirect_uri can not be a redirector.
I've done this before in PHP and am familiar with the process, and it's a trivial matter of about 3 header redirects when the logic in question has direct access to the browser. My problem is that the way our new app is structured, all the business logic is sequestered on the other subdomain. What are my options to simulate the redirect? can I have the python subdomain send the HTTP data, receiving it from the www domain using CURL like all the other requests? Can I return a json object to the user's browser that instructs on doing the redirects? Will Facebook even accept requests from a data subdomain, whose redirect_uri is at subdomain www? I find that last sentence in the quoted section probably discounts that as a possibility, but I've looked through their docs and it doesn't explicity say that this would be treated as a violation.
Has anyone had any experience setting up something like this with Facebook with a similar architecture? Perhaps I should just fold and put the FB login logic directly into PHP and handle it there.
Thanks!

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.