Google Pub/Sub push subscription into IAP-protected App Engine

Google Pub/Sub push subscription into IAP-protected App Engine - python

I am testing out a very basic Pub/Sub subscription. I have the push endpoint set to an App I have deployed through a Python Flex service in App Engine. The service is in a project with Identity-Aware Proxy enabled. The IAP is configured to allow through users authenticated with our domain.
I do not see any of the push requests being processed by my app.
I turned off the IAP protection and then I see that the requests are processed. I turn it back on and they are no longer processed.
I had similar issues with IAP when trying to get a Cron service running; that issue resolved itself after I deployed a new test app in the same project.
Has anyone had success with configuring a push subscription through IAP? I also experimented with putting different service accounts on the IAP access list and none of them worked.

I'm not aware of a way to get Pub/Sub push subscriptions + Flex + IAP working. I wonder... it might work if the subscriber is on Standard.
Some other potential workarounds:
- Switch to a Pull subscriber.
- Set up a Cloud Functions function as your Pub/Sub subscriber -- https://cloud.google.com/functions/docs/writing/background -- and then in that function pass the request on to the GAE app, using https://cloud.google.com/iap/docs/authentication-howto to authenticate as a service account.
Sorry, I wish I had a better answer for you, but AFAIK those are the options that work today.
--Matthew, IAP engineering lead

I had a pretty similar issue - a GAE 2nd G standard application in project A, which is wired under IAP, that cannot receive the pushed pub/sub message from project B.
My workaround is:
Setup Cloud Function (HTTP triggered) in project A;
Setup the subscription of project B Pub/Sub topic to push the message to above Cloud Function endpoint;
The above Cloud Function works like a proxy to filter (needed based on my case, ymmv) and forwards the Pub/Sub message in a http request to the GAE app;
Since the Cloud Function is within same project with the GAE app, there is only needed to add the IAP authentication for above http request (which fetches the token assigned from the specific SA).
There should be a project A's SA setup in Project B IAM, which may have at least Pub/Sub Subscriber and Pub/Sub Viewer roles.
Hope this could be an option for your case.

Related

Eliminating nuisance Instance starts

My GCP app has been abused by some users. To stop their usage I have attempted to eliminate features that can be abused, and have employed firewall rules to block certain users. But bad users continue to try to access my app via certain legacy URLs such as myapp.appspot.com/badroute. Of course, I still want users to use the default URL myapp.appspot.com .
I have altered my code in the following manner, but I am still getting Instances to start from them, and I do not want Instances in such cases. What can I do differently to avoid the bad Instances starting OR is there anything I can do to force such Instances to stop quickly instead of after about 15 minutes?
class Dummy(webapp2.RequestHandler):
def get(self):
logging.info("Dummy: " )
self.redirect("/")
app = webapp2.WSGIApplication(
[('/', MainPage),
('/badroute', Dummy)],debug=True)
(I may be referring to Instances when I should be referring to Requests.)

So whats the objective? you want users that visit /badroute to be redirected to some /goodroute ? or you want /badroute to not hit GAE and incur cost?
Putting a google cloud load balancer in front could help.
For the first case you could setup a redirect rule (although you can do this directly within App Engine too, like you did in your code example).
If you just want it to not hit app engine you could setup google cloud load balancer to have the /badroute route to some file in a GCS bucket instead of your GAE service
https://cloud.google.com/load-balancing/docs/https/ext-load-balancer-backend-buckets
However you wouldnt be able to use your *.appsot.com base url. You'd get a static IP which you should then map a custom domain to it

DISCLAIMER: I'm not 100% sure if this would work.
Create a new service dummy.
Create and deploy a dispatch.yaml (GAE Standard // GAE Flex)
Add the links you want to block to the dispatch.yaml and point them to the dummy service.
Set up the Identity Aware Proxy (IAP) and enable it for the dummy service.
???
Profit
The idea is that the IAP will block the requests before they hit the dummy service. Since the requests never actually get forwarded to the service dummy you will not have an instance start. The bots will get a nice 403 page from Google's own infrastructure instead.
EDIT: Be sure to create the dummy service with 0 instances as the idea is to not have it cost money.
EDIT2:
So let me expand a bit on this answer.
You can have multiple GAE services running within one GCP project. Each service is it's own app. You can have one service running a python Flask app and another running a Java Springboot app. You can have each be either GAE Standard or GAE Flex. See this doc.
Normally all traffic gets routed to the default service. Using dispatch.yaml you can make request to certain endpoints go to a specific service.
If you create the dummy service as a GAE Standard app, and you don't actually need it to do anything, you can then route all the endpoints that get abused to this dummy service using the dispatch.yaml. Using GAE Standard you can have the service use 0 instances (and 0 costs).
Using the IAP you can then make sure only your own Google account can access this app (which you won't do). In effect this means that the abusers cannot really access the service, as the IAP blocks it before hitting the service, as you've set it up so only your Google account can access it.
Note, the dispatch.yaml is separate from any services, it's one of the per-project configuration files for GAE. It's not tied to a specific service.
As stated, the dummy app doesn't actually need to do anything, but you need to deploy it once though, as this basically creates the service.

Consider using cloudflare to mitigate bot abuse, customize firewall rules regarding route access, rate limit ips, etc. This can be combined with Google cloud load balancer, if you’d like—as mentioned in https://stackoverflow.com/a/69165767/806876.
References
Cloudflare GCP integration: https://www.cloudflare.com/integrations/google-cloud/

There is a little information I did not provide in my question about my app.yaml:
handlers:
- url: /.*
script: mainapp.app
By simply removing .* from the url specification, no Instance start is created. The user gets Error: Not Found, though. So that satisfies my needs.
Edo Akse's Answer pushed me to this answer by reading here, so I am accepting his answer. I am still not clear how to implement Edo's Answer, though.

How to authenticate an end user with OAuth 2.0 for BigQuery API with python as a backend code in cloud function

We have created a Flutter Web app that fetches bigquery data through bigquery API from Cloud Function. We were using a service account for authentication but as we want to make our application public, we need to use OAuth for end-user and use OAuth credentials.
I have tried to deploy the code from this link for testing on cloud function but the cloud function keeps on running and shuts down because of timeout. I then checked the logs and found that, the reason was the cloud function doesn't allow the browser to open for authentication as it would do when run locally.
Logs:
Function execution started
Please visit this URL to authorize this application: https://accounts.google.com/o/oauth2 /auth?response_type=code&client_id=XXXXXXXXXXXXXXXX&redirect_uri=http%3A%2F%2Flocalhost%3A8080%2F&scope=https%3A%2F%2Fwww.googleapis.com%2Fauth%2Fbigquery&state=IXYIkeZUTaisTMGUGkVbgnohlor7Jx&access_type=offline.
Function execution took 180003 ms, finished with status: 'timeout'
I am confused as to how can I now authenticate and authorize a user once and have that credentials for every other bigquery API used in our web app.

I think you are missing the point of the use of Cloud Functions. The documentation you shared clearly states:
This guide explains how to authenticate by using user accounts for access to the BigQuery API when your app is installed onto users' machines.
This is never the case for a Cloud Function, since it is hosted in a Google Cloud Server and available for you to use via an HTTP request or a background process.
Because of that, Cloud Function will interact with other GCP products by using Service Accounts and if you want to setup authentication you will have to set it up in the Cloud Function layer, for which I recommend you to take a look at this documentation which explains the principles of authentication with Cloud Functions

Connect Cloud Function and App Engine internally

I have an arquitecture where a Cloud Function gets trigered whenever a file gets uploaded to a bucket and send the task to an API built on Flask and deployed on App Engine.
I want to make this process internal so that only the Cloud Function can access the App Engine endpoints, but I am struggling with the process.
As these two services are serverless, I can't just filter the traffic in the App Engine firewall since the Cloud Function will have a different IP each time a new instance is created.
I have tried to follow this guide, in which they recommend to associate all the function egress traffic to a Serverless VPC Connector asigned to a subnet, and then control all the traffic of that subnet with a NAT, assigning it a static IP address. This way I could filter on my App Engine firewall by the NAT IP, which will always be the same.
After following all the steps, I am still not able to filter the traffic. With this configuration done, if I open the traffic to everyone and print the IP routes given by the App Engine header X-Forwarded-For when I send a simple GET request from the Cloud Function, it returns the following 0.0.0.0, 169.254.1.1 (it is a list since this header records the clint IP and the proxies involved in the route). The static IP address assigned to my NAT is 34.78.XX.XX, so it seems that the function is not using the NAT to redirect the traffic.
I have read somewhere that when the destiny IP is hosted on Google Cloud, the traffic will not go through the NAT gateway, so maybe this solution won't work on my usecase.
Any idea what am I doing wrong, or if there exist any alternatives for making the process private?

There are 2 solutions to solve this problem. And the choice depends on what you believe in!
Network based solution
If you want to keep your App Engine internal only, I means at network point of view, you can set the |ingress control to internal-only to accept only traffic coming from the VPC of your project
From there, you need to deploy your Cloud Functions, with a VPC connector (to route the traffic to the VPC) and set the egress control to All to route the traffic, public and private, to the VPC.
Indeed, even is you set your App Engine in ingress internal mode, the service is still publicly accessible, but there is an additional check on the request origin to be sure that it comes from the project VPCs. Therefore, when you call App Engine with your Cloud Functions, you call a public endpoint, and you need to route the public traffic to your VPC for being accept on App Engine internal only ingress.
This solution works only with VPC on the project. Cross project set up is impossible
Identity based solution
Google says: Don't trust the network
So, based on that, Google trust the identity of the traffic and request. You can keep your service private (not accessible by anyone except authorized access) only by controlling the authentication of the connection.
For that, you need to activate IAP on your App Engine service and to authorize only the service account of your Cloud Functions.
Then, in your cloud functions, you need to generate an identity token manually and to add it in the header of your request.
Be careful, there is a trap here. The audience is the IAP Client ID (that you can find in the APIs & Services -> Credential page)
Only the valid requests, checked by IAP, will trigger your App Engine service. In case of attacks, IAP will absorb the bad traffic, not App Engine.
So now, what do you trust?

Is there a way to use openshift as a oauth authenticator in my web app?

I am creating a web app for my company.I don't want to add a new sign up process and store the creds for our employees. We already use openshift and every one having openshift creds can login into our openshift cluster. I want to re use that creds to login into my web app.
I came to knew that openshift supports oauth 2.0 and but most of the methods available in internet is using other identity providers like google as auth in openshift. No one guides in using openshift as identity provider in a web app. Any leads will be appreciated.

Based on what I'm seeing in OpenShift 4.1's documentation on Configuring the internal OAuth Server it looks like it may be possible to use the /oauth/authorize endpoint of the control-plane api.
The OpenShift Container Platform master includes a built-in OAuth server. Users obtain OAuth access tokens to authenticate themselves to the API.
When a person requests a new OAuth token, the OAuth server uses the configured identity provider to determine the identity of the person making the request.
It then determines what user that identity maps to, creates an access token for that user, and returns the token for use.
The intention of this endpoint is to grant OAuth tokens specifically for use with the OpenShift cluster, not for third party applications.
Even if it ends up being possible, you'll still probably want to use the OAuth/OIDC mechanisms directly in the upstream authentication provider OpenShift is using if possible as that will provide better support and be more intuitive from an application architecture standpoint.

You can use the openshift user api to access the identity of the user which requested an access token.
The api to call is <api_root>/apis/user.openshift.io/v1/users/~ with a Authorization: Bearer <token> header.
This will give you the k8s user object containing the username and groups of the user.
You can also do this from within an openshift pod using https://kubernetes.default.svc as api_root, this requires you to add the ca in the pod to setup a secure connection.
The ca is mounted in any pod at /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
You can use the oauth mechanism provided by openshift to retrieve an access token using code grant.
The documentation for the openshift oauth internals is sketchy at best, I found it helpful to find the correct urls and parameters in the dex openshift connector source code: here and here

Secure way to communicate with App Engine from Compute Engine

There's a built-in way to setup oauth on App Engine side, and it works great for request coming from my local machine with token generated by GoogleCredentials.get_application_default(), but it does not work for requests from Compute Engine with NotAllowedError exception on App Engine side.
I did multiple attempts to configure requests scopes to include https://www.googleapis.com/auth/userinfo.emails as its required one, but no luck.

Turned out that when you create your instance with Allow API access to all Google Cloud services in the same project. it does not includes required User Info scope.
To include User Info scope, you have to uncheck Allow API access to all Google Cloud services in the same project., go to Access & Security tab and explicitly enable User Info scope.
UPDATE 2018-11-15
The correct way to set email scope now is by using gcloud command:
gcloud compute instances set-service-account INSTANCE-ID --zone=us-central1-f --service-account=PROJECT-ID-compute#developer.gserviceaccount.com --scopes https://www.googleapis.com/auth/userinfo.email,cloud-platform

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.