Google Cloud Scheduler issues using python sdk 2.6.0

Google Cloud Scheduler issues using python sdk 2.6.0 - python

I am using google's python sdk (v2.6.0) for creating jobs on Google Cloud Scheduler that have a HTTP target. Jobs are getting created, however, I am facing a couple of other issues:
The HTTP target doesn't receive the X-Cloudscheduler-Scheduletime header which I am using to check the schedule execution time. If I create the job manually using the cloud console with the same HTTP target, I am getting that header in the request.
Jobs created using python sdk cannot be updated. This is because the time_zone property is not being set properly for jobs created via python sdk. In the console, these jobs don't show any timezone and if I try to update any other property, it doesn't let me do it because it says 'time_zone' can't be left blank which is not true because scheduler jobs cannot be created without a timezone, so this is definitely an issue with the sdk.
I have seen another post with someone using nodejs sdk and facing the same problem. Their solution was to downgrade the sdk version to fix it but that doesnt seem to work with the python counterpart.

Related

Is there a way to check on past Pivot Cloud Foundry (PCF) CLI buildpacks?

I'm currently attempting to stop utilizing a web proxy which allows internet access from an AWS Virtual Private Cloud as it won't be in use anymore soon. I also use the internet access to fetch data from an API endpoint which has past buildpack data such as the version and name of the buildpack itself. (https://buildpacks.cloudfoundry.org/#/buildpacks) General information is that I'm currently using python and AWS to do what I am doing.
Despite my research, I haven't been able to find such a CLI command which allows me to get this data without usage of this PCF API. Is there any way to do this without internet access?

Dataflow jobs are failed due to Beam-nuggets referencing to sqlalchemy

We have created an ETL in GCP which reads data from MySQL and migrates it to BigQuery. To read data from MySQL, we use beam-nuggets library. This library is passed as an extra package ('--extra_package=beam-nuggets-0.17.1.tar.gz') to the dataflow job. Cloud functions were used to create the dataflow job. The code was working fine and the Dataflow job got created and the data migration was successful.
After the latest version of sqlalchemy – 1.4 got released, we were unable to deploy the cloud function. The cloud function deployment failed with the exception as mentioned below.
To fix this issue, we tried to give the previous version of sqlalchemy – 1.3.23 in the requirements.txt file of cloud functions. This resolved the issue and the cloud functions got deployed successfully. But when we triggered the dataflow job from cloud functions, we got the same error as mentioned above.
This issue is caused because beam-nuggets library is internally referencing sqlalchemy during the run time and the job fails with the same error. Is it possible to manually enforce beam-nuggets to pick a specific version of sqlalchemy??

Try passing a specific version of sqlalchemy via the extra_package flag as well.

Where to host pub sub publisher on GCP?

I'm looking to create a publisher that streams and sends tweets containing a certain hashtag to a pub/sub topic.
The tweets will then be ingested with cloud dataflow and then loaded into a Big Query database.
In the following article they do something similar where the publisher is hosted on a docker image on a Google Compute Engine instance.
Can anyone recommend alternative Google Cloud resources that could host the publisher code more simply, that avoids the need to create a docker file etc?
The publisher would need to run constantly. Would cloud run for e.g. be a suitable alternative?

There are some workarounds I can think of:
A quick way to avoid containers architecture is having the on_data method inside a loop, for example, by using something like while(true) or start a Stream like explained in Create your Python script and run the code in a Compute Engine in the background with nohup python -u myscript.py. Or follow the steps described in Script on GCE to capture tweets that uses tweepy.Stream to start the streaming.
You might want to reconsider the Dockerfile option since its configuration could be not so difficult, see Tweets & pipelines where there is a script that read the data and publish to PubSub, you will see that 9 lines are used for the Docker file and it is deployed in App Engine using Cloud Build. Another implementation with a Docker file that requires more steps is twitter-for-bigquery, in case it helps, you will see that there are more specific steps and more configurations.
Cloud Functions is also another option, in this guide Serverless Twitter with Google Cloud you can check the Design section to know if it fits your use case.
Airflow with Twitter Scraper could work for you since Cloud Composer is a managed service for Airflow and you can create an Airflow environment quickly. It uses the Twint library, check the Technical section in the link for more details.
Stream Twitter Data into BigQuery with Cloud Dataprep is a workaround that put aside complex configurations. In this case the job won't run constantly but can be scheduled to run within minutes.

How to update a firebase server after making a change?

I've set up a Flask server which is hosted on Firebase integrated with Cloud Run, I'm only making changes to html at the moment and using the command "firebase serve" with my localhost, however when I refresh the window and when I stop the server and restart it, my changes are still not showing up. I must be googling wrong because I can't find what I'm looking for: is there some sort of an update command, or do I need to re-build and re-deploy every time?

If the Firebase emulator suite isn't proxying the request to Cloud Run in the way you expect, you should open an issue on the firebase-tools GitHub and provide reproduction steps so they can diagnose. You should make sure that your installation of firebase-tools is fully up to date.
Note that the CLI will not deploy any new code to Cloud Run. You still have to run gcloud to update the backend.

Running One Instance of Google App Engine with frontend in nodejs and backend server in python

I'm getting my feet wet with GCP and GAE, also nodejs and python and networking (I know).
[+] What I have:
Basically I have some nodejs code that takes in some input and is supposed to then send that input to some python code that will do more stuff to it. My first idea was to deploy the nodejs code via GAE, then host the python code in a python server, then make post requests from the nodejs front-end to the python server backend.
[+] What I would like to be able to do:
just deploy both my nodejs code and my python code in the same project and instance of GAE so that the nodejs is the frontend that people see but so that the python server is also running in the same environment and can just communicate with the nodejs without sending anything online.
[+] What I have read
https://www.netguru.co/blog/use-node-js-backend
Google App Engine - Front and Backend Web Development
and countless other google searches for this type of setup but to no avail.
If anyone can point me in the right direction I would really appreciate it.

You can't have both python and nodejs running in the same instance, but they can run as separate services, each with their own instance(s) inside the same GAE app/project. See Service isolation and maybe Deploying different languages services to the same Application [Google App Engine]
Using post requests can work pretty well, but will likely take some effort to ensure no outside access.
Since you intend to use as frontend the nodejs service you're limited to using only the flexible environment for it, which limits the inter-service communication options - you can't use push queues (properly supported only in the standard environment) which IMHO would be a better/more secure solution than post requests.
Another secure communication option would be for the nodejs service to place the data into the datastore and have the python service pick it up from there - the datastore is shared by all instances/versions/services inside the same GAE app. Also more loosely coupled IMHO - each service can function (at least for a while) without the other being alive (not possible if using the post requests).
Maybe of interest: How to tell if a Google App Engine documentation page applies to the standard or the flexible environment
UPDATE:
Node.JS is currently available in the standard environment as well, so you can use those features, see:
Now, you can deploy your Node.js app to App Engine standard environment
Google App Engine Node.js Standard Environment Documentation

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.