Google cloud run sudden latency spikes and container instance time increase

Google cloud run sudden latency spikes and container instance time increase - python

We are having recurring problems with our container instances with python running on cloud run. We currently have 20 services deployed, which run fine weeks at a time and then get sudden spikes in request latency as well as ping checks failing and the container instance time going up. We cannot see any added traffic during these spells of longer latency in our systems. Common access points such as database and cache all seem normal.
The region is europe-west1
Does anyone have any tips on what to check? Our have experienced similar problems?
Latency:
Container instance time:

I had to buy support for Google Cloud to get a good answer to this. They told me to make adjustment to my cloud service instances, but none to any effect. They later admitted that this was due to a problem on their end. It is a shame that you as a user do not get any feedback on problems like these when using the Google Cloud Platform, a simple notification in the Google Cloud console for affected users would be of great help, but I think they may like to cover these things up as to not worsen the service accessibility numbers.

Related

Advice on uploading trading bot .exe files to a VPS to run 24/7

I've made a trading bot that uses a c++ .exe for the backend (compute the predictions) and a python .exe for the the frontend (UI, placing trades, keeping track of trades, fetching market data, etc..). Currently I'm running it simply on my laptop, the backend only uses ~1mb process memory at any point, while the frontend uses ~72mb at any point. (The Python memory is calculated using this code:
import os, psutil
while Process_is_running:
process = psutil.Process(os.getpid())
print(process.memory_info().rss)
)
I have never worked with web based applications (besides the python-binance api I guess) or any VPS type service. I am a self taught programmer of only 7 months, roughly.
I just want a basic nudge in the right direction, hopefully somewhere I can read up on the best way to do this.
The details of the program are as follows:
The Frontend automatically logs in to Binance, of course if it runs 24/7 this will only happens once, but if something goes wrong and it has to restart it would log in by itself, though I dont mind receiving a webhook notification or something of the sort to notify me of an event like this so I can log in manually.
The frontend simply sends "commands" and market data to the backend and then the backend simply sends the prediction back and current state of the algorithm. (ie.. "is predicting", "on stand by", "is training")
the reason for doing this is that my location has very unreliable power supply and not very good internet, so it often has to reboot and if it stays offline for too long, of course I might loose money or the program might lose track of the latest trades.
So in Summary: Can anyone just point me in the right direction where I can look for information on this topic, specifically related to my situation? Normally I would spend the time myself, but I am on a massive time constraint here so any help will be appreciated :)

I'm also implementing a bot. So cool that you are doing so as well. I think that it's really the way to go, making emotionless, data-driven trades.
Anyways, if I were you, I would start an AWS instance. Either Linux or Windows.
If you can run your software on Linux, that would be cheaper, as you won't have to pay the (somewhat small) overhead of Windows licensing.
Windows instances are fine, though. Here are the docs on getting started with AWS windows instances.
I know that you're just getting started, and you probably have multiple things that you want to do with this project. One suggestion for a direction that you could take is to go serverless. Of course there will be some server, but AWS can abstract that away from you to where you. This can make it both cheaper to run your bot and simpler to manage.

Why does colab disconnect?

I am trying to load my dataset. I have been using colab's TPU for many times,but the colab gets disconnected every time. I have tried all the methods to keep it connected, still it doesn't work. I have been training for more than 10 hours and still the colab gets disconnected. What do I do??

There could be many possibilities why your session is crashing.
There is a time limit for the free tier in Google Colab. If your execution gets over the time, it disconnects.
Also check the RAM usage, if that exceeds the session will crash.
The storage limits might exceed.
Run and keep an eye on these factors. And try to optimise the code or use aws for training.

As #HarshitRuwali mentioned there are several reasons for why this would happen.
Regarding the question "what do I do?" - you can purchase a subscription for Colab Pro, which eliminates/relaxes different limitations, including the time limit.

Azure infrastructure for a Python script triggered by a http request

I'm a bit lost in the jungle of documentation, offers, and services. I'm wondering how the infrastructure should look like, and it would be very helpful to get a nudge in the right direction.
We have a python script with pytorch that runs a prediction. The script has to be triggered from a http request. Preferably, the samples to do a prediction on also has to come from the same requester. It has to return the prediction as fast as possible.
What is the best / easiest / fastest way of doing this?
We have the script laying in a Container Registry for now. Can we use it? Azure Kubernetes Service? Azure Container Instances (is this fast enough)?
And about the trigger, should we use Azure function, or logic app?
Thank you!

Azure Functions V2 has just launched a private preview for writing Functions using Python. You can find some instructions for how to play around with it here. This would probably be one of the most simple ways to execute this script with an HTTP request. Note that since it is in private preview, I would hesitate to recommend using it in a production scenario.
Another caveat to note with Azure Functions is that there will be a cold start whenever we create a new instance of your function application. This should be in the order of magnitude of ~2-4 seconds, and should only happen on the first request after the application has not seen much traffic for a while, or if a new instance has been created to scale up your application to receive more traffic. You can avoid this cold start by making your function on a dedicated App Service Plan, but at that point you are losing a lot of the benefits of Azure Functions.

Python web scraping script resources needed for cloud computing

I have a python script for doing web scraping, that it's imposible to run it in my computer due harward limitations. I was wondering, for run it in the cloud with google app engine or Heroku, using -if it's possible, the freeware resources both provie- and the first questions arise me is:
How could i know if my script stay under freeware limitations?
How could i know the harware resources needed?
Thanks.

For Google Cloud, there's a free tier so if you stay under that, it's free.
I am not sure why you need to "know" if your app stays in the free range, you can simply try it and see if it stays in the free range. It would be too difficult to guess if you would stay in the free range without know your application and expected traffic. What you can do is simply try it and set a very low or zero budget to prevent being charged too much.

I think in that context you can go with Alibaba cloud. Using Alibaba Cloud Free trial you can subscribe an ECS instance for a month. In this way, you can avoid charges. Please have a look at Official Document which provides step by step process to create, enrol free account and how to subscribe an ECS instance. You can try the python script there without any restriction and limits for a month.

Google App Engine Application Extremely slow

I created a Hello World website in Google App Engine. It is using Django 1.1 without any patch.
Even though it is just a very simple web page, it takes long time and often it times out.
Any suggestions to solve this?
Note: It is responding fast after the first call.

Now Google has added a payment option "Always On" which is 0.30$ a day.
Using this feature, your application will not have to cold start any more.
Always On
While warmup requests help your
application scale smoothly, they do
not help if your application has very
low amounts of traffic. For
high-priority applications with low
traffic, you can reserve instances via
App Engine's Always On feature.
Always On is a premium feature which
reserves three instances of your
application, never turning them off,
even if the application has no
traffic. This mitigates the impact of
loading requests on applications that
have small or variable amounts of
traffic. Additionally, if an Always On
instance dies accidentally, App Engine
automatically restarts the instance
with a warmup request. As a result,
Always On applications should be sure
to do as much initialization as
possible during warmup requests.
Even after enabling Always On, your
application may experience loading
requests if there is a sudden increase
in traffic.
To enable Always On, go to the Billing
Settings page in your application's
Admin Console, and click the Always On
checkbox.
http://code.google.com/intl/de-DE/appengine/docs/adminconsole/instances.html

This is a horrible suggestion but I'll make it anyway:
Build a little client application or just use wget with cron to periodically access your app, maybe once every 5 minutes or so. That should keep Google from putting it into a dormant state.
I say this is a horrible suggestion because it's a waste of resources and an abuse of Google's free service. I'd expect you to do this only during a short testing/startup phase.

To summarize this thread so far:
Cold starts take a long time
Google discourages pinging apps to keep them warm, but people do not know the alternative
There is an issue filed to pay for a warm instance (of the Java)
There is an issue filed for Python. Among other things, .py files are not precompiled.
Some apps are disproportionately affected (can't find Google Groups ref or issue)
March 2009 thread about Python says <1s (!)
I see less talk about Python on this issue.

If it's responding quickly after the first request, it's probably just a case of getting the relevant process up and running. Admittedly it's slightly surprising that it takes so long that it times out. Is this after you've updated the application and verified that the AppEngine dashboard shows it as being ready?
"First hit slowness" is quite common in many web frameworks. It's a bit of a pain during development, but not a problem for production.

One more tip which might increase the response time.
Enabling billing does increase the quotas, and, to my personal experience, increase the overall response of an application as well. Probably because of the higher priority for billing-enabled applications google has. For instance, an app with billing disabled, can send up to 5-10 emails/request, an app with billing enabled easily copes with 200 emails/request.
Just be sure to set low billing levels - you never know when Slashdot, Digg or HackerNews notices your site :)

I encounteres the same with pylons based app. I have the initial page server as static, and have a dummy ajax call in it to bring the app up, before the user types in credentials. It is usually enough to avoid a lengthy response... Just an idea that you might use before you actually have a million users ;).

I used pingdom for obvious reasons - no cold starts is a bonus. Of course the customers will soon come flocking and it will be a non-issue

You may want to try CloudUp. It pings your google apps periodically to keep them active. It's free and you can add as many apps as you want. It also supports azure and heroku.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.