Long polling with GMail API

Long polling with GMail API - python

I'm building an installation that will run for several days and needs to get notifications from a GMail inbox in real time. The Gmail API is great for many of the features I need, so I'd like to use it. However, it has no IDLE command like IMAP.
Right now I've created a GMail API implementation that polls the mailbox every couple of seconds. This works great, but times out after a while (I get "connection reset by peer"). So, is it reasonable to turn off the sesson and restart it every half an hour or so to keep it active (like with IDLE)? Is that a terrible, terrible hack that will have google busting down my door in the middle of the night?
Would the proper solution be to log in with IMAP as well and use IDLE to notify my GMail API module to start up and pull in changes when they occur? Or should I just suck it up and create an IMAP only implementation?

Would definitely recommend against IMAP, note that even with the IMAP IDLE command it isn't real time--it's just polling every few (5?) seconds under the covers and then pushing out to the connection. (Experiment yourself and see the delay there.)
Querying history.list() frequently is quite cheap and should be fine. If this is for a sizeable number of users you may want to do a little bit of optimization like intelligent backoff for inactive mailboxes (e.g. every time there's no updates backoff by an extra 5s up to some maximum like a minute or two)?
Google will definitely not bust down your door or likely even notice unless you're doing it every second with 1M users. :)
Real push notifications for the API is definitely something that's called for.

You are getting connection reset by peer because you are exceeding Google quota. Every GMail API request has quota defined here.

Related

Advice on uploading trading bot .exe files to a VPS to run 24/7

I've made a trading bot that uses a c++ .exe for the backend (compute the predictions) and a python .exe for the the frontend (UI, placing trades, keeping track of trades, fetching market data, etc..). Currently I'm running it simply on my laptop, the backend only uses ~1mb process memory at any point, while the frontend uses ~72mb at any point. (The Python memory is calculated using this code:
import os, psutil
while Process_is_running:
process = psutil.Process(os.getpid())
print(process.memory_info().rss)
)
I have never worked with web based applications (besides the python-binance api I guess) or any VPS type service. I am a self taught programmer of only 7 months, roughly.
I just want a basic nudge in the right direction, hopefully somewhere I can read up on the best way to do this.
The details of the program are as follows:
The Frontend automatically logs in to Binance, of course if it runs 24/7 this will only happens once, but if something goes wrong and it has to restart it would log in by itself, though I dont mind receiving a webhook notification or something of the sort to notify me of an event like this so I can log in manually.
The frontend simply sends "commands" and market data to the backend and then the backend simply sends the prediction back and current state of the algorithm. (ie.. "is predicting", "on stand by", "is training")
the reason for doing this is that my location has very unreliable power supply and not very good internet, so it often has to reboot and if it stays offline for too long, of course I might loose money or the program might lose track of the latest trades.
So in Summary: Can anyone just point me in the right direction where I can look for information on this topic, specifically related to my situation? Normally I would spend the time myself, but I am on a massive time constraint here so any help will be appreciated :)

I'm also implementing a bot. So cool that you are doing so as well. I think that it's really the way to go, making emotionless, data-driven trades.
Anyways, if I were you, I would start an AWS instance. Either Linux or Windows.
If you can run your software on Linux, that would be cheaper, as you won't have to pay the (somewhat small) overhead of Windows licensing.
Windows instances are fine, though. Here are the docs on getting started with AWS windows instances.
I know that you're just getting started, and you probably have multiple things that you want to do with this project. One suggestion for a direction that you could take is to go serverless. Of course there will be some server, but AWS can abstract that away from you to where you. This can make it both cheaper to run your bot and simpler to manage.

What is the best way to handle multiple user requests when lot of back end calculation is involved?

Hi I am quite new to web application development. I have been designing an application where a user uploads a file, some calculation is done and an output table will be shown. This process takes approximately 5-6 seconds.
I am saving my data in sessions like this:
request.session ['data']=resultDATA.
And loading the data whenever I need from sessions like this:
resultDATA = request.session['data']
I dont need DATA once the user is signed out. So is approach correct to save user data (not involving passwords)?
My biggest problem is if n number of users upload their files at exact moment do the last user have to wait for n*6 seconds for his calculation to complete? If yes is there any solution for this?
Right now I am using django built-in web server.
Do I have to use a different server to solve this problem?

There are quiet some questions in this question, however i think they are related enough and concise enough to deserve an answer:
So is approach correct to save user data (not involving passwords)?
I don't see any problem with this approach, since it's volatile data and it's not sensitive.
My biggest problem is if n number of users upload their files at exact moment do the last user have to wait for n*6 seconds for his calculation to complete?
This shouldn't be an issue as you put it. obviously if your server is handling huge ammounts of traffic it will slow down, and it will take a bit longer than your usual 5-6 seconds. However it won't be n*6, the server should be able to handle multiple requests at once.
Do I have to use a different server to solve this problem?
No, but kind of yes... what i mean is that in development the built-in server is great. It does everything you need it to do, however when you decide to push the app into production, you'll need a proper server for it.
As a side note, try to see if you can improve the data collection time, because right now everything is running on your own PC, which means it will probably be faster than when you push it to production. When you "upload" a file to localhost it takes a lot less time than when you upload it to an actual server over the internet, so that's a thing to keep in mind.

Time out issues with chrome and flask

I have a web application which acts as an interface to an offsite server which runs a very long task. The user enters information and hits submit and then chrome waits for the response, and loads a new webpage when it receives it. However depending on the network, input of the user, the task can take a pretty long time and occasionally chrome loads a "no data received page" before the data is returned (though the task is still running).
Is there a way to put either a temporary page while my task is thinking or simply force chrome to continue waiting? Thanks in advance

While you could change your timeout on the server or other tricks to try to keep the page "alive", keep in mind that there might be other parts of the connection that you have no control over that could timeout the request (such as the timeout value of the browser, or any proxy between the browser and server, etc). Also, you might need to constantly up your timeout value if the task takes longer to complete (becomes more advanced, or just slower because more people use it).
In the end, this sort of problem is typically solved by a change in your architecture.
Use a Separate Process for Long-Running Tasks
Rather than submitting the request and running the task in the handling view, the view starts the running of the task in a separate process, then immediately returns a response. This response can bring the user to a "Please wait, we're processing" page. That page can use one of the many push technologies out there to determine when the task was completed (long-polling, web-sockets, server-sent events, an AJAX request every N seconds, or the dead-simplest: have the page reload every 5 seconds).
Have your Web Request "Kick Off" the Separate Process
Anyway, as I said, the view handling the request doesn't do the long action: it just kicks off a background process to do the task for it. You can create this background process dispatch yourself (check out this Flask snippet for possible ideas), or use a library like Celery or (RQ).
Once the task is complete, you need some way of notifying the user. This will be dependent on what sort of notification method you picked above. For a simple "ajax request every N seconds", you need to create a view that handles the AJAX request that checks if the task is complete. A typical way to do this is to have the long-running task, as a last step, make some update to a database. The requests for checking the status can then check this part of the database for updates.
Advantages and Disadvantages
Using this method (rather than trying to fit the long-running task into a request) has a few benefits:
1.) Handling long-running web requests is a tricky business due to the fact that there are multiple points that could time out (besides the browser and server). With this method, all your web requests are very short and much less likely to timeout.
2.) Flask (and other frameworks like it) is designed to only support a certain number of threads that can respond to web queries. Assume it has 8 threads: if four of them are handling the long requests, that only leaves four requests to actually handle more typical requests (like a user getting their profile page). Half of your web server could be tied up doing something that is not serving web content! At worse, you could have all eight threads running a long process, meaning your site is completely unable to respond to web requests until one of them finishes.
The main drawback: there is a little more set up work in getting a task queue up and running, and it does make your entire system slightly more complex. However, I would highly recommend this strategy for long-running tasks that run on the web.

I believe this is due to your web server (apache in most cases) which has a timeout to small. Try to increase this number
For apache, have a look at the timeout option
EDIT: I don't think you can do set this time out in Chrome (see this topic on google forums even though it's really old)
In firefox, on the about:config page, type timeout and you'll have some options you can set. I have no idea about Internet Explorer.

Let's assume:
This is not a server issue, so we don't have to go fiddle with Apache, nginx, etc. timeout settings.
The delay is minutes, not hours or days, just to make the scenario manageable.
You control the web page on which the user hits submit, and from which user interaction is managed.
If those obtain, I'd suggest not using a standard HTML form submission, but rather have the submit button kick off a JavaScript function to oversee processing. It would put up a "please be patient...this could take a little while" style message, then use jQuery.ajax, say, to call the long-time-taking server with a long timeout value. jQuery timeouts are measured in milliseconds, so 60000 = 60 seconds. If it's longer than that, increase your specified timeout accordingly. I have seen reports that not all clients will allow super-extra-long timeouts (e.g. Safari on iOS apparently has a 60-second limitation). But in general, this will give you a platform from which to manage the interactions (with your user, with the slow server) rather than being at the mercy of simple web form submission.
There are a few edge cases here to consider. The web server timeouts may indeed need to be adjusted upward (Apache defaults to 300 seconds aka 5 minutes, and nginx less, IIRC). Your client timeouts (on iOS, say) may have maximums too low for the delays you're seeing. Etc. Those cases would require either adjusting at the server, or adopting a different interaction strategy. But an AJAX-managed interaction is where I would start.

Timed Quiz: How to consider internet interruptions?

I am preparing a Test or Quiz in Django. The quiz needs to be completed in certain time frame. Say 30 minutes for 40 questions.I can always initiate a clock at start of the test, and then calculate time by the time the Quiz is completed. However it's likely that during the attempt, there may be issues such as internet connection drops, or system crashes/power outages etc.
I need a strategy to figure out when such an accident happened, and stop the clock, then let the user take the test again from where it stopped, and start the clock again.
What is the right strategy? Any help including sample code/examples/ideas are most welcome

Your strategy should depend on importance of the test and ability to retake whole test.
Is test/quiz for fun or competence/knowledge checking?
Are you dealing with logged users?
Are tests generated randomly from large poll of available questions?
these are the questions you need to answer yourself first.
Remember that:
malicious user CAN simulate connection outage / power failure,
only clock you can trust is one on server side,
everything on browser side can be manipulated (think firebug/console js injection)
My approach would be:
Inform users that TIME is important factor and connection issues may not be taken into account when grade will be given...,
Serve only one question, wait for answer, serve another one,
Whole test time should be calculated as SUM of each answer time:
save each "question send" / "answer received" timestamps and calculate answer time from it,
time between questions wouldn't count,
you'd get extra scope on which questions was harder / took longer to answer.
Add some kind of heartbeat to your question page (like ajax request every X seconds), when heartbeat stops you can (depending on options you have):
invalidate question and notify user via dialog that he has connection issues and have to refresh to get new question instead if you have larger poll of questions to use,
pause time on server side (and for example dim question page so user cannot answer until his connection is restored) IMO only for games/fun quiz/tests
save information on server side on each interruption which would later ease decision to allow retake whole test e.g. he was fine until 20th question and then on 3-4 easy questions in a row he was dropping...

The simplest way would be to add a timestamp when the person starts the quiz and then compare that to when they submit. Of course, this doesn't take into account connection drops, crashes, etc... like you mentioned.
To account for these issues I'd probably use something like node.js. Each client has "check-in" when they connect to the quiz. Then at regular intervals (every 1s, 10s, 1m, etc...) the client checks in. If at these intervals the client doesn't check-in you can assume they've had the connection drop. You could keep track of when they connect again and start the timer from where they left off.
This is my initial thought on how to keep track of connection drops and crashes. The same could be done with a front-end ajax call to a Django view.

Either you do the clock on the client side, in which case they can always cheat somehow, or you do it on the server side, and then you aren't taking into account these interruptions.
To reduce cheating somewhat and still allow for interruptions, you could do a 'keep alive'.
Here the client side code announces to the server that it is still there every so often, say every 5 seconds. The server side notes when it stops getting these messages, and pauses/stops the clock. However it still has the start and end time, so you know how long it really took in wall time, and also how long it took while the client was supposedly there.
With these two pieces of information you could very easily track down odd behaviour and blacklist people. Blacklisted people might not be aware that they are blacklisted, but their quiz scores don't show up for other users of your quiz system.

The problem with pausing the clock when the connection to the user drops, is that the user could just disconnect their computer from the internet each time they received a new question, and then reconnect once they had worked out the answer.
One thing you could do, is give the user a certain amount of time for each question.
The clock is started when the user successfully receives the question to their browser, and if the user submits an answer before the time limit, it is accepted, otherwise it is void.
That would mean if a user lost connection it would only affect the question they are currently on. But it would also mean that the user would have no flexibility in how much time they want to allot to each question, you decide for them.
I was thinking you could do something like removing the question from the screen unless the connection to the server was still alive, but the user could always just screen-shot the question before disconnecting.

Google App Engine Application Extremely slow

I created a Hello World website in Google App Engine. It is using Django 1.1 without any patch.
Even though it is just a very simple web page, it takes long time and often it times out.
Any suggestions to solve this?
Note: It is responding fast after the first call.

Now Google has added a payment option "Always On" which is 0.30$ a day.
Using this feature, your application will not have to cold start any more.
Always On
While warmup requests help your
application scale smoothly, they do
not help if your application has very
low amounts of traffic. For
high-priority applications with low
traffic, you can reserve instances via
App Engine's Always On feature.
Always On is a premium feature which
reserves three instances of your
application, never turning them off,
even if the application has no
traffic. This mitigates the impact of
loading requests on applications that
have small or variable amounts of
traffic. Additionally, if an Always On
instance dies accidentally, App Engine
automatically restarts the instance
with a warmup request. As a result,
Always On applications should be sure
to do as much initialization as
possible during warmup requests.
Even after enabling Always On, your
application may experience loading
requests if there is a sudden increase
in traffic.
To enable Always On, go to the Billing
Settings page in your application's
Admin Console, and click the Always On
checkbox.
http://code.google.com/intl/de-DE/appengine/docs/adminconsole/instances.html

This is a horrible suggestion but I'll make it anyway:
Build a little client application or just use wget with cron to periodically access your app, maybe once every 5 minutes or so. That should keep Google from putting it into a dormant state.
I say this is a horrible suggestion because it's a waste of resources and an abuse of Google's free service. I'd expect you to do this only during a short testing/startup phase.

To summarize this thread so far:
Cold starts take a long time
Google discourages pinging apps to keep them warm, but people do not know the alternative
There is an issue filed to pay for a warm instance (of the Java)
There is an issue filed for Python. Among other things, .py files are not precompiled.
Some apps are disproportionately affected (can't find Google Groups ref or issue)
March 2009 thread about Python says <1s (!)
I see less talk about Python on this issue.

If it's responding quickly after the first request, it's probably just a case of getting the relevant process up and running. Admittedly it's slightly surprising that it takes so long that it times out. Is this after you've updated the application and verified that the AppEngine dashboard shows it as being ready?
"First hit slowness" is quite common in many web frameworks. It's a bit of a pain during development, but not a problem for production.

One more tip which might increase the response time.
Enabling billing does increase the quotas, and, to my personal experience, increase the overall response of an application as well. Probably because of the higher priority for billing-enabled applications google has. For instance, an app with billing disabled, can send up to 5-10 emails/request, an app with billing enabled easily copes with 200 emails/request.
Just be sure to set low billing levels - you never know when Slashdot, Digg or HackerNews notices your site :)

I encounteres the same with pylons based app. I have the initial page server as static, and have a dummy ajax call in it to bring the app up, before the user types in credentials. It is usually enough to avoid a lengthy response... Just an idea that you might use before you actually have a million users ;).

I used pingdom for obvious reasons - no cold starts is a bonus. Of course the customers will soon come flocking and it will be a non-issue

You may want to try CloudUp. It pings your google apps periodically to keep them active. It's free and you can add as many apps as you want. It also supports azure and heroku.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.