Feedback on availability with Google App Engine - python

We've had some good experiences building an app on Google App Engine, this first app's target audience are Google Apps users, so no issues there in terms of it being hosted on Google infrastructure.
We like it so much that we would like to investigate using it for another app, however this next project is for a client who is not really that interested in what technology it sits on, they just want it to work, and work all of the time.
In this scenario, given that we have the technology applicability and capability side covered, are there any concerns that this stuff is still relatively new and that we may not be as much "in control" as if we had it done with traditional hosting?

You are correct: you are not in as much control vs. traditional hosting. However, hopefully the gains outweight the negatives. App Engine is extremely scalable -- it runs on the same hardware that runs Google itself. How often have you visited http://google.com and had that page or a search result fail?
Although you are letting Google run your code, the code is still your's to do as you please. With new projects like django-nonrel, you can create and run native Django apps directly on top of App Engine, and if it doesn't meet your needs down the line, it's fairly easy to take that app to an ISP that hosts Django apps (and there are plenty of those). More on this project below.
You don't have to worry about hardware, operating systems, coming up with a machine image, databases, web servers, front-end load balancers, CDNs/edge caching, software/package upgrades, license fees, etc. All these things are tangential to the web or other application that you have or will create to solve a particular problem. All this additional infrastructure is required whether you like it or not; but with App Engine, you only need to think about your app/solution and none of this extra stuff.
Obviously another thing you lose is some of your execution environment. To ensure that you're playing nicely with your cloud neighbors (resource hogging, security issues, etc.), you must execute in a sandbox, meaning your app cannot create local files, open network sockets, etc. However, App Engine provides a rich set of APIs and product features so that you at least can create meaningful apps:
scalable distributed object datastore (see below)
Memcache
URLFetch
images service (resize, crop, etc.)
users service/authentication task queues for background processing
Django web templating
blobstore for large files
denial-of-service blacklisting
transational tasks
datastore cursors
sending (and/or receiving) of email
sending (and/or receiving) of chat/IM/instant messages via XMPP
You also have a full dashboarded administration console which will let you monitor your app's usage, your billing settings and history, a full dump of your quota usage, and even your application logs which you can view or download.
To address the "main sore points" from #Anurag:
1a. the free quotas are fairly generous... enough to power a website that gets 5MM views/month. also, if you trust Google to give them your credit card, they will bump up the free quota levels even higher. look at their quota page and refer to the numbers in both the "Free Default Quota" and "Billing Enabled Default Quota" columns... here are some examples: a) # of Requests: 1.3MM default, 43MM w/billing enabled (wBE), b) Datastore API calls: 10MM default, 140MM wBE, c) URL Fetches: 657K default, 46MM wBE
1b. 30s max for requests: this is more security for you, because your app is now in a playground with others. Google has to ensure that all cloud neighbors play nicely with each other and not hog the CPU. However, the App Engine team is working on a way to allow for longer running background tasks... there's no timetable yet, but it is on the public roadmap.
1c. writing a chat server on App Engine is not only possible, but it is quite simple. here's one created using App Engine's XMPP API -- it's pretty dumb and just echoes back to the sender what they transmitted to us (be aware that you must have already invited the user to chat):
from google.appengine.api import xmpp
from google.appengine.ext import webapp
from google.appengine.ext.webapp.util import run_wsgi_app
class XMPPHandler(webapp.RequestHandler):
def post(self):
msg = xmpp.Message(self.request.POST)
msg.reply("I got your msg: '%s'" % msg.body)
application = webapp.WSGIApplication([
('/_ah/xmpp/message/chat/', XMPPHandler),
], debug=True)
def main():
run_wsgi_app(application)
if __name__ == '__main__':
main()
1d. another item on the the public roadmap is future "[support] for Browser Push (Comet) communication", so that's coming too.
2a. "not SQL" is one of Google App Engine's greatest strengths! relational databases don't scale and must be sharded at some point to keep an RDBMS from falling over. it is true however, that it is slightly more difficult to port because it is not traditional! Based on Google Bigtable, you can think of the App Engine datastore as a scalable distributed object database. App Engine lets you query the datastore using a Query object model, or if you insist, they also provide a SQL-like GqlQuery interface.
2b. with new avantgarde projects like django-nonrel, if you create a Django app and use its ORM, you can take a pure Django app and run it directly on top of App Engine. likewise, you can it off of App Engine and move it directly to more traditional ISP vendor that hosts Django applications. the queries stay exactly the same, and you don't have to care if it does SQL or not.
3a. long-running processes are already addressed in 1b above. Google is aware of this need and are working on it.
3b. the TaskQueue API supports 100k calls, but that's bumped to 1MM wBE... and this is on a daily basis.
3c. Google strongly encourages breaking up tasks into multiple subtasks. low latency apps are seen not to "hog the system" and are given better treatment than those which are slow and consume more resources from their cloud neighbors.

Yes, you would not be in as much control as with traditional hosting. Main sore points of GAE are
Quotas etc, 30 sec max for a request, so comet/reverse ajax etc out of window or very difficult. Try writing a chat server on google app engine.
Not Sql database, so difficult to port to other server if need be and sometime limitation with google database e.g. try sorting a query which has comparison on different column other than the sorted one.
Long running process, there is a Task api but that doesn't suffice if you want to do long running background processing, otherwise you will have to break your task in subtasks, so things get complicated and there are even quotas on how many tasks per sec you can run.
GAE is good if you app can be modeled as request-response registry, with little background processing.
See this too
Feedback on using Google App Engine?

Related

How do I deploy this app for my job: EC2, Elastic Beanstalk, something else entirely?

I'm tasked with creating a web app (I think?) for my job that will tracker something in our system. It'll be an internal tool that staff uses to keep track of the status of one of the things we do. It should look like trello, with cards that drag from step to step. That frontend exists, but my job is to make the system update when the cards are dragged. This requires using an API in Python and isn't that complicated to grab from/update. I have no idea how to put all of this together. My job is almost completely nontechnical and there's no one internally who knows what I'm doing except for me. I'm in so over my head here and have no idea where to begin. Is this something I should deploy on Elastic Beanstalk? EC2? How do I tie this together and put it somewhere?
Are you trying to pull in live data from Trello or from your companies own internal project management tool?
An EC2 might be useful, but honestly, it may be completely unnecessary if your company has its own servers. An EC2 is basically just a collection of rental computers to help with scaling. I have never used beanstalk so my input would be useless there.
From what I can assume from the question, you could have a python script running to pull from the API and make the changes without an EC2.
First thing you should do is gather as much information about what the end product should look like. From your question, I have the feeling that you have only a vague idea of what the stakeholders want. Don't be afraid to ask more clarification about an unclear task. It's better to spend 30 minutes discussing and taking note than to show the end-product after a month and realizing that's not what your boss/team wanted.
Question I would Ask
Who is going to be using this app? (technical or non-technical person)
For what purpose is this being developed?
Does it need to be on the web or can it be used locally?
How many users need to have access to this application?
Are we handling sensitive information with this application?
Will this need to be augmented with other functionality at some point?
This is just a sample of what I would ask, during the conversation with the stakeholder a lot more will pop up for sure.
What I think you have to do
You need to make a monitoring system for the tasks that need to be done by your development team (like a Kanban)
What I think you already have
A frontend with the card that are draggable to each bin. I also assume that you can create a new card and delete one in the frontend. The frontend is most likely written in React, Angular or Vue.js. You might also have no frontend framework (a mix of jQuery and vanilla js), but usually frontend developper end up picking a framework of sort to help the development.
A backend API in Python (in Flask or with Django-rest-framework most likely) that is communicating with a SQL database like postgresql or a Document database like MongoDB.
I'm making a lot of assumption here, but your aim should be to understand the technology you will be working with in order to check which hosting would work best. For instance, if the database that is setup is a MySQL database you might have some trouble with some hosting provider.
What I think you are missing
Currently the frontend and the backend don't communicate to each other. When you drag a card it won't persist if you refresh the page. Also, all of this is sitting in your computer and cannot be used by any one from your staff. You need to first connect the frontend with the backend so that the application has persistance. Then you need to deploy this application somewhere so that it is reachable by your staff.
What I would do is first work locally to make sure that the layer of persistance is working. This imply having the API server, the frontend server and the database server running simultaneously on your computer to develop. You should then fetch data from the API to know which cards are there in the database and then create them visually in your frontend at the right spot.
When you drop a card to a new spot after having dragging it should trigger a POST request to your API server in order to update the status of this particular card (look at the documentation of your API to check what you need to send).
The server should be sending back an updated version of the cards status if the POST request was sucessful, so your application should then just redraw the card at the right spot (it won't make a difference for you since they are already at the right spot and your frontend framework will most likely won't act on this change since the state hasn't changed). That's all I would do for that part.
I would then move to the deployment phase to make sure that whatever you did locally can still work online. I would use Heroku to start instead of jumping directly to AWS. Heroku is a service built on top of AWS which manage a lot of the complexity of AWS for you. This is great for prototyping and it means that when your stuff is ready you can migrate to AWS easily and be confident that a setup exist to make your app work. You might also be tied up to your company servers, which is another thing I would ask to the stakeholder (i.e. where can I put this application and where I can't put it).
The flow for a frontend + api + database application on Heroku is usually as follow. You create a github repo for your frontend (make it private) and you create an app on Heroku that will watch this repository for changes. It will re-deploy the application for you when it sees a change at a specific subdomain of Heroku hosting. You will need to configure some procfiles that will tell Heroku what to do with a given application type. This is where you need to double check what frontend you are using since that might change the procfiles used. It's most likely a node.js based frontend (React, Angular or Vue) so head over here for the documentation of how to put that online.
You will need to make a repo for the backend also that is separate from the frontend, these two entities are distinct and they only communicate through HTTP request (frontend->backend) and JSON (backend->frontend). You will need to follow the same idea as with the frontend to deploy, head over here.
Once you have these two online, you need to create a database on Heroku. This is done by adding a datastore to your api, head over here. There are some framework specific configuration you need to do to make the API talk to an online database, but then you will need to find that configuration on the framework documentation. The database could also be already up and living on your server, if this is the case you just need to configure your online backend to talk to that particular database at a particular address.
Once all of the above is done, re-test your application to check if you get the same behavior as before. This is a usable MVP, however there are no layer of security. Anyone with the right URL could just fetch your frontend and start messing around with your data.
There is more engineering that need to be done to make this a viable end product. This leads us to my final remark: why you are not using a product like Trello, Jira, or even Github Project? If it is to save some money on not paying for a subscription I think you should factor in the cost of development, security and maintenance of this application.
Hope it helps!
One simple option is Heroku for deploy your API and your frontend application.

Flask login together with client authentication methods for RESTful service

Here is the situation:
We use Flask for a website application development.Also on the website sever, we host a RESTful service. And we use Flask-login for as the authentication tool, for BOTH the web application access and the RESTful service (access the Restful service from browsers).
Later, we find that we need to, also, access the RESTful from client calls (python), so NO session and cookies etc. This gives us a headache regarding the current authentication of the RESTful service.
On the web, there exist whole bunch of ways to secure the RESTful service from client calls. But it seems no easy way for them to live together with our current Flask-login tool, such that we do not need to change our web application a lot.
So here are the question:
Is there a easy way(framework) so the RESTful services can support multiple authentication methods(protocols) at the same time. Is this even a good practice?
Many thanks!
So, you've officially bumped into one of the most difficult questions in modern web development (in my humble opinion): web authentication.
Here's the theory behind it (I'll answer your question in a moment).
When you're building complicated apps with more than a few users, particularly if you're building apps that have both a website AND an API service, you're always going to bump into authentication issues no matter what you're doing.
The ideal way to solve these problems is to have an independent auth service on your network. Some sort of internal API that EXCLUSIVELY handles user creation, editing, and deletion. There are a number of benefits to doing this:
You have a single authentication source that all of your application components can use: your website can use it to log people in behind the scenes, your API service can use it to authenticate API requests, etc.
You have a single service which can smartly managing user caching -- it's pretty dangerous to implement user caching all over the place (which is what typically happens when you're dealing with multiple authentication methods: you might cache users for the API service, but fail to cache them with the website, stuff like this causes problems).
You have a single service which can be scaled INDEPENDENTLY of your other components. Think about it this way: what piece of application data is accessed more than any other? In most applications, it's the user data. For every request user data will be needed, and this puts a strain on your database / cache / whatever you're doing. Having a single service which manages users makes it a lot nicer for you to scale this part of the application stack easily.
Overall, authentication is really hard.
For the past two years I've been the CTO at OpenCNAM, and we had the same issue (a website and API service). For us to handle authentication properly, we ended up building an internal authentication service like described above, then using Flask-Login to handle authenticating users via the website, and a custom method to authenticate users via the API (just an HTTP call to our auth service).
This worked really well for us, and allowed us to scale from thousands of requests to billions (by isolating each component in our stack, and focusing on user auth as a separate service).
Now, I wouldn't recommend this for apps that are very simple, or apps that don't have many users, because it's more hassle than it's worth.
If you're looking for a third party solution, Stormpath looks pretty promising (just google it).
Anyhow, hope that helps! Good luck.

Deploying an app to users' appspot

I am working on a Python App, which runs on App Engine. Is there a way I can publish the app on each customers' appSpot account, so that the App uses the users' cloud storage? Instead of running the App on my AppSpot account and all the users storing the data on my Cloud space?
Yes, absolutely.
You just need to have each client create an App Engine account with an application to which you have administrator access. You can adjust the settings on the application to forbid downloads of your code by the other administrators if that's appropriate for your agreement with the client. This also allows the clients to be billed directly for their instances' usage, and makes it completely impossible for data to leak between different clients' instances.
Using multiple applications for multiple clients who are licensing your application almost certainly does not violate part 4.4 of the TOS, although don't take this as legal advice.
No, you cannot do that. The app is hosted and run in the administrator's account which would be you. What you can do is, release the source code and point your users do install it in their appspot account, just like creating a new application.
I suppose it's not exactly what you need. But it can give you an idea where to go. Please check DryDrop project. There is small Python application you can ask each user to install on their account, then they can configure it to fetch your site files from your GitHub repo through webhooks functionality. I didn't try it, but, theoretically, you update your site, commit it to your repo, and all users get your updated application automatically. You can share your thoughts if that works for you.
Maybe. If it's an open source app that you're giving away, you can publish the source and instruct users to upload it to their own accounts.
If you're selling the app, displaying ads or otherwise trying to monetize the service, you probably want to stick with one instance. Using multiple instances to avoid paying for quota usage is direct violation of the App Engine TOS:
4.4. You may not develop multiple Applications to simulate or act as a
single Application or otherwise access
the Service in a manner intended to
avoid incurring fees.
No. Writing an application that deploys other applications is in violation of the terms of service.
Note we don't have any 'hard' limits - those limits that aren't billing enabled can be increased on application to us if you provide a reasonable use-case.

Migrating Django Application to Google App Engine?

I'm developing a web application and considering Django, Google App Engine, and several other options. I wondered what kind of "penalty" I will incur if I develop a complete Django application assuming it runs on a dedicated server, and then later want to migrate it to Google App Engine.
I have a basic understanding of Google's data store, so please assume I will choose a column based database for my "stand-alone" Django application rather than a relational database, so that the schema could remain mostly the same and will not be a major factor.
Also, please assume my application does not maintain a huge amount of data, so that migration of tens of gigabytes is not required. I'm mainly interested in the effects on the code and software architecture.
Thanks
Most (all?) of Django is available in GAE, so your main task is to avoid basing your designs around a reliance on anything from Django or the Python standard libraries which is not available on GAE.
You've identified the glaring difference, which is the database, so I'll assume you're on top of that. Another difference is the tie-in to Google Accounts and hence that if you want, you can do a fair amount of access control through the app.yaml file rather than in code. You don't have to use any of that, though, so if you don't envisage switching to Google Accounts when you switch to GAE, no problem.
I think the differences in the standard libraries can mostly be deduced from the fact that GAE has no I/O and no C-accelerated libraries unless explicitly stated, and my experience so far is that things I've expected to be there, have been there. I don't know Django and haven't used it on GAE (apart from templates), so I can't comment on that.
Personally I probably wouldn't target LAMP (where P = Django) with the intention of migrating to GAE later. I'd develop for both together, and try to ensure if possible that the differences are kept to the very top (configuration) and the very bottom (data model). The GAE version doesn't necessarily have to be perfect, as long as you know how to make it perfect should you need it.
It's not guaranteed that this is faster than writing and then porting, but my guess is it normally will be. The easiest way to spot any differences is to run the code, rather than relying on not missing anything in the GAE docs, so you'll likely save some mistakes that need to be unpicked. The Python SDK is a fairly good approximation to the real App Engine, so all or most of your tests can be run locally most of the time.
Of course if you eventually decide not to port then you've done unnecessary work, so you have to think about the probability of that happening, and whether you'd consider the GAE development to be a waste of your time if it's not needed.
Basically, you will change the data model base class and some APIs if you use them (PIL, urllib2, etc).
If your goal is app-engine, I would use the app engine helper http://code.google.com/appengine/articles/appengine_helper_for_django.html. It can run it on your server with a file based DB and then push it to app-engine with no changes.
It sounds like you have awareness of the major limitation in building/migrating your app -- that AppEngine doesn't support Django's ORM.
Keep in mind that this doesn't just affect the code you write yourself -- it also limits your ability to use a lot of existing Django code. That includes other applications (such as the built-in admin and auth apps) and ORM-based features such as generic views.
There are a few things that you can't do on the App Engine that you can do on your own server like uploading of files. On the App Engine you kinda have to upload it and store the datastore which can cause a few problems.
Other than that it should be fine from the Presentation part. There are a number of other little things that are better on your own dedicated server but I think eventually a lot of those things will be in the App Engine

Is Google App Engine right for me?

I am thinking about using Google App Engine.It is going to be a huge website. In that case, what is your piece of advice using Google App Engine. I heard GAE has restrictions like we cannot store images or files more than 1MB limit(they are going to change this from what I read in the GAE roadmap),query is limited to 1000 results, and I am also going to se web2py with GAE. So I would like to know your comments.
Thanks
Having developed a smallish site with GAE, I have some thoughts
If you mean "huge" like "the next YouTube", then GAE might be a great fit, because of the previously mentioned scaling.
If you mean "huge" like "massively complex, with a whole slew of screens, models, and features", then GAE might not be a good fit. Things like unit testing are hard on GAE, and there's not a built-in structure for your app that you'd get with something like (famously) (Ruby on) Rails, or (Python powered) Turbogears.
ie: there is no staging environment: just your development copy of the system and production. This may or may not be a bad thing, depending on your situation.
Additionally, it depends on the other Python modules you intend to pull in: some Python modules just don't run on GAE (because you can't talk to hardware, or because there are just too many files in the package).
Hope this helps
using web2py on Google App Engine is a great strategy. It lets you get up and running fast, and if you do outgrow the restrictions of GAE then you can move your web2py application elsewhere.
However, keeping this portability means you should stay away from the advanced parts of GAE (Task Queues, Transactions, ListProperty, etc).
The AppEngine uses BigTable as it's datastore backend. Don't try to write a traditional relational-database driven application. BigTable is much more well suited for use as a highly-scalable key-value store. Avoid joins if at all possible.
I wouldn't worry about any of this. After having played with Google App Engine for a while now, I've found that it scales quite well for large data sets. If your data elements are large (i.e. photos), then you'll need to integrate with another service to handle them, but that's probably going to be true no matter what with data of that size. Also, I've found BigTable relatively easy to work with having come from a background entirely in relational databases. Finally, Django is a somewhat hidden, but awesome, "feature" of Google App Engine. If you've never used it, it's a really nice, elegant web framework that makes a lot of common tasks trivial (forms come to mind here).
Google has just released version 1.3.0 of the SDK with support with a new Blobstore API for storage of files up to 50MB. See the post "App Engine SDK 1.3.0 Released Including Support for Larger User Uploads".
What about Google Wave? It's being built on appengine, and once live, real-time translatable chat reaches the corporate sector... I could see it hitting top 1000th... But then again, that's an internal project that gets to do special stuff other appengine apps can't.... Like hanging threads; I think... And whatever else Wave has under the hood...
If you are planning on a 'huge' website, then don't use App Engine. Simple as that. The App Engine is not built to deliver the next top 1000th website.
Allow me to also ask what do you mean by 'huge', how many simultaneous users? Queries per second? DB load?

Categories

Resources