NeedIndexError at Google App Engine forever

NeedIndexError at Google App Engine forever - python

I deployed and ran my app at GAE a few hours ago. It still fails because it needs to order certain datastore item, and the index needed for that is still not generated by GAE. So at the point of .order() it throws me a NeedIndexError. How long is this gonna take?
I've been doing this same procedure for 10~ GAE apps in the past and it has never, to my memory, been this slow. (Ok, it has been slow...)
The page "Datastore Indexes" in the old console just says "You have not created indexes for this application.".
The new console says nothing. It just displays a "blue alert" as if I'm not waiting myself to death already. The message in the alert is:
Cloud Datastore queries are powered by indexes, scalable data
structures that are updated in real time as property values change.
Your project's datastore index configuration specifies the indexes it
needs to support its queries. Cloud Datastore builds new indexes as
needed when you deploy index configuration. You can inspect the ready
state of your app's indexes using this console.
(ie a joke)
What am I supposed to do?
Update:
here's the index.yaml:
indexes:
# AUTOGENERATED
- kind: Mjquizinfo
ancestor: yes
properties:
- name: version
direction: desc

FWIW, in some cases (multi-module apps for example) the plain appcfg.py update used to deploy the app code might not update the index.yaml file.
Try specifically updating the index using appcfg.py update_indexes - you should be able to see the index info in the Developer Console right away (it may still take a while for indexing to be performed and become effective).

Related

How Can I get the location of a user app from another user of same app

I’m trying to build a simple pick up and delivery app. I want a user to submit a pickup location and delivery location.
Then as soon as the user submits I want to be able to get the locations of the riders’ apps but I don’t know how to go about it. It’s just like a normal for example uber app that searches the drivers locations and calculates the nearest one.
Calculating the nearest one is not the issue as I can do that with google maps api, but how can I get the riders app location from the backend.? Thank you in advance.

there is a good plugin that does a lot of the heavy lifting here:
https://pub.dev/packages/location
The nice thing about this plugin is that it handles a lot of the permission stuff for you. (although you still have to add NSLocationWhenInUseUsageDescription and
NSLocationAlwaysUsageDescription to the Info.plist for iOS)
If, then, you want to read data from Device A on Device B, the data flowpath would look like this:
- Device A gets own location
- Device A uploads location to database (such as firebase)
- Device B queries, or directly receives data from firebase
- Device B reads location data from Device A.
I would recommend using firebase for this transfer, since it's free to get it off the ground. A good place to start would be here: https://www.youtube.com/watch?v=m9uVcubnVfc

Why is App Engine Returning the Wrong Application ID?

The App Engine Dev Server documentation says the following:
The development server simulates the production App Engine service. One way in which it does this is to prepend a string (dev~) to the APPLICATION_IDenvironment variable. Google recommends always getting the application ID using get_application_id
In my application, I use different resources locally than I do on production. As such, I have the following for when I startup the App Engine instance:
import logging
from google.appengine.api.app_identity import app_identity
# ...
# other imports
# ...
DEV_IDENTIFIER = 'dev~'
application_id = app_identity.get_application_id()
is_development = DEV_IDENTIFIER in application_id
logging.info("The application ID is '%s'")
if is_development:
logging.warning("Using development configuration")
# ...
# set up application for development
# ...
# ...
Nevertheless, when I start my local dev server via the command line with dev_appserver.py app.yaml, I get the following output in my console:
INFO: The application ID is 'development-application'
WARNING: Using development configuration
Evidently, the dev~ identifier that the documentation claims will be preprended to my application ID is absent. I have also tried to use the App Engine Launcher UI to see if that changed anything, but it did not.
Note that 'development-application' is the name of my actual application, but I expected it to be 'dev~development-application'.

Google recommends always getting the application ID using get_application_id
But, that's if you cared about the application ID -- you don't: you care about the partition. Check out the source -- it's published at https://code.google.com/p/googleappengine/source/browse/trunk/python/google/appengine/api/app_identity/app_identity.py .
get_app_identity uses os.getenv('APPLICATION_ID') then passes that to internal function _ParseFullAppId -- which splits it by _PARTITION_SEPARATOR = '~' (thus removing again the dev~ prefix that dev_appserver.py prepended to the environment variable). That's returned as the "partition" to get_app_identity (which ignores it, only returning the application ID in the strict sense).
Unfortunately, there is no architected way to get just the partition (which is in fact all you care about).
I would recommend that, to distinguish whether you're running locally or "in production" (i.e, on Google's servers at appspot.com), in order to access different resources in each case, you take inspiration from the way Google's own example does it -- specifically, check out the app.py example at https://cloud.google.com/appengine/docs/python/cloud-sql/#Python_Using_a_local_MySQL_instance_during_development .
In that example, the point is to access a Cloud SQL instance if you're running in production, but a local MySQL instance instead if you're running locally. But that's secondary -- let's focus instead on, how does Google's own example tell which is the case? The relevant code is...:
if (os.getenv('SERVER_SOFTWARE') and
os.getenv('SERVER_SOFTWARE').startswith('Google App Engine/')):
...snipped: what to do if you're in production!...
else:
...snipped: what to do if you're in the local server!...
So, this is the test I'd recommend you use.
Well, as a Python guru, I'm actually slightly embarassed that my colleagues are using this slightly-inferior Python code (with two calls to os.getenv) -- me, I'd code it as follows...:
in_prod = os.getenv('SERVER_SOFTWARE', '').startswith('Google App Engine/')
if in_prod:
...whatever you want to do if we're in production...
else:
...whatever you want to do if we're in the local server...
but, this is exactly the same semantics, just expressed in more elegant Python (exploiting the second optional argument to os.getenv to supply a default value).
I'll be trying to get this small Python improvement into that example and to also place it in the doc page you were using (there's no reason anybody just needing to find out if their app is being run in prod or locally should ever have looked at the docs about Cloud SQL use -- so, this is a documentation goof on our part, and, I apologize). But, while I'm working to get our docs improved, I hope this SO answer is enough to let you proceed confidently.

That documentation seems wrong, when I run the commands locally it just spits out the name from app.yaml.
That being said, we use
import os
os.getenv('SERVER_SOFTWARE', '').startswith('Dev')
to check if it is the dev appserver.

A Realtime Dashboard For Logs

We have a number of Python services, many of which use Nginx as a reverse proxy. Right now, we examine requests in real time by tailing the logs found in /var/log/nginx/access.log. I want to make these logs publicly readable in aggregate on a webserver so people don't have to SSH into individual machines.
Our current infrastructure has fluentd (a tool similar to logstash I'm told) sending logs up to a centralized stats server, which has Elasticsearch and kibana installed, with the idea being that kibana would serve as the frontend for viewing our logs.
I know nothing about these services. If I wanted to view our logs in realtime, would this stack even be feasible? Can Elasticsearch provide realtime data with a mere second's delay? Does kibana have out-of-the-box functionality for automatically updating the page as new log data comes in (i.e., does it have a socket connection with elasticsearch? Or am I falling into the wrong toolset?

Kibana is just an interface on top of elastic search. It talks directly to elasticsearch, so the data on it is as realtime as the data you are feeding into elasticsearch. In other words, its only as good as your collectors (fluentd in your case).
It works by having you define time series which it uses to query data from elastic search, and then you can have it always search for keywords and then visualize that data.
If by "realtime" you mean that you want the graphs to move/animate - this is also possible (its called "streaming dashboards"); but that's not the real power of kibana - the real power is a very rich query engine, drill down into time series, do calculations (top x over period y).
If all you want is a nice visual/moving thing to place on the wall tv - this is possible with kibana, but keep in mind you'll have to store everything in elasticsearch so unless you plan on doing some other analysis, you'll have to adjust your configuration. For example, have a really short TTL for the messages so once they are visualized, they are no longer available; or filter fluentd to only send across those events that you want to plot. Otherwise you'll have a disk space problem.
If that is all that you want, it would be a easier to grab some javascript charting library and use that in your portal.

I have the "access.log (or other logs) - logstash (or other ES indexer) - Kibana" pipeline setup for a number of services and logs and it works well. In our case it has more than a second of delay but that's because of buffering in logs or the ES indexer, not because of Kibana/ES itself.
You can setup Kibana to show only the last X minutes of data and refresh every Y seconds, which gives a decent real-time feel - and looks good on TVs ;)
Keep in mind that Kibana can sometimes issue pretty bad queries which can crash your ES cluster (although this seems to have vastly improved in more recent ES and Kibana versions), so do not rely on this as a permanent data store for your logs, and do not share the ES cluster you use for Kibana with apps that have stronger HA requirements.
As Burhan Khalid pointed out, this setup also gives us the ability to drill down and study specific patterns in details, which is super useful ("What's this spike on this graph?" - zoom in, add a couple filters, look at a few example log lines, filter again - mystery solved). I think saving us from having to dig somewhere else to get more details when we see something weird is actually the best part of this solution.

django session.objects.filter gives static data

I have been debugging this half a day now... anybody have ideas?
I wrote a python script to monitor active sessions, found this:
sessions = Session.objects.filter(expire_date__gte=datetime.now())
for session in sessions:
data = session.get_decoded()
id = data.get('_auth_user_id', None)
ses = session.session_key
if id:
name = User.objects.get(id=id)
gives nice list... ok. But -- if user logs out or in, the above code does not reflect the change. It just keeps repeating the original, outdated list.
Is there a caching issue? Think not -- disabled memcached, and no change.
Tried file and memcache based session storage -- strange result: the code above seems to read db-based session storage.
So, I suspect the initialization is not correct for the 1.4.3 -- as there seem to have been various ways to initialize environment. I believe 1.4. does not require anything but the environment variable DJANGO_SETTINGS_MODULE to be set to the app.
Next, if this does not resolve.. must use file based session storage and poll the directory.. that seems to be alive and kicking in realtime :)

Your problem is caused by transaction isolation. By default, each connection to the db runs inside a transaction. Usually, that equates to a view, and the transaction middleware takes care of opening and closing the transaction. In a standalone script, you'll need to manage that yourself.

GAE development server keep full text search indexes after restart?

Is there anyway of forcing the GAE dev server to keep full text search indexes after restart? I am finding that the index is lost whenever the dev server is restarted.
I am already using a static datastore path when I launch the dev server (the --datastore_path option).

This functionality was added a few releases ago (in either 1.7.1 or 1.7.2, I think). If you're using an SDK from the last few months it should be working. You can try explicitly setting the --search_indexes_path flag on dev_appserver.py; it's possible that the default location (/tmp/) isn't writable. Could you post the first few lines of the logs from when you start dev_appserver?

in case anyone else comes looking for this, it looks like the simple solution is now to specify
--storage_path=/not/the/tmp/dir
you can still override this with --datastore_path etc.
https://developers.google.com/appengine/docs/python/tools/devserver
(at the bottom of the page..)

Look like this is not an issue anymore. according to documentation (and my tests):
"The development web server simulates the App Engine datastore using a
file on your computer. This file persists between invocations of the
web server, so data you store will still be available the next time
you run the web server."
Please let me know if it is otherwise and I will follow up on that.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.