How to set individual data store per local GAE app instance?

How to set individual data store per local GAE app instance? - python

For testing purpose I want to start two instances of a GAE app locally. However the second instance will fail to start because there is already a lock on the local database imposed by the first instance.
INFO 2014-09-28 05:14:22,751 admin_server.py:117] Starting admin server at: http://localhost:8081
OperationalError('database is locked',)
Traceback (most recent call last):
File "/Applications/GoogleAppEngineLauncher.app/Contents/Resources/GoogleAppEngine-default.bundle/Contents/Resources/google_appengine/lib/cherrypy/cherrypy/wsgiserver/wsgiserver2.py", line 1302, in communicate
req.respond()
File "/Applications/GoogleAppEngineLauncher.app/Contents/Resources/GoogleAppEngine-default.bundle/Contents/Resources/google_appengine/lib/cherrypy/cherrypy/wsgiserver/wsgiserver2.py", line 831, in respond
self.server.gateway(self).respond()
File "/Applications/GoogleAppEngineLauncher.app/Contents/Resources/GoogleAppEngine-default.bundle/Contents/Resources/google_appengine/lib/cherrypy/cherrypy/wsgiserver/wsgiserver2.py", line 2115, in respond
response = self.req.server.wsgi_app(self.env, self.start_response)
File "/Applications/GoogleAppEngineLauncher.app/Contents/Resources/GoogleAppEngine-default.bundle/Contents/Resources/google_appengine/google/appengine/tools/devappserver2/wsgi_server.py", line 266, in __call__
return app(environ, start_response)
File "/Applications/GoogleAppEngineLauncher.app/Contents/Resources/GoogleAppEngine-default.bundle/Contents/Resources/google_appengine/google/appengine/tools/devappserver2/module.py", line 1431, in __call__
return self._handle_request(environ, start_response)
File "/Applications/GoogleAppEngineLauncher.app/Contents/Resources/GoogleAppEngine-default.bundle/Contents/Resources/google_appengine/google/appengine/tools/devappserver2/module.py", line 641, in _handle_request
module=self._module_configuration.module_name)
File "/Applications/GoogleAppEngineLauncher.app/Contents/Resources/GoogleAppEngine-default.bundle/Contents/Resources/google_appengine/google/appengine/api/apiproxy_stub.py", line 165, in WrappedMethod
return method(self, *args, **kwargs)
File "/Applications/GoogleAppEngineLauncher.app/Contents/Resources/GoogleAppEngine-default.bundle/Contents/Resources/google_appengine/google/appengine/api/logservice/logservice_stub.py", line 172, in start_request
host, start_time, method, resource, http_version, module))
OperationalError: database is locked
Is there any way I can specify an alternative data store location in the second instance of my app?

Depends on how you start your application.
If using Java, might want to look at this answer.
But keep in mind your two apps won't be talking to the same datastores, so if you need data to persist between your instances, this won't work.

Related

Python SocketIO KeyError: 'Session is disconnected'

On a small Flask webserver running on a RaspberryPi with about 10-20 clients, we periodically get this error:
Error on request:
Traceback (most recent call last):
File "/home/pi/3D_printer_control/env/lib/python3.7/site-packages/werkzeug/serving.py", line 270, in run_wsgi
execute(self.server.app)
File "/home/pi/3D_printer_control/env/lib/python3.7/site-packages/werkzeug/serving.py", line 258, in execute
application_iter = app(environ, start_response)
File "/home/pi/3D_printer_control/env/lib/python3.7/site-packages/flask/app.py", line 2309, in __call__
return self.wsgi_app(environ, start_response)
File "/home/pi/3D_printer_control/env/lib/python3.7/site-packages/flask_socketio/__init__.py", line 43, in __call__
start_response)
File "/home/pi/3D_printer_control/env/lib/python3.7/site-packages/engineio/middleware.py", line 47, in __call__
return self.engineio_app.handle_request(environ, start_response)
File "/home/pi/3D_printer_control/env/lib/python3.7/site-packages/socketio/server.py", line 360, in handle_request
return self.eio.handle_request(environ, start_response)
File "/home/pi/3D_printer_control/env/lib/python3.7/site-packages/engineio/server.py", line 291, in handle_request
socket = self._get_socket(sid)
File "/home/pi/3D_printer_control/env/lib/python3.7/site-packages/engineio/server.py", line 427, in _get_socket
raise KeyError('Session is disconnected')
KeyError: 'Session is disconnected'
The error is generated automatically from inside python-socketio. What does this error really mean and how can I prevent or suppress it?

As far as I can tell, this usually means the server can't keep up with supplying data to all of the clients.
Some possible mitigation techniques include disconnecting inactive clients, reducing the amount of data sent where possible, sending live data in larger chunks, or upgrading the server. If you need a lot of data throughput, there may be also be a better option than socketIO.
I have been able to reproduce it by setting a really high ping rate and low timeout in the socketIO constructor:
from flask_socketio import SocketIO
socketio = SocketIO(engineio_logger=True, ping_timeout=5, ping_interval=5)
This means the server has to do a lot of messaging to all of the clients and they don't have long to respond. I then open around 10 clients and I start to see the KeyError.
Further debugging of our server found a process that was posting lots of live data which ran fine with only a few clients but starts to issue the occasional KeyError once I get up to about a dozen.

Error in OpenCensus in Google Kubernetes Engine with Python

I am deploying containers to GKE that contain Python apps and encountering an error when I try to use OpenCensus to send trace messages:
Traceback (most recent call last):
File "/usr/local/lib/python3.7/site-packages/opencensus/metrics/transport.py", line 59, in func
return self.func(*aa, **kw)
File "/usr/local/lib/python3.7/site-packages/opencensus/metrics/transport.py", line 113, in export_all
export(itertools.chain(*all_gets))
File "/usr/local/lib/python3.7/site-packages/opencensus/ext/stackdriver/stats_exporter/__init__.py", line 162, in export_metrics
self.client.project_path(self.options.project_id), ts_batch)
File "/usr/local/lib/python3.7/site-packages/google/cloud/monitoring_v3/gapic/metric_service_client.py", line 1024, in create_time_series
request, retry=retry, timeout=timeout, metadata=metadata
File "/usr/local/lib/python3.7/site-packages/google/api_core/gapic_v1/method.py", line 143, in __call__
return wrapped_func(*args, **kwargs)
File "/usr/local/lib/python3.7/site-packages/google/api_core/retry.py", line 273, in retry_wrapped_func
on_error=on_error,
File "/usr/local/lib/python3.7/site-packages/google/api_core/retry.py", line 182, in retry_target
return target()
File "/usr/local/lib/python3.7/site-packages/google/api_core/timeout.py", line 214, in func_with_timeout
return func(*args, **kwargs)
File "/usr/local/lib/python3.7/site-packages/google/api_core/grpc_helpers.py", line 59, in error_remapped_callable
six.raise_from(exceptions.from_grpc_error(exc), exc)
File "<string>", line 3, in raise_from
google.api_core.exceptions.InvalidArgument: 400 One or more TimeSeries could not be written: The set of resource labels is incomplete. Missing labels: (container_name namespace_name).: timeSeries[0-199]
The interesting part seems to be this sentence: Missing labels: (container_name namespace_name).
When I run the exact same code locally, I do not receive any errors and I do see my tracing appearing in Stackdriver Metrics Explorer, so the problem appears to be related specifically to running inside a container in GKE.
Is there something specific that is required to get OpenCensus working in a GKE container?

The answer is that you need to manually set two environment variables in your container: CONTAINER_NAME and NAMESPACE. I believe GKE should be setting these and isn't, and so OpenCensus can't find the expected values. A sample fix would involve including those two variables in the podspec:
spec:
containers:
env:
- name: NAMESPACE
valueFrom:
fieldRef:
fieldPath: metadata.namespace
- name: CONTAINER_NAME
value: {{ APP }}-collectors-{{ NAME }}
More details: https://github.com/census-instrumentation/opencensus-python/issues/796#issuecomment-539109321

sqlalchemy: assert dispatch_reg[owner_ref] == listen_ref

I'm using Invenio 2.0 and try to replace old version of SQLAlchemy 0.8.7 with the last 0.9.7.
The utility to automaticaly create the db works (inveniomanage database recreate --yes-i-know).
But when I start tests with: python setup.py test
It return me a error:
test_fisrt_blueprint (invenio.testsuite.test_ext_template.TemplateLoaderCase) ... --------------------------------------------------------------------------------
ERROR in wrappers [/home/vagrant/.virtualenvs/invenio2/src/invenio/invenio/ext/logging/wrappers.py:310]:
--------------------------------------------------------------------------------
Traceback (most recent call last):
File "/home/vagrant/.virtualenvs/invenio2/src/invenio/invenio/ext/legacy/__init__.py", line 124, in __call__
response = self.app.full_dispatch_request()
File "/home/vagrant/.virtualenvs/invenio2/local/lib/python2.7/site-packages/flask/app.py", line 1470, in full_dispatch_request
self.try_trigger_before_first_request_functions()
File "/home/vagrant/.virtualenvs/invenio2/local/lib/python2.7/site-packages/flask/app.py", line 1497, in try_trigger_before_first_request_functions
func()
File "/home/vagrant/.virtualenvs/invenio2/src/invenio/invenio/modules/messages/views.py", line 264, in invoke_email_alert_register
email_alert_register()
File "/home/vagrant/.virtualenvs/invenio2/src/invenio/invenio/modules/messages/models.py", line 202, in email_alert_register
event.listen(MsgMESSAGE, 'after_insert', email_alert)
File "/home/vagrant/.virtualenvs/invenio2/local/lib/python2.7/site-packages/sqlalchemy/event/api.py", line 63, in listen
_event_key(target, identifier, fn).listen(*args, **kw)
File "/home/vagrant/.virtualenvs/invenio2/local/lib/python2.7/site-packages/sqlalchemy/event/registry.py", line 187, in listen
self.dispatch_target.dispatch._listen(self, *args, **kw)
File "/home/vagrant/.virtualenvs/invenio2/local/lib/python2.7/site-packages/sqlalchemy/orm/events.py", line 547, in _listen
event_key.base_listen(**kw)
File "/home/vagrant/.virtualenvs/invenio2/local/lib/python2.7/site-packages/sqlalchemy/event/registry.py", line 226, in base_listen
for_modify(target.dispatch).append(self, propagate)
File "/home/vagrant/.virtualenvs/invenio2/local/lib/python2.7/site-packages/sqlalchemy/event/attr.py", line 328, in append
event_key.append_to_list(self, self.listeners)
File "/home/vagrant/.virtualenvs/invenio2/local/lib/python2.7/site-packages/sqlalchemy/event/registry.py", line 237, in append_to_list
_stored_in_collection(self, owner)
File "/home/vagrant/.virtualenvs/invenio2/local/lib/python2.7/site-packages/sqlalchemy/event/registry.py", line 74, in _stored_in_collection
assert dispatch_reg[owner_ref] == listen_ref
AssertionError
In /home/vagrant/.virtualenvs/invenio2/src/invenio/invenio/modules/messages/views.py (row 264)
# Registration of email_alert invoked from blueprint
# in order to use before_app_first_request.
# Reading config CFG_WEBMESSAGE_EMAIL_ALERT
# required app context.
#blueprint.before_app_first_request
def invoke_email_alert_register():
email_alert_register()
In /home/vagrant/.virtualenvs/invenio2/src/invenio/invenio/modules/messages/models.py (row 202)
# Registration of email_alert invoked from blueprint
# in order to use before_app_first_request.
# Reading config CFG_WEBMESSAGE_EMAIL_ALERT
# required app context.
def email_alert_register():
if cfg['CFG_WEBMESSAGE_EMAIL_ALERT']:
from sqlalchemy import event
# Register after insert callback.
event.listen(MsgMESSAGE, 'after_insert', email_alert)
Someone can help me?
Installed:
-e git+https://github.com/mitsuhiko/flask-sqlalchemy#c7eccba63314f3ea77e2c6217d3d3c8b0d2552fd#egg=Flask_SQLAlchemy-2.0
MySQL-python==1.2.5
SQLAlchemy==0.9.7
SQLAlchemy-Utils==0.23.5

With help from google (today) I found what I suspect would be a solution here (I'm not an invenio user)
I suspect an SQLa update will fix your issue.
https://bitbucket.org/zzzeek/sqlalchemy/issue/3199/deduplication-of-events-doesnt-work-for
-->
https://bitbucket.org/zzzeek/sqlalchemy/commits/9ae4db27b993
-->
Fixed in SQLA 0.9.8 (supposedly)

Using a service account with boto to access Cloud Storage in GAE with gcs_oauth2_boto_plugin

I am wondering if anyone knows a way to use a service account to authenticate if I want to access data in Cloud Storage by:
1. Using boto library (and gcs_oauth2_boto_plugin)
2. Running in Google App Engine (GAE)
Following https://developers.google.com/storage/docs/gspythonlibrary I am using boto and gcs_oauth2_boto_plugin to authenticate and perform actions against Cloud Storage (upload/download files). I am using a service account to authenticate so that we don't have to authenticate with a Google account periodically (the thought being that if we run this in GCE, it'll run with the GCE service account -- haven't actually done that yet). Locally, I've set up my boto config file to use the service account and point to a p12 key file. This runs fine locally.
I would like to use the same code to interact with Cloud Storage from within Google App Engine (GAE). We are running a light weight ETL process that transforms and loads the data into Big Query. We want to run this code in App Engine task queue (the task is getting triggered by an Object Change Notification from Cloud Storage).
Since we're currently relying on the boto config (~/.boto), I adapted http://thurloat.com/2010/06/07/google-storage-and-app-engine to put the relevant config items for a service account.
When I finally run the code from App Engine (dev_appserver.py), I get the below stack trace:
Traceback (most recent call last):
File "/home/some-user/google-cloud-sdk/platform/google_appengine/lib/webapp2-2.5.1/webapp2.py", line 1536, in __call__
rv = self.handle_exception(request, response, e)
File "/home/some-user/google-cloud-sdk/platform/google_appengine/lib/webapp2-2.5.1/webapp2.py", line 1530, in __call__
rv = self.router.dispatch(request, response)
File "/home/some-user/google-cloud-sdk/platform/google_appengine/lib/webapp2-2.5.1/webapp2.py", line 1278, in default_dispatcher
return route.handler_adapter(request, response)
File "/home/some-user/google-cloud-sdk/platform/google_appengine/lib/webapp2-2.5.1/webapp2.py", line 1102, in __call__
return handler.dispatch()
File "/home/some-user/google-cloud-sdk/platform/google_appengine/lib/webapp2-2.5.1/webapp2.py", line 572, in dispatch
return self.handle_exception(e, self.app.debug)
File "/home/some-user/google-cloud-sdk/platform/google_appengine/lib/webapp2-2.5.1/webapp2.py", line 570, in dispatch
return method(*args, **kwargs)
File "/home/some-user/dev/myApp/main.py", line 247, in post
gs.download(fname, fp)
File "/home/some-user/dev/myApp/cloudstorage.py", line 107, in download
bytes = src_uri.get_key().get_contents_to_file(fp)
File "/home/some-user/dev/myApp/boto/storage_uri.py", line 336, in get_key
bucket = self.get_bucket(validate, headers)
File "/home/some-user/dev/myApp/boto/storage_uri.py", line 181, in get_bucket
conn = self.connect()
File "/home/some-user/dev/myApp/boto/storage_uri.py", line 140, in connect
**connection_args)
File "/home/some-user/dev/myApp/boto/gs/connection.py", line 47, in __init__
suppress_consec_slashes=suppress_consec_slashes)
File "/home/some-user/dev/myApp/boto/s3/connection.py", line 190, in __init__
validate_certs=validate_certs, profile_name=profile_name)
File "/home/some-user/dev/myApp/boto/connection.py", line 568, in __init__
host, config, self.provider, self._required_auth_capability())
File "/home/some-user/dev/myApp/boto/auth.py", line 929, in get_auth_handler
ready_handlers.append(handler(host, config, provider))
File "/home/some-user/dev/myApp/gcs_oauth2_boto_plugin/oauth2_plugin.py", line 56, in __init__
cred_type=oauth2_client.CredTypes.OAUTH2_SERVICE_ACCOUNT)
File "/home/some-user/dev/myApp/gcs_oauth2_boto_plugin/oauth2_helper.py", line 48, in OAuth2ClientFromBotoConfig
token_cache = oauth2_client.FileSystemTokenCache()
File "/home/some-user/dev/myApp/gcs_oauth2_boto_plugin/oauth2_client.py", line 175, in __init__
tempfile.gettempdir(), 'oauth2_client-tokencache.%(uid)s.%(key)s')
File "/home/some-user/google-cloud-sdk/platform/google_appengine/google/appengine/dist/tempfile.py", line 61, in PlaceHolder
raise NotImplementedError("Only tempfile.TemporaryFile is available for use")
NotImplementedError: Only tempfile.TemporaryFile is available for use
Looks like the problem is just with gcs_oauth2_boto_plugin trying to use a temporary directory when caching the oauth credentials (App Engine only supports tempfile.TemporaryFile).
Rather than try and patch gcs_oauth2_boto_plugin, is there potentially another solution? Can we use a service account with gcs_oauth2_boto_plugin/boto on App Engine to access Cloud Storage resources?
Or, am I using the wrong authentication method here?

This doesn't quite answer the question directly, but instead of using boto and gcs_oauth2_boto_plugin, I am using the "Google Cloud Storage
Python Client Library", GoogleAppEngineCloudStorageClient from pip.
https://developers.google.com/appengine/docs/python/googlecloudstorageclient/

How to handle 60 secs timeout with heavy queries

I have to make some heavy queries in my datastore to obtain some high level information. When it reaches the 60 secs I get an error that I suppose its a timeout cut:
Traceback (most recent call last):
File "/base/python27_runtime/python27_lib/versions/1/google/appengine/runtime/wsgi.py", line 207, in Handle
result = handler(dict(self._environ), self._StartResponse)
File "/base/python27_runtime/python27_lib/versions/third_party/webapp2-2.3/webapp2.py", line 1505, in __call__
rv = self.router.dispatch(request, response)
File "/base/python27_runtime/python27_lib/versions/third_party/webapp2-2.3/webapp2.py", line 1253, in default_dispatcher
return route.handler_adapter(request, response)
File "/base/python27_runtime/python27_lib/versions/third_party/webapp2-2.3/webapp2.py", line 1077, in __call__
return handler.dispatch()
File "/base/python27_runtime/python27_lib/versions/third_party/webapp2-2.3/webapp2.py", line 545, in dispatch
return method(*args, **kwargs)
File "/base/python27_runtime/python27_lib/versions/1/google/appengine/ext/admin/__init__.py", line 140, in xsrf_required_decorator
method(self)
File "/base/python27_runtime/python27_lib/versions/1/google/appengine/ext/admin/__init__.py", line 348, in post
exec(compiled_code, globals())
File "<string>", line 28, in <module>
File "/base/python27_runtime/python27_lib/versions/1/google/appengine/ext/db/__init__.py", line 2314, in next
return self.__model_class.from_entity(self.__iterator.next())
File "/base/python27_runtime/python27_lib/versions/1/google/appengine/ext/db/__init__.py", line 1442, in from_entity
return cls(None, _from_entity=entity, **entity_values)
File "/base/python27_runtime/python27_lib/versions/1/google/appengine/ext/db/__init__.py", line 958, in __init__
if isinstance(_from_entity, datastore.Entity) and _from_entity.is_saved():
File "/base/python27_runtime/python27_lib/versions/1/google/appengine/api/datastore.py", line 814, in is_saved
self.__key.has_id_or_name())
File "/base/python27_runtime/python27_lib/versions/1/google/appengine/api/datastore_types.py", line 565, in has_id_or_name
elems = self.__reference.path().element_list()
DeadlineExceededError
This is not an application query, I am interacting with my app through the Interactive Console, so this is not a live problem. My problem is that I have to iterate around all my application users, checking big amounts of data that I need to retrieve for each of them. I could do it one by one by hard coding their user_id, but it would be slow and non-efficient.
Can you guys think of any way I could do this faster? Is there anyway for selecting maybe 5 by five the users, like LIMIT=5 get only the first 5 users, but it would be great if I can get, first the 5 users, after that, the next 5 users and so on, iterating by all of them but with lighter queries. Can I set a longer timeout?
Any way you can think about I could deal with this problem?

You could use a cursor to pick up your search where you left off in conjunction with limit:
Returns a base64-encoded cursor string denoting the position in the query's result set following the last result retrieved. The cursor string is safe to use in HTTP GET and POST parameters, and can also be stored in the Datastore or Memcache. A future invocation of the same query can provide this string via the start_cursor parameter or the with_cursor() method to resume retrieving results from this position.
https://developers.google.com/appengine/docs/python/datastore/queryclass#Query_cursor

I'd write a simple request handler to do the task.
Either write it in a way that it can be run on mapreduce, or launch a backend to run your handler.

First, by getting your entities in batches will reduce the communication time of your application with the datastore significantly. For details on this, take a look at 10 things you (probably) didn't know about App Engine
Then, you can assign this procedure to Task Queues that enable you to execute tasks up to 10 minutes. For more information on Task Queues, take a look at The Task Queue Python API.
Finally, for tasks that need more time you can also consider the use of Backends. For more information you can take a look at Backends (Python).
Hope this helps.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

How to set individual data store per local GAE app instance? - python

Depends on how you start your application. If using Java, might want to look at this answer. But keep in mind your two apps won't be talking to the same datastores, so if you need data to persist between your instances, this won't work.

Related

Python SocketIO KeyError: 'Session is disconnected'

Error in OpenCensus in Google Kubernetes Engine with Python

sqlalchemy: assert dispatch_reg[owner_ref] == listen_ref

Using a service account with boto to access Cloud Storage in GAE with gcs_oauth2_boto_plugin

How to handle 60 secs timeout with heavy queries

Categories

Resources