OrientDB Gremlin server not working in python - python

I am using the orientdb and gremlin server in python, Gremlin server is started successfully, but when I am trying to add one vertex to the orientdb through gremlin code it's giving me an error.
query = """graph.addVertex(label, "Test", "title", "abc", "title", "abc")"""
following is the Traceback
/usr/bin/python3.6 /home/admin-12/Documents/bitbucket/ecodrone/ecodrone/test/test1.py
Traceback (most recent call last):
File "/home/admin-12/Documents/bitbucket/ecodrone/ecodrone/test/test1.py", line 27, in <module>
result = execute_query("""graph.addVertex(label, "Test", "title", "abc", "title", "abc")""")
File "/home/admin-12/Documents/bitbucket/ecodrone/ecodrone/GremlinConnector.py", line 21, in execute_query
results = future_results.result()
File "/usr/lib/python3.6/concurrent/futures/_base.py", line 432, in result
return self.__get_result()
File "/usr/lib/python3.6/concurrent/futures/_base.py", line 384, in __get_result
raise self._exception
File "/home/admin-12/.local/lib/python3.6/site-packages/gremlin_python/driver/resultset.py", line 81, in cb
f.result()
File "/usr/lib/python3.6/concurrent/futures/_base.py", line 425, in result
return self.__get_result()
File "/usr/lib/python3.6/concurrent/futures/_base.py", line 384, in __get_result
raise self._exception
File "/usr/lib/python3.6/concurrent/futures/thread.py", line 56, in run
result = self.fn(*self.args, **self.kwargs)
File "/home/admin-12/.local/lib/python3.6/site-packages/gremlin_python/driver/connection.py", line 77, in _receive
self._protocol.data_received(data, self._results)
File "/home/admin-12/.local/lib/python3.6/site-packages/gremlin_python/driver/protocol.py", line 106, in data_received
"{0}: {1}".format(status_code, data["status"]["message"]))
gremlin_python.driver.protocol.GremlinServerError: 599: Error during serialization: Infinite recursion (StackOverflowError) (through reference chain: com.orientechnologies.orient.core.id.ORecordId["record"]->com.orientechnologies.orient.core.record.impl.ODocument["schemaClass"]->com.orientechnologies.orient.core.metadata.schema.OClassImpl["document"]->com.orientechnologies.orient.core.record.impl.ODocument["owners"]->com.orientechnologies.orient.core.record.impl.ODocument["owners"]->com.orientechnologies.orient.core.record.impl.ODocument["owners"]->com.orientechnologies.orient.core.record.impl.ODocument["owners"]->com.orientechnologies.orient.core.record.impl.ODocument["owners"]->com.orientechnologies.orient.core.record.impl.ODocument["owners"]->com.orientechnologies.orient.core.record.impl.ODocument["owners"]->com.orientechnologies.orient.core.record.impl.ODocument["owners"]->com.orientechnologies.orient.core.record.impl.ODocument["owners"]->com.orientechnologies.orient.core.record.impl.ODocument["owners"]->com.orientechnologies.orient.core.record.impl.ODocument["owners"]->com.orientechnologies.orient.core.record.impl.ODocument["owners"]->com.orientechnologies.orient.core.record.impl.ODocument["owners"]->com.orientechnologies.orient.core.record.impl.ODocument["owners"]->com.orientechnologies.orient.core.record.impl.ODocument["owners"]->com.orientechnologies.orient.core.record.impl.ODocument["owners"]->com.orientechnologies.orient.core.record.impl.ODocument["owners"]->com.orientechnologies.orient.core.record.impl.ODocument["owners"]->com.orientechnologies.orient.core.record.impl.ODocument["owners"]->com.orientechnologies.orient.core.record.impl.ODocument["owners"]->com.orientechnologies.orient.core.record.impl.ODocument["owners"]->com.orientechnologies.orient.core.record.impl.ODocument["owners"]->com.orientechnologies.orient.core.record.impl.ODocument["owners"]->com.orientechnologies.orient.core.record.impl.ODocument["owners"]->com.orientechnologies.orient.core.record.impl.ODocument["owners"]->com.orientechnologies.orient.core.record.impl.ODocument["owners"]->com.orientechnologies.orient.core.record.impl.ODocument["owners"]->com.orientechnologies.orient.core.record.impl.ODocument["owners"]->com.orientechnologies.orient.core.record.impl.ODocument["owners"]->com.orientechnologies.orient.core.record.impl.ODocument["owners"]->com.orientechnologies.orient.core.record.impl.ODocument["owners"]->com.orientechnologies.orient.core.record.impl.ODocument["owners"]->com.orientechnologies.orient.core.record.impl.ODocument["owners"]->com.orientechnologies.orient.core.record.impl.ODocument["owners"]->com.orientechnologies.orient.core.record.impl.ODocument["owners"]->com.orientechnologies.orient.core.record.impl.ODocument["owners"])
Process finished with exit code 1

First of all, I very much recommend that you do not use the Graph API to make mutation. Prefer the Traversal API for that and do:
g.addV('Test').
property('title1', 'abc').
property('title2', 'abc')
Second, I think that your error is occurring because you are returning a Vertex which contains an ORecordId which is the vertex identifier and Gremlin Server doesn't know how to handle that. I don't know if OrientDB has serializers built to handle that, but if they do then you would want to add them to Gremlin Server configurations which is described in a bit more detail here - basically, you would want to know if OrientDB exposes an TinkerPop IORegistry for all their custom classes that might be sent back over the wire.
If they do not, then you would want to avoid returning those or convert them yourself. TinkerPop already recommends that you not return full Vertex objects and only return data that you need. So, rather than g.V() you would want to convert that Vertex into a Map with g.V().valueMap('title') or something similar (perhaps use project() step). If you definitely need the vertex identifier then you would need to convert that to something TinkerPop serializers understand. That might mean something as simple as:
g.V().has("title1","abc").id().next().toString()

Related

Removing an entire tree in Neptune using Gremlin

I'm new to using Gremlin + Neptune. I'm writing a small python program to cleanup some edges + nodes in a Neptune DB. To start, I want to delete all nodes + edges rooted at a particular node.
After looking around, I came across How to delete all child nodes when parent node deleted using the gremlin query?, and I tried the following:
g.V().has("id", <insert id>).union(__(), __.repeat(__.out()).emit()).drop()
However, the node with "id" still exists in the DB. When I try running the above command straight on the python3 console, I get the following output:
>>> g.V().has("id", <insert id>).union(__(), __.repeat(__.out()).emit()).drop()
[['V'], ['has', 'id', '5c4266a3a44649ddb24ce6cf4f831300'], ['union', <gremlin_python.process.graph_traversal.__ object at 0x7fa51c967fd0>, [['repeat', [['out']]], ['emit']]], ['drop']]
What's going on here? It almost looks like it's listing out the steps it plans to take, but not executing them.
Edit -- From answer below, I added .iterate() to the end. I get this error now:
Traceback (most recent call last):
...
tg.g.V().has("id", <id>).union(__(), __.repeat(__.out()).emit()).drop().iterate()
File "/home/vagrant/.cache/bazel/_bazel_vagrant/f9910d98673307a31f928c448bd4acd0/execroot/project/bazel-out/k8-fastbuild/bin/path/to/directory/python_code.runfiles/pip_deps/pypi__gremlinpython/gremlin_python/process/traversal.py", line 66, in iterate
try: self.nextTraverser()
File "/home/vagrant/.cache/bazel/_bazel_vagrant/f9910d98673307a31f928c448bd4acd0/execroot/project/bazel-out/k8-fastbuild/bin/path/to/directory/python_code.runfiles/pip_deps/pypi__gremlinpython/gremlin_python/process/traversal.py", line 71, in nextTraverser
self.traversal_strategies.apply_strategies(self)
File "/home/vagrant/.cache/bazel/_bazel_vagrant/f9910d98673307a31f928c448bd4acd0/execroot/project/bazel-out/k8-fastbuild/bin/path/to/directory/python_code.runfiles/pip_deps/pypi__gremlinpython/gremlin_python/process/traversal.py", line 573, in apply_strategies
traversal_strategy.apply(traversal)
File "/home/vagrant/.cache/bazel/_bazel_vagrant/f9910d98673307a31f928c448bd4acd0/execroot/project/bazel-out/k8-fastbuild/bin/path/to/directory/python_code.runfiles/pip_deps/pypi__gremlinpython/gremlin_python/driver/remote_connection.py", line 149, in apply
remote_traversal = self.remote_connection.submit(traversal.bytecode)
File "/home/vagrant/.cache/bazel/_bazel_vagrant/f9910d98673307a31f928c448bd4acd0/execroot/project/bazel-out/k8-fastbuild/bin/path/to/directory/python_code.runfiles/pip_deps/pypi__gremlinpython/gremlin_python/driver/driver_remote_connection.py", line 55, in submit
result_set = self._client.submit(bytecode)
File "/home/vagrant/.cache/bazel/_bazel_vagrant/f9910d98673307a31f928c448bd4acd0/execroot/project/bazel-out/k8-fastbuild/bin/path/to/directory/python_code.runfiles/pip_deps/pypi__gremlinpython/gremlin_python/driver/client.py", line 111, in submit
return self.submitAsync(message, bindings=bindings).result()
File "/usr/lib/python3.7/concurrent/futures/_base.py", line 428, in result
return self.__get_result()
File "/usr/lib/python3.7/concurrent/futures/_base.py", line 384, in __get_result
raise self._exception
File "/home/vagrant/.cache/bazel/_bazel_vagrant/f9910d98673307a31f928c448bd4acd0/execroot/project/bazel-out/k8-fastbuild/bin/path/to/directory/python_code.runfiles/pip_deps/pypi__gremlinpython/gremlin_python/driver/connection.py", line 66, in cb
f.result()
File "/usr/lib/python3.7/concurrent/futures/_base.py", line 428, in result
return self.__get_result()
File "/usr/lib/python3.7/concurrent/futures/_base.py", line 384, in __get_result
raise self._exception
File "/usr/lib/python3.7/concurrent/futures/thread.py", line 57, in run
result = self.fn(*self.args, **self.kwargs)
File "/home/vagrant/.cache/bazel/_bazel_vagrant/f9910d98673307a31f928c448bd4acd0/execroot/project/bazel-out/k8-fastbuild/bin/path/to/directory/python_code.runfiles/pip_deps/pypi__gremlinpython/gremlin_python/driver/protocol.py", line 74, in write
request_id, request_message)
File "/home/vagrant/.cache/bazel/_bazel_vagrant/f9910d98673307a31f928c448bd4acd0/execroot/project/bazel-out/k8-fastbuild/bin/path/to/directory/python_code.runfiles/pip_deps/pypi__gremlinpython/gremlin_python/driver/serializer.py", line 132, in serialize_message
message = self.build_message(request_id, processor, op, args)
File "/home/vagrant/.cache/bazel/_bazel_vagrant/f9910d98673307a31f928c448bd4acd0/execroot/project/bazel-out/k8-fastbuild/bin/path/to/directory/python_code.runfiles/pip_deps/pypi__gremlinpython/gremlin_python/driver/serializer.py", line 142, in build_message
return self.finalize_message(message, b"\x21", self.version)
File "/home/vagrant/.cache/bazel/_bazel_vagrant/f9910d98673307a31f928c448bd4acd0/execroot/project/bazel-out/k8-fastbuild/bin/path/to/directory/python_code.runfiles/pip_deps/pypi__gremlinpython/gremlin_python/driver/serializer.py", line 145, in finalize_message
message = json.dumps(message)
File "/usr/lib/python3.7/json/__init__.py", line 231, in dumps
return _default_encoder.encode(obj)
File "/usr/lib/python3.7/json/encoder.py", line 199, in encode
chunks = self.iterencode(o, _one_shot=True)
File "/usr/lib/python3.7/json/encoder.py", line 257, in iterencode
return _iterencode(o, 0)
File "/usr/lib/python3.7/json/encoder.py", line 179, in default
raise TypeError(f'Object of type {o.__class__.__name__} '
TypeError: Object of type __ is not JSON serializable
Unlike the Gremlin Console, the Python Console doesn't automatically iterate your traversals for you. As in your application code you must iterate them yourself. When you do
g.V().has("id", <insert id>).union(__.identity(), __.repeat(__.out()).emit()).drop()
that simply spawns a Traversal object but does not execute it. You must therefore iterate it in some way to exhaust the elements within it - in your case, the appropriate terminator to use would be iterate():
g.V().has("id", <insert id>).union(__.identity(), __.repeat(__.out()).emit()).drop().iterate()
The semantics around drops in TinkerPop aren't always consistent unfortunately. TinkerPop tried to preserve flexibility for providers in how they implement that but it causes confusion sometimes because a query will work fine in TinkerGraph but then behave slightly differently when executed on a different provider. If the above approach only drops the root, you could try to realize the results before the drop:
g.V().has("id", <insert id>).
union(__.identity(),
__.repeat(__.out()).emit()).
fold().unfold().
drop()
Looks a bit dumb but will force all the vertices you wish to drop to be traversed into a list before they are dropped. In that way, you won't kill the repeat() by dropping the parent and its edges first and leaving the child paths untravelled.

Neptune InternalFailureException: Can not get the attachable from the host vertex

I am using neptune's graph database with gremlin queries through python, to store addresses in a database. Most of the queries execute fine, but once i try the following query neptune returns a internal failure exception:
g.V(address).outE('isPartOf').inV().
dedup().as_('groupNode').
inE('isPartOf').outV().dedup().as_('children').
addE('isPartOf').to(group).
select('groupNode').drop().
fold().
coalesce(__.unfold(),
g.V(address).addE('isPartOf').to(group)).next()
Every address has the possibility to belong to a group. when the address is already assigned to a group, i try to take all addresses assigned to that group and assign them to a new group, while deleting the old group. If the address is not yet assigned to a group i simply want to assign the address to the new group immediately.
If i try this query on it's own everything executes perfectly (although it is a bit of a slow query). However once i try to execute this query in parallel on more addresse this query fails with the following error:
Traceback (most recent call last):
File "/usr/lib64/python2.7/threading.py", line 804, in __bootstrap_inner
self.run()
File "gremlinExample.py", line 30, in run
processTx(self.tx, self.g, self.parentBlock)
File "gremlinExample.py", line 152, in processTx
g.V(address).outE('isPartOf').inV().dedup().as_('groupNode').inE('isPartOf').outV().dedup().as_('children').select('children').addE('isPartOf').to(group).select('groupNode').drop().fold().coalesce(__.unfold(), g.V(address).addE('isPartOf').to(group)).next()
File "/home/ec2-user/.local/lib/python2.7/site-packages/gremlin_python/process/traversal.py", line 70, in next
return self.__next__()
File "/home/ec2-user/.local/lib/python2.7/site-packages/gremlin_python/process/traversal.py", line 43, in __next__
self.traversal_strategies.apply_strategies(self)
File "/home/ec2-user/.local/lib/python2.7/site-packages/gremlin_python/process/traversal.py", line 346, in apply_strategies
traversal_strategy.apply(traversal)
File "/home/ec2-user/.local/lib/python2.7/site-packages/gremlin_python/driver/remote_connection.py", line 143, in apply
remote_traversal = self.remote_connection.submit(traversal.bytecode)
File "/home/ec2-user/.local/lib/python2.7/site-packages/gremlin_python/driver/driver_remote_connection.py", line 54, in submit
results = result_set.all().result()
File "/usr/lib/python2.7/site-packages/concurrent/futures/_base.py", line 405, in result
return self.__get_result()
File "/usr/lib/python2.7/site-packages/concurrent/futures/_base.py", line 357, in __get_result
raise type(self._exception), self._exception, self._traceback
GremlinServerError: 500: {"requestId":"a42015b7-6b22-4bd1-9e7d-e3252e8f3ab6","code":"InternalFailureException","detailedMessage":"Can not get the attachable from the host vertex: v[64b32957-ef71-be47-c8d7-0109cfc4d9fd]-/->neptunegraph[org.apache.commons.configuration.PropertiesConfiguration#6db0f02]"}
To my knowledge execution in parallel shouldn't be the problem, since every query simply get's queued at the database (exactly for this reason i tried to create a query which executes the whole task at once).
Excuses for any bad English, it is not my native language
For anyone else who's looking for an update here - the OP was able to resolve the issue by replacing .next() with a .iterate(). Some followups were needed to understand the query and data better, but the OP has abandoned the project and moved to another solution.

Understanding Dask Distributed behavior with regards to DataFrame operations

I would like to better understand how dask.distributed works. I have a simple csv I read into a Dask dataframe as in the below. This operation executes fine and returns some integer value representing the length of the dataframe, which is the behavior I would expect.
import dask.dataframe as dd
gdf = dd.read_csv(filepath)
len(gdf)
# returns some int value
But once I introduce an instance of Client from dask.distributed, I recieve the following error:
distributed.utils - ERROR - 'LocalFileSystem' object has no attribute 'cwd'
Here is an example code block:
from dask.distributed import Client
import dask.dataframe as dd
client_db = Client(remote_addr)
gdf = dd.read_csv(filepath)
len(gdf)
# throws the above error
I'm confused - does Client "inject itself" into all Dask operations once it has been instantiated. I thought I would need to do something like gdf = client_db.persist(gdf) to ask that Client connection to manage operations on that dataframe.
Some context on what's happening here would be much appreciated! I can see from the traceback it has to do with Tornado, which is a web framework in Py that allows for web sockets, long polling, etc. I assume that it is attempting to store something... somewhere... But my familiarity drops off here.
If needed, the traceback:
Traceback (most recent call last):
File "/.../geopandas_opt/venv/lib/python3.6/site-packages/distributed/utils.py", line 223, in f
result[0] = yield make_coro()
File "/.../geopandas_opt/venv/lib/python3.6/site-packages/tornado/gen.py", line 1015, in run
value = future.result()
File "/.../geopandas_opt/venv/lib/python3.6/site-packages/tornado/concurrent.py", line 237, in result
raise_exc_info(self._exc_info)
File "<string>", line 3, in raise_exc_info
File "/.../geopandas_opt/venv/lib/python3.6/site-packages/tornado/gen.py", line 1021, in run
yielded = self.gen.throw(*exc_info)
File "/.../geopandas_opt/venv/lib/python3.6/site-packages/distributed/client.py", line 1156, in _gather
traceback)
File "/.../geopandas_opt/venv/lib/python3.6/site-packages/six.py", line 685, in reraise
raise value.with_traceback(tb)
File "/usr/local/lib/python3.5/site-packages/dask/bytes/core.py", line 212, in read_block_from_file
File "/usr/local/lib/python3.5/site-packages/dask/bytes/core.py", line 314, in __enter__
File "/usr/local/lib/python3.5/site-packages/dask/bytes/local.py", line 64, in open
File "/usr/local/lib/python3.5/site-packages/dask/bytes/local.py", line 36, in _trim_filename
AttributeError: 'LocalFileSystem' object has no attribute 'cwd'
Yes, when you create a Client it registers itself as the default global scheduler. You can avoid this behavior with the set_as_default= keyword
client = Client(..., set_as_default=False)
Regarding the exception you've run into I suspect that it is a version mismatch. You might want to upgrade with either conda or pip.
conda install dask distributed
or
pip install dask distributed

ndb.get_multi returning AssertionError

Over the last 48 hours or so my small python GAE app has started getting AssertionErrors from ndb.get_multi calls.
Full traceback is appended, and the errors are being generated on the production server in _BaseValue's __init__ on line 734 of /base/data/.../ndb/model.py, and the failing assertion is b_val is not None with message "Cannot wrap None"
The error doesn't appear to be related to a particular entity or entities, but I've only seen it with one entity type so far (yet to test others).
The get_multi call is for only up to a dozen keys, and the error is intermittent so that repeating it will sometimes succeed. Or not...
I'm not seeing this error via remote shell, but I note that my local install is 1.9.23 while the log entry says the production server is 1.9.25 (GoogleAppEngineLauncher says my local install is up to date)
I'm adding a workaround to catch the exception and iterate through the keys to get them individually but I'm still seeing an upstream warning about a "suspended generator get" on line 744 of context.py.
The warning appears on the first get of this entity type from the list, for at least 2 different lists of keys (as well as preceding the AssertionError).
I don't want to have to wrap all get_multi calls in this way.
What's going on?
TRACEBACK:
Cannot wrap None
Traceback (most recent call last):
File "/base/data/home/runtimes/python27/python27_lib/versions/third_party/webapp2-2.5.2/webapp2.py", line 570, in dispatch
return method(*args, **kwargs)
File "/base/data/home/apps/s~thegapnetball/115.386356111937586421/handlers/assess.py", line 50, in get
rs = ndb.get_multi(t.players)
File "/base/data/home/runtimes/python27/python27_lib/versions/1/google/appengine/ext/ndb/model.py", line 3905, in get_multi
for future in get_multi_async(keys, **ctx_options)]
File "/base/data/home/runtimes/python27/python27_lib/versions/1/google/appengine/ext/ndb/tasklets.py", line 326, in get_result
self.check_success()
File "/base/data/home/runtimes/python27/python27_lib/versions/1/google/appengine/ext/ndb/tasklets.py", line 372, in _help_tasklet_along
value = gen.send(val)
File "/base/data/home/runtimes/python27/python27_lib/versions/1/google/appengine/ext/ndb/context.py", line 751, in get
pbs = entity._to_pb(set_key=False).SerializePartialToString()
File "/base/data/home/runtimes/python27/python27_lib/versions/1/google/appengine/ext/ndb/model.py", line 3147, in _to_pb
prop._serialize(self, pb, projection=self._projection)
File "/base/data/home/runtimes/python27/python27_lib/versions/1/google/appengine/ext/ndb/model.py", line 2379, in _serialize
projection=projection)
File "/base/data/home/runtimes/python27/python27_lib/versions/1/google/appengine/ext/ndb/model.py", line 1405, in _serialize
values = self._get_base_value_unwrapped_as_list(entity)
File "/base/data/home/runtimes/python27/python27_lib/versions/1/google/appengine/ext/ndb/model.py", line 1175, in _get_base_value_unwrapped_as_list
wrapped = self._get_base_value(entity)
File "/base/data/home/runtimes/python27/python27_lib/versions/1/google/appengine/ext/ndb/model.py", line 1163, in _get_base_value
return self._apply_to_values(entity, self._opt_call_to_base_type)
File "/base/data/home/runtimes/python27/python27_lib/versions/1/google/appengine/ext/ndb/model.py", line 1335, in _apply_to_values
value[:] = map(function, value)
File "/base/data/home/runtimes/python27/python27_lib/versions/1/google/appengine/ext/ndb/model.py", line 1217, in _opt_call_to_base_type
value = _BaseValue(self._call_to_base_type(value))
File "/base/data/home/runtimes/python27/python27_lib/versions/1/google/appengine/ext/ndb/model.py", line 734, in \__init__
assert b_val is not None, "Cannot wrap None"
AssertionError: Cannot wrap None
Tim Hoffman and Patrick Costello put me on the right track to solve this.
I incremented version to protect some changes but took longer to finish than I expected.
One change added a repeated StructuredProperty to a model derived from ndb.Model, and I put several entities with the extra property (about 30 out of 1100 total).
The previous version without the extra property was still the default and was being lightly used, so the entities became just inconsistent enough to produce the intermittent AssertionError.
The main lesson is to take note of the recommendations in Google's schema update article, particularly changing the underlying parent to Expando and/or disabling datastore edits until any migration is complete.
https://cloud.google.com/appengine/articles/update_schema
The fix was to add the property to the previous version, get all the entities and then put them.
Thanks Tim and Patrick for the pointer!

Putting models in a callback function from ctypes library

I am trying to setup an application based on the Google App Engine using the Managed VM feature.
I am using a shared library written in C++ using ctypes
cdll.LoadLibrary('./mylib.so')
which registers a callback function
CB_FUNC_TYPE = CFUNCTYPE(None, eSubscriptionType)
cbFuncType = CB_FUNC_TYPE(scrptCallbackHandler)
in which i want to save data to the ndb datastore
def scrptCallbackHandler(arg):
model = Model(name=str(arg.data))
model.put()
I am registering a callback function in which i want to take the Data from the C++ program and put it in the ndb datastore. This results in an error. On the devserver it behaves slightly different, so from a production server:
suspended generator _put_tasklet(context.py:343) raised BadRequestError(Application Id (app) format is invalid: '_')LOG 2 1429698464071045 suspended generator put(context.py:810) raised BadRequestError(Application Id (app) format is invalid: '_')
Traceback (most recent call last):
File "_ctypes/callbacks.c", line 314, in 'calling callback function' File "/home/vmagent/app/isw_cloud_client.py", line 343, in scrptCallbackHandler node.put()
File "/home/vmagent/python_vm_runtime/google/appengine/ext/ndb/model.py", line 3380, in _put return self._put_async(**ctx_options).get_result()
File "/home/vmagent/python_vm_runtime/google/appengine/ext/ndb/tasklets.py", line 325, in get_result self.check_success()
File "/home/vmagent/python_vm_runtime/google/appengine/ext/ndb/tasklets.py", line 368, in _help_tasklet_along value = gen.throw(exc.__class__, exc, tb)
File "/home/vmagent/python_vm_runtime/google/appengine/ext/ndb/context.py", line 810, in put key = yield self._put_batcher.add(entity, options)
File "/home/vmagent/python_vm_runtime/google/appengine/ext/ndb/tasklets.py", line 368, in _help_tasklet_along value = gen.throw(exc.__class__, exc, tb)
File "/home/vmagent/python_vm_runtime/google/appengine/ext/ndb/context.py", line 343, in _put_tasklet keys = yield self._conn.async_put(options, datastore_entities)
File "/home/vmagent/python_vm_runtime/google/appengine/ext/ndb/tasklets.py", line 454, in _on_rpc_completion result = rpc.get_result()
File "/home/vmagent/python_vm_runtime/google/appengine/api/apiproxy_stub_map.py", line 613, in get_result return self.__get_result_hook(self)
File "/home/vmagent/python_vm_runtime/google/appengine/datastore/datastore_rpc.py", line 1827, in __put_hook self.check_rpc_success(rpc)
File "/home/vmagent/python_vm_runtime/google/appengine/datastore/datastore_rpc.py", line 1342, in check_rpc_success raise _ToDatastoreError(err)google.appengine.api.datastore_errors.BadRequestError: Application Id (app) format is invalid: '_'
The start of the C++ program is triggered by a call to a Request handler but runs in the background and accepts incoming data which should be processed in the callback.
Update: As Tim pointed out already it seems that the context of the wsgi handler is lost. Most likely the solution here would be to create the application context somehow.
I am only guessing what is my problem and i want to tell what i did to solve it.
The execution context of the callback functions is somewhat different than the rest of the python application. Any asynchronous operation in the callback fails. I tried doing an http call or saving it to the datastore. The operations never finish and after 60s the application shows an error that they crashed. I guess this is because how the python manages the execution and the corresponding memory allocation.
I was able to execute the callback in an object's context by wrapping it in a closure within a class. This wasnt really the problem but the solution can be found in this answer: How can I get methods to work as callbacks with python ctypes?
For my solution i am now using a combination of cloud-endpoints on another module and background threads on the ctypes-module.
Within the C-Callback i start a background thread, which is able to do asynchronous work
# Start a background thread using the background thread service from GAE
background_thread.start_new_background_thread(putData, [name, value])
And here the simple task it executes:
# Here i call my cloud-endpoints
def putData(name, value):
body = {
'name' : 'name',
'value' : int(value)
}
res = service.objects().create(body=body).execute()
Of course i need to do error handling and additional stuff, but for me this is a good solution.
Note: Adding models to the datastore in the bg thread failed because the environment in the bg thread is different from the application and the app id was not set.

Categories

Resources