Appengine Search API - InvalidRequest

Appengine Search API - InvalidRequest - python

I'm using Search API (https://cloud.google.com/appengine/docs/standard/python/search/)
on Google App Engine Python Standard Environment.
In my app I have several search indexes and query them with cursor pagination (paging pagination can't return more than 1000 results)
https://cloud.google.com/appengine/docs/standard/python/search/cursorclass
Sometimes I randomly get following exception:
InvalidRequest: Failed to execute search request "<search query>"
with no further details. If I get It in the middle of some query, It is going to reproduce forever with such cursor.
Item set is immutable between requests (no items are added or deleted)
First page of query is always ok (exception occurs only if I query with cursor)
I'm not using any sorting
Cursors are exact same as received by previous query and are not corrupted between requests
Any ideas how to solve this?
related:
https://issuetracker.google.com/issues/35898069
https://issuetracker.google.com/issues/35895008
https://groups.google.com/forum/#!topic/google-appengine/tBHkZLeYTOI
What does this error message mean in appengine?

Finally, we were able to find out what were causing random errors.
Make first request "A=1 AND B=2" and receive <cursor>
Make second request with <cursor> "A=1 AND B=2" - works OK
Make second request with <cursor> "B=2 AND A=1" - same request, but parameters order does not match original query - InvalidRequest with no explanation.

Related

Flask and best way to reduce mysql queries, maybe celery?

After i researched for three days and played with redis and celery i m no longer sure what the right solution to my problem is.
Its a simple problem. I have a simple flask app returning the data of a mysql query. But i dont want to query the database for every request made, as they might be 100 requests in a second. I wanna setup a daemon that queries independently my database every five seconds and if someone makes a request it should return the data of the previous request and when those 5 secs pass it will return the data from the latest query. All users recieve the same data. Is CELERY the solution?
i researched for three days.

The easiest way is to use Flask-Caching]
Just set a cache timeout of 5 seconds on your view and it will return a cached view response containing the result of the query made the first time and for all other query in the next 5 secs. When time is out, the first request will regenerate the cache by doing the query and all the flow of your view.
If your view function use arguments, use memoization instead of cache decorator to let caching use your arguments to generate the cache. For exemple, if you want to return a page details and you don't use memoization, you will return the same page detail for all your user, no matter of the id / slug in arguments.
The documentation of Flask-Caching explain everything better than me

How to page through results using Shopify Python API wrapper

I want to page through the results from the Shopify API using the Python wrapper. The API recently (2019-07) switched to "cursor-based pagination", so I cannot just pass a "page" query parameter to get the next set of results.
The Shopify API docs have a page dedicated to cursor-based pagination.
The API response supposedly includes a link in the response headers that includes info for making another request, but I cannot figure out how to access it. As far as I can tell, the response from the wrapper is a standard Python list that has no headers.
I think I could make this work without using the python API wrapper, but there must be an easy way to get the next set of results.
import shopify
shopify.ShopifyResource.set_site("https://example-store.myshopify.com/admin/api/2019-07")
shopify.ShopifyResource.set_user(API_KEY)
shopify.ShopifyResource.set_password(PASSWORD)
products = shopify.Product.find(limit=5)
# This works fine
for product in products:
print(product.title)
# None of these work for accessing the headers referenced in the docs
print(products.headers)
print(products.link)
print(products['headers'])
print(products['link'])
# This throws an error saying that "page" is not an acceptable parameter
products = shopify.Product.find(limit=5, page=2)
Can anyone provide an example of how to get the next page of results using the wrapper?

As mentioned by #babis21, this was a bug in the shopify python api wrapper. The library was updated in January of 2020 to fix it.
For anyone stumbling upon this, here is an easy way to page through all results. This same format also works for other API objects like Products as well.
orders = shopify.Order.find(since_id=0, status='any', limit=250)
for order in orders:
# Do something with the order
while orders.has_next_page():
orders = orders.next_page()
for order in orders:
# Do something with the remaining orders
Using since_id=0 will fetch ALL orders because order IDs are guaranteed to be greater than 0.
If you don't want to repeat the code that processes the order objects, you can wrap it all in an iterator like this:
def iter_all_orders(status='any', limit=250):
orders = shopify.Order.find(since_id=0, status=status, limit=limit)
for order in orders:
yield order
while orders.has_next_page():
orders = orders.next_page()
for order in orders:
yield order
for order in iter_all_orders():
# Do something with each order
If you are fetching a large number of orders or other objects (for offline analysis like I was), you will find that this is slow compared to your other options. The GraphQL API is faster than the REST API, but performing bulk operations with the GraphQL API was by far the most efficient.

You can find response header with below code
resp_header = shopify.ShopifyResource.connection.response.headers["link"];
then you can split(',') string of link index and then remove(<>) and can get next link url.
I am not familiar with python , but i think i will work, you can also review below links:
https://community.shopify.com/c/Shopify-APIs-SDKs/Python-API-library-for-shopify/td-p/529523
https://community.shopify.com/c/Shopify-APIs-SDKs/Trouble-with-pagination-when-fetching-products-API-python/td-p/536910
thanks

#rseabrook
I have exactly the same issue, it seems others do as well and someone has raised this: https://github.com/Shopify/shopify_python_api/issues/337
where I see there is an open PR for this: https://github.com/Shopify/shopify_python_api/pull/338
I guess it should be ready soon, so an alternative idea would be to wait a bit and use 2019-04 version (which supports the page parameter to perform pagination).
UPDATE: It seems this has been released now: https://github.com/Shopify/shopify_python_api/pull/352

flask-restless use pagination or get full response

Sometimes I want to get from a resource the full response and sometimes I want it with pagination. Until now I was only able to use either the one or the other.
But isn't there a way to set flask-restless to use both depending on the paramters i pass on the GET request?
If I want to disable pagination for a specific resource I change the settings like this:
manager.create_api(someresource, methods=['GET'], results_per_page=None)
But now pagination is completly disabled and that's not the behaviour I wish.
And if pagination is enabled as default it returns only the first page.
Isn't there a way to tell flask-restless to get only the first page if I specifically pass the page 1 in the query string like so:
GET http://someaddress/resource?page=1
I was actually able to solve the problem using a loop but I don't think it is a nice solution because I have to use multiple requests.
I requested the resource and fetched the total_pages and then I ran a loop to total_pages and passed each iteration as an argument in the query string for another new request to fetch each page:
i = 1
while i <= response.total_pages:
page_response = requests.get("http://someurl/someresource?page=" + str(i))
...
But I don't think it is a nice way to solve that issue. If there is a possibility to change the settings on flask-restless to fetch only the first page if it is passed as an argument in the query string then I would be more than happy but if there is still another way to use both then it's also good.

You can get the behaviour you want by disabling pagination with:
manager.create_api(someresource, methods=['GET'], results_per_page=0)
And then query the API with the results_per_page parameter like so:
GET http://someaddress/resource?results_per_page=2
The results_per_page parameter has to be a positive integer and will be your new page size. The parameter is further documented here.
Getting the full response without pagination is straight forward with this configuration. Just omit the results_per_page parameter:
GET http://someaddress/resource

Send form data to MongoDB using Flask

I'm having troubles sending my form data to MongoDB with a Flask setup. The form looks something like this: https://jsfiddle.net/8gqLtv7e
On the client-side, I see no errors. But when I submit the form, I receive a 500 Internal Server Error and I'm having a hard time finding the solution. The problem is the last line below in my views.py file:
#app.route('/recordReport', methods=['POST'])
def recordReport():
homeReportsCollection = db["reports"]
address=request.form.get['address']
rgbImage=request.form.get['rgb']
homeReportsCollection.insert({'address':address, 'rgb':rgbImage})
Because if I replace that with return json.dumps({'status':'OK', 'address':'address', 'rgb':'rgbImage'}), I can see the correct data in my browser. Am just not able to send it to a collection in MongoDB.

This answer is a summary of the comments (where a solution was found).
Have you tried typecasting address and rgbImage to String before inserting?
Type invalidation is the root of many common bugs in DB operations.
There used to be a bug in Mongo back in 2013. The data would be inserted into the collection. But Mongo would not return a correct status response. That led to servers going 500. Have you tried verifying if the data was indeed inserted into the collection?
Additionally run your flask app in debug=True mode. That might give additional data.
Flask has very good debug traceback reporting support. This is generally a good idea. In fact this should be the first thing to do when encountering an error.
So this is weird, I turned on debug=True and I get the following error: ValueError: View function did not return a response. BUT the data did actually get sent to DB via homeReportsCollection.insert({'address':address, 'rgb':rgbImage}) line. I see it in my collection. How do I fix the error? Because the user is redirected to /recordReport.
So the data was indeed inserted into the collection. It is possibly a Flask only issue. The traceback says it all. Flask requires that something is returned by a view method.
Try returning something in your function recordReport(). Return anything you want. It could be an OK message. It could be the _id of the document you just inserted into the collection. Just don't return None. Try this.
This behaviour is documented in this SO question.
Yeah, I returned an HTML template and no error now.
This is indeed the solution. Return something other than None from your flask view methods. This also validates the behaviour observed by asker in the question:
Because if I replace that with return json.dumps({'status':'OK', 'address':'address', 'rgb':'rgbImage'}), I can see the correct data in my browser.

AppEngine NDB Query return different Results

I have a query in my live app that has gone "odd"...
Running 1.8.4 SDK... 1.8.5 live instance using Python 2.7
Measurement is an NDB model... with a string property called status and a key property called asset....
(Deep in my handler code.... )
cursor=None
limit=10
asset_key = <a key to an actual asset>
qry = Measurement.query(
Measurement.status=='PENDING',
Measurement.asset=asset_key)
results, cursor, more = qry.fetch_page(page_size=limit, start_cursor=cursor)
Now the weird thing is if I run this sometimes I get 4 items and sometimes only 1. (the right answer is 4)....
The dump of the query is exactly the same ... cursor is set to None... limit is always the same....same handler...same query and no new records in between each query. Fresh instance (eg 1st time + no other users)
Each query is only separated by seconds yet results a different.
Am I missing something here... has anyone else experienced this? Is this some sort of corrupt index? (It is a relatively large "table" with 482,911 items) Is NDB caching a cursor variable???
Very very odd.

Queries do not look up values in any cache. However, query results are written back to the in-context cache if the cache policy says so (as per the docs). https://developers.google.com/appengine/docs/python/ndb/cache#incontext
Perhaps review the caching policy for the entity in question. However, from your snippet I'm unsure if your query is strongly consistent. That is more likely the cause of this issue: https://developers.google.com/appengine/docs/python/datastore/structuring_for_strong_consistency

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.