I am trying to use the Krakenex python library to query the order book for multiple currency pairs at once. When I do it for a single currency is works, like this:
con = krakenex.API()
con.load_key('kraken.key')
con.query_public('Depth', {'pair':'GNOETH'})
However, if I do:
con = krakenex.API()
con.load_key('kraken.key')
con.query_public('Depth', {'pair':['GNOETH', 'GNOEUR']})
I get {'error': ['EQuery:Unknown asset pair']}. I assume that the syntax is incorrect but can't figure out the correct one. This is the first time that I use an API and the example provided are not covering enough info yet.
Unfortunately you can not query the Depth of multiple asset pairs with a single request. I had the same question to Kraken's support: their reason not allowing it, is high computational cost.
On contrary, querying e.g. AssetPairs endpoint in the same manner works.
Spent a lot of time trying different combos, finally figured it out.
try con.query_public('Depth', {'pair':'GNOETH, GNOEUR'})
Related
I want to page through the results from the Shopify API using the Python wrapper. The API recently (2019-07) switched to "cursor-based pagination", so I cannot just pass a "page" query parameter to get the next set of results.
The Shopify API docs have a page dedicated to cursor-based pagination.
The API response supposedly includes a link in the response headers that includes info for making another request, but I cannot figure out how to access it. As far as I can tell, the response from the wrapper is a standard Python list that has no headers.
I think I could make this work without using the python API wrapper, but there must be an easy way to get the next set of results.
import shopify
shopify.ShopifyResource.set_site("https://example-store.myshopify.com/admin/api/2019-07")
shopify.ShopifyResource.set_user(API_KEY)
shopify.ShopifyResource.set_password(PASSWORD)
products = shopify.Product.find(limit=5)
# This works fine
for product in products:
print(product.title)
# None of these work for accessing the headers referenced in the docs
print(products.headers)
print(products.link)
print(products['headers'])
print(products['link'])
# This throws an error saying that "page" is not an acceptable parameter
products = shopify.Product.find(limit=5, page=2)
Can anyone provide an example of how to get the next page of results using the wrapper?
As mentioned by #babis21, this was a bug in the shopify python api wrapper. The library was updated in January of 2020 to fix it.
For anyone stumbling upon this, here is an easy way to page through all results. This same format also works for other API objects like Products as well.
orders = shopify.Order.find(since_id=0, status='any', limit=250)
for order in orders:
# Do something with the order
while orders.has_next_page():
orders = orders.next_page()
for order in orders:
# Do something with the remaining orders
Using since_id=0 will fetch ALL orders because order IDs are guaranteed to be greater than 0.
If you don't want to repeat the code that processes the order objects, you can wrap it all in an iterator like this:
def iter_all_orders(status='any', limit=250):
orders = shopify.Order.find(since_id=0, status=status, limit=limit)
for order in orders:
yield order
while orders.has_next_page():
orders = orders.next_page()
for order in orders:
yield order
for order in iter_all_orders():
# Do something with each order
If you are fetching a large number of orders or other objects (for offline analysis like I was), you will find that this is slow compared to your other options. The GraphQL API is faster than the REST API, but performing bulk operations with the GraphQL API was by far the most efficient.
You can find response header with below code
resp_header = shopify.ShopifyResource.connection.response.headers["link"];
then you can split(',') string of link index and then remove(<>) and can get next link url.
I am not familiar with python , but i think i will work, you can also review below links:
https://community.shopify.com/c/Shopify-APIs-SDKs/Python-API-library-for-shopify/td-p/529523
https://community.shopify.com/c/Shopify-APIs-SDKs/Trouble-with-pagination-when-fetching-products-API-python/td-p/536910
thanks
#rseabrook
I have exactly the same issue, it seems others do as well and someone has raised this: https://github.com/Shopify/shopify_python_api/issues/337
where I see there is an open PR for this: https://github.com/Shopify/shopify_python_api/pull/338
I guess it should be ready soon, so an alternative idea would be to wait a bit and use 2019-04 version (which supports the page parameter to perform pagination).
UPDATE: It seems this has been released now: https://github.com/Shopify/shopify_python_api/pull/352
I am working on a Python Script to get the data out from sales force. Everything seems to be working fine I am passing a custom SOQL query to get the data but the challenge is that query is returning only the first 500 rows but there are close to 653000 results inside my object.
"data= sf.query('SELECT {} from abc'.format(','.join(column_names_list)))"
now i tried using querymore() and queryall() but that doesn't seem to work too.
At the end my intent is to load all this information in a dataframe and push this to a table and keep looking for new records by scheduling this code. Is there a way to achieve this. ?
In order to retrieve all the records using a single method just try
data= sf.query_all('SELECT {} from abc'.format(','.join(column_names_list)))
Refer following link
This morning my GAE application generated several error log: "too much contention on these datastore entities. please try again.". In my mind, this type of error only happens when multiple requests try modify the same entity or entities in the same entity group.
When I got this error, my code is inserting new entities. I'm confused. Does this mean there is a limitation of how fast we can create new entity?
My code of model definition and calling sequence is show below:
# model defnition
class ExternalAPIStats(ndb.Model):
uid = ndb.StringProperty()
api = ndb.StringProperty()
start_at = ndb.DateTimeProperty(auto_now_add=True)
end_at = ndb.DateTimeProperty()
# calling sequence
stats = ExternalAPIStats(userid=current_uid, api="eapi:hr:get_by_id", start_at=start_at, end_at=end_at)
stats.put() # **too much contention** happen here
That's pretty mysterious to me. I was wondering how I shall deal with this problem. Please let me know if any suggestion.
Without seeing how the calls are made(you show the calling code but how often is it called, via loop or many pages calling the same put at the same time) but I believe the issue is better explained here. In particular
You will also see this problem if you create new entities at a high rate with a monotonically increasing indexed property like a timestamp, because these properties are the keys for rows in the index tables in Bigtable.
with the 'start_at' being the culprit. This article explains in more detail.
Possibly (though untested) try doing your puts in batches. Do you run queries on the 'start_at' field? If not removing its indexes will also fix the issue.
How is the puts called (ie what I was asking above in a loop, multiple pages calling)? With that it might be easier to narrow down the issue.
Here is everything you need to know about Datastore Contention and how to avoid it:
https://developers.google.com/appengine/articles/scaling/contention?hl=en
(Deleted)
UPDATE:
You are reaching writes per second limit on the same entity group. Default it is 1 write per second.
https://cloud.google.com/datastore/docs/concepts/limits
Source: https://stackoverflow.com/a/47800087/1034622
After enabling Appstats and profiling my application, I went on a panic rage trying to figure out how to reduce costs by any means. A lot of my costs per request came from queries, so I sought out to eliminate querying as much as possible.
For example, I had one query where I wanted to get a User's StatusUpdates after a certain date X. I used a query to fetch: statusUpdates = StatusUpdates.query(StatusUpdates.date > X).
So I thought I might outsmart the system and avoid a query, but incur higher write costs for the sake of lower read costs. I thought that every time a user writes a Status, I store the key to that status in a list property of the user. So instead of querying, I would just do ndb.get_multi(user.list_of_status_keys).
The question is, what is the difference for the system between these two approaches? Sure I avoid a query with the second case, but what is happening behind the scenes here? Is what I'm doing in the second case, where I'm collecting keys, just me doing a manual indexing that GAE would have done for me with queries?
In general, what is the difference between get_multi(keys) and a query? Which is more efficient? Which is less costly?
Check the docs on billing:
https://developers.google.com/appengine/docs/billing
It's pretty straightforward. Reads are $0.07/100k, smalls are $0.01/100k, so you want to do smalls.
A query is 1 read + 1 small / entity
A get is 1 read. If you are getting more than 1 entity back with a query, it's cheaper to do a query than reading entities from keys.
Query is likely more efficient too. The only benefit from doing the gets is that they'll be fully consistent (whereas a query is eventually consistent).
Storing the keys does not query, as you cannot do anything with just the keys. You will still have to fetch the Status objects from memory. Also, since you want to query on the date of the Status object, you will need to fetch all the Status objects into memory and compare their dates yourself. If you use a Query, appengine will fetch only the Status with the required date. Since you fetch less, your read costs will be lower.
As this is basically the same question as you have posed here, I suggest that you look at the answer I gave there.
I'm making an app that has a need for reverse searches. By this, I mean that users of the app will enter search parameters and save them; then, when any new objects get entered onto the system, if they match the existing search parameters that a user has saved, a notification will be sent, etc.
I am having a hard time finding solutions for this type of problem.
I am using Django and thinking of building the searches and pickling them using Q objects as outlined here: http://www.djangozen.com/blog/the-power-of-q
The way I see it, when a new object is entered into the database, I will have to load every single saved query from the db and somehow run it against this one new object to see if it would match that search query... This doesn't seem ideal - has anyone tackled such a problem before?
At the database level, many databases offer 'triggers'.
Another approach is to have timed jobs that periodically fetch all items from the database that have a last-modified date since the last run; then these get filtered and alerts issued. You can perhaps put some of the filtering into the query statement in the database. However, this is a bit trickier if notifications need to be sent if items get deleted.
You can also put triggers manually into the code that submits data to the database, which is perhaps more flexible and certainly doesn't rely on specific features of the database.
A nice way for the triggers and the alerts to communicate is through message queues - queues such as RabbitMQ and other AMQP implementations will scale with your site.
The amount of effort you use to solve this problem is directly related to the number of stored queries you are dealing with.
Over 20 years ago we handled stored queries by treating them as minidocs and indexing them based on all of the must have and may have terms. A new doc's term list was used as a sort of query against this "database of queries" and that built a list of possibly interesting searches to run, and then only those searches were run against the new docs. This may sound convoluted, but when there are more than a few stored queries (say anywhere from 10,000 to 1,000,000 or more) and you have a complex query language that supports a hybrid of Boolean and similarity-based searching, it substantially reduced the number we had to execute as full-on queries -- often no more that 10 or 15 queries.
One thing that helped was that we were in control of the horizontal and the vertical of the whole thing. We used our query parser to build a parse tree and that was used to build the list of must/may have terms we indexed the query under. We warned the customer away from using certain types of wildcards in the stored queries because it could cause an explosion in the number of queries selected.
Update for comment:
Short answer: I don't know for sure.
Longer answer: We were dealing with a custom built text search engine and part of it's query syntax allowed slicing the doc collection in certain ways very efficiently, with special emphasis on date_added. We played a lot of games because we were ingesting 4-10,000,000 new docs a day and running them against up to 1,000,000+ stored queries on a DEC Alphas with 64MB of main memory. (This was in the late 80's/early 90's.)
I'm guessing that filtering on something equivalent to date_added could be done used in combination the date of the last time you ran your queries, or maybe the highest id at last query run time. If you need to re-run the queries against a modified record you could use its id as part of the query.
For me to get any more specific, you're going to have to get a lot more specific about exactly what problem you are trying to solve and the scale of the solution you are trying accomplishing.
If you stored the type(s) of object(s) involved in each stored search as a generic relation, you could add a post-save signal to all involved objects. When the signal fires, it looks up only the searches that involve its object type and runs those. That probably will still run into scaling issues if you have a ton of writes to the db and a lot of saved searches, but it would be a straightforward Django approach.