How to page through results using Shopify Python API wrapper - python

I want to page through the results from the Shopify API using the Python wrapper. The API recently (2019-07) switched to "cursor-based pagination", so I cannot just pass a "page" query parameter to get the next set of results.
The Shopify API docs have a page dedicated to cursor-based pagination.
The API response supposedly includes a link in the response headers that includes info for making another request, but I cannot figure out how to access it. As far as I can tell, the response from the wrapper is a standard Python list that has no headers.
I think I could make this work without using the python API wrapper, but there must be an easy way to get the next set of results.
import shopify
shopify.ShopifyResource.set_site("https://example-store.myshopify.com/admin/api/2019-07")
shopify.ShopifyResource.set_user(API_KEY)
shopify.ShopifyResource.set_password(PASSWORD)
products = shopify.Product.find(limit=5)
# This works fine
for product in products:
print(product.title)
# None of these work for accessing the headers referenced in the docs
print(products.headers)
print(products.link)
print(products['headers'])
print(products['link'])
# This throws an error saying that "page" is not an acceptable parameter
products = shopify.Product.find(limit=5, page=2)
Can anyone provide an example of how to get the next page of results using the wrapper?

As mentioned by #babis21, this was a bug in the shopify python api wrapper. The library was updated in January of 2020 to fix it.
For anyone stumbling upon this, here is an easy way to page through all results. This same format also works for other API objects like Products as well.
orders = shopify.Order.find(since_id=0, status='any', limit=250)
for order in orders:
# Do something with the order
while orders.has_next_page():
orders = orders.next_page()
for order in orders:
# Do something with the remaining orders
Using since_id=0 will fetch ALL orders because order IDs are guaranteed to be greater than 0.
If you don't want to repeat the code that processes the order objects, you can wrap it all in an iterator like this:
def iter_all_orders(status='any', limit=250):
orders = shopify.Order.find(since_id=0, status=status, limit=limit)
for order in orders:
yield order
while orders.has_next_page():
orders = orders.next_page()
for order in orders:
yield order
for order in iter_all_orders():
# Do something with each order
If you are fetching a large number of orders or other objects (for offline analysis like I was), you will find that this is slow compared to your other options. The GraphQL API is faster than the REST API, but performing bulk operations with the GraphQL API was by far the most efficient.

You can find response header with below code
resp_header = shopify.ShopifyResource.connection.response.headers["link"];
then you can split(',') string of link index and then remove(<>) and can get next link url.
I am not familiar with python , but i think i will work, you can also review below links:
https://community.shopify.com/c/Shopify-APIs-SDKs/Python-API-library-for-shopify/td-p/529523
https://community.shopify.com/c/Shopify-APIs-SDKs/Trouble-with-pagination-when-fetching-products-API-python/td-p/536910
thanks

#rseabrook
I have exactly the same issue, it seems others do as well and someone has raised this: https://github.com/Shopify/shopify_python_api/issues/337
where I see there is an open PR for this: https://github.com/Shopify/shopify_python_api/pull/338
I guess it should be ready soon, so an alternative idea would be to wait a bit and use 2019-04 version (which supports the page parameter to perform pagination).
UPDATE: It seems this has been released now: https://github.com/Shopify/shopify_python_api/pull/352

Related

Fetching entire changelog for an issue in JIRA using jira-python

Using jira-python, I want to retrieve the entire changelog for a JIRA issue:
issues_returned = jira.search_issues(args.jql, expand='changelog')
I discovered that for issues with more than 100 entries in their changelog I am only receiving the first 100:
My question is how do I specify a startAt and make another call to get subsequent pages of the changelog (using python-jira)?
From this thread at Atlassian I see that API v3 provides an endpoint to get the change log directly:
/rest/api/3/issue/{issueIdOrKey}/changelog
but this doesn't seem to be accessible via jira-python. I'd like to avoid having to do the REST call directly and authenticate separately. Barring a way to do it directly via jira-python, is there a way to make a 'raw' REST API call from jira-python?
In instances where more than 100 results are present, you'll need to edit the 'startAt' parameter when searching issues:
issues_returned = jira.search_issues(args.jql, expand='changelog', startAt=100)
You'll need to setup a statement that compares the 'total' and 'maxResults' data points, then run another query with a different 'startAt' parameter if the total is higher and append the two together.

Is there a way to search for an edx course using their API?

I have an edX client ID and secret ID, and I'm able to get the list of all edX online courses, but I want to be able to search through all of the edX courses with a query string like "robotics". (Unlike this question, I already understand how to get the list of courses, so this is not a duplicate question...) I tried
import requests
# I also tried with search_query instead of search
edx_course_search_response = requests.get('https://api.edx.org/catalog/v1/catalogs/'+edx_course_catalog_id+'/courses/'+"?search=robotics", headers=headers)
I know the catalog id and headers information is correct because I can get the list of edX courses. Unfortunately, this does not work and instead returns the first 20 online courses in edX's database as usual.
If I can't, I think I'm going to use whoosh.
I don't think so.
query is a response value from this endpoint: GET /catalog/v1/catalogs/{id}/. The endpoint for retrieving info about a specific catalog. It is not a user passed parameter. The user parameter is catalog id.
You might use this endpoint: /catalog/v1/catalogs/{id}/courses/
That endpoint, as mentioned in comments has an answer covering pagination here. You could then filter the returned dataset with python according to your query term. The results array in response values from that endpoint has fields such as short description/long description/title which might be useful to target.

Krakenex API multiple pairs query

I am trying to use the Krakenex python library to query the order book for multiple currency pairs at once. When I do it for a single currency is works, like this:
con = krakenex.API()
con.load_key('kraken.key')
con.query_public('Depth', {'pair':'GNOETH'})
However, if I do:
con = krakenex.API()
con.load_key('kraken.key')
con.query_public('Depth', {'pair':['GNOETH', 'GNOEUR']})
I get {'error': ['EQuery:Unknown asset pair']}. I assume that the syntax is incorrect but can't figure out the correct one. This is the first time that I use an API and the example provided are not covering enough info yet.
Unfortunately you can not query the Depth of multiple asset pairs with a single request. I had the same question to Kraken's support: their reason not allowing it, is high computational cost.
On contrary, querying e.g. AssetPairs endpoint in the same manner works.
Spent a lot of time trying different combos, finally figured it out.
try con.query_public('Depth', {'pair':'GNOETH, GNOEUR'})

How to get/put Yahoo contacts in bulk

I'm adding contacts one by one using Yahoo REST API and Python. The set of contacts can be relatively big (around 500), however the API doesn't provide any method that can add bigger parts of my contacts in one request (for example 100 items at once). Maybe someone knows any other way to add multiple contacts at once?
Thanks
Looking at the page linked in your question, the request URL has a count parameter. Have you tried this?

How to optimize for Django's paginator module

I have a question about how Django's paginator module works and how to optimize it. I have a list of around 300 items from information that I get from different APIs on the internet. I am using Django's paginator module to display the list for my visitors, 10 items at a time. The pagination does not work as well as I want it to. It seems that the paginator has to get all 300 items before pulling out the ten that need to be displayed each time the page is changed. For example, if there are 30 pages, then going to page 2 requires my website to query the APIs again, put all the information in a list, and then access the ten that the visitor's browser requests. I do not want to keep querying the APIs for the same information that I already have on each page turn.
Right now, my views has a function that looks at the get request and queries the APIs for information based on the query. Then it puts all that information into a list and passes it onto the template file. So, this function always loads whenever someone turns the page, resulting in querying the APIs again.
How should I fix this?
Thank you for your help.
The paginator will in this case need the full list in order to do its job.
My advice would be to update a cache of the feeds at a regular interval, and then use that cache as the input to the paginator module. Doing an intensive or length task on each and every request is always a bad idea. If not for the page load times the user will experience, think of the vulnerability of your server to attack.
You may want to check out Django's low level cache API which would allow you to store the feed result in a globally accessible place under a key, which you can later use to retrieve the cache and paginate for each page request.
ORM's do not load data until the row is selected:
query_results = Foo(id=1) # No sql executed yet, just stored.
foo = query_results[0] # now it fires
or
for foo in query_results:
foo.bar() # sql fires
If you are using a custom data source that is loading results on initialization then the pagination will not work as expected since all feeds will be fetched at once. You may want to subclass __getitem__ or __iter__ to do the actual fetch. It will then coincide with the way Django expects the results to be loaded.
Pagination is going to need to know how many results there are to do things like has_next(). In sql it is usually inexpensive to get a count(*) with an index. So you would also, want to have know how many results there would be (or maybe just estimate if it too expensive to know exactly).

Categories

Resources