speeding up dynamic web page generation in python django - python

I am learning django now and I am learning to but a price comparison web app utilizing Amazon advertising API. My overall work flow is as below:
user submit a keyword for query
the keyword is being searched in Amazon for listing some items, currently 10
each item was assigned an unique number by Amazon, which the number is submitted to Amazon again for extracting additional information about the item such as a thumbnail, price. This will be done by a for-loop.
Information from the first and second query to Amazon will be listed on the web page and returned to the user.
The flow is now working, but I found the webapp to be quite slow in the django development server: I search for a book and it takes Amazon more than 10 seconds to extract information for the first 10 item searched, and display all the pictures all at once.
I am aware of the requirement from Amazon that we cannot make more than 1 request per second, but in view of other price comparison website working much faster, I am wondering if there is any programmatic mean for optimizing it.
I have thought of several ways:
caching product information
speeding up the for-loop in step 3
display the text in the page to user first, and load the picture later.
Method 1 is doable, I am not sure if method 2 or 3 is feasible.
Can anyone give any hints?

Related

Number of orders matching parameters?

I'm trying to setup some automation in my life here and I've hit a bit of a snag.
I'm using the WooCommerce API and I've utilized the woocommerce python library.
I'm able to get a max of 100 orders for a given date range. So I've done the following:
wcapi.get("orders?per_page=100&before=2023-02-01T23:59:59&after=2023-01-01T00:00:00").json()
It appears 100 is the max you can set for "per_page". I can then utilize the "page" argument to get page 2 and so on. I'm aware of how to loop through the pages, however I can't find anything in the documentation that can tell me how many orders fit my before/after parameters. So I'm left with paging through until I receive an error?
reports/orders/totals appears to ignore any parameters given to it, so there is no way to do any date filtering.

Trying to use YouTube API to check views of a video every hour. Can Python be used to store the data or is PHP needed?

I apologize if this is a naive question as I'm a learning developer. I'm trying to come up a project to work on in the summer to build my portfolio as a Python developer and as of right now I'm in the planning stages and just want to know if I'm on the right track.
But without going into the specifics, I essentially want to get the meta data from one YouTube video every hour from the time it was uploaded and the current time and make a time vs views/hour plot using the YouTube API and display it on a website
I've looked into webscrapers, but from my understanding, those scripts can only take one instance at a time.
Would I need to learn PHP as well in order to store the data from each instance the API is called? Or is there a way to do what I'm thinking of natively in Python?
If so would I need to know it extensively?
Part 1 - YouTube API only returns a total viewCount. You won't be able to ask for a video's viewCount from 6 hours ago, for example. However, you can call the API every hour for the video viewCount and store the numbers yourself going forward.
Part 2 - PHP vs Python doesn't matter. You can store data with either code.

Using Python to extract LinkedIn information [duplicate]

This question already exists:
Python: visiting random LinkedIn profiles [closed]
Closed 8 years ago.
I'm trying to visit, say a set of 8,000 LinkedIn profiles that belong to people who have a certain first name (just for example, let's say "Larry"), and then would like to extract the kinds of jobs each user has held in the past. Is there an efficient way to do this? I would need each Larry to be picked independently from one another; basically, traversing someone's network isn't a good way to do this. Is there a way to completely randomize how the Larry's are picked?
Don't even know where to start. Thanks.
To Start:
Trying to crawl the response linkedin gives you on your browser would be almost suicidal.
Check their APIs (particularly the People's API) and their code samples.
Important disclaimer found in the People's API:
People Search API is a part of our Vetted API Access Program. You must
apply here and get LinkedIn's approval before using this API.
MAYBE with that in mind you'll be able to write an script that queries and parses those APIs. For instance, retrieving users with Larry as a first name http://api.linkedin.com/v1/people-search?first-name=Larry
Once you get approved by Linkedin and you have retrieved some data from their APIs and tried some json or XML parsing (whatever the APIs return), you will have something more specific to ask.
If you still want to crawl the html returned by linkedin when you hit https://www.linkedin.com/pub/dir/?first=Larry&last=&search=Search take a look to BeautifulSoup

Amazon Advertising API ItemSearch: Get more than 10 pages / 100 results?

I'm trying to use the Python wrapper for the Amazon Advertising API (http://pypi.python.org/pypi/python-amazon-product-api/), but when I try to do an ItemSearch and try to access the 11th page, I get something along the lines of:
The value you specified for ItemPage is invalid. ... between 1 and 10
I know in order to avoid this problem, I could just perform another search query, but is there a way to start the search query on a certain page? Or is there a way (for books) to set a boundary for the publishing year? I just need a way so that I could make my search results smaller, so that I don't run into this error. Right now this is how I'm calling it:
results = api.item_search('Books', ResponseGroup='Images,Large',
AssociateTag='qwerty', Publisher=kwd)
Where kwd is just a publisher name obtained from a file.
Amazon will only give you the first 10 pages (as of API version 2011-08-01). You could run slightly different searches to get more results, but these would also be restricted to the first 10 pages.

How to get number of visitors of a page on GAE?

I need to get the number of unique visitors(say, for the last 5 minutes) that are currently looking at an article, so I can display that number, and sort the articles by most popular.
ex. Similar to how most forums display 'There are n people viewing this thread'
How can I achieve this on Google App Engine? I am using Python 2.7.
Please try to explain in a simple way because I recently started learning programming and I am working on my first project. I don't have lots of experience. Thank you!
Create a counter (property within entity) and increase it transactionally for every page view. If you have more then a few pageviews a second you need to look into sharded counters.
There is no way to tell when someone stops viewing a page unless you use Javascript to inform the server when that happens. Forums etc typically assume that someone has stopped viewing a page after n minutes of inactivity, and base their figures on that.
For minimal resource use, I would suggest using memcache exclusively here. If the value gets evicted, the count will be incorrect, but the consequences of that are minimal, and other solutions will use a lot more resources.
Did you consider Google Analytics service for getting statistics? Read this article about real-time monitoring using this service. Please note: a special script must be embedded on every page you want to monitor.

Categories

Resources