what is a search query in google custom search engine? - python

This question has nothing to do with technical help, however i need to understand what is a search query under google custom search api. If I am not mistaken, a search query is what i query in google search box isn't it ?.
If so, under google custom search api, it was said, that i can make 100 queries a day. Keeping that in mind, i was being cautious in making queries and the total queries were 54.
After 54 queries, i received the below error. The error says This API requires billing to be enabled on the project. Visit https://console.developers.google.com/billing?project=236852110619 to enable billing. Why is it so ?
Does that mean after billing, can i utilize 46 queries that belong to free quota ?

Quota for this API seems to be pro-rated if you start in the middle of a day. Are you able to get the full 100 queries the next day?

Related

Twitter API v2 - Rate Limit

I'm accessing Twitter's API v2 and have Academic Research access.
I am interested in pulling as much data as possible, but am getting rate limited.
I am using tweepy in Python to extract the data, and the call "search_all_tweets"
I am looping the extraction for each day and limiting tweets extracted per day. By using time.sleep() I can modify how many tweets I can extract per 15 minutes.
Twitter has published this to answer my question, however I am still in doubt which category and thereby limit I am under:
https://developer.twitter.com/en/docs/twitter-api/rate-limits#v2-limits
Can anybody help to clarify how many tweets I can extract per 15 minutes before getting rate limited?
Thanks in advance
Go to the Twitter developper portal, select your app in your project and, at the bottom of the Settings tab, check if you have activated the Read and Write permissions in the OAuth 1.0a authentication. If not, activate them and regenerate your tokens.

How to extract information (citation, h-index, currently working institution etc) about all professors in a specific field from Google scholar?

I want to compare different information (citation, h-index, etc) of professors in a specific field in different institutions all over the world by data mining and analysis techniques. But I have no idea how to extract these data of hundreds of (or even thousands of) professors since Google does not provide an official API for it. So I am wondering are there any other ways to do that?
Use this google code tool will calculate an individual h-index but if you do this on demand for a limited number in a particular field you will not break the terms of use - it doesn't specifically refer to limits on access but does refer to disruption of service (eg bulk requests may potentially do this) the export questions state:
I wrote a program to download lots of search results, but you blocked my computer from accessing Google Scholar. Can you raise the limit?
Err, no, please respect our robots.txt when you access Google Scholar using automated software. As the wearers of crawler's shoes and webmaster's hat, we cannot recommend adherence to web standards highly enough.
Web of Science does have an API available and a collaboration agreement with google scholar but Web of Science only for certain individuals
A solution could be to request user's web of science credential (or your own) to return the information on demand - perhaps for the top ones in the field, then store it as you planned. Google scholar only updates a few times per week and this would not be excessive use.
The other option is to request permission from google, which is an mentioned in the terms of use, although seems unlikely to be granted.
I've done a project exactly on this.
You provide an input text file to the script with the names of the professor you'd like to retrieve the information from, and the script is able to crawl google scholar and manage the info you are interested on.
The project provides also a functionality for downloading automatically the profile picture of the researchers/professors.
In order to respect the constraint imposed by the portal you can set a delay between each requests. if you have >1k of profile to crawl it might take a while but it works.
A concurrency-enabled script has also been implemented and it runs way faster than the basic sequence approach.
note: in order to specify the information you need you have to know either the id of the class of the html generated by google scholar or the name of the class.
good luck!

Can the search service on Google App Engine be used with dynamic queries?

I read through the documentation for the Search Service on Google App Engine (Python).
In App Engine Applications, all complex queries that your application will perform must be included in your index.yaml file. An App Engine Query is like a Mad Lib template, where the structure is always the same, but the individual "blank spaces" are filled in by your request. This makes app engine's query system unsuitable for random queries that may include user selected AND OR or other modifiers. I was wondering if this is also the case for the queries generated for the above linked search API?
Simply put, can I just throw random complex queries at the search API without having to have an index for that exact query built beforehand?
The search API isn't like the datastore. It doesn't require any special index to be maintained on your part. In other words, a user can put in any query they want (according to the rules) and it will work.
This doesn't come for free -- The search indexes have limitations that the datastore doesn't. e.g. there is a maximum size of any given search index and IIRC, more stringent quotas.

BigQuery vs Custom Search for High-throughput Google (Scholar) Searches?

For a list of ~30 thousand keywords, I'm trying to find out how many Google search hits there exist for each keyword, similar to this website but on a larger scale: http://azich.org/google/.
I am using python to query and was originally planning to use pygoogle. Unfortunately Google has a limit of ~100 searches a day for a free account. I am willing to use a paid service, but I am not sure which Google service makes more sense - BigQuery or Custom Search. Bigquery seems to be for searches on a provided set of data, whereas Custom Search seems to be website search solutions for a small "slice" of the internet.
Would someone refer me to the appropriate service that will allow me to perform the above task? It doesn't need to be a free service - I am willing to pay.
Two more things, if possible: I'd like the searches to be from Google Scholar, but this is not necessary. Second, I'd like to save the text from the front page, such as the blurbs from each search result, to text-mine the front page results.
BigQuery is not a tool to interact with Google Search in any way. BigQuery is a tool for you to feed your data, and then run analytical queries over those data. But you need first to ingest the data.

Google app engine excessive datastore small operations

My site has about 50 users and I am getting excessive small datastore operations. I am aggressively memcaching, dont have that many records and still I get millions of small datastore operations. Appstats says the cost is 0 yet the real cost is not 0.
I basically know where the small datastore operations might occur.
Key only operations: I do this but I memcache it until the data is not changed. Plus most of my key only operation have limit=100 (this is max) so to get 12m operations I would need to make 120000 calls (I am assuming fetching 1 key is 1 small operation). As I get about 60-70 visits a day that seems a bit excessive.
I just cant figure out what is causing that many operations. Appstats is giving me no clue.
This is the dashboard.
This is the appstats.
Are you using lots of counts? Seems like this can be a problem that causes excessive datastore small operations.
I don't have your code, but this answer has some suggestions for optimizing your code when experiencing this problem.
Also, take a look at a similar question - Google app engine excessive small datastore operations for similar answers
I notice this old question isn't yet solved, so based on your info, here is another potential cause.
Running my GAE SDK on an very fresh public Azure VM instance (xxx.cloudapp.net), I've noticed a lot of bot traffic coming in trying to find a common open source CMS or cart's admin page. I believe this is due to the bots either utilizing AXFR requests or bruteforce detection of subdomains.
Make sure that you are blocking any unwanted bot traffic and not serving them a dynamic page, hitting your datastore more.
The same condition can also be caused by a rogue AJAX request looping on every page that those 50 users request.

Categories

Resources