How to search on Google using Python and Google API? - python

I've searched the whole afternoon but I'm still stuck.
I need to google keywords, and save the ranks of a given domain name for each keyword.
I tried to use several libraries : xgoogle, google, and pygoogle. However, pygoogle just doesn't work, and google, pygoogle always end up raising "HTTP Error : Service Unavailable".
So I suppose I should use the Google API, that uses the libraries urllib2 and simplejson, as well as the URL "http://ajax.googleapis.com/ajax/services/search/web?v=1.0".
I have several questions :
How to choose the top level domain ?
How to choose the langage of the results ?
How to choose how many results are shown ?
Are the results ranked the way I should find them in my own Google search ? I'm asking the question since I'm under the impression it's not the case.
Are the photos URL taken into account ?
How to choose the starting URL ? Is it possible to start from the 10th result ?
Thank you for your help,
Sebi81

Related

Getting the number of results of a google search using python in 2020?

There were solutions provided before, but they don't work anymore :
extract the number of results from google search
for example the above code doesn't work anymore because the number of results doesn't seem to even be in the respond, there is no resultStats ID, in my browser the result is in the id of "result-status" but this doesn't exist in the respond
I don't want to actually use the API of google because there is a big limit on daily search, and i need to search for thousands of words daily, what is the solution for me?

Unable to get desired google search result using module "google"

i have been trying to scrap google search data.
Let me explain what i have done so far.
i have used google module to get the search results, with Beautiful soup. Below i have given the sample search i have made,
>>> from google import search
>>>
>>> for i in search("tom and jerry", tld="co.in",num=10,stop=1): print i
https://www.youtube.com/watch?v=mugo5LoG8Ws
https://en.wikipedia.org/wiki/Tom_and_Jerry
http://www.dailymail.co.uk/debate/article-2390792/How-sense-humour-censor-Tom-Jerry-racist-By-Mail-TV-critic-CHRISTOPHER-STEVENS.html
http://edp.wikia.com/wiki/Tom_and_Jerry
https://www.youtube.com/watch?v=gSK5curwV_o
https://www.youtube.com/watch?v=xb8jTvSwJbw
https://www.youtube.com/watch?v=Kj8VuTr5q9g
https://www.youtube.com/watch?v=iIprJoPTJoI
https://www.youtube.com/watch?v=UaX3hvrZDJA
http://www.cartoonnetwork.com/games/tomjerry/
https://www.facebook.com/TomandJerry/
http://www.dailymotion.com/video/x2mn36a
http://www.dailymotion.com/video/x2p0k8j
>>>
But this result actually differs from the manual search result.
How actually it differs, if we make any changes to the init.py file of google library we can get some efficient result?
Please sort me out a possible way..
Thanks in advance.
[Note] : already surfed for previous discussions in stackoverflow. If it is a Dup, I apologize... :)
EDIT 1: Also i get duplicate links sometimes,. First link is repeated few times in the generator output i am getting from google.search(*arg) command. Please advice me how to get rid of this
I got how this DUP came. It is the sublinks shown for the popular websites in google search page.
sorry the pixel was too small. :)
Researching more on the API output and the way the output is parsed. Thanks for all who could have thought of helping me :)

Twitter search api parameters

In twitter search api, I'm able to find new paramter src = 'typd' or src = 'sprv', getting different results for each src paramter.But I'm unable to figure it out, what the term 'typd' and 'sprv' means?
for eg:
https://twitter.com/search?q=Technology&src=typd
https://twitter.com/search?q=Technology&src=sprv
'sprv' and 'typd' relate to Twitter's spelling correction system. As Leb said, 'typd' indicates results from a query that was typed-in and may be incorrect; while 'sprv' is a clear "no, I really meant this".
For example, if I type 'flayrah' into the search bar I get results for 'flayra' at URL https://twitter.com/search?q=flayrah&src=typd and the text "Showing results for flayra. Search for flayrah instead."
Clicking the link brings me to https://twitter.com/search?q=flayrah&src=sprv with results for 'flayrah'.
I'm not sure what sprv means but those two links aren't giving me different results, they're the exact same.
typd means that you actually typed the query into the search yourself.
Also note that the search through the previous link and search through Rest API (per your tag) are two different things.
The Twitter Search API is part of Twitter’s v1.1 REST API. It allows queries against the indices of recent or popular Tweets and behaves similarily to, but not exactly like the Search feature available in Twitter mobile or web clients, such as Twitter.com search.
https://dev.twitter.com/rest/public/search

Advanced Tweepy Search User Queries, and Their Results?

I am running tweepy and trying to run an advanced query through the "search_users" api. I am noticing a big difference in the search results even if the exact query is passed from the api compared to the web people search on twitter. Any thoughts?
Example
query = 'mustang AND near:"New Haven" AND within:15mi'
tweepy.search_users(q=query)
Is there a difference? Is there another API call I should look at?
After some research I have discovered that the "USER SEARCH" does not currently support advanced search parameters. This is unfortunate. If this changes I will come back and update this answer.

Crawler for Twitter social graph via python

I am sorry for asking but I am new in writing crawler.
I would like to crawl Twitter space for Twitter users and follow relationship among them using python.
Any recommendation for starting points such as tutorials?
Thank you very much in advance.
I'm a big fan of Tweepy myself - https://github.com/tweepy/tweepy
You'll have to refer to the Twitter docs for the API methods that you're going to need. As far as I know, Tweepy wraps all of them, but I recommend looking at Twitter's own docs to find out which ones you need.
To construct a following/follower graph, you're going to need some of these:
GET followers/ids - grab followers (in IDs) for a user
GET friends/ids - grab followings (in IDs) for a user
GET users/lookup - grab up to 100 users, specified by IDs
besides reading the twitter api?
a good starting point would be the great python twitter library by mike verdona which personally I think is the the best one. (also an intorduction here)
also see this question in stackoverflow

Categories

Resources