I am writing in Python a module that will query Google's Custom Search API and return all listings of domain 'example.com'
I Have been reading instructions at https://code.google.com/apis/customsearch/v1/getting_started.html and am a little stumped at the moment.
Are my assumptions listed below correct?
For example, to search for results that has 'example.com' in the URL, the query is:
*'https://www.googleapis.com/customsearch/v1?key=my_key&cx=017576662512468239146:omuauf_lfve&q=site:example.com'*
*key=my_key:* value of key given by google
cx=017576662512468239146: name of the search engine (google)? Is this correct?
*omuauf_lfve:* I have no idea what this is
q=site:example.com: This should return all results with 'example.com'; e.g. www.a.example.com, b.example.com, example .com
Though this question is quite old and the author doesn't seem to be too responsive, Google still highly ranks this page and many people may come here, so I post my answer.
Searching with Google Custom Search is described in this answer to similar question.
Parameters are as follows:
key - yes, it is API key for your Google account. To obtain it go to APIs console, switch on Custom Search API on Services tab and find actual API key on API Access tab.
cx - yes again, it is search engine unique code. Note, that this code is of form "123456:abcdef", so "omuauf_lfve" is part of this code, not the other param.
q - actual search query. "site:example.com" is part of Google's query language. See search tips for details.
Related
I'm trying to use the Vimeo API with Python, but I'm stuck trying to find video's using keywords.
What I have is this, after successfully registering with Vimeo:
import vimeo
client = vimeo.VimeoClient(
token='my_token',
key='my_key',
secret='my_secret_key'
)
about_me = client.get('/me',params={'fields':'uri,name'})
json.loads(about_me.text)
these return my user credentials. Now I want to use a similar approach to get videos using Query keywords, like on their page. But I cannot get it to work.
So, I want to have a JSON returned with movies based on keywords (like 'interstellar trailer' and not the URI or id of Vimeo).
But for some reason, I can't get the query keyword to work, and from the linked page above, I cannot figure it out how to implement it in my code.
27-02-19: UPDATE: I figured it out, and will update my solution here, soon.
The /me endpoint only returns the user object for the authenticated user.
To get and search for videos, use the /me/videos or /videos endpoint (depending on if you are searching for videos on your own account, or searching for videos from other users public on Vimeo).
I have a snippet of code using the pygoogle python module that allows me to programmatically search for some term in google succintly:
g = pygoogle(search_term)
g.pages = 1
results = g.get_urls()[0:10]
I just found out that this has been discontinued unfortunately and replaced by something called the google custom search. I looked at the other related questions on SO but didn't find anything I could use. I have two questions:
1) Does google custom search allow me to do exactly what I am doing in the three lines above?
2) If yes - where can I find example code to do exactly what I am doing above? If no then what is the alternative to do what I did using pygoogle?
It is possible to do this. The setup is... not very straightforward, but the end result is that you can search the entire web from python with few lines of code.
There are 3 main steps in total.
1st step: get Google API key
The pygoogle's page states:
Unfortunately, Google no longer supports the SOAP API for search, nor
do they provide new license keys. In a nutshell, PyGoogle is pretty
much dead at this point.
You can use their AJAX API instead. Take a look here for sample code:
http://dcortesi.com/2008/05/28/google-ajax-search-api-example-python-code/
... but you actually can't use AJAX API either. You have to get a Google API key. https://developers.google.com/api-client-library/python/guide/aaa_apikeys For simple experimental use I suggest "server key".
2nd step: setup Custom Search Engine so that you can search the entire web
Indeed, the old API is not available. The best new API that is available is Custom Search. It seems to support only searching within specific domains, however, after following this SO answer you can search the whole web:
From the Google Custom Search homepage ( http://www.google.com/cse/ ), click Create a Custom Search Engine.
Type a name and description for your search engine.
Under Define your search engine, in the Sites to Search box, enter at least one valid URL (For now, just put www.anyurl.com to get
past this screen. More on this later ).
Select the CSE edition you want and accept the Terms of Service, then click Next. Select the layout option you want, and then click
Next.
Click any of the links under the Next steps section to navigate to your Control panel.
In the left-hand menu, under Control Panel, click Basics.
In the Search Preferences section, select Search the entire web but emphasize included sites.
Click Save Changes.
In the left-hand menu, under Control Panel, click Sites.
Delete the site you entered during the initial setup process.
This approach is also recommended by Google: https://support.google.com/customsearch/answer/2631040
3rd step: install Google API client for Python
pip install google-api-python-client, more info here:
repo: https://github.com/google/google-api-python-client
more info: https://developers.google.com/api-client-library/python/apis/customsearch/v1
complete docs: https://api-python-client-doc.appspot.com/
4th step (bonus): do the search
So, after setting this up, you can follow the code samples from few places:
simple example: https://github.com/google/google-api-python-client/blob/master/samples/customsearch/main.py
cse() function docs: https://google-api-client-libraries.appspot.com/documentation/customsearch/v1/python/latest/customsearch_v1.cse.html
and end up with this:
from googleapiclient.discovery import build
import pprint
my_api_key = "Google API key"
my_cse_id = "Custom Search Engine ID"
def google_search(search_term, api_key, cse_id, **kwargs):
service = build("customsearch", "v1", developerKey=api_key)
res = service.cse().list(q=search_term, cx=cse_id, **kwargs).execute()
return res['items']
results = google_search(
'stackoverflow site:en.wikipedia.org', my_api_key, my_cse_id, num=10)
for result in results:
pprint.pprint(result)
After some tweaking you could write some functions that behave exactly like your snippet, but I'll skip this step here.
#mbdevpl's response helped me a lot, so all credit goes to them.
But there have been a few changes in the UI, so here is an update:
A. Install google-api-python-client
If you don't already have a Google account, sign up.
If you have never created a Google APIs Console project, read the Managing Projects page and create a project in the Google API Console.
Install the library.
B. To create an API key:
Navigate to the APIs & Services→Credentials panel in Cloud Console.
Select Create credentials, then select API key from the drop-down menu.
The API key created dialog box displays your newly created key.
You now have an API_KEY
C. Setup Custom Search Engine so you can search the entire web
Create a custom search engine in this link.
In Sites to search, add any valid URL (i.e. www.stackoverflow.com).
That’s all you have to fill up, the rest doesn’t matter. In the left-side menu, click Edit search engine → {your search engine name} → Setup
Set Search the entire web to ON.
Remove the URL you added from the list of Sites to search.
Under Search engine ID you’ll find the search-engine-ID.
Search example
from googleapiclient.discovery import build
my_api_key = "AIbaSyAEY6egFSPeadgK7oS/54iQ_ejl24s4Ggc" #The API_KEY you acquired
my_cse_id = "012345678910111213141:abcdef10g2h" #The search-engine-ID you created
def google_search(search_term, api_key, cse_id, **kwargs):
service = build("customsearch", "v1", developerKey=api_key)
res = service.cse().list(q=search_term, cx=cse_id, **kwargs).execute()
return res['items']
results = google_search('"god is a woman" "thank you next" "7 rings"', my_api_key, my_cse_id, num=10)
for result in results:
print(result)
Important! on the first run, you might have to enable the API in your account. The error message should contain the link to enable the API in. It will be something like:
https://console.developers.google.com/apis/api/customsearch.googleapis.com/overview?project={your project name}.
You’ll be asked to create a service name (It doesn’t matter what it is), and give it Roles.
I gave it Role Viewer and Service Usage Admin and it works.
Answer from 2020
Google aren't providing any API anymore for some reason, but https://github.com/bisoncorps/search-engine-parser is developing a python package for scraping Google.
Installation
pip install search-engine-parser
Usage
from search_engine_parser import GoogleSearch
def google(query):
search_args = (query, 1)
gsearch = GoogleSearch()
gresults = gsearch.search(*search_args)
return gresults['links']
google('Is it illegal to scrape google results')
I don't know how legal this is, but as long as you aren't commercializing your product I think you can get away with it. Besides Google haven't really sued anyone because of using their product, they have just banned their IP address.
For more information Is it ok to scrape data from Google results?
I'm trying to figure out how to authenticate and create an entry on quickbooks online through Python. Currently, when I try to click auth link in their API Explorer, I get 404 page.
What I'm trying to do is creating invoice through Python. However, it seems like their documentation is not complete. I contacted their support, and I haven't heard from them yet.
The python-quickbooks library is probably the correct choice now, as it is a "complete rework of quickbooks-python". It has pretty comprehensive instructions on getting and using the auth keys, though I wouldn't call it "simple", since the process is by definition somewhat complex. The instructions are "for Django", but the Django-specific code simply gets parameters out of a URL string.
We're using it to great effect, because the syntax is as easy as:
auth_client = AuthClient(
client_id = CLIENT_ID # from QB website
,client_secret = CLIENT_SECRET # from QB website
,environment = 'sandbox' # or 'production'
,redirect_uri = REDIRECT_URI
)
client = QuickBooks(
auth_client = auth_client
,refresh_token = REFRESH_TOKEN
,company_id = COMPANY_ID
)
account = Account.get(qbid, qb=client) # qbid can be retrieved from the AccountList
return account.CurrentBalance
This library will get the job done https://github.com/HaPsantran/quickbooks-python
It works in JSON so you would construct the Invoice based off of docs at https://developer.intuit.com/docs/0025_quickbooksapi/0050_data_services/030_entity_services_reference/invoice using the JSON examples.
The library doesn't support sandbox mode** so if you are going to use the development consumer key and secret than you would change this code.
base_url_v3 = "https://quickbooks.api.intuit.com/v3"
to
base_url_v3 = "https://sandbox-quickbooks.api.intuit.com/v3"
while in that mode.
** Sandbox mode only applies currently to U.S. QBO
Having written a lot of the module #Minimul mentions — with a very helpful start by simonv3, who figured out how to get it working first and then I just built on it — I am fairly confident that this will not support the oauth workflow of getting the request token, prompting the user to authenticate out of band, and then getting and storing the access token. It presumes you already have an access token.
Simon (or another Python developer) may be able to comment on how he gets the access token with Python, and if so, it'd be great if he (or they) could add it to the module for all to enjoy.
I had this same problem. I just figured it out and posed the step-by-step process here:
python with Quickbooks Online API v3
Hope this helps.
I looked at the existing python clients for quickbooks and found them to be either outdated or not having all the features. So i created a new python client for quickbooks which can be found at https://pypi.python.org/pypi/quickbooks-py
Is there a way to use Simple Access API (Developer Key) instead of oAuth2 key with Google Cloud Endpoint?
Extra fields in your protorpc request object that aren't part of the definition are still stored with the request.
If you wanted to use a key field as a query parameter, you could access it via
request.get_unrecognized_field_info('key')
even if key is not a field in your message definition.
This is done in users_id_token.py (the Auth part of the endpoints library) to allow sending bearer_token or access_token as query parameters instead of as header values.
Unfortunately, the nice quota checking and other associated pieces that a "Simple API Access" key gives are not readily available. However, you could issue your own keys and manually check a key against your list and potentially check against quotas that you have defined.
For those looking to use #bossylobster's answer in Java, use the the SO Answer here:
Getting raw HTTP Data (Headers, Cookies, etc) in Google Cloud Endpoints
P.S.
I tried to make this a comment in #bossylobster's answer, but I don't have the reputation to do that. Feel free to clean up this answer so that other's can follow the path
I am attempting to read the raw text/content of a Google Doc (just a plain document, not a spreadsheet or presentation) from within a Python script, but so far have had little success.
Here's what I've tried:
import gdata.docs.service
client = gdata.docs.service.DocsService()
client.ClientLogin('email', 'password')
q = gdata.docs.service.DocumentQuery()
q.AddNamedFolder('email', 'Folder Name')
feed = client.Query(q.ToUri())
doc = feed.entry[0] # extract one of the documents
However, this variable doc, which is of type gdata.docs.DocumentListEntry, doesn't seem to contain any content, just meta information about the document.
Am I doing something wrong here? Can somebody point me in the right direction? Thank you!
UPDATE (Mar 2019) Good news! The Google Docs REST API is now available. More info about it from my SO answer to a similar question, but to get you going, here's the official Python "quickstart" sample showing you how to get the title of a Google Doc in plain text.
Both the Apps Script and Drive REST API solutions originally answered below are still valid and are alternate ways to get the contents of a Google Doc. (The Drive API works on both Python 2 & 3, but Apps Script is JavaScript-only.)
Bottom-line: if you want to download the entire Doc in plain text, the Drive API solution is best. If you want to programmatically CRUD different parts of a Doc, then you must use either the Docs API or Apps Script.
(Feb 2017) The code in the OP and the only other answer are both now out-of-date as ClientLogin authentication was deprecated back in 2012(!), and GData APIs are the previous generation of Google APIs. While not all GData APIs have been deprecated, all newer Google APIs do not use the Google Data protocol.
There isn't a REST API available (at this time) for Google Docs documents, although there is an "API-like" service provided by Google Apps Script, the JavaScript-in-the-cloud solution which provides programmatic access to Google Docs (via its DocumentService object), including Docs add-ons.
To read plain text from a Google Doc, considered file-level access, you would use the Google Drive API instead. Examples of using the Drive API:
Exporting a Google Sheet as CSV (blog post)
"Poor man's plain text to PDF" converter (blog post) (*)
(*) - TL;DR: upload plain text file to Drive, import/convert to Google Docs format, then export that Doc as PDF. Post above uses Drive API v2; this follow-up post describes migrating it to Drive API v3, and here's a developer video combining both "poor man's converter" posts.
The solution to the OP is to perform similar operations as what you see in both posts above but ensure you're using the text/plain export MIMEtype. For other import/export formats to/from Drive, see this related question SO answer as well as the downloading files from Drive docs page. Here's some pseudocode that searches for Google Docs documents called "Hello World" in my Drive folder and displays the contents of the first matching file found on-screen (assuming DRIVE is your API service endpoint):
from __future__ import print_function
NAME = 'Hello World'
MIME = 'text/plain'
# using Drive API v3; if using v2, change 'pageSize' to 'maxResults',
# 'name=' to 'title=', and ".get('files')" to ".get('items')"
res = DRIVE.files().list(q="name='%s'" % NAME, pageSize=1).execute().get('files')
if res:
fileID = res[0]['id'] # 1st matching "Hello World" name
res = DRIVE.files().export(fileId=fileID, mimeType=MIME).execute()
if res:
print(res.decode('utf-8')) # decode bytes for Py3; NOP for Py2
If you need more than this, see these videos on how to setup using Google APIs, OAuth2 authorization, and creating a Drive service endpoint to list your Drive files, plus a corresponding blog post for all three.
To learn more about how to use Google APIs with Python in general, check out my blog as well as a variety of Google developer videos (series 1 and series 2) I'm producing.
A DocumentQuery doesn't return you all the documents with their contents—that would take forever. It just returns a list of documents, with metadata about each. (Actually, IIRC you can get a preview page this way, so if your document is only one page that might be enough…)
You then need to download the content in a separate request. The content element has a type (the MIME type) and a src (the URL to the actual data). You can just download that src, and parse it. However, you can override the default type by adding an exportFormat parameter, so you don't need to do any parsing.
See the section Downloading documents and files in the docs, which has an example showing how to download a document and specify a format. (It's in .NET rather than Python, and it uses HTML rather than plain text, but you should be able to figure it out.)