Using Google (Beta) Web Risk API

Using Google (Beta) Web Risk API - python

I am trying to use the google web risk API ( beta) with my python code . Please see the sample code:-
URI='http://www.amazongroupco.org' # bad url
key='key=<mykey>'
threat='&threatTypes=MALWARE'
queryurl='https://webrisk.googleapis.com/v1beta1/uris:search?'
requeststring=queryurl+key+threat
header={"Content-Type":"application/json"}
payload = {'uri':URI }
try:
req = requests.get(requeststring, headers=header, params=payload)
print(req.url)
if (req.status_code == 200):
print(req)
else:
print(" ERROR:",req)
except Exception as e:
print(" Google API returned error:",e, req.url)
The above code always returns successful request status code "Response [200] OK" with an empty jason response {}. The fact that it is an malicious site , I was expecting it to return something in the jason response. I tried it with other malicious sites as well but I get the same response - empty jason object with a status 200 OK.
Am I missing something ?.
I understand that some sites may not have malware but are social engineering sites which is another kind of threattype. Therefore i am wondering if there is an general purpose all-in-all threatTypes attribute I can use which will return a jason object no matter what the threat is as long as it is a Threat.
Just a side note that anyone trying this should have an GCP account to generate a key.
Any guidance here will be much appreciated.

I have also checked the Web Risk API and it works and I have also reproduced your issue and I get the same result. The URL you are checking it is not considered by Google as MALWARE threat. Honestly I have tried various types of threads for that specific URL and it seems that it is not in the Google lists.
Here you can find a list of all the thread types you can use. There is a type for the situation you have described : THREAT_TYPE_UNSPECIFIED , but it returns a error json - invalid argument, always and this is intended behaviour.
I should also note that as it is stated in the official documentation you should use the REST API with the URI encoded :
The URL must be valid (see RFC 2396) but it doesn't need to be canonicalized.
If you use the REST API, you must encode GET parameters, like the URI.

Related

How to add body parameters in python for requests search api

I have a website that requires login then allows a search api, that's public, it's returning error 405, How do I give parameters in requests.post, since following code isn't working
response = requests.post(
'https://www.somewebsite.com/api/public/',
auth=('asct1#gmail.com', 'Cons')
)

As you can see here 405 means that the method you chose in your case .get(...) is not allowed on the endpoint.
https://developer.mozilla.org/en-US/docs/Web/HTTP/Status/405
You wrote as well that you should make a Post request. So give requests.post(...) a try!
Happy Coding, T!

Control API: Service unavailable (503)

Good morning,
I want to query households (my first query and generally first experience with the Sonos API) and have authenticated successfully. I got an access token and query the Control API like this:
headers={"Content-Type" : "application/json",
"Authorization" : "Bearer " + token["access_token"]}
resp = re.get('http://api.ws.sonos.com/control/api/v1/househoulds', headers=headers)
It returns me a response with error code "503: Service unavailable":
Service Unavailable
Service Unavailable - Zero size object
The server is temporarily unable to service your request. Please try again
later.
Reference XXXXX
(I cut out the reference because I am not sure, if it contains credentials). I remember that when I intentionally changed my access token to a wrong one yesterday, I would get an error code back that I am not authorized. But now when I change it to a false one I still just get this same page back (503: Service unavailable).
Does anyone have the same problem? Might it be some security mechanism because I authorized many times in a short time or is the control API just currently down? I tried yesterday and today and don't see a blog post stating a downtime.

I see two issues with the code snippet you provided:
Issue 1: Your API URL has a typo. You used "househoulds" instead of
"households".
Issue 2: Your URL needs to use https://, not http://
If you fix those two issues and are indeed using a valid access token, your request should work.

Google service account access token request: "Required parameter is missing: grant_type"

UPDATE: Solved. I was indeed making a basic mistake, among other things. My usage of the session.headers.update function was incorrect; instead of session.headers.update = {foo:bar}, I needed to be doing session.headers.update({foo:bar}). Additionally, I changed the following section of code:
From this:
payload = urllib.parse.urlencode({
"grant_type":"urn:ietf:params:oauth:grant-type:jwt-bearer",
"assertion": str(token)
})
To this:
payload = "grant_type=urn%3Aietf%3Aparams%3Aoauth%3Agrant-type%3Ajwt-bearer&assertion=" + token.decode("utf-8")
The code now works as intended.
Original question below
I've seen several hits on SO and Google about this problem; none of them have helped, although I've certainly made sure to double-check my code to make sure I'm not guilty of the same problems they detail. The problems people tend to have involve passing the POST data as parameters or POSTing to the wrong URL, which I'm not doing, as far as I can tell. Additionally, most of the hits I've found involve 3-legged OAuth2 involving users; I've found comparatively few hits pertaining to service accounts and JWTs, which differ enough from the user flow that I'm concerned about how relevant they are to my problem.
I'm trying to get an access token from the Google Authentication server for a service account. I've generated my JWT and now want to POST to the server to receive back my access token. I've set the headers according to the documentation described here, under "Making the access token request," and as far as I can tell, my request is up to spec, but Google responds back with a 400 response, and the following JSON:
{'error': 'invalid_request', 'error_description': 'Required parameter is missing: grant_type'}
Here's the code causing the problem:
# Returns the session, now with the Host and Authorization headers set.
def gAuthenticate(session):
token = createJWT()
session.headers.update = {
"Host": "www.googleapis.com",
"Content-Type": "application/x-www-form-urlencoded"
}
payload = urllib.parse.urlencode({
"grant_type":"urn:ietf:params:oauth:grant-type:jwt-bearer",
"assertion": str(token)
})
response = session.post("https://www.googleapis.com/oauth2/v4/token", data = payload)
session.headers.update = {"Authorization": "Bearer " + response.json()["access_token"]}
return session
I'm having a lot of strange issues with this code. First of all, if I don't urllib.parse.urlencode my dictionary (i.e. simply payload = {dictionary}), I get only a Bad Request / 'invalid_request' error, which I assume from the less specific error message means that this is less acceptable than what I'm currently doing. Why do I have to do this? Isn't Requests supposed to encode my data for me? I've never had this problem when POSTing with Requests before.
Second, examining the prepared request before it's sent reveals that my headers aren't being correctly set, despite the header update. Neither of the headers I've added to the request are being transmitted.
I've examined the request body and it looks to be identical (except of course the content of the JWT) to the one that Google provides as an example in the documentation.
All of this leads me to believe that I'm making a very basic error somewhere, but I haven't had any success finding it. What am I doing wrong here? Links to any helpful documentation would be extremely appreciated; thanks for your time and attention.

Try "grant_type": "authorization_code". And add grant type as header.
Also, check this link - Accessing Google oAuth giving invalid grant_type (Bad request)

How to create a Wikipedia bot to add a new section to a talk page?

We need to implement a bot which posts new sections on Wikipedia Talk pages.
As a matter of efficiency, we prefer to use python HTTP POST requests using MediaWiki API rather than available MediaWiki libraries.
We have not requested for an approval for the bot, and we are just trying to implement a trial version to test the bot on our own Talk pages.
For this purpose, I went through the following steps:
1- As discussed at https://en.wikipedia.org/wiki/Wikipedia:Creating_a_bot:
Create an account for your bot. Click here when logged in to create the account, linking it to yours. (If you do not create the bot account while logged in, it is likely to be blocked as a possible sockpuppet or unauthorised bot until you verify ownership)
Create a user page for your bot. Your bot's edits must not be made under your own account. Your bot will need its own account with its own username and password.
So, I logged in to my own Wikipedia account, and created a new account (for the bot).
2- As discussed at "API:Login" page: (Sorry, because of having less than 10 reputation, I am not able to add more than 2 links)
Logging in through the API requires two requests. For the first request, I wrote the following code in python:
def logInRequestToWikipedia():
# Add required parameters to the request.
request = { 'action' : 'login' }
request['lgname'] = 'BotName'
request['lgpassword'] = '*************'
url = 'https://en.wikipedia.org/w/api.php'
headers = { 'content-type' : 'application/x-www-form-urlencoded' }
r = requests.post(url, data = json.dumps(request), headers=headers)
The response starts with an error as follows:
<error code="help" info="" xml:space="preserve">
And continues with the API documentation.
3- As discussed at "API:Edit_-_Create%26Edit_pages" page:
Note: In this example, all parameters are passed in a GET request just for the sake of simplicity. However, action=edit requires POST requests; GET requests will cause an error. Do not forget to set the Content-Type header of your request to application/x-www-form-urlencoded. The token that you received is terminated with +\, this needs to be urlencoded (so it will end with %2B%5C) before it is passed back.
I added each of the following parameters separately and both together in the request data and tried all three cases, but it returns the same response.
request['lgtoken'] = '%2B%5C'
request['Content-Type'] = 'application/x-www-form-urlencoded'
4- Also I tried each of the followings in my request data, but it returns the same response:
request['format'] = 'json'
request['format'] = 'xml'
5- Moreover I found the following instruction at "User-Agent_policy" page:
User agents (browsers or scripts) that do not send a User-Agent header may now encounter an error message like this:
Scripts should use an informative User-Agent string with contact information, or they may be IP-blocked without notice.
User agents that send a User-Agent header that is blacklisted (for example, any User-Agent string that begins with "lwp", whether it is informative or not) may encounter a less helpful error message (lie) like this:
Our servers are currently experiencing a technical problem. This is probably temporary and should be fixed soon. Please try again in a few minutes.
This change is most likely to affect scripts (bots) accessing Wikimedia websites such as Wikipedia automatically, via api.php or otherwise, and command line programs.[3] If you run a bot, please send a User-Agent header identifying the bot and supplying some way of contacting you, e.g.:
User-Agent: MyCoolTool/1.1 (http://example.com/MyCoolTool/; MyCoolTool#example.com) BasedOnSuperLib/1.4
Do not copy a browser's user agent for your bot, as bot-like behavior with a browser's user agent will be assumed malicious.[4] For more information, please refer to the MediaWiki API Documentation
That's why I also tried my script with the following parameter, but the error response did not change:
request['User-Agent'] = "MyCoolTool/1.1 (http://example.com/MyCoolTool/; MyCoolTool#example.com) BasedOnSuperLib/1.4"
Do you think the problem can be related to the fact that we have not requested for an approval for the bot yet? Because we are just trying to implement a trial version to test the bot on our own Talk pages, and apply for the approval after making sure everything will work.

I'm pretty sure the problem is this line:
request['lgtoken'] = '%2B%5C'
The Login API you linked to doesn't include an lgtoken on the initial login attempt; it's only sent on the second ("Confirm token") step, using the token value from the NeedToken response.
And +\ doesn't look like a valid token.
So it's not surprising that you're getting an error.
Meanwhile, when I test this with my Wikipedia account, I get an error if I include that line, and success if I don't, which validates my suspicion that this is the problem.

blackle.com queries

I'm trying to query blackle.com for searches, but I get an 403 HTTP error. Can somebody point out what is wrong here?
#!/usr/bin/env python
import urllib2
ss = raw_input('Please enter search string: ')
response = "http://www.google.com/cse?cx=013269018370076798483:gg7jrrhpsy4&cof=FORID:1&q=" + ss + "&sa=Search"
urllib2.urlopen(response)
html = response.read()
print html

HTTP 403 means "forbidden" (see here for a good explanation): google.com doesn't want to let you access that resource. Since it does let browsers access it, presumably it's identifying you as a robot (automated code, not interactive user browser), through user agent checking and the like. Have you checked robots.txt to see if you SHOULD be allowed to access such URLs? In http://www.google.com/robots.txt I see one line:
Disallow: /cse?
which means robots are NOT allowed here. See here for explanations of robots.txt, here for the standard Python library module roboparser that makes it easy for a Python program to understand a robots.txt file.
You could try fooling google's detection of "robots" vs humans, e.g. by falsifying your user agent header and so on, and maybe you'd get away with it for a while, but do you really want to deliberately violate the terms of use and get into a fight about it with google...?

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Using Google (Beta) Web Risk API - python

Related

How to add body parameters in python for requests search api

Control API: Service unavailable (503)

Google service account access token request: "Required parameter is missing: grant_type"

How to create a Wikipedia bot to add a new section to a talk page?

blackle.com queries

Categories

Resources