python: facebook GraphAPI request function - python

I saw some code look like this:
graph = facebook.GraphAPI(User_Access_Token)
graph.request("search", {'q' : 'social web', 'type' : 'page'})
This seems fetch all the data containing the key word 'social web'. But I don't understand why we can do such request.
I read the document of help(graph.request), which says
request(self, path, args=None, post_args=None, files=None, method=None) method of facebook.GraphAPI instance
Fetches the given path in the Graph API.
It doesn't mention "search" at all.

I have the same question and I assume you installed pip install facebook-sdk as well and I assume again that your source is Mining the Social Web: Analyzing Data from Facebook, Twitter, LinkedIn, and Other Social Media Sites - Feb 11, 2011 by Matthew A. Russell. The facebook-sdk version is facebook_sdk-2.0.0 . I am not sure the versioning system is the same with Facebook's GraphAPI, but if it is, the API documentation for that is not supported anymore. I downloaded the library from here and in /facebook-sdk-2.0.0/facebook/__init__.py you will be able to see this block of code.
def request(
self, path, args=None, post_args=None, files=None, method=None):
"""Fetches the given path in the Graph API.
We translate args to a valid query string. If post_args is
given, we send a POST request to the given path with the given
arguments.
"""
args = args or {}
if post_args is not None:
method = "POST"
# Add `access_token` to post_args or args if it has not already been
# included.
if self.access_token:
# If post_args exists, we assume that args either does not exists
# or it does not need `access_token`.
if post_args and "access_token" not in post_args:
post_args["access_token"] = self.access_token
elif "access_token" not in args:
args["access_token"] = self.access_token
try:
response = requests.request(method or "GET",
FACEBOOK_GRAPH_URL + path,
timeout=self.timeout,
params=args,
data=post_args,
proxies=self.proxies,
files=files)
except requests.HTTPError as e:
response = json.loads(e.read())
raise GraphAPIError(response)
headers = response.headers
if 'json' in headers['content-type']:
result = response.json()
elif 'image/' in headers['content-type']:
mimetype = headers['content-type']
result = {"data": response.content,
"mime-type": mimetype,
"url": response.url}
elif "access_token" in parse_qs(response.text):
query_str = parse_qs(response.text)
if "access_token" in query_str:
result = {"access_token": query_str["access_token"][0]}
if "expires" in query_str:
result["expires"] = query_str["expires"][0]
else:
raise GraphAPIError(response.json())
else:
raise GraphAPIError('Maintype was not text, image, or querystring')
if result and isinstance(result, dict) and result.get("error"):
raise GraphAPIError(result)
return result
I hope it helps.

Related

Get license link from Github with specific commit hash

I have a table (as a Pandas DF) of (mostly) github repos, for which I need to automatically extract the LICENSE link. However, it is a requirement that the link does not just simply go to the /blob/master/ but actually points to a specific commit as the master link might be updated at some point. I assembled a Python script to do this through the github API, but using the API I am only able to retrieve the link with the master tag.
I.e. instead of
https://github.com/jsdom/abab/blob/master/LICENSE.md
I want
https://github.com/jsdom/abab/blob/8abc2aa5b1378e59d61dee1face7341a155d5805/LICENSE.md
Any idea if there is a way to automatically get the link to the latest commit for a file, in this case the LICENSE file?
This is the code I have written so far:
def githubcrawl(repo_url, session, headers):
parts = repo_url.split("/")[3:]
url_tmpl = "http://api.github.com/repos/{}/license"
url = url_tmpl.format("/".join(parts))
try:
response = session.get(url, headers=headers)
if response.status_code in [404]:
return(f"404: {repo_url}")
else:
data = json.loads(response.text)
return(data["html_url"]) # Returns the html URL to LICENSE file
except urllib.error.HTTPError as e:
print(repo_url, "-", e)
return f"http_error: {repo_url}"
token="mytoken" # Token for github authentication to get more requests per hour
headers={"Authorization": "token %s" % token}
session = requests.Session()
lizlinks = [] # List to store the links of the LICENSE files in
# iterate over DataFrame of applications/deps
for idx, row in df.iterrows():
# if idx < 5:
if type(row["Homepage"]) == type("str"):
repo_url = re.sub(r"\#readme", "", row["Homepage"])
response = session.get(repo_url, headers=headers)
repo_url = response.url # Some URLs are just redirects, so I get the actual repo url here
if "github" in repo_url and len(repo_url.split("/")) >= 3:
link = githubcrawl(repo_url, session, headers)
print(link)
lizlinks.append(link)
else:
print(row["Homepage"], "Not a github Repo")
lizlinks.append("Not a github repo")
else:
print(row["Homepage"], "Not a github Repo")
lizlinks.append("Not a github repo")
Bonus-Question: Would parallelizing this task work with the Github-API? I.e. could I send multiple requests at once without being locked out (DoS) or is the for-loop a good approach to avoid this? It takes quite a while to go through the 1000ish of repos I have in that list.
Ok, I found a way to get the unique SHA-hash of the current commit. I believe that should always link to the license file of that point in time.
Using the python git library, i simply run the ls_remote git command and return the HEAD sha
def lsremote_HEAD(url):
g = git.cmd.Git()
HEAD_sha = g.ls_remote(url).split()[0]
return HEAD_sha
I can then replace the "master", "main" or whatever tag in my github_crawl function:
token="token_string"
headers={"Authorization": "token %s" % token}
session = requests.Session()
def githubcrawl(repo_url, session, headers):
parts = repo_url.split("/")[3:]
api_url_tmpl = "http://api.github.com/repos/{}/license"
api_url = api_url_tmpl.format("/".join(parts))
try:
print(api_url)
response = session.get(api_url, headers=headers)
if response.status_code in [404]:
return(f"404: {repo_url}")
else:
data = json.loads(response.text)
commit_link = re.sub(r"/blob/.+?/",rf"/blob/{lsremote_HEAD(repo_url)}/", data["html_url"])
return(commit_link)
except urllib.error.HTTPError as e:
print(repo_url, "-", e)
return f"http_error: {repo_url}"
Maybe this helps someone, so I'm posting this answer here.
This answer uses the following libraries:
import re
import git
import urllib
import json
import requests

How to call a normal python function along with header information without using requests

Details of application:
UI: Angular
Backend: Python Flask (using Swagger)
Database: MongoDB
We have a few backend python methods which will be called from the UI side to do CURD operations on the database.
Each of the methods has a decorator which will check the header information to ensure that only a genuine person can call the methods.
From the UI side when these API's are called, this authorization decorator is not creating any problem and a proper response is returned to the UI (as we are passing the header information also to the request)
But now we are writing unit test cases for the API's. Here each test case will call the backend method and because of the authorization decorator, I am getting errors and not able to proceed. How can I handle this issue?
backend_api.py
--------------
from commonlib.auth import require_auth
#require_auth
def get_records(record_id):
try:
record_details = records_coll.find_one({"_id": ObjectId(str(record_id))})
if record_details is not None:
resp = jsonify({"msg": "Found Record", "data": str(record_details)})
resp.status_code = 200
return resp
else:
resp = jsonify({"msg": "Record not found"})
resp.status_code = 404
return resp
except Exception as ex:
resp = jsonify({"msg": "Exception Occured",'Exception Details': ex}))
resp.status_code = 500
return resp
commonlib/auth.py
-----------------
### some lines of code here
def require_auth(func):
"""
Decorator that can be added to a function to check for authorization
"""
def wrapper(*args, **kwargs):
print(*args,**kwargs)
username = get_username()
security_log = {
'loginId': username,
'securityProtocol': _get_auth_type(),
}
try:
if username is None:
raise SecurityException('Authorization header or cookie not found')
if not is_auth_valid():
raise SecurityException('Authorization header or cookie is invalid')
except SecurityException as ex:
log_security(result='DENIED', message=str(ex))
unauthorized(str(ex))
return func(*args, **kwargs)
return wrapper
test_backend_api.py
-------------------
class TestBackendApi(unittest.TestCase):
### some lines of code here
#mock.patch("pymongo.collection.Collection.find_one", side_effect=[projects_json])
def test_get_records(self, mock_call):
from backend_api import get_records
ret_resp = get_records('61729c18afe7a83268c6c9b8')
final_response = ret_resp.get_json()
message1 = "return response status code is not 200"
self.assertEqual(ret_resp.status_code, 200, message1)
Error snippet :
---------------
E RuntimeError: Working outside of request context.
E
E This typically means that you attempted to use functionality that needed
E an active HTTP request. Consult the documentation on testing for
E information about how to avoid this problem.

Rename collection with pyArango

I'm trying to rename a ArangoDB collection using pyArango. This is what I have so far:
connection = pyArango.Connection('http://random-address', username='random-username', password='random-password')
test_db = Database(connection, 'test-db')
collection = test_db["new"]
collection.action("PUT", "rename", name="newname")
The code fails in line 4:
{'error': True, 'code': 400, 'errorNum': 1208, 'errorMessage': 'name
must be non-empty'}
I'm probably using the action method incorrectly but the documentation does not provide any examples. Anybody got an idea?
A JSON object {"name": "newname"} needs to be passed as request body. The new name can not be passed as URL path parameter. The problem is the implementation of collection.action():
def action(self, method, action, **params) :
"a generic fct for interacting everything that doesn't have an assigned fct"
fct = getattr(self.connection.session, method.lower())
r = fct(self.URL + "/" + action, params = params)
return r.json()
The keyword arguments end up as dict called params. This object is passed to the request function fct() as named parameter params. This parameter receives the dict and converts it to URL path parameters, e.g. ?name=newname which is not supported by the HTTP API of the server.
There is unfortunately no way to pass a payload via action(). You can write some custom code however:
from pyArango.connection import *
connection = Connection('http://localhost:8529', username='root', password='')
try:
connection.createDatabase('test-db')
except CreationError:
pass
test_db = Database(connection, 'test-db')
try:
test_db.createCollection(name='new')
except CreationError:
pass
collection = test_db['new']
r = connection.session.put(collection.URL + '/rename', data='{"name":"newname"}')
print(r.text)
collection = test_db['newname']
You can also use a dict for the payload and transform it to JSON if you want:
import json
...put(..., data=json.dumps({"name": "newname"}))
I've fixed it like this:
def rename_collection(arango_uri, username, password, database, collection, new_name):
url = '{}/_db/{}/_api/collection/{}/rename'.format(arango_uri, database, collection)
params = {"name": new_name}
response = requests.put(url, data=json.dumps(params), auth=HTTPBasicAuth(username, password))
return response

HTTP Error 400 Syntax Error

This code is supposed to use the Yelp API to retrieve information about restaurants. Every time I run it, I get a HTTP 400 Error, which I think should mean a syntax error. In reviewing it by myself I cannot find my error. I've searched Stack Overflow for a question related to mine and haven't found one.
The return Error message is: Encountered HTTP error 400. Abort program.
# -*- coding: utf-8 -*-
"""
Yelp API v2.0 code sample.
This program demonstrates the capability of the Yelp API version 2.0
by using the Search API to query for businesses by a search term and location,
and the Business API to query additional information about the top result
from the search query.
Please refer to http://www.yelp.com/developers/documentation for the API documentation.
This program requires the Python oauth2 library, which you can install via:
`pip install -r requirements.txt`.
Sample usage of the program:
`python sample.py --term="bars" --location="San Francisco, CA"`
"""
import argparse
import json
import pprint
import sys
import urllib
import urllib2
import oauth2
API_HOST = 'api.yelp.com'
DEFAULT_TERM = 'restaurants'
DEFAULT_LOCATION = 'Ottawa'
SEARCH_LIMIT = '75'
SEARCH_PATH = '/v2/search/'
BUSINESS_PATH = '/v2/business/'
# OAuth credential placeholders that must be filled in by users.
CONSUMER_KEY =########
CONSUMER_SECRET =######
TOKEN =#######
TOKEN_SECRET =#######
def request(host, path, url_params=None):
"""Prepares OAuth authentication and sends the request to the API.
Args:
host (str): The domain host of the API.
path (str): The path of the API after the domain.
url_params (dict): An optional set of query parameters in the request.
Returns:
dict: The JSON response from the request.
Raises:
urllib2.HTTPError: An error occurs from the HTTP request.
"""
url_params = url_params or {}
encoded_params = urllib.urlencode(url_params)
url = 'http://{0}{1}?{2}'.format(host, path, encoded_params)
consumer = oauth2.Consumer(CONSUMER_KEY, CONSUMER_SECRET)
oauth_request = oauth2.Request('GET', url, {})
oauth_request.update(
{
'oauth_nonce': oauth2.generate_nonce(),
'oauth_timestamp': oauth2.generate_timestamp(),
'oauth_token': TOKEN,
'oauth_consumer_key': CONSUMER_KEY
}
)
token = oauth2.Token(TOKEN, TOKEN_SECRET)
oauth_request.sign_request(oauth2.SignatureMethod_HMAC_SHA1(), consumer, token)
signed_url = oauth_request.to_url()
print 'Querying {0} ...'.format(url)
conn = urllib2.urlopen(signed_url, None)
try:
response = json.loads(conn.read())
finally:
conn.close()
return response
def search(term, location):
"""Query the Search API by a search term and location.
Args:
term (str): The search term passed to the API.
location (str): The search location passed to the API.
Returns:
dict: The JSON response from the request.
"""
url_params = {
'term': term,
'location': location,
'limit': SEARCH_LIMIT
}
return request(API_HOST, SEARCH_PATH, url_params=url_params)
def get_business(business_id):
"""Query the Business API by a business ID.
Args:
business_id (str): The ID of the business to query.
Returns:
dict: The JSON response from the request.
"""
business_path = BUSINESS_PATH + business_id
return request(API_HOST, business_path)
def query_api(term, location):
"""Queries the API by the input values from the user.
Args:
term (str): The search term to query.
location (str): The location of the business to query.
"""
response = search(term, location)
businesses = response.get('businesses')
if not businesses:
print 'No businesses for {0} in {1} found.'.format(term, location)
return
business_id = businesses[0]['id']
print '{0} businesses found, querying business info for the top result "{1}" ...'.format(
len(businesses),
business_id
)
response = get_business(business_id)
print 'Result for business "{0}" found:'.format(business_id)
pprint.pprint(response, indent=2)
def main():
parser = argparse.ArgumentParser()
parser.add_argument('-q', '--term', dest='term', default=DEFAULT_TERM, type=str, help='Search term (default: %(default)s)')
parser.add_argument('-l', '--location', dest='location', default=DEFAULT_LOCATION, type=str, help='Search location (default: %(default)s)')
input_values = parser.parse_args()
try:
query_api(input_values.term, input_values.location)
except urllib2.HTTPError as error:
sys.exit('Encountered HTTP error {0}. Abort program.'.format(error.code))
if __name__ == '__main__':
main()
In yelp api v2 they use oauth 1.oa,but you use oauth2.
You limit parameter is set to a very high value. try something like 10 to begin with.

learning python with yelp api

I'm just beginning to learn python/django. I've been teaching myself PHP and now I would like to learn python. I'm having issues integrating yelp's API. I'm getting errors:
Values instance has no attribute 'q'
I have this code:
def search(request):
parser = optparse.OptionParser()
parser.add_option('-c', '--consumer_key', dest='my_consumer_key_goes_here', help='OAuth consumer key (REQUIRED)')
parser.add_option('-s', '--consumer_secret', dest='my_consumer_secret_goes_here', help='OAuth consumer secret (REQUIRED)')
parser.add_option('-t', '--token', dest='my_token_goes_here', help='OAuth token (REQUIRED)')
parser.add_option('-e', '--token_secret', dest='my_token_secret_goes_here', help='OAuth token secret (REQUIRED)')
parser.add_option('-a', '--host', dest='host', help='Host', default='api.yelp.com')
options, args = parser.parse_args()
# search stuff?
if 'q' in request.GET and request.GET['q']:
q = request.GET['q']
parser.add_option('-q', '--term', dest=q, help='Search term')
url_params = {}
if options.q:
url_params['term'] = options.q
# Sign the URL
consumer = oauth2.Consumer(consumer_key, consumer_secret)
oauth_request = oauth2.Request('GET', url, {})
oauth_request.update({'oauth_nonce': oauth2.generate_nonce(),
'oauth_timestamp': oauth2.generate_timestamp(),
'oauth_token': token,
'oauth_consumer_key': consumer_key})
token = oauth2.Token(token, token_secret)
oauth_request.sign_request(oauth2.SignatureMethod_HMAC_SHA1(), consumer, token)
signed_url = oauth_request.to_url()
else:
message = 'You submitted an empty form.'
#return HttpResponse(message)
print 'Signed URL: %s\n' % (signed_url,)
# Connect
try:
conn = urllib2.urlopen(signed_url, None)
try:
response = json.loads(conn.read())
finally:
conn.close()
except urllib2.HTTPError, error:
response = json.loads(error.read())
return response
response = request(options.host, '/v2/search', url_params, options.consumer_key, options.consumer_secret, options.token, options.token_secret)
print json.dumps(response, sort_keys=True, indent=2)
Now as I'm still new to this whole python language, I wonder if I'm also adding the API consumer_key, secret, etc... in the right places? Do I even have this code set correctly? Most of it was from the original yelp api script. I guess the issue lies in the if options.q: area... not sure if I'm doing this correctly?
Thanks
I would say that calling parser.parse_args() before you declared your q option is probably your immediate problem.
But unless I am missing something about what you are trying to do, Django settings [1] are what you need to use for those first 5 options (don't use optparse at all). And then I don't know what you are trying to do with the rest of the function.
[1] https://docs.djangoproject.com/en/1.4/topics/settings/

Categories

Resources