HTTP403Forbidden when running Python's "Pattern" module

HTTP403Forbidden when running Python's "Pattern" module - python

Anytime I try to run the "sort" function from python's "pattern" module. I get the follow error:
File "/usr/local/lib/python2.7/dist-packages/pattern/web/init.py", line 391, in open
if e.code == 403: raise HTTP403Forbidden(src=e, url=url)
pattern.web.HTTP403Forbidden
It's strange because my code used to run fine. Here's all I'm doing:
from pattern.web import sort
import json
search_terms = "chuck norris, mickey mouse"
context = "evil"
results = sort(
terms = search_terms.split(","),
context = context,
prefix = True)
print json.dumps(results)
anyone run into this before?

You have exceeded the daily quota of Google searches. That quota is shared among all people who use pattern with the default license key.
You need to specify your own license key, like so:
results = sort(
terms = search_terms.split(","),
context = context,
license = "abcd1234999999",
prefix = True)
References:
http://www.clips.ua.ac.be/pages/pattern-web#sort
http://www.clips.ua.ac.be/pages/pattern-web#services

Related

Python Boto3 Get Parameters from SSM by Path using NextToken

I've been working with boto3 for a while in order to gather some values from the Parameter Store SSM, this is the code I use, which is very simple:
def get_raw_parameters_group_by_namespace(namespace_path):
raw_params_response = None
try:
if not namespace_path:
raise Exception('Namespace path should be specified to get data')
raw_params_response = ssm_ps.get_parameters_by_path(Path = namespace_path)
except Exception as e:
raise Exception('An error ocurred while trying to get parameters group: ' + str(e))
return raw_params_response
I used to have around 7 to 10 parameters in SSM and that method worked fine, however, we needed the add some additional parameters these days and the number of them increased to 14, so I tried adding a property in the boto3 ssm method called "MaxResults" and set it to 50:
ssm_ps.get_parameters_by_path(Path = namespace_path, MaxResults = 50)
but I get the following:
"error": "An error ocurred while trying to get parameters group: An error occurred (ValidationException) when calling the GetParametersByPath operation: 1 validation error detected: Value '50' at 'maxResults' failed to satisfy constraint: Member must have value less than or equal to 10."
After talking with the team, increasing the quota in the account is not an option, so I wonder to know if probably using the "NextToken" property would be a good option.
I am not sure on how this can be used, I have searched for examples, but I could not find something useful. Does anyone know how to use NextToken please? Or an example on how is it supposed to work?
I tried something like:
raw_params_response = ssm_ps.get_parameters_by_path(Path = namespace_path, NextToken = 'Token')
But I am not sure on the usage of this.
Thanks in advance.

I remember running into this at some point.
You want to use a paginator - https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/ssm.html#SSM.Paginator.GetParametersByPath
This is how I used it:
import boto3
client = boto3.client('ssm',region_name='eu-central-1')
paginator = client.get_paginator('get_parameters_by_path')
response_iterator = paginator.paginate(
Path='/some/path'
)
parameters=[]
for page in response_iterator:
for entry in page['Parameters']:
parameters.append(entry)
And you would get a list like [{"Name": "/some/path/param, "Value": "something"}] in parameters with all the parameters under the path.
*edit: response would be much richer than just the Name, Value keys. check the paginator docs!

Let me suggest using this library (I'm the author): AWStanding
You can achieve this easily, without worrying about pagination:
import os
from awstanding.parameter_store import load_path
load_path('/stripe', '/spotify')
STRIPE_PRICE = os.environ.get('STRIPE_PRICE', 'fallback_value')
STRIPE_WEBHOOK = os.environ.get('STRIPE_WEBHOOK', 'fallback_value')
SPOTIFY_API_KEY = os.environ.get('SPOTIFY_API_KEY', 'fallback_value')
print(f'price: {STRIPE_PRICE}, webhook: {STRIPE_WEBHOOK}, spotify: {SPOTIFY_API_KEY}')
>>> price: price_1xxxxxxxxxxxxxxxxxxxxxxx, webhook: fallback_value, spotify: fallback_value

python-ldap3 is unable to add user to and existing LDAP group

I am able successfully connect using LDAP3 and retrieve my LDAP group members as below.
from ldap3 import Server, Connection, ALL, SUBTREE
from ldap3.extend.microsoft.addMembersToGroups import ad_add_members_to_groups as addMembersToGroups
>>> conn = Connection(Server('ldaps://ldap.****.com:***', get_info=ALL),check_names=False, auto_bind=False,user="ANT\*****",password="******", authentication="NTLM")
>>>
>>> conn.open()
>>> conn.search('ou=Groups,o=****.com', '(&(cn=MY-LDAP-GROUP))', attributes=['cn', 'objectclass', 'memberuid'])
it returns True and I can see members by printing
conn.entries
>>>
The above line says MY-LDAP-GROUP exists and returns TRUE while searching but throws LDAP group not found when I try to an user to the group as below
>>> addMembersToGroups(conn, ['myuser'], 'MY-LDAP-GROUP')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/Users/****/anaconda3/lib/python3.7/site-packages/ldap3/extend/microsoft/addMembersToGroups.py", line 69, in ad_add_members_to_groups
raise LDAPInvalidDnError(group + ' not found')
ldap3.core.exceptions.LDAPInvalidDnError: MY-LDAP-GROUP not found
>>>

The above line says MY-LDAP-GROUP exists and returns TRUE
Returning True just means that the search succeeded. It doesn't mean that anything was found. Is there anything in conn.entries?
But I suspect your real problem is something different. If this is the source code for ad_add_members_to_groups, then it is expecting the distinguishedName of the group (notice the parameter name group_dn), but you're passing the cn (common name). For example, your code should be something like:
addMembersToGroups(conn, ['myuser'], 'CN=MY-LDAP-GROUP,OU=Groups,DC=example,DC=com')
If you don't know the DN, then ask for the distinguishedName attribute from the search.
A word of warning: that code for ad_add_members_to_groups retrieves all the current members before adding the new member. You might run into performance problems if you're working with groups that have large membership because of that (e.g. if the group has 1000 members, it will load all 1000 before adding anyone). You don't actually need to do that (you can add a new member without looking at the current membership). I think what they're trying to avoid is the error you get when you try to add someone who is already in the group. But I think there are better ways to handle that. It might not matter to you if you're only working with small groups.

After so many trial and errors, I got frustrated and used the older python-ldap library to add existing users. Now my code is a mixture of ldap3 and ldap.
I know this is not what the OP has desired. But this may help someone.
Here the user Dinesh Kumar is already part of a group group1. I am trying to add him
to another group group2 which is successful and does not disturb the existing group
import ldap
import ldap.modlist as modlist
def add_existing_user_to_group(user_name, user_id, group_id):
"""
:return:
"""
# ldap expects a byte string.
converted_user_name = bytes(user_name, 'utf-8')
converted_user_id = bytes(user_id, 'utf-8')
converted_group_id = bytes(group_id, 'utf-8')
# Add all the attributes for the new dn
ldap_attr = {}
ldap_attr['uid'] = converted_user_name
ldap_attr['cn'] = converted_user_name
ldap_attr['uidNumber'] = converted_user_id
ldap_attr['gidNumber'] = converted_group_id
ldap_attr['objectClass'] = [b'top', b'posixAccount', b'inetOrgPerson']
ldap_attr['sn'] = b'Kumar'
ldap_attr['homeDirectory'] = b'/home/users/dkumar'
# Establish connection to server using ldap
conn = ldap.initialize(server_uri, bytes_mode=False)
bind_resp = conn.simple_bind_s("cn=admin,dc=testldap,dc=com", "password")
dn_new = "cn={},cn={},ou=MyOU,dc=testldap,dc=com".format('Dinesh Kumar','group2')
ldif = modlist.addModlist(ldap_attr)
try:
response = conn.add_s(dn_new, ldif)
except ldap.error as e:
response = e
print(" The response is ", response)
conn.unbind()
return response

Is there a way to access facebook events using graphapi

I have code to look up events but keep getting an error:
File "facebook_events.py", line 8, in
events = graph.request('/search?q=Poetry&type=event')
File "/Users/teomeyerhoff/Desktop/projects/jakecongress/facebookenv/lib/python3.7/site-packages/facebook/init.py", line 313, in request
raise GraphAPIError(result)
facebook.GraphAPIError: (#33) This object does not exist or does not support this action.
Has something changed in Facebooks api. It looks as though I can no longer access events using the query string: '/search?q=Poetry&type=event' as a graph request.
import urllib3
import facebook
import requests
token = "EA......" //not actual token
graph = facebook.GraphAPI(access_token=token, version = "2.8")
events = graph.request('/search?q=Poetry&type=event')
print(events)
eventList = events['data']
eventid = eventList[1]['id']
event1 = graph.get_object(id=eventid, fields='attending_count, can_guests_invite, \
category, cover, declined_count, description, \
end_time, guest_list_enabled, interested_count, \
is_canceled, is_page_owned, is_viewer_admin, \
maybe_count, noreply_count, owner, parent_group,\
place, ticket_uri, timezone, type, updated_time')
attenderscount = event1['attending_count']
declinerscount = event1['declined_count']
interestedcount = event1['interested_count']
maybecount = event1['maybe_count']
noreplycount = event1['noreply_count']
attenders = requests.get('https://graph.facebook.com/v2.8/"+eventid+"\
/attending?access_token="+token+"&limit='+str(attenderscount))
attenders_json = attenders.json()
admins = requests.get("https://graph.facebook.com/v2.8/"+eventid+"\
/admins?access_token="+token)
admins_json = admins.json()
Thank you for the help.

It looks as though I can no longer access events using the query string: '/search?q=Poetry&type=event' as a graph request.
Yes, that is the case.
https://developers.facebook.com/docs/graph-api/changelog/breaking-changes#search-4-4
Search API
You can no longer use the /search endpoint with the following object types:
event
group
page
user
Also mentioned in the accompanying blog post, https://developers.facebook.com/blog/post/2018/04/04/facebook-api-platform-product-changes/
Search API
Deprecated:
Support for finding pages, groups, events, users using search.

How do I avoid getting a sporadic KeyError: 'data' when using the Reddit API in python?

I have the following python code that is working ok to use reddit's api and look up the front page of different subreddits and their rising submissions.
from pprint import pprint
import requests
import json
import datetime
import csv
import time
subredditsToScan = ["Arts", "AskReddit", "askscience", "aww", "books", "creepy", "dataisbeautiful", "DIY", "Documentaries", "EarthPorn", "explainlikeimfive", "food", "funny", "gaming", "gifs", "history", "jokes", "LifeProTips", "movies", "music", "pics", "science", "ShowerThoughts", "space", "sports", "tifu", "todayilearned", "videos", "worldnews"]
ofilePosts = open('posts.csv', 'wb')
writerPosts = csv.writer(ofilePosts, delimiter=',')
ofileUrls = open('urls.csv', 'wb')
writerUrls = csv.writer(ofileUrls, delimiter=',')
for subreddit in subredditsToScan:
front = requests.get(r'http://www.reddit.com/r/' + subreddit + '/.json')
rising = requests.get(r'http://www.reddit.com/r/' + subreddit + '/rising/.json')
front.text
rising.text
risingData = rising.json()
frontData = front.json()
print(len(risingData['data']['children']))
print(len(frontData['data']['children']))
for i in range(0, len(risingData['data']['children'])):
author = risingData['data']['children'][i]['data']['author']
score = risingData['data']['children'][i]['data']['score']
subreddit = risingData['data']['children'][i]['data']['subreddit']
gilded = risingData['data']['children'][i]['data']['gilded']
numOfComments = risingData['data']['children'][i]['data']['num_comments']
linkUrl = risingData['data']['children'][i]['data']['permalink']
timeCreated = risingData['data']['children'][i]['data']['created_utc']
writerPosts.writerow([author, score, subreddit, gilded, numOfComments, linkUrl, timeCreated])
writerUrls.writerow([linkUrl])
for j in range(0, len(frontData['data']['children'])):
author = frontData['data']['children'][j]['data']['author'].encode('utf-8').strip()
score = frontData['data']['children'][j]['data']['score']
subreddit = frontData['data']['children'][j]['data']['subreddit'].encode('utf-8').strip()
gilded = frontData['data']['children'][j]['data']['gilded']
numOfComments = frontData['data']['children'][j]['data']['num_comments']
linkUrl = frontData['data']['children'][j]['data']['permalink'].encode('utf-8').strip()
timeCreated = frontData['data']['children'][j]['data']['created_utc']
writerPosts.writerow([author, score, subreddit, gilded, numOfComments, linkUrl, timeCreated])
writerUrls.writerow([linkUrl])
It works well and scrapes the data accurately but it constantly gets interrupted, seemingly randomly, and has a run time crash, saying:
Traceback (most recent call last):
File "dataGather1.py", line 27, in <module>
for i in range(0, len(risingData['data']['children'])):
KeyError: 'data'
I have no idea why this error is occuring on and off and not consistently. I thought maybe I am calling the API too much so it stops me from accessing it so I threw a sleep in my code but that did not help. Any ideas?

When there are no data on the response from the API there are is no key data on the dictionary so you get a keyError on some subreddits. You need to use a try catch

The json you are parsing doesn't contain the 'data' element. Thus you get an error. I think your hunch is correct though. It is probably rate limiting, or that you're asking for hidden/deleted entries.
Reddit is very strict about accessing their API without playing nice. Meaning you should register your app and use a meaningful user-agent to your requets, and you should probably use the python library for this kind of thing: https://praw.readthedocs.io/en/latest/
Without registering it seems to my experience that the direct REST reddit API is even more strict than the 1 request per 2 seconds rule they have (had?).

Python raises a KeyError whenever a dict() object is requested (using the format a = adict[key]) and the key is not in the dictionary.
It seems like when you are getting this error, your data value is empty.
You might just try to get the length of the dictionary before you execute the for loop. If it’s empty, it will just not run. Some interesting error checking here might help.
size = len(risingData)
if size:
for i in range(0,size):
…

Python 3.5 / Pastebin "Bad API request, invalid api_option"

I'm working on a twitch irc bot and one of the components I wanted to have available was the ability for the bot to save quotes to a pastebin paste on close, and then retrieve the same quotes on start up.
I've started with the saving part, and have hit a road block where I can't seem to get a valid post, and I can't figure out a method.
#!/usr/bin/env python3
import urllib.parse
import urllib.request
# --------------------------------------------- Pastebin Requisites --------------------------------------------------
pastebin_key = 'my pastebin key' # developer api key, required. GET: http://pastebin.com/api
pastebin_password = 'password' # password for pastebin_username
pastebin_postexp = 'N' # N = never expire
pastebin_private = 0 # 0 = Public 1 = unlisted 2 = Private
pastebin_url = 'http://pastebin.com/api/api_post.php'
pastebin_username = 'username' # user corresponding with key
# --------------------------------------------- Value clean up --------------------------------------------------
pastebin_password = urllib.parse.quote(pastebin_password, safe='/')
pastebin_username = urllib.parse.quote(pastebin_username, safe='/')
# --------------------------------------------- Pastebin Functions --------------------------------------------------
def post(title, content): # used for posting a new paste
pastebin_vars = {'api_option': 'paste', 'api_user_key': pastebin_username, 'api_paste_private': pastebin_private,
'api_paste_name': title, 'api_paste_expire_date': pastebin_postexp, 'api_dev_key': pastebin_key,
'api_user_password': pastebin_password, 'api_paste_code': content}
try:
str_to_paste = ', '.join("{!s}={!r}".format(key, val) for (key, val) in pastebin_vars.items()) # dict to str :D
str_to_paste = str_to_paste.replace(":", "") # remove :
str_to_paste = str_to_paste.replace("'", "") # remove '
str_to_paste = str_to_paste.replace(")", "") # remove )
str_to_paste = str_to_paste.replace(", ", "&") # replace dividers with &
urllib.request.urlopen(pastebin_url, urllib.parse.urlencode(pastebin_vars)).read()
print('did that work?')
except:
print("post submit failed :(")
print(pastebin_url + "?" + str_to_paste) # print the output for test
post("test", "stuff")
I'm open to importing more libraries and stuff, not really sure what I'm doing wrong after working on this for two days straight :S

import urllib.parse
import urllib.request
PASTEBIN_KEY = 'xxx'
PASTEBIN_URL = 'https://pastebin.com/api/api_post.php'
PASTEBIN_LOGIN_URL = 'https://pastebin.com/api/api_login.php'
PASTEBIN_LOGIN = 'my_login_name'
PASTEBIN_PWD = 'yyy'
def pastebin_post(title, content):
login_params = dict(
api_dev_key=PASTEBIN_KEY,
api_user_name=PASTEBIN_LOGIN,
api_user_password=PASTEBIN_PWD
)
data = urllib.parse.urlencode(login_params).encode("utf-8")
req = urllib.request.Request(PASTEBIN_LOGIN_URL, data)
with urllib.request.urlopen(req) as response:
pastebin_vars = dict(
api_option='paste',
api_dev_key=PASTEBIN_KEY,
api_user_key=response.read(),
api_paste_name=title,
api_paste_code=content,
api_paste_private=2,
)
return urllib.request.urlopen(PASTEBIN_URL, urllib.parse.urlencode(pastebin_vars).encode('utf8')).read()
rv = pastebin_post("This is my title", "These are the contents I'm posting")
print(rv)
Combining two different answers above gave me this working solution.

First, your try/except block is throwing away the actual error. You should almost never use a "bare" except clause without capturing or re-raising the original exception. See this article for a full explanation.
Once you remove the try/except, and you will see the underlying error:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "paste.py", line 42, in post
urllib.request.urlopen(pastebin_url, urllib.parse.urlencode(pastebin_vars)).read()
File "/usr/lib/python3.4/urllib/request.py", line 161, in urlopen
return opener.open(url, data, timeout)
File "/usr/lib/python3.4/urllib/request.py", line 461, in open
req = meth(req)
File "/usr/lib/python3.4/urllib/request.py", line 1112, in do_request_
raise TypeError(msg)
TypeError: POST data should be bytes or an iterable of bytes. It cannot be of type str.
This means you're trying to pass a unicode string into a function that's expecting bytes. When you do I/O (like reading/writing files on disk, or sending/receiving data over HTTP) you typically need to encode any unicode strings as bytes. See this presentation for a good explanation of unicode vs. bytes and when you need to encode and decode.
Next, this line:
urllib.request.urlopen(pastebin_url, urllib.parse.urlencode(pastebin_vars)).read()
Is throwing away the response, so you have no way of knowing the result of your API call. Assign this to a variable or return it from your function so you can then inspect the value. It will either be a URL to the paste, or an error message from the API.
Next, I think your code is sending a lot of unnecessary parameters to the API and your str_to_paste statements aren't necessary.
I was able to make a paste using the following, much simpler, code:
import urllib.parse
import urllib.request
PASTEBIN_KEY = 'my-api-key' # developer api key, required. GET: http://pastebin.com/api
PASTEBIN_URL = 'http://pastebin.com/api/api_post.php'
def post(title, content): # used for posting a new paste
pastebin_vars = dict(
api_option='paste',
api_dev_key=PASTEBIN_KEY,
api_paste_name=title,
api_paste_code=content,
)
return urllib.request.urlopen(PASTEBIN_URL, urllib.parse.urlencode(pastebin_vars).encode('utf8')).read()
Here it is in use:
>>> post("test", "hello\nworld.")
b'http://pastebin.com/v8jCkHDB'

I didn't know about pastebin until now. I read their api and tried it for the first time, and it worked perfectly fine.
Here's what I did:
I logged in to fetch the api_user_key.
Included that in the posting along with api_dev_key.
Checked the website, and the post was there.
Here's the code:
import urllib.parse
import urllib.request
def post(url, params):
data = urllib.parse.urlencode(login_params).encode("utf-8")
req = urllib.request.Request(login_url, data)
with urllib.request.urlopen(req) as response:
return response.read()
# Logging in to fetch api_user_key
login_url = "http://pastebin.com/api/api_login.php"
login_params = {"api_dev_key": "<the dev key they gave you",
"api_user_name": "<username goes here>",
"api_user_password": "<password goes here>"}
api_user_key = post(login_url, login_params)
# Posting some random text
post_url = "http://pastebin.com/api/api_post.php"
post_params = {"api_dev_key": "<the dev key they gave you",
"api_option": "paste",
"api_paste_code": "<head>Testing</head>",
"api_paste_private": "0",
"api_paste_name": "testing.html",
"api_paste_expire_date": "10M",
"api_paste_format": "html5",
"api_user_key": api_user_key}
response = post(post_url, post_params)
Only the first three parameters are needed for posting something, the rest are optional.

fwy the API doesn't seem to accept http requests as of writing this, so make sure to have the urls in the format of https://pas...

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

HTTP403Forbidden when running Python's "Pattern" module - python

Related

Python Boto3 Get Parameters from SSM by Path using NextToken

python-ldap3 is unable to add user to and existing LDAP group

Is there a way to access facebook events using graphapi

How do I avoid getting a sporadic KeyError: 'data' when using the Reddit API in python?

Python 3.5 / Pastebin "Bad API request, invalid api_option"

Categories

Resources