I need to get a list of all the public and private Slack channels a particular user token ("xoxp...") belongs to. The problem is that currently the api is returning only some of the private channels, not all of them (but it is returning some). It used to return all, but now some are missing (that used to be found by the user before this). This started happening sometime after March (the last time I queried the API).
I tried:
Creating a new private channel and adding the user to it to see if it saw that => it does
Removing the user from the channel it doesn't see when api calling and re-adding the user to the channel => issue remains
Reinstalling the app to the workspace => issue remains
The workspace only has about 100 channels, including deprecated ones, so I know I'm not hitting the limit.
Here is my code (in Python):
def _getChannels(self, _next_cursor=""):
""" Needs scope channels:read
Archived channels are included by default.
INCLUDES private channels the calling user (person whose token is being used) has access to
"""
kwargs = {"limit":1000,
"types":"public_channel,private_channel"}
if _next_cursor:
kwargs["cursor"] = _next_cursor
results_full = self._callApi("conversations.list", "_getChannels()", kwargs)
results = results_full["channels"]
next_cursor = results_full["response_metadata"]["next_cursor"]
if next_cursor: # Isn't empty
results = results + self._getChannels(next_cursor)
return results
def _callApi(self, call_type, calling_function, kwargs={}):
""" calling_function is a string for error message reporting """
# New API can't handle booleans or extra params
pass_error_through = kwargs.get(self.PASS_ERROR_THROUGH, False)
if self.PASS_ERROR_THROUGH in kwargs:
kwargs.pop(self.PASS_ERROR_THROUGH)
for key in kwargs:
if type(kwargs[key]) == bool:
kwargs[key] = str(kwargs[key]).lower()
# New api raises exceptions instead of returning error, so need to catch
try:
result = self._slack_client.api_call(call_type, params=kwargs)
except Exception as E:
result = str(E) # You used to be able to just call result["error"]
if "error" in result:
if "ratelimited" in result:
print("\nRatelimited. Waiting one min before retrying.", flush=True)
sleep(60)
return self._callApi(call_type, calling_function, kwargs)
elif not pass_error_through:
error_message = ("ERROR: Call to " + calling_function +
" failed due to " + result + ". " +
"\n\nkwargs: " + str(kwargs) + "\n\n--End Message--")
#if "needed" in result:
# error_message += "It needs: " + result["needed"]
print() # To provide spacing before the traceback starts
raise ValueError(error_message)
return result
Did you try scope groups:read for private channels?
https://api.slack.com/methods/conversations.list
Documentation says:
Information about required scopes
This Conversations API method's required scopes depend on the type of channel-like object you're working with. To use the method, you'll need at least one of the channels:, groups:, im: or mpim: scopes corresponding to the conversation type you're working with.
Related
As stated in the question, I would like to check the status of a list of twitter user ids. I had about 20k twitter users. I was able to get the timelines of about half of them. The other have are probably either suspended, deactivated, or have 0 tweets. I found this script online that supposedly allow for checking the status of a twitter user. Here is the script (https://github.com/dbrgn/Twitter-User-Checker/blob/master/checkuser.py):
`
#!/usr/bin/env python2
# Twitter User Checker
# Author: Danilo Bargen
# License: GPLv3
import sys
import tweepy
import urllib2
try:
import json
except ImportError:
import simplejson as json
from datetime import datetime
auth = tweepy.AppAuthHandler("xxx", "xxxx")
api = tweepy.API(auth, wait_on_rate_limit=True, wait_on_rate_limit_notify=True)
if (not api):
print ("Can't Authenticate")
sys.exit(-1)
# Continue with rest of code
try:
user = sys.argv[1]
except IndexError:
print 'Usage: checkuser.py [username]'
sys.exit(-1)
url = 'https://api.twitter.com/1.1/users/show.json?id=%s' % user
try:
request = urllib2.urlopen(url)
status = request.code
data = request.read()
except urllib2.HTTPError, e:
status = e.code
data = e.read()
data = json.loads(data)
print data
if status == 403:
print "helloooooooo"
# if 'suspended' in data['error']:
# print 'User %s has been suspended' % user
# else:
# print 'Unknown response'
elif status == 404:
print 'User %s not found' % user
elif status == 200:
days_active = (datetime.now() - datetime.strptime(data['created_at'],
'%a %b %d %H:%M:%S +0000 %Y')).days
print 'User %s is active and has posted %s tweets in %s days' % \
(user, data['statuses_count'], days_active)
else:
print 'Unknown response'
`
I get the following error:
File "twitter_status_checker.py", line 16, in <module>
auth = tweepy.AppAuthHandler("xxx", "xxxx")
File "/Users/aloush/anaconda/lib/python2.7/site-packages/tweepy/auth.py", line 170, in __init__
'but got %s instead' % data.get('token_type'))
tweepy.error.TweepError: Expected token_type to equal "bearer", but got None instead
Could anyone help me fix the error as well as allow the script to check for a list of users rather than one user for each run.
Here is a list of the HTTP Status Codes I would like to check for: https://dev.twitter.com/overview/api/response-codes
Thank you.
It seems that you failed to authenticate twitter. For the latest version (3.5), tweepy uses OAuthHander to authenticate. Please check how to use Tweepy. And also the linked script you used is to check the account one by one, which could be very slow.
To check the status of a large set of Twitter accounts by Tweepy, particularly if you want to know the reason why it is inactive (e.g., not found, suspended), you need to be aware of the followings:
Which API should be used?
Twitter provides two related APIs, one is user/show and the other is user/lookup. The former one returns the profile of one specified user, while the later one returns profile of a block of up to 100 users. The corresponding tweepy APIs are API.get_user and API.lookup_users (I cannot find it in the documentation, but it does exist in code). Definitely, you should use the second one. However, when there exist some inactive users, the lookup_users API returns only these are active. This means that you have to call get_user API to get the very detail reason for inactive accounts.
How to determine the status of a user?
Of course, you should pay attention to the response code provided by Twitter. However, when using tweepy, instead of the HTTP ERROR CODES, you should pay more attention on the ERROR MESSAGE. Here are some common cases:
If the profile is successfully fetched, it is an active user;
Otherwise, we could check the error code:
50 User not found.
63 User has been suspended.
... maybe more code about the user account
For tweepy, when the profile is failed to fetch, a TweepyError is raised. And the TweepyError.message[0] is the error message from twitter API.
Okay, here are the logic to process
(1) Divide the large block of user into pieces of size of 100;
(2) for each of these pieces, do (3) and (4);
(3) call lookup_users, the returned users will be treated as the active users and the remaining users will be treated as inactive users;
(4) call get_user for each of the inactive users to get the detailed reason.
Here is a sample code for you:
import logging
import tweepy
logger = logging.getLogger(__name__)
def to_bulk(a, size=100):
"""Transform a list into list of list. Each element of the new list is a
list with size=100 (except the last one).
"""
r = []
qt, rm = divmod(len(a), size)
i = -1
for i in range(qt):
r.append(a[i * size:(i + 1) * size])
if rm != 0:
r.append(a[(i + 1) * size:])
return r
def fast_check(api, uids):
""" Fast check the status of specified accounts.
Parameters
---------------
api: tweepy API instance
uids: account ids
Returns
----------
Tuple (active_uids, inactive_uids).
`active_uids` is a list of active users and
`inactive_uids` is a list of inactive uids,
either supended or deactivated.
"""
try:
users = api.lookup_users(user_ids=uids,
include_entities=False)
active_uids = [u.id for u in users]
inactive_uids = list(set(uids) - set(active_uids))
return active_uids, inactive_uids
except tweepy.TweepError as e:
if e[0]['code'] == 50 or e[0]['code'] == 63:
logger.error('None of the users is valid: %s', e)
return [], inactive_uids
else:
# Unexpected error
raise
def check_inactive(api, uids):
""" Check inactive account, one by one.
Parameters
---------------
uids : list
A list of inactive account
Returns
----------
Yield tuple (uid, reason). Where `uid` is the account id,
and `reason` is a string.
"""
for uid in uids:
try:
u = api.get_user(user_id=uid)
logger.warning('This user %r should be inactive', uid)
yield (u, dict(code=-1, message='OK'))
except tweepy.TweepyError as e:
yield (uid, e[0][0])
def check_one_block(api, uids):
"""Check the status of user for one block (<100). """
active_uids, inactive_uids = fast_check(api, uids)
inactive_users_status = list(check_inactive(api, inactive_uids))
return active_uids, inactive_users_status
def check_status(api, large_uids):
"""Check the status of users for any size of users. """
active_uids = []
inactive_users_status = []
for uids in to_bulk(large_uids, size=100):
au, iu = check_one_block(api, uids)
active_uids += au
inactive_users_status += iu
return active_uids, inactive_users_status
def main(twitter_crendient, large_uids):
""" The main function to call check_status. """
# First prepare tweepy API
auth = tweepy.OAuthHandler(twitter_crendient['consumer_key'],
twitter_crendient['consumer_secret'])
auth.set_access_token(twitter_crendient['access_token'],
twitter_crendient['access_token_secret'])
api = tweepy.API(auth, wait_on_rate_limit=True)
# Then, call check_status
active_uids, inactive_user_status = check_status(api, large_uids)
Because of the lack of data, I never test the code. There may be bugs, you should take care of them.
Hope this is helpful.
Sooo, I'm relatively new to programming, and trying to learn how to consume API's. I figured I would start out by building a Slack bot for moderation purposes since I use Slack a lot. For the most part, everything works except for when I try to delete a message. The API returns saying it can't find the message even though it is there in the channel (the slack API uses timestamps to locate said message). The timestamps match, but proclaims the message doesn't exist. Here is my code:
def __init__(self, token):
self.token = token
self.users = {}
self.channels = {}
self.slack = SlackClient(self.token)
self.as_user = True
def connect(self):
if self.slack.rtm_connect():
self.post_message('#test', "*AUTOMOD* _v0.1_")
while True:
# print(self.slack.rtm_read())
self.parse_data(self.slack.rtm_read())
time.sleep(1)
def parse_data(self, payload):
if payload:
if payload[0]['type'] == 'message':
print(("user: {} message: {} channel: {}").format(payload[0]['user'], payload[0]['text'], payload[0]['channel']))
self.handle_message(payload[0])
def handle_message(self, data):
# these users can post whatever they want.
WHITELISTED = ["U4DU2TS2F", "U3VSRJJD8", "U3WLZUTQE", "U3W1Q2ULT"]
# get userid
sent_from = (data['user'])
# ignore whitelisted
if sent_from in WHITELISTED:
return
# if message was sent from someone not in WHITELISTED, delete it
else:
print(("\n\ntimestamp of message: {}").format(data['ts']))
self.delete_message(data['channel'], data['ts'])
self.post_message(data['channel'], "```" + random.choice(dongers) + "```")
def delete_message(self, channel, timestamp):
print(("deleting message in channel '{}'...").format(channel))
print("timestamp check (just to make sure): ", timestamp)
deleted = self.slack.api_call("chat.delete",
channel=channel,
timestamp=timestamp,
as_user=self.as_user
)
if deleted.get('ok'):
print("\nsuccessfully deleted.\n")
else:
print(("\ncouldn't delete message: {}\n").format(deleted['error']))
OUTPUT
timestamp of message: 1488822718.000040
deleting message in channel: 'G4DGYCW2X'
timestamp check (just to make sure...): 1488822718.000040
couldn't delete message: message_not_found
Any ideas on what could be happening? Here is the chat.delete method for context.
EDIT:
Due #pvg's recommendation of "Minimal, Complete, and Verifiable example", I have placed the ENTIRE code from the project in a gist.
One issue might be that you appear to be passing a timestamp parameter to chat.delete, when the API method takes a ts parameter instead. (See docs)
I have a method that pulls the information about a specific gerrit revision, and there are two possible endpoints that it could live under. Given the following sample values:
revision_id = rev1
change_id = chg1
proj_branch_id = pbi1
The revision can either live under
/changes/ch1/revisions/rev1/commit
or
/changes/pbi1/revisions/rev1/commit
I'm trying to handle this cleanly with minimal code re-use as follows:
#retry(
wait_exponential_multiplier=1000,
stop_max_delay=3000
)
def get_data(
self,
api_obj: GerritAPI,
writer: cu.LockingWriter = None,
test: bool = False
):
"""
Make an HTTP Request using a pre-authed GerritAPI object to pull the
JSON related to itself. It checks the endpoint using just the change
id first, and if that doesn't return anything, will try the endpoint
using the full id. It also cleans the data and adds ETL tracking
fields.
:param api_obj: An authed GerritAPI object.
:param writer: A LockingWriter object.
:param test: If True then the data will be returned instead of written
out.
:raises: If an HTTP Error occurs, an alternate endpoint will be
attempted. If this also raises an HTTP Error then the error is
re-raised and propagated to the caller.
"""
logging.debug('Getting data for a commit for {f_id}'.format(
f_id=self.proj_branch_id
))
endpoint = (r'/changes/{c_id}/revisions/{r_id}/commit'.format(
c_id=self.change_id,
r_id=self.revision_id
))
try:
data = api_obj.get(endpoint)
except HTTPError:
try:
endpoint = (r'/changes/{f_id}/revisions/{r_id}/commit'.format(
f_id=self.proj_branch_id,
r_id=self.revision_id
))
data = api_obj.get(endpoint)
except HTTPError:
logging.debug('Neither endpoint returned data: {ep}'.format(
ep=endpoint
))
raise
else:
data['name'] = data.get('committer')['name']
data['email'] = data.get('committer')['email']
data['date'] = data.get('committer')['date']
mu.clean_data(data)
except ReadTimeout:
logging.warning('Read Timeout occurred for a commit. Endpoint: '
'{ep}'.format(ep=endpoint))
else:
data['name'] = data.get('committer')['name']
data['email'] = data.get('committer')['email']
data['date'] = data.get('committer')['date']
mu.clean_data(data)
finally:
try:
data.update(self.__dict__)
except NameError:
data = self.__dict__
if test:
return data
mu.add_etl_fields(data)
self._write_data(data, writer)
I don't much like that I'm repeating the portion under the else, so I'm wondering if there is a way to more cleanly handle this? As a side note, as it stands currently my program will write out up to 3 times for every commit if it returns an HTTPError, is creating an instance variable self.written which tracks whether it has already been written out a best practice way to do this?
I have a very unreliable API that I request using Python. I have been thinking about using requests_cache and setting expire_after to be 999999999999 like I have seen other people do.
The only problem is, I do not know if when the API works again, that if the data is updated. If requests_cache will automatically auto-update and delete the old entry.
I have tried reading the docs but I cannot really see this anywhere.
requests_cache will not update until the expire_after time has passed. In that case it will not detect that your API is back to a working state.
I note that the project has since added an option that I implemented in the past; you can now set the old_data_on_error option when configuring the cache; see the CachedSession documentation:
old_data_on_error – If True it will return expired cached response if update fails.
It would reuse existing cache data in case a backend update fails, rather than delete that data.
In the past, I created my own requests_cache session setup (plus small patch) that would reuse cached values beyond expire_after if the backend gave a 500 error or timed out (using short timeouts) to deal with a problematic API layer, rather than rely on expire_after:
import logging
from datetime import (
datetime,
timedelta
)
from requests.exceptions import (
ConnectionError,
Timeout,
)
from requests_cache.core import (
dispatch_hook,
CachedSession,
)
log = logging.getLogger(__name__)
# Stop logging from complaining if no logging has been configured.
log.addHandler(logging.NullHandler())
class FallbackCachedSession(CachedSession):
"""Cached session that'll reuse expired cache data on timeouts
This allows survival in case the backend is down, living of stale
data until it comes back.
"""
def send(self, request, **kwargs):
# this *bypasses* CachedSession.send; we want to call the method
# CachedSession.send() would have delegated to!
session_send = super(CachedSession, self).send
if (self._is_cache_disabled or
request.method not in self._cache_allowable_methods):
response = session_send(request, **kwargs)
response.from_cache = False
return response
cache_key = self.cache.create_key(request)
def send_request_and_cache_response(stale=None):
try:
response = session_send(request, **kwargs)
except (Timeout, ConnectionError):
if stale is None:
raise
log.warning('No response received, reusing stale response for '
'%s', request.url)
return stale
if stale is not None and response.status_code == 500:
log.warning('Response gave 500 error, reusing stale response '
'for %s', request.url)
return stale
if response.status_code in self._cache_allowable_codes:
self.cache.save_response(cache_key, response)
response.from_cache = False
return response
response, timestamp = self.cache.get_response_and_time(cache_key)
if response is None:
return send_request_and_cache_response()
if self._cache_expire_after is not None:
is_expired = datetime.utcnow() - timestamp > self._cache_expire_after
if is_expired:
self.cache.delete(cache_key)
# try and get a fresh response, but if that fails reuse the
# stale one
return send_request_and_cache_response(stale=response)
# dispatch hook here, because we've removed it before pickling
response.from_cache = True
response = dispatch_hook('response', request.hooks, response, **kwargs)
return response
def basecache_delete(self, key):
# We don't really delete; we instead set the timestamp to
# datetime.min. This way we can re-use stale values if the backend
# fails
try:
if key not in self.responses:
key = self.keys_map[key]
self.responses[key] = self.responses[key][0], datetime.min
except KeyError:
return
from requests_cache.backends.base import BaseCache
BaseCache.delete = basecache_delete
The above subclass of CachedSession bypasses the original send() method to instead go directly to the original requests.Session.send() method, to return existing cached value even if the timeout has passed but the backend has failed. Deletion is disabled in favour of setting the timeout value to 0, so we can still reuse that old value if a new request fails.
Use the FallbackCachedSession instead of a regular CachedSession object.
If you wanted to use requests_cache.install_cache(), make sure to pass in FallbackCachedSession to that function in the session_factory keyword argument:
import requests_cache
requests_cache.install_cache(
'cache_name', backend='some_backend', expire_after=180,
session_factory=FallbackCachedSession)
My approach is a little more comprehensive than what requests_cache implemented some time after I hacked together the above; my version will fall back to a stale response even if you explicitly marked it as deleted before.
Try to do something like that:
class UnreliableAPIClient:
def __init__(self):
self.some_api_method_cached = {} # we will store results here
def some_api_method(self, param1, param2)
params_hash = "{0}-{1}".format(param1, param2) # need to identify input
try:
result = do_call_some_api_method_with_fail_probability(param1, param2)
self.some_api_method_cached[params_hash] = result # save result
except:
result = self.some_api_method_cached[params_hash] # resort to cached result
if result is None:
raise # reraise exception if nothing cached
return result
Of course you can make simple decorator with that, up to you - http://www.artima.com/weblogs/viewpost.jsp?thread=240808
I'm trying to subscribe to a tag. It appears that the callback URL is being called correctly with a hub.challenge and hub.mode, and I figured out how to access the challenge using self.request.get('hub.challenge'). I thought I was just supposed to echo the challenge, but that doesn't appear to work since I receive the following errors in the GAE logs:
InstagramAPIError: (400) APISubscriptionError-Challenge verification failed. Sent "647bf6dbed31465093ee970577ce1b72", received "
647bf6dbed31465093ee970577ce1b72
".
Here is the full handler:
class InstagramHandler(BaseHandler):
def get(self):
def process_tag_update(update):
update = update
mode = self.request.get('hub.mode')
challenge = self.request.get('hub.challenge')
verify_token = self.request.get('hub.verify_token')
if challenge:
template_values = {'challenge':challenge}
path = os.path.join(os.path.dirname(__file__), '../templates/instagram.html')
html = template.render(path, template_values)
self.response.out.write(html)
else:
reactor = subscriptions.SubscriptionsReactor()
reactor.register_callback(subscriptions.SubscriptionType.TAG, process_tag_update)
x_hub_signature = self.request.headers.get('X-Hub-Signature')
raw_response = self.request.data
try:
reactor.process('INSTAGRAM_SECRET', raw_response, x_hub_signature)
except subscriptions.SubscriptionVerifyError:
logging.error('Instagram signature mismatch')
So returning it as a string worked. I should have payed closer attention to the error message, but it took a helpful person on the Python IRC to point out the extra line breaks in the message. Once I put the template files on one line, it seemed to work. I can now confirm that my app is authorized via Instagram's list subscription URL.