I'm implementing a telegram bot that will serve users. Initially, it used to get any new message sequentially, even in the middle of an ongoing session with another user. Because of that, anytime 2 or more users tried to use the bot, it used to get all jumbled up. To solve this I implemented a queue system that put users on hold until the ongoing conversation was finished. But this queue system is turning out to be a big hassle. I think my problems would be solved with just a method to get the new messages from a specific chat_id or user. This is the code that I'm using to get any new messages:
def get_next_message_result(self, update_id: int, chat_id: str):
"""
get the next message the of a given chat.
In case of the next message being from another user, put it on the queue, and wait again for
expected one.
"""
update_id += 1
link_requisicao = f'{self.url_base}getUpdates?timeout={message_timeout}&offset={update_id}'
result = json.loads(requests.get(link_requisicao).content)["result"]
if len(result) == 0:
return result, update_id # timeout
if "text" not in result[0]["message"]:
self.responder(speeches.no_text_speech, message_chat_id)
return [], update_id # message without text
message_chat_id = result[0]["message"]["chat"]["id"]
while message_chat_id != chat_id:
self.responder(speeches.wait_speech, message_chat_id)
if message_chat_id not in self.current_user_queue:
self.current_user_queue.append(message_chat_id)
print("Queuing user with the following chat_id:", message_chat_id)
update_id += 1
link_requisicao = f'{self.url_base}getUpdates?timeout={message_timeout}&offset={update_id}'
result = json.loads(requests.get(link_requisicao).content)["result"]
if len(result) == 0:
return result, update_id # timeout
if "text" not in result[0]["message"]:
self.responder(speeches.no_text_speech, message_chat_id)
return [], update_id # message without text
message_chat_id = result[0]["message"]["chat"]["id"]
return result, update_id
On another note: I use the queue so that the moment the current conversation ends, it calls the next user in line. Should I just drop the queue feature and tell the concurrent users to wait a few minutes? While ignoring any messages not from the current chat_id?
Related
Having the following design problem: I open a websocket connection and send multiple messages every second to get open risk.
What’s happening is the message queue is building up wit a lot of “stale” messages (from frequently sending messages), and when I send a trade order out and it fills, the risk is delayed as it has to wait for the while loop to iterate through all the old messages, before it updates with the correct risk
Ie, the deque in websockets["messages"] is getting really long, and it takes the while loop some time to catch up to the "recent" messages
Sample code:
import websockets
url = 'wss://www.deribit.com/ws/api/v2'
risk1 = None
risk2 = None
async with websockets.connect(url) as ws:
# send many messages every second, which builds up a lot of messages in the queue
await send_message("private/get_positions", "btc-perpetual")
await send_message("private/get_positions", "eth-perpetual")
....
await send_message("private/get_positions", "sol-perpetual")
# messages from above build up in a deque, which gets iterated one-at-a-time in while loop
response = await ws.recv()
if response["id"] == 100:
risk1 = response["result"]
elif response["id"] == 200:
risk2 = response["result"]
else:
pass
# message queue gets long, and messages go stale (response from ws.recv() ), resulting in out-of-date risk
risk_usd = calculate_risk(risk1, risk2)
if risk_usd > 0:
await post_order()
some ideas ive had, but not sure if good practice:
ignore message if > x seconds
unpack the websockets["messages"] and choose last item
note: there are multiple variables (risk1, risk2) getting updated with each iteration, and ALL of them need to be up-to-date
So far, I have the following function that checks NLP values of the messages that user sends to the chatbot. I would want to store these values so that if user sends 10 messages, textToneSum is the sum of all textTone values. I will use this data later in another function with help of Session. However, as the chat function is called with AJAX every time user clicks a button to send a message, textToneSum never gathers values from more than one message. How would I solve this? There is probably an easy solution but my brain feels stuck.
#app.route('/chat', methods=['GET','POST'])
def chat():
textToneSum = 0
message = request.args.get('message')
textSample = TextBlob (message)
textTone = textSample.sentiment.polarity
textSubj = textSample.sentiment.subjectivity
textToneSum+=textTone
print("sum is", textToneSum)
#some irrelevant code deleted here
session['textToneSum'] = textToneSum
return jsonify(result = response_text, texttone = textTone, textsubj = textSubj)
I am attempting to create a small dataset by pulling messages/responses from a slack channel I am a part of. I would like to use python to pull the data from the channel however I am having trouble figuring out my api key. I have created an app on slack but I am not sure how to find my api key. I see my client secret, signing secret, and verification token but can't find my api key
Here is a basic example of what I believe I am trying to accomplish:
import slack
sc = slack.SlackClient("api key")
sc.api_call(
"channels.history",
channel="C0XXXXXX"
)
I am willing to just download the data manually if that is possible as well. Any help is greatly appreciated.
messages
See below for is an example code on how to pull messages from a channel in Python.
It uses the official Python Slack library and calls
conversations_history with paging. It will therefore work with
any type of channel and can fetch large amounts of messages if
needed.
The result will be written to a file as JSON array.
You can specify channel and max message to be retrieved
threads
Note that the conversations.history endpoint will not return thread messages. Those have to be retrieved additionaly with one call to conversations.replies for every thread you want to retrieve messages for.
Threads can be identified in the messages for each channel by checking for the threads_ts property in the message. If it exists there is a thread attached to it. See this page for more details on how threads work.
IDs
This script will not replace IDs with names though. If you need that here are some pointers how to implement it:
You need to replace IDs for users, channels, bots, usergroups (if on a paid plan)
You can fetch the lists for users, channels and usergroups from the API with users_list, conversations_list and usergroups_list respectively, bots need to be fetched one by one with bots_info (if needed)
IDs occur in many places in messages:
user top level property
bot_id top level property
as link in any property that allows text, e.g. <#U12345678> for users or <#C1234567> for channels. Those can occur in the top level text property, but also in attachments and blocks.
Example code
import os
import slack
import json
from time import sleep
CHANNEL = "C12345678"
MESSAGES_PER_PAGE = 200
MAX_MESSAGES = 1000
# init web client
client = slack.WebClient(token=os.environ['SLACK_TOKEN'])
# get first page
page = 1
print("Retrieving page {}".format(page))
response = client.conversations_history(
channel=CHANNEL,
limit=MESSAGES_PER_PAGE,
)
assert response["ok"]
messages_all = response['messages']
# get additional pages if below max message and if they are any
while len(messages_all) + MESSAGES_PER_PAGE <= MAX_MESSAGES and response['has_more']:
page += 1
print("Retrieving page {}".format(page))
sleep(1) # need to wait 1 sec before next call due to rate limits
response = client.conversations_history(
channel=CHANNEL,
limit=MESSAGES_PER_PAGE,
cursor=response['response_metadata']['next_cursor']
)
assert response["ok"]
messages = response['messages']
messages_all = messages_all + messages
print(
"Fetched a total of {} messages from channel {}".format(
len(messages_all),
CHANNEL
))
# write the result to a file
with open('messages.json', 'w', encoding='utf-8') as f:
json.dump(
messages_all,
f,
sort_keys=True,
indent=4,
ensure_ascii=False
)
This is using the slack webapi. You would need to install requests package. This should grab all the messages in channel. You need a token which can be grabbed from apps management page. And you can use the getChannels() function. Once you grab all the messages you will need to see who wrote what message you need to do id matching(map ids to usernames) you can use getUsers() functions. Follow this https://api.slack.com/custom-integrations/legacy-tokens to generate a legacy-token if you do not want to use a token from your app.
def getMessages(token, channelId):
print("Getting Messages")
# this function get all the messages from the slack team-search channel
# it will only get all the messages from the team-search channel
slack_url = "https://slack.com/api/conversations.history?token=" + token + "&channel=" + channelId
messages = requests.get(slack_url).json()
return messages
def getChannels(token):
'''
function returns an object containing a object containing all the
channels in a given workspace
'''
channelsURL = "https://slack.com/api/conversations.list?token=%s" % token
channelList = requests.get(channelsURL).json()["channels"] # an array of channels
channels = {}
# putting the channels and their ids into a dictonary
for channel in channelList:
channels[channel["name"]] = channel["id"]
return {"channels": channels}
def getUsers(token):
# this function get a list of users in workplace including bots
users = []
channelsURL = "https://slack.com/api/users.list?token=%s&pretty=1" % token
members = requests.get(channelsURL).json()["members"]
return members
I am using PyAPNs to send notifications to iOS devices. I am often sending groups of notifications at once. If any of the tokens is bad for any reason, the process will stop. As a result I am using the enhanced setup and the following method:
apns.gateway_server.register_response_listener
I use this to track which token was the problem and then I pick up from there sending the rest. The issue is that when sending the only way to trap these errors is to use a sleep timer between token sends. For example:
for x in self.retryAPNList:
apns.gateway_server.send_notification(x, payload, identifier = token)
time.sleep(0.5)
If I don't use a sleep timer no errors are caught and thus my entire APN list is not sent to as the process stops when there is a bad token. However, this sleep timer is somewhat arbitrary. Sometimes the .5 seconds is enough while other times I have had to set it to 1. In no case has it worked without some sleep delay being added. Doing this slows down web calls and it feels less than bullet proof to enter random sleep times.
Any suggestions for how this can work without a delay between APN calls or is there a best practice for the delay needed?
Adding more code due to the request made below. Here are 3 methods inside of a class that I use to control this:
class PushAdmin(webapp2.RequestHandler):
retryAPNList=[]
channelID=""
channelName = ""
userName=""
apns = APNs(use_sandbox=True,cert_file="mycert.pem", key_file="mykey.pem", enhanced=True)
def devChannelPush(self,channel,name,sendAlerts):
ucs = UsedChannelStore()
pus = PushUpdateStore()
channelName = ""
refreshApnList = pus.getAPN(channel)
if sendAlerts:
alertApnList,channelName = ucs.getAPN(channel)
if not alertApnList: alertApnList=[]
if not refreshApnList: refreshApnList=[]
pushApnList = list(set(alertApnList+refreshApnList))
elif refreshApnList:
pushApnList = refreshApnList
else:
pushApnList = []
self.retryAPNList = pushApnList
self.channelID = channel
self.channelName = channelName
self.userName = name
self.retryAPNPush()
def retryAPNPush(self):
token = -1
payload = Payload(alert="A message from " +self.userName+ " posted to "+self.channelName, sound="default", badge=1, custom={"channel":self.channelID})
if len(self.retryAPNList)>0:
token +=1
for x in self.retryAPNList:
self.apns.gateway_server.send_notification(x, payload, identifier = token)
time.sleep(0.5)
Below is the calling class (abbreviate to reduce non-related items):
class ChannelStore(ndb.Model):
def writeMessage(self,ID,name,message,imageKey,fileKey):
notify = PushAdmin()
notify.devChannelPush(ID,name,True)
Below is the slight change I made to the placement of the sleep timer that seems to have resolved the issue. I am, however, still concerned for whether the time given will be the right amount in all circumstances.
def retryAPNPush(self):
identifier = 1
token = -1
payload = Payload(alert="A message from " +self.userName+ " posted to "+self.channelName, sound="default", badge=1, custom={"channel":self.channelID})
if len(self.retryAPNList)>0:
token +=1
for x in self.retryAPNList:
self.apns.gateway_server.send_notification(x, payload, identifier = token)
time.sleep(0.5)
Resolution:
As noted in the comments at bottom, the resolution to this problem was to move the following statement to the module level outside the class. By doing this there is no need for any sleep statements.
apns = APNs(use_sandbox=True,cert_file="mycert.pem", key_file="mykey.pem", enhanced=True)
In fact, PyAPNS will auto resend dropped notifications for you, please see PyAPNS
So you don't have to retry by yourself, you can just record what notifications have bad tokens.
The behavior of your code might be result from APNS object kept in local scope (within if len(self.retryAPNList)>0:)
I suggest you to pull out APNS object to class or module level, so that it can complete its error handling procedure and reuse the TCP connection.
Please kindly let me know if it helps, thanks :)
I'm using Flask and Tweepy to search for live tweets. On the front-end I have a user text input, and button called "Search". Ideally, when a user gives a search-term into the input and clicks the "Search" button, the Tweepy should listen for the new search-term and stop the previous search-term stream. When the "Search" button is clicked it executes this function:
#app.route('/search', methods=['POST'])
# gets search-keyword and starts stream
def streamTweets():
search_term = request.form['tweet']
search_term_hashtag = '#' + search_term
# instantiate listener
listener = StdOutListener()
# stream object uses listener we instantiated above to listen for data
stream = tweepy.Stream(auth, listener)
if stream is not None:
print "Stream disconnected..."
stream.disconnect()
stream.filter(track=[search_term or search_term_hashtag], async=True)
redirect('/stream') # execute '/stream' sse
return render_template('index.html')
The /stream route that is executed in the second to last line in above code is as follows:
#app.route('/stream')
def stream():
# we will use Pub/Sub process to send real-time tweets to client
def event_stream():
# instantiate pubsub
pubsub = red.pubsub()
# subscribe to tweet_stream channel
pubsub.subscribe('tweet_stream')
# initiate server-sent events on messages pushed to channel
for message in pubsub.listen():
yield 'data: %s\n\n' % message['data']
return Response(stream_with_context(event_stream()), mimetype="text/event-stream")
My code works fine, in the sense that it starts a new stream and searches for a given term whenever the "Search" button is clicked, but it does not stop the previous search. For example, if my first search term was "NYC" and then I wanted to search for a different term, say "Los Angeles", it will give me results for both "NYC" and "Los Angeles", which is not what I want. I want just "Los Angeles" to be searched. How do I fix this? In other words, how do I stop the previous stream? I looked through other previous threads, and I know I have to use stream.disconnect(), but I'm not sure how to implement this in my code. Any help or input would be greatly appreciated. Thanks so much!!
Below is some code that will cancel old streams when a new stream is created. It works by adding new streams to a global list, and then calling stream.disconnect() on all streams in the list whenever a new stream is created.
diff --git a/app.py b/app.py
index 1e3ed10..f416ddc 100755
--- a/app.py
+++ b/app.py
## -23,6 +23,8 ## auth.set_access_token(access_token, access_token_secret)
app = Flask(__name__)
red = redis.StrictRedis()
+# Add a place to keep track of current streams
+streams = []
#app.route('/')
def index():
## -32,12 +34,18 ## def index():
#app.route('/search', methods=['POST'])
# gets search-keyword and starts stream
def streamTweets():
+ # cancel old streams
+ for stream in streams:
+ stream.disconnect()
+
search_term = request.form['tweet']
search_term_hashtag = '#' + search_term
# instantiate listener
listener = StdOutListener()
# stream object uses listener we instantiated above to listen for data
stream = tweepy.Stream(auth, listener)
+ # add this stream to the global list
+ streams.append(stream)
stream.filter(track=[search_term or search_term_hashtag],
async=True) # make sure stream is non-blocking
redirect('/stream') # execute '/stream' sse
What this does not solve is the problem of session management. With your current setup a search by one user will affect the searches of all users. This can be avoided by giving your users some identifier and storing their streams along with their identifier. The easiest way to do this is likely to use Flask's session support. You could also do this with a requestId as Pierre suggested. In either case you will also need code to notice when a user has closed the page and close their stream.
Disclaimer: I know nothing about Tweepy, but this appears to be a design issue.
Are you trying to add state to a RESTful API? You may have a design problem.
As JRichardSnape answered, your API shouldn't be the one taking care of canceling a request; it should be done in the front-end. What I mean here is in the javascript / AJAX / etc calling this function, add another call, to the new function
#app.route('/cancelSearch', methods=['POST'])
With the "POST" that has the search terms. So long as you don't have state, you can't really do this safely in an async call: Imagine someone else makes the same search at the same time then canceling one will cancel both (remember, you don't have state so you don't know who you're canceling). Perhaps you do need state with your design.
If you must keep using this and don't mind breaking the "stateless" rule, then add a "state" to your request. In this case it's not so bad because you could launch a thread and name it with the userId, then kill the thread every new search
def streamTweets():
search_term = request.form['tweet']
userId = request.form['userId'] # If your limit is one request per user at a time. If multiple windows can be opened and you want to follow this limit, store userId in a cookie.
#Look for any request currently running with this ID, and cancel them
Alternatively, you could return a requestId, which you would then keep in the front-end can call cancelSearch?requestId=$requestId. In cancelSearch, you would have to find the pending request (sounds like that's in tweepy since you're not using your own threads) and disconnect it.
Out of curiosity I just watched what happens when you search on Google, and it uses a GET request. Have a look (debug tools -> Network; then enter some text and see the autofill). Google uses a token sent with every request (every time you type something)). It doesn't mean it's used for this, but that's basically what I described. If you don't want a session, then use a unique identifier.
Well I solved it by using timer method But still I'm looking for pythonic way.
from streamer import StreamListener
def stream():
hashtag = input
#assign each user an ID ( for pubsub )
StreamListener.userid = random_user_id
def handler(signum, frame):
print("Forever is over")
raise Exception("end of time")
def main_stream():
stream = tweepy.Stream(auth, StreamListener())
stream.filter(track=track,async=True)
redirect(url_for('map_stream'))
def close_stream():
# this is for closing client list in redis but don't know it's working
obj = redis.client_list(tweet_stream)
redis_client_list = obj[0]['addr']
redis.client_kill(redis_client_list)
stream = tweepy.Stream(auth, StreamListener())
stream.disconnect()
import signal
signal.signal(signal.SIGALRM, handler)
signal.alarm(300)
try:
main_stream()
except Exception:
close_stream()
print("function terminate")