Multithreading for a socket connection in python - python

I'm trying to scrape really hectic twitch chats for keywords but sometimes the socket stops for a split second, but in that split second, 5 messages can go by. I thought of implementing some multithreading but no luck in the code below. It seems like they all fail to catch a keyword, or all succeed. Any help is appreciated. Code below:
import os
import time
from dotenv import load_dotenv
import socket
import logging
from emoji import demojize
import threading
# loading environment variables
load_dotenv()
# variables for socket
server = "irc.chat.twitch.tv"
port = 6667
nickname = "frankied003"
token = os.getenv("TWITCH_TOKEN")
channel = "#xqcow"
# creating the socket and connecting
sock = socket.socket()
sock.connect((server, port))
sock.send(f"PASS {token}\n".encode("utf-8"))
sock.send(f"NICK {nickname}\n".encode("utf-8"))
sock.send(f"JOIN {channel}\n".encode("utf-8"))
while True:
consoleInput = input(
"Enter correct answer to the question (use a ',' for multiple answers):"
)
# if console input is stop, the code will stop ofcourse lol
if consoleInput == "stop":
break
# make array of all the correct answers
correctAnswers = consoleInput.split(",")
correctAnswers = [answer.strip().lower() for answer in correctAnswers]
def threadingFunction():
correctAnswerFound = False
# while the correct answer is not found, the chats will keep on printing
while correctAnswerFound is not True:
while True:
try:
resp = sock.recv(2048).decode(
"utf-8"
) # sometimes this fails, hence retry until it succeeds
except:
continue
break
if resp.startswith("PING"):
sock.send("PONG\n".encode("utf-8"))
elif len(resp) > 0:
username = resp.split(":")[1].split("!")[0]
message = resp.split(":")[2]
strippedMessage = " ".join(message.split())
# once the answer is found, the chats will stop, correct answer is highlighted in green, and onto next question
if str(strippedMessage).lower() in correctAnswers:
print(bcolors.OKGREEN + username + " - " + message + bcolors.ENDC)
correctAnswerFound = True
else:
if username == nickname:
print(bcolors.OKCYAN + username + " - " + message + bcolors.ENDC)
# else:
# print(username + " - " + message)
t1 = threading.Thread(target=threadingFunction)
t2 = threading.Thread(target=threadingFunction)
t3 = threading.Thread(target=threadingFunction)
t1.start()
time.sleep(.3)
t2.start()
time.sleep(.3)
t3.start()
time.sleep(.3)
t1.join()
t2.join()
t3.join()

First, it makes not much sense to let 3 threads in parallel read on the same socket, it only leads to confusion and race conditions.
The main problem though is that you are assuming that a single recv will always read a single message. But this is not how TCP works. TCP has no concept of a message, but only is a byte stream. A message is an application level concept. A single recv might contain a single message, multiple messages, parts of messages ...
So you have to actually parse the data you get according to the semantics defined by the application protocol, i.e.
initialize some buffer
get some data from the socket and add them to the buffer - don't decode the data
extract all full messages from the buffer, decode and process each of the message separately
leave remaining incomplete messages in the buffer
continue with #2
Apart from that don't blindly throw away errors during recv(..).decode(..). Given that you are using a blocking socket recv will usually only fail if there is a fatal problem with the connection, in which case a retry will not help. The problem is most likely because you are calling decode on incomplete messages which might also mean invalid utf-8 encoding. But since you simply ignore the problem you essentially lose the messages.

Related

Python IRC ChatBot hangs on socket.recv after seemingly random time even though socket.settimeout is 8

Hey so I decided to create an IRC ChatBot whose sole purpose it is to read incoming messages from Twitch Chat and if a giveaway is recognized by a keyword it's supposed to enter the giveaway by sending !enter in Chat.
I build the Bot upon this source: https://github.com/BadNidalee/ChatBot. I only changed things in the Run.py so thats the only Code I'm going to post. The unaltered ChatBot does work but it has no reconnect ability and regularly stops receiving data because the socket closes or other reasons.
All I wanted to change was make it so that the ChatBot is stable and can just stay in the IRC Chat constantly without disconnecting. I tried to achieve this by setting a timeout of 8 seconds for my socket and catching timeout exceptions that would occur and reconnect after they occur.
And all in all it does seem to work, my Bot does what it's supposed to even when alot of messages are coming in, it recognizes when a Giveaway starts and answers acordingly. IRC Server PING Messages are also handled and answered correctly. If there is no message in Chat for over 8 seconds the Exception gets thrown correctly and the Bot also reconnects correctly to IRC.
BUT heres my Problem: After seemingly random times the socket will literally just Stop working. What I find strange is it will sometimes work for 20 minutes and sometimes for an hour. It doesn't occur when special events, like lots of messages or something else happens in Chat, it really seems random. It will not timeout there's just nothing happening anymore. If I cancel the program with CTRL-C at this point the console sais the last call was "readbuffer = s.recv(1024)" But why is it not throwing a timeout exception at that point? If s.recv was called the socket should timeout if nothing is received after 8 seconds but the program just stops and there is no more output until you manually abort it.
Maybe I went about it the wrong way completely. I just want a stable 24/7-able ChatBot that scans for one simple keyword and answers with one simple !enter.
This is also my first Time programming in Python so If I broke any conventions or made any grave mistakes let me know.
The getUser Method returns the username of the line of chat that is scanned currently.
The getMessage Method returns the message of the line of chat that is scanned.
The openSocket Method opens the Socket and sends JOIN NICK PASS etc to the IRC
#!/usr/bin/python
import string
import socket
import datetime
import time
from Read import getUser, getMessage
from Socket import openSocket, sendMessage
from Initialize import joinRoom
connected = False
readbuffer = ""
def connect():
print "Establishing Connection..."
irc = openSocket()
joinRoom(irc)
global connected
connected = True
irc.settimeout(8.0)
print "Connection Established!"
return irc
while True:
s = connect()
s.settimeout(8.0)
while connected:
try:
readbuffer = s.recv(1024)
temp = string.split(readbuffer, "\n")
readbuffer = temp.pop()
for line in temp:
if "PING" in line:
s.send(line.replace("PING", "PONG"))
timern = str(datetime.datetime.now().time())
timern = timern[0:8]
print timern + " PING received"
break
user = getUser(line)
message = getMessage(line)
timern = str(datetime.datetime.now().time())
timern = timern[0:8]
print timern +" " + user + ": " + message
if "*** NEW" in message:
sendMessage(s, "!enter")
break
except socket.timeout:
connected = False
print "Socket Timed Out, Connection closed!"
break
except socket.error:
connected = False
print "Socket Error, Connection closed!"
break
I think you've missunderstood how timeout work on the socket.
s.settimeout(8.0)
Will only set s.connect(...) to timeout if it can't reach the destination host.
Further more, usually what you want to use instead if s.setblocking(0) however this alone won't help you either (probably).
Instead what you want to use is:
import select
ready = select.select([s], [], [], timeout_in_seconds)
if ready[0]:
data = s.recv(1024)
What select does is check the buffer to see if any incoming data is available, if there is you call recv() which in itself is a blocking operation. If there's nothing in the buffer select will return empty and you should avoid calling recv().
If you're running everything on *Nix you're also better off using epoll.
from select import epoll, EPOLLIN
poll = epoll()
poll.register(s.fileno(), EPOLLIN)
events = poll.poll(1) # 1 sec timeout
for fileno, event in events:
if event is EPOLLIN and fileno == s.fileno():
data = s.recv(1024)
This is a crude example of how epoll could be used.
But it's quite fun to play around with and you should read more about it

I am creating a Twitch-focused IRC bot in Python, however it is getting the responses slowly. What am I doing wrong?

As the title says, although this is also the first time I have used Python to really do anything big. I'm not all that used to the language yet, so this is probably my missing something. The code is fairly short and is as followed, with username and private pass removed:
import re
import socket
import sys
import time
import string
HOST = "irc.twitch.tv"
PORT = 6667
NICK = ""
PASS = ""
CHAN = ""
RATE = (20/30) # messages per second
CHAT_MSG=re.compile(r"^:\w+!\w+#\w+\.tmi\.twitch\.tv PRIVMSG #\w+ :")
def chat(sock, msg):
sock.send("PRIVMSG #{} :{}".format(cfg.CHAN, msg))
public = socket.socket()
public.connect((HOST, PORT))
public.send("PASS {}\r\n".format(PASS).encode("utf-8"))
public.send("NICK {}\r\n".format(NICK).encode("utf-8"))
public.send("JOIN {}\r\n".format(CHAN).encode("utf-8"))
private = socket.socket()
private.connect((HOST, PORT))
private.send("PASS {}\r\n".format(PASS).encode("utf-8"))
private.send("NICK {}\r\n".format(NICK).encode("utf-8"))
private.send("CAP REQ :twitch.tv/tags twitch.tv/commands {}\r\n".format(CHAN).encode("utf-8"))
while True:
channelResponse = public.recv(1024).decode("utf-8")
privateResponse = private.recv(1024).decode("utf-8")
if privateResponse == "PING :tmi.twitch.tv\r\n":
private.send("PONG :tmi.twitch.tv\r\n".encode("utf-8"))
else:
privateResponseUsername = re.search(r"\w+", privateResponse).group(0) # return the entire match
privateResponseMessage = CHAT_MSG.sub("", privateResponse)
print(privateResponseUsername + ": " + privateResponseMessage)
if channelResponse == "PING :tmi.twitch.tv\r\n":
public.send("PONG :tmi.twitch.tv\r\n".encode("utf-8"))
else:
username = re.search(r"\w+", channelResponse).group(0) # return the entire match
message = CHAT_MSG.sub("", channelResponse)
print(username + ": " + message)
time.sleep(1 / RATE)
One thing to mention is that I was following a basic template style, however it did not cover implementing whispers into the bot - so I'm having to guess by doing research on how to do that, and it seems to be that the most recommended way is two connections, one for public, one for private.
As you've structured your code, you can't get anything from the private socket until you've gotten something from the public socket. If IRC didn't send PING messages occasionally, this would work even worse.
The way to handle this is to use select, and give it your two sockets. Right as soon as one has stuff that can be read, select will return, and indicate which socket has bytes available for reading.
This answer has some general code. You might want to modify it to look something like:
while True:
# this will block until at least one socket is ready
ready_socks,_,_ = select.select([private, public], [], [])
if private in ready_socks:
privateResponse += private.recv()
if public in ready_socks:
channelResponse += public.recv()
# check privateResponse buffer, do stuff
# check channelResponse buffer, do stuff
There are a few other things you should keep in mind:
The network doesn't have to deliver entire IRC messages at the same time, nor does it have to deliver a single one at a time. You could get "PI", "NG :t", "mi.twitch.tv", "\r\n" as separate messages. So you should accumulate bytes in a buffer, and then when you've got at least one entire message, process it, and remove it from the buffer.
UTF-8 characters can span multiple bytes, and might be split up by the network. Don't decode UTF-8 until you're sure you've got an entire message to work with.

How to keep a python 3 script (Bot) running

(Not native English, sorry for probably broken English. I'm also a newbie at programming).
Hello, I'm trying to connect to a TeamSpeak server using the QueryServer to make a bot. After days of struggling with it... it works, with only 1 problem, and I'm stuck with that one.
If you need to check, this is the TeamSpeak API that I'm using: http://py-ts3.readthedocs.org/en/latest/api/query.html
And this is the summary of what actually happens in my script:
It connects.
It checks for channel ID (and it's own client ID)
It joins the channel
Script ends so it disconnects.
My question is: How can I make it doesn't disconnects? How can I make the script stay in a "waiting" state so it can read if someone types "hi bot" in the channel? All the code needed to read texts and answer to them seems easy to program, however I'm facing a problem where I can't keep the bot "running" since it closes the file as soon as it ends running the script.
More info:
I am using Python 3.4.1.
I tried learning Threading http://www.tutorialspoint.com/python/python_multithreading.htm but either M'm dumb or it doesn't work the way I though it would.
In the API there's a function named on_event that I would like to keep running all the time. The bot code should only be run once and then stay "waiting" until an event happens. How should i do that? No clue.
Code:
import ts3
import telnetlib
import time
class BotPrincipal:
def Conectar(ts3conn):
MiID = [i["client_id"] for i in ts3conn.whoami()]
ChannelToJoin = "[Pruebas] Bots"
ts3conn.on_event = BotPrincipal.EventHappened()
try:
BuscandoIDCanal = ts3conn.channelfind(pattern=ChannelToJoin)
IDCanal = [i["cid"] for i in BuscandoIDCanal]
if not IDCanal:
print("No channel found with that name")
return None
else:
MiID = str(MiID).replace("'", "")
MiID = str(MiID).replace("]", "")
MiID = str(MiID).replace("[", "")
IDCanal = str(IDCanal).replace("'", "")
IDCanal = str(IDCanal).replace("]", "")
IDCanal = str(IDCanal).replace("[", "")
print("ID de canal " + ChannelToJoin + ": " + IDCanal)
print("ID de cliente " + Nickname + ": " + MiID)
try:
print("Moving you into: " + ChannelToJoin)
ts3conn.clientmove(cid=IDCanal, clid=MiID) #entra al canal
try:
print("Asking for notifications from: " + ChannelToJoin)
ts3conn.servernotifyregister(event="channel", id_=IDCanal)
ts3conn.servernotifyregister(event="textchannel", id_=IDCanal)
except ts3.query.TS3QueryError:
print("You have no permission to use the telnet command: servernotifyregister")
print("------- Bot Listo -------")
except ts3.query.TS3QueryError:
print("You have no permission to use the telnet command: clientmove")
except ts3.query.TS3QueryError:
print("Error finding ID for " + ChannelToJoin + ". telnet: channelfind")
def EventHappened():
print("Doesn't work")
# Data needed #
USER = "thisisafakename"
PASS = "something"
HOST = "111.111.111.111"
PORT = 10011
SID = 1
if __name__ == "__main__":
with ts3.query.TS3Connection(HOST, PORT) as ts3conn:
ts3conn.login(client_login_name=USER, client_login_password=PASS)
ts3conn.use(sid=SID)
print("Connected to "+HOST)
BotPrincipal.Conectar(ts3conn)
From a quick glimpse at the API, it looks like you need to explicitly tell the ts3conn object to wait for events. There seem to be a few ways to do it, but ts3conn.recv(True) seems like the most obvious:
Blocks untill all unfetched responses have been received or forever, if recv_forever is true.
Presumably as each command comes in, it will call your on_event handler, then when you return from that it will go back to waiting forever for the next command.
I don't know if you need threads here or not, but the docs for recv_in_thread make it sound like you might:
Calls recv() in a thread. This is useful, if you used servernotifyregister and you expect to receive events.
You presumably want to get both servernotify events and also commands, and I guess the way this library is written you need threads for that? If so, just call ts3conn.recv_in_thread() instead of ts3conn.recv(True). (If you look at the source, all that does is start a background thread and call self.recv(True) on that thread.)

My chat client freezes up after beginning threads

I made a better chat client following help from people:
They told me that if I didn't want to be blocked on .recv when waiting for messages, I would need to use threads, classes, functions, and queues to do so.
So I followed some help a specific person gave me where I created a thread from a class and then defined a function that was supposed to read incoming messages and print them.
I also created a function that allows you to enter stuff to be sent off.
Thing is, when I run the program. Nothing happens.
Can somebody help point out what is wrong? (I've asked questions and researched for 3 days, without getting anywhere, so I did try)
from socket import *
import threading
import json
import select
print("Client Version 3")
HOST = input("Connect to: ")
PORT = int(input("On port: "))
# Create Socket
s = socket(AF_INET,SOCK_STREAM)
s.connect((HOST,PORT))
print("Connected to: ",HOST,)
#-------------------Need 2 threads for handling incoming and outgoing messages--
# 1: Create out_buffer:
Buffer = []
rlist,wlist,xlist = select.select([s],Buffer,[])
class Incoming(threading.Thread):
# made a function a thread
def Incoming_messages():
while True:
for i in rlist:
data = i.recv(1024)
if data:
print(data.decode())
# Now for outgoing data.
def Outgoing():
while True:
user_input=("Your message: ")
if user_input is True:
Buffer += [user_input.encode()]
for i in wlist:
s.sendall(Buffer)
Buffer = []
Thanks for taking a look, thanks also to Tony The Lion for suggesting this
Take a look at this revised version of your code: (in python3.3)
from socket import *
import threading
import json
import select
print("client")
HOST = input("connect to: ")
PORT = int(input("on port: "))
# create the socket
s = socket(AF_INET, SOCK_STREAM)
s.connect((HOST, PORT))
print("connected to:", HOST)
#------------------- need 2 threads for handling incoming and outgoing messages--
# 1: create out_buffer:
out_buffer = []
# for incoming data
def incoming():
rlist,wlist,xlist = select.select([s], out_buffer, [])
while 1:
for i in rlist:
data = i.recv(1024)
if data:
print("\nreceived:", data.decode())
# now for outgoing data
def outgoing():
global out_buffer
while 1:
user_input=input("your message: ")+"\n"
if user_input:
out_buffer += [user_input.encode()]
# for i in wlist:
s.send(out_buffer[0])
out_buffer = []
thread_in = threading.Thread(target=incoming, args=())
thread_out = threading.Thread(target=outgoing, args=())
thread_in.start() # this causes the thread to run
thread_out.start()
thread_in.join() # this waits until the thread has completed
thread_out.join()
in your program you had various problems, namely you need to call the threads; to just define them isn't enough.
you also had forgot the function input() in the line: user_input=input("your message: ")+"\n".
the "select()" function was blocking until you had something to read, so the program didn't arrive to the next sections of the code, so it's better to move it to the reading thread.
the send function in python doesn't accept a list; in python 3.3 it accepts a group of bytes, as returned by the encoded() function, so that part of the code had to be adapted.

Python socket Recv not working properly could someone explain

i am trying to create a gui client for my command line server. However, i am running into some annoying problems i cant seem to fix.
I'm not 100 % sure of what the actual problem is as sometimes the code will work, other times it wont. I think the main problem is that originally i tried the
while 1:
self.data = s.recv(1024)
if not self.data():
break
else:
print self.data()
Then i was sending to it with this
for f in files:
s.send(f)
Each f was a string of a filename. I expected it to come out on the recv side as one file name recieved for each recv call but instead on one recv call i got a big chunk of filenames i assume 1024 chars worth
Which made it impossible to check for the end of the data and thus the loop never exited.
This is the code i have now
def get_data(self,size = 1024):
self.alldata = ""
while 1:
while gtk.events_pending():
gtk.main_iteration()
self.recvdata = self.s.recv(size)
self.alldata += self.recvdata
if self.alldata.find("\r\n\r\nEOF"):
print "recieved end message"
self.rdata = self.alldata[:self.alldata.find("\r\n\r\nEOF")]
break
print "All data Recieved: " + str(len(self.rdata)) + "Bytes"
print "All data :\n" + self.rdata + "\n-------------------------------------------------"
self.infiles = self.rdata.split("-EOS-")
for nf in self.infiles:
if len(nf) > 2:
self.add_message(self.incomingIcon,nf)
At the minute im trying to get the client to read correctly from the server. What i want to happen is when the command list is typed in and sent to the client the server sends back the data and each file is appended to the list store
some times this works ok, other times only one of 1200 files gets returned, if it executes ok, if i try to type another command and send it , the whole gtk window geys out and the program becomes unresponsive.
Sorry i cant explain this question better, ive tried alot of different solutions all of which give different errors.
if someone could explain the recv command and why it may be giving the errors this is how im sending data to the client
if(commands[0] == 'list'):
whatpacketshouldlooklike=""
print "[Request] List files ", address
fil = list_files(path)
for f in fil:
sdata = f
whatpacketshouldlooklike += sdata + "-EOS-"
newSock.send(sdata +"-EOS-")
#print "sent: " + sdata
newSock.send("\r\n\r\nEOF")
whatpacketshouldlooklike += "\r\n\r\nEOF"
print "---------------------------------"
print whatpacketshouldlooklike
print "---------------------------------"
The problem you had in the first part is that sockets are stream based not message based. You need to come up with a message abstraction to layer on top of the stream. This way the other end of the pipe knows what is going on(how much data to expect as a part of one command) and isn't guessing at what is supposed to happen.
Use an abstraction layer (Pyro, XML-RPC, zeromq) or define your own protocol to distinguish messages.
For example as own protocol you can send the length of a message as a "header" before each string. In this case you should use the struct module to parse the length into a binary format. Ask again, if you want to go this way, but I strongly recommend choosing one of the mentioned abstraction layers.
There are different problems with your code.
Let's start with the fundamental that some people already commented, there is no relation between sends() and recv(), you do not control which part of the data is returned on a recv(call), you need some kind of protocol, on your case it could be just as simple as terminating command strings with "\n", and checking for "\n" on the server to consume the data.
Now other problems:
You are using send without checking it's return size, a send() does not guarantee that the data is completely written, if you need that please use sendall().
By using recv(1024) in a blocking socket (default), your server code may wait for 1024 bytes to be received, this will not allow you to process messages until you get the full chunk, you need to use a non blocking socket, and the select module.
My source code:
def readReliably(s,n):
buf = bytearray(n)
view = memoryview(buf)
sz = 0
while sz < n:
k = s.recv_into(view[sz:],n-sz)
sz += k
# print 'readReliably()',sz
return sz,buf
def writeReliably(s,buf,n):
sz = 0
while sz < n:
k = s.send(buf[sz:],n-sz)
sz += k
# obj = s.makefile(mode='w')
# obj.flush()
# print 'writeReliably()',sz
return sz
Usage of these functions:
# Server
s = socket.socket(socket.AF_INET,socket.SOCK_STREAM)
s.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1)
s.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEPORT, 1)
s.setsockopt(socket.IPPROTO_TCP, socket.TCP_NODELAY, 1)
s.bind((host,port))
s.listen(10) # unaccepted connections
while True:
sk,skfrom = s.accept()
sz,buf = io.readReliably(sk,4)
a = struct.unpack("4B",buf)
print repr(a)
# ...
io.writeReliably(sk,struct.pack("4B",*[0x01,0x02,0x03,0x04]))
See also official docs about recv_into(...), https://docs.python.org/2/library/socket.html#socket.socket.recv_into

Categories

Resources