I have made a Slack bot using slackclient 1.3.2 and Python 3.5 in a virtualenv. I'm running this on Windows 7. One of the functions sends a files from a directory when requested. This function seems to work fine when sending files whose names and paths contain only ascii, but it doesn't do anything when I ask it to send a file with an accent in the name.
Here is the relevant part of the code:
import asyncio, requests
from slackclient import SlackClient
token = "MYTOKEN"
sc = SlackClient(token)
#asyncio.coroutine
def sendFile(sc, filename, channels):
print(filename)
f = {'file': (filename, open(filename, 'rb'), 'text/plain', {'Expires':'0'})}
response = requests.post(url='https://slack.com/api/files.upload',
data= {'token': token, 'channels': channels, 'media': f},
headers={'Accept': 'application/json'}, files=f)
#asyncio.coroutine
def sendExample(sc, chan, user, instructions):
path1 = 'D:/Test Files/test.txt' #note: just ascii characters
path2 = 'D:/Test Files/tést.txt' #note: same as path1 but with accent over e in filename
print('instructions: ', instructions)
if instructions[0] == 'path1':
sc.rtm_send_message(chan, 'Sending path1!')
asyncio.async(sendFile(sc, path1, chan))
if instructions[0] == 'path2':
sc.rtm_send_message(chan, 'Sending path2!')
asyncio.async(sendFile(sc, path2, chan))
#asyncio.coroutine
def listen():
x = sc.rtm_connect()
while True:
yield from asyncio.sleep(0)
info = sc.rtm_read()
if len(info) == 1:
if 'text' in info[0]:
print(info[0]['text'])
if r'send' in info[0]['text'].lower().split(' '):
chan = info[0]['channel']
user = info[0]['user']
instructions = info[0]['text'].lower().split(' ')[1:]
asyncio.async(sendExample(sc, chan, user, instructions))
def main():
print('Starting bot.')
loop = asyncio.get_event_loop()
asyncio.async(listen())
loop.run_forever()
if __name__ == '__main__':
main()
D:/Test Files/test.txt and D:/Test Files/tést.txt are identical text files, just one has an accent above the e in the file name.
Here is what it looks like from the Slack interface:
As you can see, when I request path1 (test.txt) it works, but when I request path2 (tést.txt) it says it is sending something but doesn't seem to do anything.
The command line output looks the same for both of the files:
>python testSlackSend.py
Starting bot.
send path1
instructions: ['path1']
D:/Test Files/test.txt
Sending path1!
send path2
instructions: ['path2']
D:/Test Files/tést.txt
Sending path2!
I guess I can just rename all the files to get rid of the accents, but that's not really ideal. Is there a more elegant way to solve this problem?
If it will help with the problem I could migrate the whole thing to slackclient 2.x, but that seems like a lot of work that I'd rather not do if it's an option.
Related
i want my bot to receive videos thats been sent by users. Previously, i did this coding to receive images from users but i still cant use my bot to receive videos from them. Can any of you help me solve this problem?
coding:
file = update.message.photo[-1].file_id
At first place telegram can recieve two types of videos: simple videofile, that was sent by user as attachment and "video_note" - video that was created directly in the telegram.
Recieving both these types are pretty similar:
At first we need to obtain file_id:
def GetVideoNoteId(update):
if update['message'].get('video_note') != None:
return update['message']['video_note']['file_id']
else:
return 'not videonote'
second step - getting filepath of that file on telegram servers - from where we should download that file:
def GetVideoNoteFileSource(FileId):
url = 'https://api.telegram.org/bot' + TOKEN + '/' + 'getFile'
jsn = {'file_id': FileId}
r = requests.get(url,json=jsn).json()
fileSource = r['result']['file_path']
return fileSource
And third - finally - getting this file:
def GetFile(fileSource):
url = 'https://api.telegram.org/file/bot' + TOKEN + '/' + fileSource
r = requests.get(url)
filename = 'Video.mp4'
try:
with open(filename, 'wb') as file:
file.write(r.content)
return 'file dowloaded'
except:
return 'there are something wrong'
With videofile as attachments similar, but
return update['message']['video_note']['file_id'] - would be looks different (i do not remmember)
browser using vue element-ui el-upload component to upload file,and aiohttp as backend receive form data then save it.but aiohttp request.multipart() was always blank but request.post() will be ok.
vue:
<el-upload class="image-uploader"
:data="dataObj"
drag
name="aaa"
:multiple="false"
:show-file-list="false"
:action="action" -> upload url,passed from outer component
:on-success="handleImageScucess">
<i class="el-icon-upload"></i>
</el-upload>
export default {
name: 'singleImageUpload3',
props: {
value: String,
action: String
},
methods: {
handleImageScucess(file) {
this.emitInput(file.files.file)
},
}
aiohttp: not work
async def post_image(self, request):
reader = await request.multipart()
image = await reader.next()
print (image.text())
filename = image.filename
print (filename)
size = 0
with open(os.path.join('', 'aaa.jpg'), 'wb') as f:
while True:
chunk = await image.read_chunk()
print ("chunk", chunk)
if not chunk:
break
size += len(chunk)
f.write(chunk)
return await self.reply_ok([])
aiohttp: work
async def post_image(self, request):
data = await request.post()
print (data)
mp3 = data['aaa']
filename = mp3.filename
mp3_file = data['aaa'].file
content = mp3_file.read()
with open('aaa.jpg', 'wb') as f:
f.write(content)
return await self.reply_ok([])
browser console:
bug or anything i missed ? please help me to solve it,thanks in advance.
I think you might have check the example at aiohttp document about file upload server. But that snippet is ambiguous, and the document doesn't explain itself very well.
After digging around its source code for a while, I found that request.multipart() actually yields a MultipartReader instance, which processes multipart/form-data requests one field each time on calling of .next(), yielding another BodyPartReader instance.
In your not working code, image = await reader.next() that line actually reads out one entire field from the form data, which you can't be sure which field it actually is. It might be the token field, the key field, the filename field, the aaa field... or any of them. So in your not working example, that post_image coroutine function will only process one single field from your requesting data, and you cannot be very sure that it is the aaa file field.
Here's my code snippet,
async def post_image(self, request):
# Iterate through each field of MultipartReader
async for field in (await request.multipart()):
if field.name == 'token':
# Do something about token
token = (await field.read()).decode()
pass
if field.name == 'key':
# Do something about key
pass
if field.name == 'filename':
# Do something about filename
pass
if field.name == 'aaa':
# Process any files you uploaded
filename = field.filename
# In your example, filename should be "2C80...jpg"
# Deal with actual file data
size = 0
with open(os.path.join('', filename), 'wb') as fd:
while True:
chunk = await field.read_chunk()
if not chunk:
break
size += len(chunk)
fd.write(chunk)
# Reply ok, all fields processed successfully
return await self.reply_ok([])
And the snippet above can also deal with multiple files in single request with duplicate field name, or 'aaa' in your example. The filename in Content-Disposition header should be automatically filled in by the browser itself, so no need to worry about filename.
BTW, when dealing with file uploads in requests, data = await request.post() will eat up considerable amount of memory to load file data. So request.post() should be avoided when involving file uploads, use request.multipart() instead.
I am trying to write a script to call the Watson Speech-to-Text (STT) API to continually transcribe speech being recorded through a microphone word-for-word in real-time. I read that this should be possible using the Websockets version of the API.
I have a Python script that should be able to do this on Linux (assuming the dependencies are installed), however, it does not work on Mac OS X.
from ws4py.client.threadedclient import WebSocketClient
import base64, json, ssl, subprocess, threading, time
class SpeechToTextClient(WebSocketClient):
def __init__(self):
ws_url = "wss://stream.watsonplatform.net/speech-to-text/api/v1/recognize"
username = "your username"
password = "your password"
auth_string = "%s:%s" % (username, password)
base64string = base64.encodestring(auth_string).replace("\n", "")
self.listening = False
try:
WebSocketClient.__init__(self, ws_url,
headers=[("Authorization", "Basic %s" % base64string)])
self.connect()
except: print "Failed to open WebSocket."
def opened(self):
self.send('{"action": "start", "content-type": "audio/l16;rate=16000"}')
self.stream_audio_thread = threading.Thread(target=self.stream_audio)
self.stream_audio_thread.start()
def received_message(self, message):
message = json.loads(str(message))
if "state" in message:
if message["state"] == "listening":
self.listening = True
print "Message received: " + str(message)
def stream_audio(self):
while not self.listening:
time.sleep(0.1)
reccmd = ["arecord", "-f", "S16_LE", "-r", "16000", "-t", "raw"]
p = subprocess.Popen(reccmd, stdout=subprocess.PIPE)
while self.listening:
data = p.stdout.read(1024)
try: self.send(bytearray(data), binary=True)
except ssl.SSLError: pass
p.kill()
def close(self):
self.listening = False
self.stream_audio_thread.join()
WebSocketClient.close(self)
try:
stt_client = SpeechToTextClient()
raw_input()
finally:
stt_client.close()
Ideally, I wouldn't even be doing this in Python, but R, which is my native language which I will have to transfer the results back to to be processed anyway.
Could anyone provide me with a solution for how I can get a streamed transcription?
Not sure if this answer is exactly what you want, but sounds like an issue with the parameter continuous.
As you can see, has the lib Python SDK inside Watson-developer-cloud.
You can install with: pip install watson-developer-cloud
import json
from os.path import join, dirname
from watson_developer_cloud import SpeechToTextV1
speech_to_text = SpeechToTextV1(
username='YOUR SERVICE USERNAME',
password='YOUR SERVICE PASSWORD',
x_watson_learning_opt_out=False
)
print(json.dumps(speech_to_text.models(), indent=2))
print(json.dumps(speech_to_text.get_model('en-US_BroadbandModel'), indent=2))
with open(join(dirname(__file__), '../resources/speech.wav'),
'rb') as audio_file:
data = json.dumps(speech_to_text.recognize(audio_file, content_type='audio/wav', timestamps=False, word_confidence=False, continuous=True), indent=2)
print(data)
Obs.: The service returns an array of results with one per utterance.
In the line #L44, has the params that you can use, so, for continuous transcription, you need to use the parameter continuous and set to true like the example above.
See the Official Documentation talking about Websockets for keeping the connection alive. (Maybe this is what you need).
For some good examples of how to do this using R, check out these great blog posts by Ryan Anderson.
Voice Controlled Music Machine
Python as a tool to help with Continuous Audio - This shows how you can use R for your main logic, and use Python to deal with handling just the audio.
Ryan has done a lot of work with R and the Watson API's - he shares a lot of his knowledge on his blog.
i'm just implementing a simple bot who should send some photos and videos to my chat_id.
Well, i'm using python, this is the script
import sys
import time
import random
import datetime
import telepot
def handle(msg):
chat_id = msg['chat']['id']
command = msg['text']
print 'Got command: %s' % command
if command == 'command1':
bot.sendMessage(chat_id, *******)
elif command == 'command2':
bot.sendMessage(chat_id, ******)
elif command == 'photo':
bot.sendPhoto(...)
bot = telepot.Bot('*** INSERT TOKEN ***')
bot.message_loop(handle)
print 'I am listening ...'
while 1:
time.sleep(10)
In the line bot.sendphoto I would insert the path and the chat_id of my image but nothing happens.
Where am I wrong?
thanks
If you have local image path:
bot.send_photo(chat_id, photo=open('path', 'rb'))
If you have url of image from internet:
bot.send_photo(chat_id, 'your URl')
Just using the Requests lib you can do it:
def send_photo(chat_id, file_opened):
method = "sendPhoto"
params = {'chat_id': chat_id}
files = {'photo': file_opened}
resp = requests.post(api_url + method, params, files=files)
return resp
send_photo(chat_id, open(file_path, 'rb'))
I have used the following command while using python-telegram-bot to send the image along with a caption:
context.bot.sendPhoto(chat_id=chat_id, photo=
"url_of_image", caption="This is the test photo caption")
I've tried also sending from python using requests. Maybe it's late answer, but leaving this here for others like me.. maybe it'll come to use..
I succeded with subprocess like so:
def send_image(botToken, imageFile, chat_id):
command = 'curl -s -X POST https://api.telegram.org/bot' + botToken + '/sendPhoto -F chat_id=' + chat_id + " -F photo=#" + imageFile
subprocess.call(command.split(' '))
return
This is complete code to send a photo in telegram:
import telepot
bot = telepot.Bot('______ YOUR TOKEN ________')
# here replace chat_id and test.jpg with real things
bot.sendPhoto(chat_id, photo=open('test.jpg', 'rb'))
You need to pass 2 params
bot.sendPhoto(chat_id, 'URL')
sendPhoto requires at least two parameters; first one is target chat_id, and for second one photo you have three options:
Pass file_id if the photo is already uploaded to telegram servers (recommended because you don't need to reupload it).
If the photo is uploaded somewhere else, pass the full http url and telegram will download it (max photo size is 5MB atm).
Post the file using multipart/form-data like you want to upload it via a browser (10MB max photo size this way).
I am moving to Python from other language and I am not sure how to properly tackle this. Using the urllib2 library it is quite easy to set up a proxy and get a data from a site:
import urllib2
req = urllib2.Request('http://www.voidspace.org.uk')
response = urllib2.urlopen(req)
the_page = response.read()
The problem I have is that the text file that is retrieved is very large (hundreds of MB) and the connection is often problematic. The code also need to catch connection, server and transfer errors (it will be a part of small extensively used pipeline).
Could anyone suggest how to modify the code above to make sure the code automatically reconnects n times (for example 100 times) and perhaps split the response into chunks so the data will be downloaded faster and more reliably?
I have already split the requests as much as I could so now have to make sure that the retrieve code is as good as it can be. Solutions based on core python libraries are ideal.
Perhaps the library is already doing the above in which case is there any way to improve downloading large files? I am using UNIX and need to deal with a proxy.
Thanks for your help.
I'm putting up an example of how you might want to do this with the python-requests library. The script below checks if the destinations file already exists. If the partially destination file exists, it's assumed to be the partially downloaded file, and tries to resume the download. If the server claims support for a HTTP Partial Request (i.e. the response to a HEAD request contains Accept-Range header), then the script resume based on the size of the partially downloaded file; otherwise it just does a regular download and discard the parts that are already downloaded. I think it should be fairly straight forward to convert this to use just urllib2 if you don't want to use python-requests, it'll probably be just much more verbose.
Note that resuming downloads may corrupt the file if the file on the server is modified between the initial download and the resume. This can be detected if the server supports strong HTTP ETag header so the downloader can check whether it's resuming the same file.
I make no claim that it is bug-free.
You should probably add a checksum logic around this script to detect download errors and retry from scratch if the checksum doesn't match.
import logging
import os
import re
import requests
CHUNK_SIZE = 5*1024 # 5KB
logging.basicConfig(level=logging.INFO)
def stream_download(input_iterator, output_stream):
for chunk in input_iterator:
output_stream.write(chunk)
def skip(input_iterator, output_stream, bytes_to_skip):
total_read = 0
while total_read <= bytes_to_skip:
chunk = next(input_iterator)
total_read += len(chunk)
output_stream.write(chunk[bytes_to_skip - total_read:])
assert total_read == output_stream.tell()
return input_iterator
def resume_with_range(url, output_stream):
dest_size = output_stream.tell()
headers = {'Range': 'bytes=%s-' % dest_size}
resp = requests.get(url, stream=True, headers=headers)
input_iterator = resp.iter_content(CHUNK_SIZE)
if resp.status_code != requests.codes.partial_content:
logging.warn('server does not agree to do partial request, skipping instead')
input_iterator = skip(input_iterator, output_stream, output_stream.tell())
return input_iterator
rng_unit, rng_start, rng_end, rng_size = re.match('(\w+) (\d+)-(\d+)/(\d+|\*)', resp.headers['Content-Range']).groups()
rng_start, rng_end, rng_size = map(int, [rng_start, rng_end, rng_size])
assert rng_start <= dest_size
if rng_start != dest_size:
logging.warn('server returned different Range than requested')
output_stream.seek(rng_start)
return input_iterator
def download(url, dest):
''' Download `url` to `dest`, resuming if `dest` already exists
If `dest` already exists it is assumed to be a partially
downloaded file for the url.
'''
output_stream = open(dest, 'ab+')
output_stream.seek(0, os.SEEK_END)
dest_size = output_stream.tell()
if dest_size == 0:
logging.info('STARTING download from %s to %s', url, dest)
resp = requests.get(url, stream=True)
input_iterator = resp.iter_content(CHUNK_SIZE)
stream_download(input_iterator, output_stream)
logging.info('FINISHED download from %s to %s', url, dest)
return
remote_headers = requests.head(url).headers
remote_size = int(remote_headers['Content-Length'])
if dest_size < remote_size:
logging.info('RESUMING download from %s to %s', url, dest)
support_range = 'bytes' in [s.strip() for s in remote_headers['Accept-Ranges'].split(',')]
if support_range:
logging.debug('server supports Range request')
logging.debug('downloading "Range: bytes=%s-"', dest_size)
input_iterator = resume_with_range(url, output_stream)
else:
logging.debug('skipping %s bytes', dest_size)
resp = requests.get(url, stream=True)
input_iterator = resp.iter_content(CHUNK_SIZE)
input_iterator = skip(input_iterator, output_stream, bytes_to_skip=dest_size)
stream_download(input_iterator, output_stream)
logging.info('FINISHED download from %s to %s', url, dest)
return
logging.debug('NOTHING TO DO')
return
def main():
TEST_URL = 'http://mirror.internode.on.net/pub/test/1meg.test'
DEST = TEST_URL.split('/')[-1]
download(TEST_URL, DEST)
main()
You can try something like this. It reads the file line by line and appends it to a file. It also checks to make sure that you don't go over the same line. I'll write another script that does it by chunks as well.
import urllib2
file_checker = None
print("Please Wait...")
while True:
try:
req = urllib2.Request('http://www.voidspace.org.uk')
response = urllib2.urlopen(req, timeout=20)
print("Connected")
with open("outfile.html", 'w+') as out_data:
for data in response.readlines():
file_checker = open("outfile.html")
if data not in file_checker.readlines():
out_data.write(str(data))
break
except urllib2.URLError:
print("Connection Error!")
print("Connecting again...please wait")
file_checker.close()
print("done")
Here's how to read the data in chunks instead of by lines
import urllib2
CHUNK = 16 * 1024
file_checker = None
print("Please Wait...")
while True:
try:
req = urllib2.Request('http://www.voidspace.org.uk')
response = urllib2.urlopen(req, timeout=1)
print("Connected")
with open("outdata", 'wb+') as out_data:
while True:
chunk = response.read(CHUNK)
file_checker = open("outdata")
if chunk and chunk not in file_checker.readlines():
out_data.write(chunk)
else:
break
break
except urllib2.URLError:
print("Connection Error!")
print("Connecting again...please wait")
file_checker.close()
print("done")